I'm beginning to get my head round setting up boost, but it took a while to understand and there are still bits that are somewhat obscure to me - so I hope that this post could both be useful to others and help to clarify the areas which remain unclear.

Use Case
My site is mostly a reference site, with articles organised in books on a variety of subjects. There is also a forum so that people can discuss the articles (comments directly on the articles are disabled), but the number of actively signed up forum users is only a small sub-set of the number of readers.
There are essentially four types of page: summary pages showing a display of the latest book pages published (built by panels and nodequeue views), node pages (books), views pages, and the forum.
The book pages show the full article, the book contents block, and a block from the "related links" module to show articles on similar subjects.
Cron is currently set to run every 24 hours.

How Boost works
This is on the assumption that the default settings are used.
The way Boost works is globally as follows:
1) Anonymous user navigates to a page.
2) If the page is not cached then Boost will generate a page in HTML, containing all the elements of the page (titles, node content, block content). This will also include the read counter.
3) If the page is already cached then Boost feeds up the page from the cache (QUESTION: What happens if the page is in the cache but marked as expired?).
4) When cron next runs, then pages in the cache whose expiry time has passed will be marked as "expired".
5) When an anonymous user next visits the page, the existing cached page will be flushed and the current version of the page sent to the cache.

Solution
Here I propose to explain how I see the setup of my site in order to fit my use case.
1) Before installing Boost, the Pathauto module is set up so that all forum entries are automatically allocated a URL beginning "forum" - this will allow them to be excluded from Boost.
2) Cron runs every 24 hours, during the night, so there is no point in having a default expiry interval less than this (QUESTION: is this true?). This means that when new documents are published, their summary (on the panels pages) will only be visible to anonymous users on the following day (QUESTION: is this true?).
3) Exclude all "forum*" pages from Boost so that users can see their forum posts straight away.
4) Since I want to display the read counter, I configure the "Popular content" block, then disable it. It is replaced by the "Boost Ajax statistics" block, configured to collect statistics but not to have block cached. This will:
a) display an up-to-date "most popular content" list
b) update the read counter: cached pages will display a read counter accurate from the time that the page was put in cache.
5) Since articles once published are never changed (they remain for reference), their default expiry time can be set to (for example) one week. The read counter will therefore be updated every week. Setting the timeout for the "book" content type must be done through the Boost settings Block displayed with the page (QUESTION: in the settings I can see that there is an option "Check database timestamps for any site changes" - but it is not clear what is the timestamp. In the case of my articles for example, is it the node timestamp? In this case it will not be very appropriate, because the "related links" will then not move at all).

I would be very grateful if you could take the time to look through this and tell me if it makes sense, and also answer the questions in the text.

Comments

bgm’s picture

Status: Active » Fixed

"3) If the page is already cached then Boost feeds up the page from the cache (QUESTION: What happens if the page is in the cache but marked as expired?)."

The page will be served. It only gets deleted when cron runs.

"2) Cron runs every 24 hours, during the night, so there is no point in having a default expiry interval less than this (QUESTION: is this true?)."

correct.

"(QUESTION: in the settings I can see that there is an option "Check database timestamps for any site changes" - but it is not clear what is the timestamp."

It's a cron option to clear a page cache before its expiry time. Otherwise cron waits for the page to expire before deleting its cache.

(responding quite late, but might help other people who found this issue while searching)

bgm’s picture

Title: Am I getting this right? Perhaps this could be added to support documentation? » expiration logic, questions on the cron, check for database timestamps

(renaming issue)

+ clarification about the "check database timestamps": yes, it checks the node/comment/user "last modified" timestamp, if it has one.

In the code, this is the function "boost_has_site_changed".

joel_guesclin’s picture

"Check database timestamps for any site changes" is a cron option to clear a page cache before its expiry time. Otherwise cron waits for the page to expire before deleting its cache.

This is very useful. It means that if I change an article, then setting this option will cause the cache to be cleared before its normal expiry time.

joel_guesclin’s picture

Would it be helpful for me to put this in the Boost documentation as a use case?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.