Hello. I use a shared hosting account. My site is eating up the server resources. I want to use boost and improve that.
So far I haven't seen the improvent I expected. I believe I use the wrong settings.
My site does not have new content so often. What is a correct setting for Maximum Cache Lifetime and Minimum Cache Lifetime?
My current setting is Maximum Cache Lifetime: 1 day and Minimum Cache Lifetime: 0 sec.
What about cache expiration options?
Thanks in advance.

Comments

Anonymous’s picture

Component: User interface » Caching logic
Assigned: Unassigned »

Boost only works for anonymous users, so the first thing to do and to check is to log out, view a page, refresh it (in case the page was not cached on the first viewing) and then view the source. If boost is working then at the bottom of the page will be a comment saying when the page was cached and when it should expire.

If no comment appears then you have a module that sets a user id, (or your site is https), and boost disabled itself. (Boost turns itself off for things like login pages so that username are not cached is someone enters them incorrectly).

You could easily set your boost lifetime to a month and not have any further use for the setting. Every time you edit a page then boost automatically wipes it out and regenerates it using the crawler, which never crawls the whole site, instead it just deals with pages that are edited, deleted, or related pages with links on them.

Since boost does not deal with logged in users, if the majority of your visitors are logged in, then you would not see any improvements and should looks at authCache or another caching mechanism. Boost only deals with php pages being turned into html, so server resources are split among many things, network bandwidth in which case you should be looking at your gzip settings (and making sure that you are also not wasting cpu cycles by zipping compressed items like images), the amount of things you are trying to send the user in the first place, all of which can be analysed by installing firebug in firefox, or using developer tools in chrome, and selecting the network tab, which will give you a breakdown of what your page is doing. With boost installed, the first page call will be "normal", a second call to a cached page will be much faster as then html is just sent out, so once that is out of the way, you can focus on what else is slowing down the site, your css and js files should be aggregated in Drupal's normal caching mechanism and a long time should be chosen for those.

Whatever your logging mechanism is, it should be analysed carefully to determine the bottleneck, high cpu usage can be database/ PHP related, of server compression. If you have a lot of 404 errors, you could be the victim of dumb bots probing your site for vulnerabilities that don't exist since they run through a list of joomla, wordpress, zen-cart.... etc exploits which are irrelevant but will use up your resources, and you should certainly consider the fast 404 drupal module that sends a static page out rather than hog your db resources.

palazis’s picture

Thanks for the fast and detailed response.
Yes more than 90% of my site visitors are anonymous so i need the boost module.
You are right - I can't find the comment on any page saying when the page was cached and when it should expire.
So boost is not working, perhaps I did something wrong in the configuration process.
This is probably the reason that the site's statistics show a heavier server load than normal because also core's "Cache pages for anonymous users" is disabled.
Installation steps.
1) Clean URLS: OK
2) Boost module enabled: OK
3) Cache pages for anonymous users is unchecked
4) Administer > Configuration > System > Boost > Boost Settings: Seems OK
5) Administer > Configuration > System > Boost > File System
Seems OK since I have a cache folder (permissions 0775) and inside that a normal folder.
Inside the normal folder I have many folders (en, el, de, etc), one for each language.
In there I can find lots of html files so I believe this step is OK
6) .htaccess modification. Using Notepad++ (I have LF now), the following code was added below # RewriteBase /
# RewriteBase /

### BOOST START ###

# Allow for alt paths to be set via htaccess rules; allows for cached variants (future mobile support)
RewriteRule .* - [E=boostpath:normal]

# Caching for anonymous users
# Skip boost IF not get request OR uri has wrong dir OR cookie is set OR request came from this server OR https request
RewriteCond %{REQUEST_METHOD} !^(GET|HEAD)$ [OR]
RewriteCond %{REQUEST_URI} (^/(admin|cache|misc|modules|sites|system|openid|themes|node/add|comment/reply))|(/(edit|user|user/(login|password|register))$) [OR]
RewriteCond %{HTTPS} on [OR]
RewriteCond %{HTTP_COOKIE} DRUPAL_UID [OR]
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule .* - [S=7]

# GZIP
RewriteCond %{HTTP:Accept-encoding} !gzip
RewriteRule .* - [S=3]
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.html -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html,E=no-gzip:1]
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.xml -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.xml [L,T=text/xml,E=no-gzip:1]
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.json -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.json [L,T=text/javascript,E=no-gzip:1]

# NORMAL
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.html -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html]
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.xml -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.xml [L,T=text/xml]
RewriteCond /home/www/drupal7/cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.json -s
RewriteRule .* cache/%{ENV:boostpath}/studiesinuk.net%{REQUEST_URI}_%{QUERY_STRING}\.json [L,T=text/javascript]

### BOOST END ###

What can be wrong???

Anonymous’s picture

Do you have anything being created under the folder cache/normal, also I notice visiting your site that there is an almost immediate redirect to /en/ which may be the problem for a rewrite path. There are several threads for multilinugual module configuration with .htaccess examples. I would cut the .htaccess right back and only use html for the time being to debug it.

Also I noticed something.

http://studiesinuk.net/cache/normal/studiesinuk.net/ give a 500 error, you may need to use the SymLinksIfOwnerMatch option which is available in the dev version of boost and is required for some web hosting services.

palazis’s picture

My structure is like this:
cache-->normal-->studiesinuk.net-->en-->lots of html files there
Under normal I also have eshop.palazis.net, palazis.net, www.palazis.net, www.studiesinuk.net
I am not sure I understand what is this immediate redirect to /en/

I will also try the dev version

Anonymous’s picture

When I went to the site by just pasting in

studiesinuk.net

to the browser from your .htaccess file, it immediately redirected to

studiesinuk.net/en/

I have noticed a problem however. You should be able to access the cache files directly using the directory structure you mentioned

http://studiesinuk.net/cache/normal/studiesinuk.net/en/_.html

should give the cached index page, but if placed in the browser it is redirected to

http://studiesinuk.net/en/cache/normal/studiesinuk.net/en/_.html

and giving a 404 error which is correct, it looks like the there is something incorrect with the multilingual module and I suggest you look through the threads, I believe the problem was solved for a japanese site if memory serves correctly, it appears boost is functioning correctly but that it is the rewrite rules that need tweaking.

joeyb583’s picture

I'm having the same issue. All setup looks fine, statue report is good, copied the info to the htaccess file, but not getting comment at the bottom of the source pages. Any help would be much appreciated. Here's the site link...

http://www.green-clinic.com.php53-2.dfw1-2.websitetestlink.com/

Anonymous’s picture

Status: Active » Closed (works as designed)

The pages are being generated I can see that much as the comment appears on

http://www.green-clinic.com.php53-2.dfw1-2.websitetestlink.com/cache/nor...

The boost code is very specific for it's location it must be placed after RewriteBase

and before

  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^ index.php [L]

which is the standard boost rewrite url code. I cannot see any cookies being set that would override boost, I'd certainly recommend a lifetime much greater than 1 hour.

joeyb583’s picture

Status: Closed (works as designed) » Active

That's interesting. Why would it not be outputting the boost code on the home page? Also, what would you recommend the max timeout be set to?

Here's is the .htaccess code...

#
# Apache/PHP/Drupal settings:
#

# Protect files and directories from prying eyes.
<FilesMatch "\.(engine|inc|info|install|make|module|profile|test|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)$|^(\..*|Entries.*|Repository|Root|Tag|Template)$">
  Order allow,deny
</FilesMatch>

# Don't show directory listings for URLs which map to a directory.
Options -Indexes

# Follow symbolic links in this directory.
Options +FollowSymLinks

# Make Drupal handle any 404 errors.
ErrorDocument 404 /index.php

# Set the default handler.
DirectoryIndex index.php index.html index.htm

# Override PHP settings that cannot be changed at runtime. See
# sites/default/default.settings.php and drupal_environment_initialize() in
# includes/bootstrap.inc for settings that can be changed at runtime.

# PHP 5, Apache 1 and 2.
<IfModule mod_php5.c>
  php_flag magic_quotes_gpc                 off
  php_flag magic_quotes_sybase              off
  php_flag register_globals                 off
  php_flag session.auto_start               off
  php_value mbstring.http_input             pass
  php_value mbstring.http_output            pass
  php_flag mbstring.encoding_translation    off
</IfModule>

# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On

  # Cache all files for 2 weeks after access (A).
  ExpiresDefault A1209600

  <FilesMatch \.php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers set by mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non-Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>

# Various rewrite rules.
<IfModule mod_rewrite.c>
  RewriteEngine on

  # Block access to "hidden" directories whose names begin with a period. This
  # includes directories used by version control systems such as Subversion or
  # Git to store control files. Files whose names begin with a period, as well
  # as the control files used by CVS, are protected by the FilesMatch directive
  # above.
  #
  # NOTE: This only works when mod_rewrite is loaded. Without mod_rewrite, it is
  # not possible to block access to entire directories from .htaccess, because
  # <DirectoryMatch> is not allowed here.
  #
  # If you do not have mod_rewrite installed, you should remove these
  # directories from your webroot or otherwise protect them from being
  # downloaded.
  RewriteRule "(^|/)\." - [F]

  # If your site can be accessed both with and without the 'www.' prefix, you
  # can use one of the following settings to redirect users to your preferred
  # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
  #
  # To redirect all users to access the site WITH the 'www.' prefix,
  # (http://example.com/... will be redirected to http://www.example.com/...)
  # uncomment the following:
  # RewriteCond %{HTTP_HOST} !^www\. [NC]
  # RewriteRule ^ http://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
  #
  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (http://www.example.com/... will be redirected to http://example.com/...)
  # uncomment the following:
  # RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
  # RewriteRule ^ http://%1%{REQUEST_URI} [L,R=301]

  # Modify the RewriteBase if you are using Drupal in a subdirectory or in a
  # VirtualDocumentRoot and the rewrite rules are not working properly.
  # For example if your site is at http://example.com/drupal uncomment and
  # modify the following line:
  # RewriteBase /drupal
  #
  # If your site is running in a VirtualDocumentRoot at http://example.com/,
  # uncomment the following line:
  RewriteBase /


  ### BOOST START ###

  # Allow for alt paths to be set via htaccess rules; allows for cached variants (future mobile support)
  RewriteRule .* - [E=boostpath:normal]

  # Caching for anonymous users
  # Skip boost IF not get request OR uri has wrong dir OR cookie is set OR request came from this server OR https request
  RewriteCond %{REQUEST_METHOD} !^(GET|HEAD)$ [OR]
  RewriteCond %{REQUEST_URI} (^/(admin|cache|misc|modules|sites|system|openid|themes|node/add|comment/reply))|(/(edit|user|user/(login|password|register))$) [OR]
  RewriteCond %{HTTPS} on [OR]
  RewriteCond %{HTTP_COOKIE} DRUPAL_UID [OR]
  RewriteCond %{ENV:REDIRECT_STATUS} 200
  RewriteRule .* - [S=3]

  # GZIP
  RewriteCond %{HTTP:Accept-encoding} !gzip
  RewriteRule .* - [S=1]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html,E=no-gzip:1]

  # NORMAL
  RewriteCond %{DOCUMENT_ROOT}/cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html -s
  RewriteRule .* cache/%{ENV:boostpath}/%{HTTP_HOST}%{REQUEST_URI}_%{QUERY_STRING}\.html [L,T=text/html]

  ### BOOST END ###


  # Pass all requests not referring directly to files in the filesystem to
  # index.php. Clean URLs are handled in drupal_environment_initialize().
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^ index.php [L]

  # Rules to correctly serve gzip compressed CSS and JS files.
  # Requires both mod_rewrite and mod_headers to be enabled.
  <IfModule mod_headers.c>
    # Serve gzip compressed CSS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.css $1\.css\.gz [QSA]

    # Serve gzip compressed JS files if they exist and the client accepts gzip.
    RewriteCond %{HTTP:Accept-encoding} gzip
    RewriteCond %{REQUEST_FILENAME}\.gz -s
    RewriteRule ^(.*)\.js $1\.js\.gz [QSA]

    # Serve correct content types, and prevent mod_deflate double gzip.
    RewriteRule \.css\.gz$ - [T=text/css,E=no-gzip:1]
    RewriteRule \.js\.gz$ - [T=text/javascript,E=no-gzip:1]

    <FilesMatch "(\.js\.gz|\.css\.gz)$">
      # Serve correct encoding type.
      Header set Content-Encoding gzip
      # Force proxies to cache gzipped & non-gzipped css/js files separately.
      Header append Vary Accept-Encoding
    </FilesMatch>
  </IfModule>
</IfModule>
Anonymous’s picture

If you have the Drupal code in a virtual host configuration file, then that can also do the same thing. Boost itself is working but your web server is not sending out the pages so the rewrite rules are being missed for some reason. You may want to do a simple

order allow,deny
deny from all

in your .htaccess file just to check that .htaccess is working. This kind of error normally comes down to one of two things, a module installed that logs a user in even if they are anonymous (and no DRUPAL_UID cookie is set on your site), or the rewrite rules being ignored.

As for the length of time. Boost can be set for a massively long time especially if you have httprl, cache expiry and boost_crawler (bad name for it as it regenerates pages that are updated, inserted or deleted rather than crawls the site) installed, then your anonymous users would build a cache of static files and the only reason ever to delete the cache would be if the site underwent any style changes that you wanted to propagate to older pages.

joeyb583’s picture

I'll be honest I'm not sure of either of those. We use Rackspace cloud sites for hosting so I'm not sure about the virtual host configuration file.

I also have no clue about a module potentially logging a user in. How would I know or even check on that? I'm a fairly newbie. I'm not familiar with httprl, cache expire or boost crawler either. I'm just looking for a solution to increase speed on all our sites that will be mostly anonymous users. As you can see from that site, it's pretty slow. I really appreciate your help with this.

Anonymous’s picture

Your problem is the Rewrite one, and if you are using a cloud server then you almost certainly have a root login and probably need someone a little more experienced to go through your apache configuration. It is rather slow and the main speed loss is from processing the PHP, I can see that looking at it through firebug. If you rename your .htaccess file (must remove the .ht at the beginning) and drupal still works then there is configuration elsewhere that has set up your system, if you are not comfortable with editing files directly on the server.

joeyb583’s picture

Yeah I'm comfortable doing that. I've been developing front-end stuff for a while, just not much Drupal and php and server administration is not my cup of tea.

When I was setting up clean URLs, I had to edit the RewriteBase in the .htaccess file and I added some dummy text at the end of it to make sure the site was using that and it broke the site if that tells you anything. Could it still be using rewrite configurations elsewhere?

Anonymous’s picture

It is very likely that it is using rewrite configuration elsewhere, I've seen it a few times including multi-billion dollar turnover company websites :) If the site continues to function then that's the issue, if not then you'll be needing to enable a RewriteDebug log file to see what the problem is, which can only be done by editing main apache configuration and turning it on and off again.

joeyb583’s picture

Gotcha. I'll chat with some rackspace server admins tomorrow and get them to take a look and see what we can't figure out. I'll update accordingly. Thanks again.

joeyb583’s picture

Not much help from those guys. Let's go back to this...

order allow,deny
deny from all

What is that and what does it do?

I also came across this...

http://drupal.org/node/1888588

Once guy specifically mentions the vhost here...

http://drupal.org/node/1888588#comment-6939236

Anonymous’s picture

The deny from all statement is just a test to see if your .htaccess is working. If it is then your site would stop with a 500 error. The same test can be achieved by renaming the .htaccess file and seeing if the site still works.

joeyb583’s picture

Yeah I just did it and I got the Forbidden, you do not have access message. Does that mean it's not being overwritten elsewhere?

Anonymous’s picture

No it just means that the .htaccess file is being read, you need to remove / rename the file to see if the site works to know if there is a configuration elsewhere.

joeyb583’s picture

Well I just renamed the files to access and the site still works, but the clean-urls did not. I added the ?q= to the path and it did.

Anonymous’s picture

You are going to have to read up about enabling the RewriteDebug log in your virtual host/ apache config to try and work out what's going on. If clean URL's aren't working with a disabled .htaccess then the root .htaccess is probably in control and you'll need the debug log to work out why the boost rules are being ignored.

joeyb583’s picture

Hey man I just figured it out. I stumbled upon this article in the rackspace knowledge space and step #10 was the key...

http://www.rackspace.com/blog/optimizing-your-drupal-site/

Now that I've got that, what would you suggest the length be set to?

Thanks for all your help.

Anonymous’s picture

That article is not quite correct. For one thing you need to disable page caching (the other pages are fine) but you need under performance to turn off the check box for

Cache pages for anonymous users

I also do not understand the comment about length.

joeyb583’s picture

Yeah I did that. I just skimmed it until I saw something different and step #10 is what stood out.

The length was referring to the length to set the cache to. Here's the site link...

http://www.green-clinic.com.php53-2.dfw1-2.websitetestlink.com/

Anonymous’s picture

boost cache length is entirely dependent on what else you have installed. If you have the crawler component installed then any page that is edited/ updated/ deleted will be regenerated, so the cache length can be infinite if you wish to untick the box "remove old files on cron" which is my personal preference. However if you are doing major stylistic changes then you are going to want to alter that because your cached pages would be with "the old layout".

cron in drupal 7 is pretty much automatic as long as an authorised user logs in, or someone hits a page that boost does not cache. Depends on the frequency of updates to the site and really is most useful for search indexing, although the crawler does use cron it already expires the pages so an anonymous user could generate the pages before cron gets around to it so fresh content is always going to be provided. The other aspect of cron is the checking for available updates, that's a security issue so should be run at least once a week even if nothing else is happening to the site.

So the solution to the issue seems to be that your %{DOCUMENT_ROOT} does not match the filesystem on your cloud server, this would have been diagnosed by the Rewrite log being enabled as it would have given you the path that mod_rewrite was looking for in the filesystem.

joeyb583’s picture

I will look into the boost crawler. Seems like the more efficient way to go so the pages are cached automatically upon expiration rather than requiring a user to load the page again before caching (assuming I'm understanding that correctly).

So with the "remove old files on cron" option, if its enabled Boost will delete the expired pages when cron is run?

If its disabled and using the crawler, the crawler will automatically regenerate the cached file upon expiration so the user doesn't need to hit the page first in order to cache it?

Am I understanding that correctly?

In regards to the rewrite log, wouldn't that need to be enabled in the server configurations? That's what I've read. If so, I don't have access to those settings, nor am I familiar with them.

Anonymous’s picture

Status: Active » Closed (works as designed)

I will look into the boost crawler. Seems like the more efficient way to go so the pages are cached automatically upon expiration rather than requiring a user to load the page again before caching (assuming I'm understanding that correctly).

Yes, otherwise you have to rely on cron deleting the pages and anonymous users creating the cache.

So with the "remove old files on cron" option, if its enabled Boost will delete the expired pages when cron is run?

Yes, but I find there is little point to delete stale cache files apart from to save disk space unless the layout has changed, since you still require then an anonymous user to visit the page to re-create the cache. There is a theoretical advantage that the db tables would be "in memory" for non-cached pages, but thay tends to be far and few between.

If its disabled and using the crawler, the crawler will automatically regenerate the cached file upon expiration so the user doesn't need to hit the page first in order to cache it?

Am I understanding that correctly?

No not on expiration, on modification, plus if the title is changed and the page appears in blocks on other pages, then the other pages are also deleted and then regenerated.

In regards to the rewrite log, wouldn't that need to be enabled in the server configurations? That's what I've read. If so, I don't have access to those settings, nor am I familiar with them.

Yes, but it is unusual to have a cloud server with no access to the configuration, that's more like cheap shared hosting.

joeyb583’s picture

Awesome. Installing Boost crawler now. Thanks for all your help on this. Definitely learned a lot. Top notch support and assistance.

joeyb583’s picture

Last question. So I need the expire module for this to work as you you described? If not, what happens?

Anonymous’s picture

If you don't have expire then boost will run on cron and you'll need to have the setting turned on for "expire pages on cron run". This would only effect anonymous users but they could get way out of date pages where comments never updated.

One interesting point. If you put your site into maintenance mode, then anonymous users will still see "the site" as the rewrite rules hit the boost cache before going to index.php so the cache needs to be flushed if any major correction work needs to take place.

joeyb583’s picture

So to make sure I understand correctly, the cache either gets cleared when cron is run or when the expire module detects a content update of some sort. At that point, I've got the boost crawler and httprl modules installed that will automatically regenerate those cached pages that were cleared without a user having to hit the page. That correct?

In regards to the maintenance mode, that makes sense. Put the site in maintenance mode and then flush all cache so the users will see the maintenance page.

Anonymous’s picture

So to make sure I understand correctly, the cache either gets cleared
when cron is run or when the expire module detects a content update of
some sort. At that point, I've got the boost crawler and httprl modules
installed that will automatically regenerate those cached pages that
were cleared without a user having to hit the page. That correct?

Exactly.

joeyb583’s picture

Awesome. Last question for real this time. What do you suggest/recommend for cron settings and your internal performance settings, mainly minimum cache lifetime and expiration of cached pages or do those even matter since we're using boost and expire?

Anonymous’s picture

Difficult subject, depends on many things.

  • cron would only be triggered by a logged in user (as anon users see boost)
  • or a visit to a admin page / like register new users which might be triggered by search engine spiders
  • or data being posted to the site by a spam bot looking for a vulnerability (so they can prove useful)

You need it running at least once a week for update checking, possibly more if there is a large amount of new content every day as cron also controls your search indexing.

joeyb583’s picture

Yeah I don't foresee it changing much. I'll set it to run once a week.

What about the internal performance settings? I updated the last post maybe after you read the initial one. Any suggestions there?

Anonymous’s picture

If you have the crawler then cache everything for a long long time and then turn off the cron setting that removes stale files, then only ever clear the cache if the style of the site changes. The crawler would handle any updates, the only thing that may ever need changing would be if you had the google tracking modules enabled and they changed the javascript code which would then require an old cache clearout. But it will drop your CPU usage etc down to virtually nothing, and speed up everything especially any spiders indexing old information which can be a big drain.

joeyb583’s picture

Gotcha. What about the built in performance settings, minimum cache lifetime and expiration of cached pages? Do those matter since I'm using Boost? The max of those are 1 day.

Anonymous’s picture

You can ignore those settings for the pages but not for the blocks (though any performance increase is going to be very minimal). That highlights a lack in my knowledge as I do not know if those settings affect the aggregate css and javascript functions inside of drupal.

joeyb583’s picture

So I've updated the order of menu items and upon hitting it in a different browser not logged in, it's not updating. Is the cache expire and regeneration happen upon update or is it basically "flagged" and queued until cron is run?

joeyb583’s picture

I tried running cron and it didn't work. Once I cleared cache, it updated the menu. Is this because this is more than just content?

Anonymous’s picture

Yes a menu is a block so is controlled by drupal's caching mechanism not boost's. Boost caches the output to the browser at one instant, if you move your links around or play with stylesheets, boost will either not find the new styles or the links will be the old ones just like if you saved a page to your hard drive.

joeyb583’s picture

Gotcha. So here's my setup and tell me what you think. I believe I'm at a good level of understanding...

Performance Settings
Cache Blocks
Minimum cache lifetime left to .
Expiration of cached pages left to .
Aggregates both CSS and JS files.

Cache Expiration
Left all defaults

Boost Settings
text/html - Max cache lifetime set to 12 months 4 days
text/html - Min cache lifetime set to 12 months 4 days

Boost Cache Configuration
Uncheck remove old cached files on cron

Boost Crawler
Checked Enable the cron crawler

How does that look to you?

Anonymous’s picture

That all looks fine as long as you remember that menus are blocks, and that themes work exactly the same way, boost will show the "old one" unless content is updated. It is only my personal opinion that cron should be unticked, but it's based on seeing sites that are spidered and using thousands of CPU cycles to generate old content which can run into thousands of pages.

joeyb583’s picture

What do you mean specifically by untick cron? You mean the general cron setting?

Also, I went home last night and hit the site from there and it wasn't cached using boost. When I went to another page and then back to the home page, I could see it was using the boost cached pages. I thought the cached pages were automatically generated using boost crawler and httprl? Is that not right? If it should generate on cron run, it's set at every 3 hours so it should have run by the time I hit the page last night. I'm a bit confused by this.

Anonymous’s picture

Look through the boost settings and there is a tick box, "remove stale files on cron", I don't think that should be ticked.

The crawler only crawls for changes, anonymous users generates the cache. It's a frequent naming error.

joeyb583’s picture

Yes, I did uncheck that box.

I guess the user has to generate the cache initially, huh? No other way to do that?

Anonymous’s picture

There's lots of ways of doing it from paid for systems, to submitting your site to a search engine which will spider it. There's not much when you're on shared hosting but there is a thread somewhere here on some PHP that people have put together to mimic a spider.

joeyb583’s picture

Yeah I think I'll stay away from that. What are your thoughts on the Boost expire module?

Anonymous’s picture

This module may become deprecated in favor of a more universal solution (like Boost integration with the Cache Expiration module).

As far as I remember it serves the same function as the crawler and the cache expiration module so is not a core part of boost, it would not be likely that boost would be directly integrated with cache expiration as other modules rely on cache expiration, the change the boost 6 to boost 7 was a total redesign that took a very large piece of code that was quite difficult to manage and to split it into discrete chunks. I would doubt that one would go backwards and I suspect that boost_expire was created without fully understanding the functioning of the crawler as it appears to be a duplicate project.

joeyb583’s picture

Gotcha, I did read that but just wanted your opinion on it as I have a pretty knowledgeable co-worker using it and figured he just wasn't aware of the situation you described. I think I've got this figured out and I appreciate all your help and patience with me.