Dynamic cache expiration without boost_cron

alex s - November 7, 2008 - 19:57
Project:Boost
Version:6.x-1.x-dev
Component:Expiration logic
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed
Description

If your site contains many pages, boost_cron() is rather slow and it hangs server for some time. We need dynamic cache expiration. If you have access to httpd.conf, you can use this solution.
It is based on RewriteMap Apache directive. RewriteMap can be external Rewriting Program (in this case php script that checks whether cache files are old or not).
This is not a universal solution, but some people can find it useful.
Archive contatins installation instructions, my expite.php and .htaccess. It will work on both boost5 and boost6.

AttachmentSize
dynamic_expire.ZIP2.97 KB

#1

EvanDonovan - February 25, 2009 - 17:09

After applying this patch, Boost doesn't seem to be serving up the pages from the cache, even though they are being created. Also, it seems to have slowed down the server significantly.

I am also using the patch to exclude the files directory from caching, the patch to stop creating symlinks, and my own custom rewrite rules to make Boost compatible with the Referertools module. However, I had all these working until the patch was applied. Here's my rewrite rules:

<?php

  RewriteCond
%{REQUEST_METHOD} ^GET$
 
RewriteCond %{REQUEST_URI} ^/$
 
RewriteCond %{QUERY_STRING} ^$
 
RewriteCond %{HTTP_COOKIE} !DRUPAL_UID
  RewriteCond
%{HTTP_COOKIE} !referer_theme
  RewriteCond
%{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)safefamilies(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)techmission(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanyouthworkers(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)agrm(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)youthpartnersnet(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianfreeware(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanresource(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)egc(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?ccda\.christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?server\.ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?mobile\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?worldvision\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{DOCUMENT_ROOT}/cache/%{SERVER_NAME}/0/index.html -f
  RewriteCond
${boost:/index.html} ^ok$
 
RewriteRule ^(.*)$ cache/%{SERVER_NAME}/0/index.html [L]
 
 
RewriteCond %{REQUEST_METHOD} ^GET$
 
RewriteCond %{REQUEST_URI} !^/cache
  RewriteCond
%{REQUEST_URI} !^/user/login
  RewriteCond
%{REQUEST_URI} !^/admin
  RewriteCond
%{QUERY_STRING} ^$
 
RewriteCond %{HTTP_COOKIE} !DRUPAL_UID
  RewriteCond
%{HTTP_COOKIE} !referer_theme
  RewriteCond
%{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)safefamilies(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)techmission(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanyouthworkers(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)agrm(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)youthpartnersnet(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianfreeware(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanresource(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)egc(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?ccda\.christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?server\.ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?mobile\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?worldvision\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{DOCUMENT_ROOT}/cache/%{SERVER_NAME}/0%{REQUEST_URI} -d
  RewriteCond
%{DOCUMENT_ROOT}/cache/%{SERVER_NAME}/0%{REQUEST_URI}/index.html -f
  RewriteCond
${boost:%{REQUEST_URI}/index.html} ^ok$
 
RewriteRule ^(.*)$ cache/%{SERVER_NAME}/0/$1/index.html [L]
 
 
RewriteCond %{REQUEST_METHOD} ^GET$
 
RewriteCond %{REQUEST_URI} !^/cache
  RewriteCond
%{REQUEST_URI} !^/user/login
  RewriteCond
%{REQUEST_URI} !^/admin
  RewriteCond
%{QUERY_STRING} ^$
 
RewriteCond %{HTTP_COOKIE} !DRUPAL_UID
  RewriteCond
%{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)safefamilies(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)techmission(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanyouthworkers(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)agrm(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)youthpartnersnet(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)christianfreeware(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)urbanresource(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?(www\.)?.*(-|.)egc(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?ccda\.christianvolunteering(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?server\.ccda(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?mobile\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{HTTP_REFERER} !^(http://)?worldvision\.urbanministry(-|.).*$ [NC]
 
RewriteCond %{HTTP_COOKIE} !referer_theme
  RewriteCond
%{DOCUMENT_ROOT}/cache/%{SERVER_NAME}/0%{REQUEST_URI}.html -f
  RewriteCond
${boost:%{REQUEST_URI}.html} ^ok$
 
RewriteRule ^(.*)$ cache/%{SERVER_NAME}/0/$1.html [L]
?>

#2

mikeytown2 - May 6, 2009 - 21:38

I think a better idea would be to inject a 1px clear gif at the bottom of the cached page that returns asap but then checks to see if the file is expired. Makes this applicable to any hosting situation & allows for an easy on/off switch; maybe even per page control of the cache.
http://px.sklar.com/code.html/id=256

#3

mikeytown2 - May 6, 2009 - 23:49

<?php
//prime php for background opperations
ob_end_clean();
header("Connection: close");
ignore_user_abort();

// output of 1 pixel transparent gif
ob_start();
header("Content-type: image/gif");
header("Expires: Wed, 11 Nov 1998 11:11:11 GMT");
header("Cache-Control: no-cache");
header("Cache-Control: must-revalidate");
header("Content-Length: 45");
header("Connection: close");
printf ("%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c",71,73,70,56,57,97,1,0,1,0,128,255,0,192,192,192,0,0,0,33,249,4,1,0,0,0,0,44,0,0,0,0,1,0,1,0,0,2,2,68,1,0,59);
ob_end_flush();
flush();

// Do background processing here - UNTESTED CODE BELOW
define('BOOST_CACHE_LIFETIME', 1200);    // 20 minutes
define('BOOST_CACHE_PATH', '...../cache/example.com');
$uri = $_SERVER['HTTP_REFERER'];

if (
time() - @filemtime(BOOST_CACHE_PATH.$uri) > BOOST_CACHE_LIFETIME) {
  @
unlink(BOOST_CACHE_PATH . $uri);
}
exit();
?>

This works in terms of an image that returns right away and allows for the script to do stuff in the background. Use $_SERVER['HTTP_REFERER'] to get the pages url, then use that and check the file's creation time. Or should I pass the URL to this function by something like this
img src='boost_image.php?URL=***'

where *** is something like urlencode(XOREncrypt($url,1)); http://www.jonasjohn.de/snippets/php/xor-encryption.htm
1 would be a randomly generated key that gets stored in the DB, preventing almost all people from trying to force a cleared cache of an expired file that doesn't have this img tag & is still in the boost cache.

#4

mikeytown2 - May 6, 2009 - 23:48
Status:active» needs review

#5

yhager - May 26, 2009 - 04:43

#3 is an awesome idea to get rid of that cron, which is horrible for large sites' performance. However, it means that to refresh a stale page, one user has to view it first. So if your boost cache lifetime is 10 minutes, and the file was not accessed for a day, the first user will say a day-old file.

#6

mikeytown2 - June 5, 2009 - 08:49
Status:needs review» needs work

this needs a lot of work in order for this to fly.

#7

mikeytown2 - June 8, 2009 - 06:17
Status:needs work» postponed

postponed until #453426: Merge Cache Static into boost - Create GUI for database operations is done. I can then use the database to clear the cache... although thinking about this, with the DB you could run a boost only cron every 1 min and have about the same effect.

#8

mikeytown2 - June 11, 2009 - 19:43

Once the database goes in, making a separate file that one can call for cron makes since. Ship with it's own cron.php file that boots up the Database and then clears expired pages makes more since then the original post. Will use the image code for #422620: Support Drupal's built in statistics module..

#9

mikeytown2 - June 19, 2009 - 21:23
Status:postponed» fixed

alt my own cron file
http://drupal.org/project/elysia_cron
http://drupal.org/project/supercron

One of these should allow for boost cron to be run every 10 min, all others every hour; or something to that effect.

#10

mikeytown2 - June 20, 2009 - 02:18

#11

System Message - July 4, 2009 - 02:20
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.