Perhaps gzipping of the cached files would be a nice feature? I've been using a similar manual caching method before I found this module (and the 5.x patch) and kept my cache gzipped.
I've now set my server to gzip the cached files dynamically using mod_deflate, but gzipping and storing the cached files as html.gz and html would perhaps conserve precious resources?
Something like the code below is needed in addition for the .htaccess file
RewriteEngine on
RewriteOptions Inherit
#Check to see if browser can accept gzip files.
ReWriteCond %{HTTP:accept-encoding} (gzip.*)
#make sure there's no trailing .gz on the url
#ReWriteCond %{REQUEST_FILENAME} !^.+\.gz$
#check to see if a .gz version of the file exists.
RewriteCond %{REQUEST_FILENAME}.gz -f
#All conditions met so add .gz to URL filename (invisibly)
RewriteRule ^(.+) $1.gz [L]
Comments
Comment #1
axel commentedVery interesting. I'll try to add this feature (optional) to 5.x branch.
Comment #2
cybe commentedI'm running a site on a virtual server with too little memory, and it would probably not be possible without this boost module.
Comment #3
Arto commentedStaffan, that's a nice idea and it would be a pretty useful feature, I think.
However, the problem with Boost is that it is already quite complicated to install and get working, for an average Drupal user. Any added complexity would increase the support burden further still, so I'm a bit cautious about adding new features - especially those related to Apache and .htaccess.
If somebody submits a well-thought-out patch (including documentation changes!) that implements gzip support, I will consider it. But I will not likely be implementing this myself.
Comment #4
Arto commentedComment #5
BioALIEN commentedSubscribing to this. I think this is the next logical step for this module to further "boost" the site. Obviously Arto's comments are valid, unless documentation is in place along with the code then it would make a great additional (optional) feature.
White-space removal would also be of great help on a busy website!
Comment #6
jediyassin commentedI hope this feature will be available soon, really makes your pages load fast!
Comment #7
pedropablo commentedI big +1 for this feature request.
I think I am the only user that has slower page loads using boost (although I am very happy with it, it is a precious module to deal with hosting providers :-))
I have a quite acceptable fine tuned site, and this feature gets critical when talking about page load times, to the point that boost loads pages slower than normal cache system just beacuse of lacking gzip compression (you know, you first load the page, and then the rest of components, so the sooner you get the main page, the faster the whole page loads).
I am not quite good at programming, but I think I am going to give it a try...
BTW, thank you Arto for this excelent module.
Comment #8
pedropablo commentedHere is a proposal for a very simple way to provide gzip compression.
It would be quite simple to enable compression changing the file extension from ".html" to, for instance ".html.gz". In that case, just adding one line to boost.api.inc file, we could have compressed files. To make the changes very, very easy, the package could be distributed with 2 versions of .htacces, one for standard ".html", and another one in case you want activate compression using ".html.gz" extensions.
The line to be added to boost.api.inc would be inside the funtion boost_cache_set. Instead of this:
we would use this:
Once this is done, and the files extension changed in the admin panel of the module, the .htaccess file should be modified to reflect ".html.gz" extensions, and to check if the requester supports gzip compression
ReWriteCond %{HTTP:accept-encoding} (gzip.*)And finally, one last modification to add gzip Content-Encoding headers for .html.gz files
AddEncoding gzip .html.gzI have done very little testing, but seems it works. I don't know the internals of the module so Arto, what do you think?
Comment #9
pedropablo commentedThis solution has been working for almost 2 days in the production environment of Cuentos para Dormir, a spanish page with short children stories with about 2000 visits a day. It is not drupal.org, but it is quite acceptable test site.
The gzip compression (and the boost module) are working fine, and no problem has been reported up until now. And, you know, the page loads quite faster... ;-)
Comment #10
asak commentedPedropablo - where should i insert the line you mentioned into .htaccess ?
Anyone else using this with success?
This sounds interesting
Comment #11
gansbrest commentedYes, it would be great to see what exactly needs to be changed in .htaccess It doesn't work for me for some reason - browser trying to save page instead. I think my server doesn't give it gzip headers.. Do I have to enable some Apache modules for this to work?
Thanks a lot!
Comment #12
alex s commentedHere is example of .htaccess for drupal6 which will work with pedropablo's patch
Comment #13
gansbrest commentedGreat! Works just awesome! Thanks a lot!
Comment #14
rcclarke commentedInteresting fix; although I've noticed something odd about its behavior:
When I run boost without the compression patch (i.e. the change to boost.api.inc and the associated .htaccess file), it generates the cache files just fine; and when I reference a cached page, apache serves it up without any sort of noticable spike in CPU utilization.
However, when I apply the patch (by modding the .inc file, using the .htaccess file mentioned above and changing the file extension in admin->boost), I notice that whenever a page is referenced (even those that are cached and compressed), apache CPU utilization spikes to between 20 and 30% file the page is being served.
Any idea what's causing this behavior? It's acting like it's rebuilding and/or compressing the file when it shouldn't need to...
Thanks
Comment #15
rcclarke commentedfigured out the issue...The cache directory structure used in the patch file is slightly different than the default used by boost. After adjusting a few of the RewriteCond and RewriteRule statements to account for this difference, everything worked fine.
Comment #16
alex s commented@rcclarke, sorry, i forgot to say my htaccess is for boost6x. There are difference in cache directories between boost5 and boost6. So your pages were dynamically generated by drupal, not taken from cache.
For 5x you need to replace /cache/ with /cache/0/ in all RewriteCond and RewriteRule
Comment #17
christefano commentedIf it's for the 6.x-1.x-dev version...
Comment #18
damienmckennaAnother vote to have this added as an optional feature. I have to get Boost running first, but will then see what it takes to do.
Comment #19
mikeytown2 commentedIdeally this should create 2 files, one for gzip supported browsers (*.html.gz), one for non supported (*.html); just like Javascript Aggregator does #290280: Add GZip compression. Can you rewrite the already rewritten redirect? example.com is forwarded to example.com/cache/example.com/index.html, can you rewrite that to example.com/cache/example.com/index.html.gz? in short, add .gz at the end of the html file.
Comment #20
mikeytown2 commentedComment #21
mikeytown2 commentedThis is what I'm thinking of for a patch for 6.x
Modify boost_cache_set(). Use Drupal's file_save_data() instead of PHP's file_put_contents(). Mimic a lot of the logic found in javascript_aggregator_preprocess_page(). Each cached page gets 2 versions; normal (*.html) and gzipped (*.html.gz). Let apache serve either one based on
RewriteCond %{HTTP:Accept-encoding} gzip, among some other rules. Include a "gzip boosted html files" checkbox on admin/settings/performance. Going to wait until the next dev because implementing this should happen after this (#174380: Remove symlink creation. Let each path have own file) is in the 6.x dev and has been tested.Thoughts?
Comment #22
mikeytown2 commentedSince the /cache dir is excluded via RewriteCond, we can add a C to the end of boost's rewrite rule (see chain http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule). Then the final boost rewrite rule is something like
Thoughts?
skip is another way, that might be better then chain.
Comment #23
mikeytown2 commentedJust tested and skip is the correct way to get this done. Above rules when added to the bottom of the boost part allow for normal operation when .gz files are not present. Next step is to create the gzipped html along with the normal html.
Comment #24
mikeytown2 commentedHere are my rewrite rules
Comment #25
mikeytown2 commentedComment #26
mikeytown2 commentedThinking about this and this will be double trouble for people that have lots of nodes in a folder already. would having a separate gz dir get us back to where we were? #410730: System limits: Number of files in a single directory
cache/example.com/taxonomy/index.html
maps to
cache/example.com/taxonomy/gz/index.html.gz
or
cache/example.com/gz/taxonomy/index.html.gz
or
cache/gz/example.com/taxonomy/index.html.gz
Comment #27
mikeytown2 commentedcache/gzip/example.com/taxonomy/index.html.gz is the right way to do this I think... means we need 8 rules instead of the above 5. Should also be a setting on the performance page to click on/off gzip functionality.
Comment #28
mikeytown2 commentedSplit rules up into 8+1. 1st rule is to serve the file if it exists on the server. It's a performance thing from #276495: Update for Rewrite Rules - Give Boost even bigger performance gain that's safe. Next step is changing the php code so the .gz files get written to their own dir; and make sure that dir gets cleared.
Comment #29
mikeytown2 commentedThere where some dumb errors in the above code... should be fixed now
RewriteCond %{REQUEST_URI} !(^/admin|^/cache|^/misc|^/modules|^/sites|^/system|^/themes|^/user/login)
vs
RewriteCond %{REQUEST_URI} ^/admin|^/cache|^/misc|^/modules|^/sites|^/system|^/themes|^/user/login
Comment #30
mikeytown2 commentedHere's a patch that contains a gzip enable/disable setting and it moves the html.gz files to the cache/gz dir. htaccess Rules are in boosted1.txt/boosted2.txt. Please Test!
Comment #31
mikeytown2 commentedcommitted