This is stuff that may never happen, but it's an idea generator.

  • Auth Cache via Ajaxify Regions
  • Split Boost into many different modules. Core, Crawler, UI, ect...
  • Other Ideas?
CommentFileSizeAuthor
#17 boost-message-to-specific-roles.png24.73 KBFrancewhoa
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Froggie-2’s picture

Thanks to the hard work of Mikey, Boost has become an awesome module.

However, overcoming the system limits of 40000 files per folder is a necessity for large sites. Drupal core creates pages as "nodes". It would be nice if Boost provides an option to create multiple sub folders/sub directories within the cache/nodes directories during installation, so that each of the sub folders can subsequently accommodate 40000 nodes/files (cached pages) each. For example the first 40000 nodes gets stored in sub directory "node-A", the next 40000 gets stored in sub-directory "node-B" and so on....

Just my 2 cents...

mikeytown2’s picture

@Froggie
The limit is 32k directories and apache starts to slow down around 200k nodes in one dir. If one uses a url alias for the path then this limit is effectively removed. See openjurist http://drupal.org/node/546834

Froggie-2’s picture

Thanks Mikey,
Is there a Drupal based module for converting node type urls to url aliases.
Thanks again

mikeytown2’s picture

@Froggie
Enable the Path module, it's part of core. Pathauto is also helpful.

Froggie-2’s picture

Thanks Mikey, I'm grateful to you for all the help....

404’s picture

boost is awesome!

please improve the help msg on boost setting page.

mikeytown2’s picture

@404
can you give me some suggestions? also take a look at this
http://drupal.org/node/545908

Should i use something like http://drupal.org/project/advanced_help?

dbeall’s picture

just passin by with a few notes.
Can't say it enough, Thank You mikeytown2 for your patients, persistence, goals, knowledge and know-how. You have brought Boost from being a street rod to a Dragster.
The handbook will get updates soon.. as I am installing RC-4 on 6 sites tonight, 2 different live server setups(1 shared and 1 managed-&-shared vps) and wamp-xp. There has been so may adjustments, things have changed just a bit.
Agree with some better help text for the non-wizard Drupal people. hmmm, Maybe I will make a list of newbie questions or comments on each item as I go through the setup again. I see the future(lol), and there will be many non-php-server-wizards flocking to Boost, yep, we need this same as we need xmlsitemap.

dbeall’s picture

always forget something.. Advanced help is neat for the people that have installed it, many people don't know about it. Have noticed on many modules it will display the readme and that is important too. Easy to understand text for each setting on performance pages will help more people(my opinion), assuming a new Drupal user(like me) is at the controls.

mikeytown2’s picture

Doing this is a good idea for boost 1.0; as such I made a new issue for it #565796: Better Explanation of Settings and added it to the roadmap.

crea’s picture

Abstract cache storage so it can be swapped.
Then implement Nginx Memcached storage. Nginx can fetch prerendered pages straight from Memcached, firing up php-fcgi in case of cache miss. That way in most cases anonymous visitors get content with blazing speed, and it also removes the need for reverse proxies such as Varnish (getting content from Varnish or Nginx is comparable in terms of speed). Also working with Memcached is much easier, in particular because you don't need to implement tricky filepath.

mikeytown2’s picture

@crea
With stream wrappers in core (7.x), that would be a lot simpler to implement... at least I think it would be. Either way, good idea.

redox’s picture

Component: Miscellaneous » Expiration logic

Expiring per path or node type
one of the websites I maintain could have some paths expiring in 1year but especially for views it cannot take more than 1hour to expire.
It would be great for websites where nodes are not that often updated but views are very often updated.

Just an ideia..

Thanks for that great module

mikeytown2’s picture

Component: Expiration logic » Miscellaneous

@redox
you can already do that. It's not per directory but by content type. Enable the Boost: Pages cache configuration block; you can set all views to expire in 1 hour.

Francewhoa’s picture

@mikeytown2: Here is four more ideas.

  1. Option to store boost HTML static files on remote server. We would have two options: save static files to local server or save static files to remote server.

    Local cache file relative path is currently cache/www.example.com
    Remote cache file path would be an absolute path. Something like http://www.your-remote-site-domain-name-here.com/your-folder-name-here/cache/www.example.com

    Why storing static files on remote server? Save bandwidth, increase scalability and performance.

  2. A 'lock down' button. This would basically prepares your site for a heavy digging or slashdotting. It locks down the static cache files and doesn’t delete them when a page is updated or a new comment is made. For example just before sending a large mailing to your subscribers you click on 'lock down' button. Then as expected the site receive a heavy incoming traffic. Then 2 or 3 hours later the traffic is back to normal so you click on the same button again. But this time the button reads something like 'unlock'. Then the site is back to normal mode. In this scenario it would be easier and faster to simply click one 'lock down' button as oppose to do a few clicks to change Boost settings.
  3. An optional automatic 'lock down mode' for every page on your site. This significantly lower the load on a busy server with lots of traffic and comments. When activated something like a message or a block is display to specific roles such as Admin or Authors. Message saying something like The X site is currently experiencing heavy traffic. The automatic lock down mode has been activated. Static cached page will be served to anonymous visitors during this period. Deactivate automatic lock down mode [with link to settings page]. Maybe the automatic 'lock down mode' could be activated by Drupal core throttle module.
  4. Automatic backup your .htaccess file before installing the module. First the .htaccess file is copy to cache/backups/htaccess-files/ folder. Then the file is rename something like.htaccess-2009-sept-10-02-39-55. Y-M-D-Hour-Minute-Second. Easy is good.

Source for last 3: WP Super Cache

mikeytown2’s picture

A 'lock down' button. This functionality is the same as
admin/settings/performance/boost
Ignore cache flushing:
"Ignore All Delete Commands (Not Recommended)"
Correct?

An optional automatic 'lock down mode'
Works the same as Site maintenance mode; displays a message.
Correct?

Option to store boost HTML static files on remote server
Very similar to what crea requested #11. Stream wrappers is probably how to do this.

Automatic backup your .htaccess file before installing the module.
Another great idea.

Francewhoa’s picture

@mikeytown2: Yes correct I just look at 'Ignore cache flushing: Ignore All Delete Commands (Not Recommended)' and it's the same as 'lock down' button. The 'Ignore All Delete Commands' would not automatically turn on CSS & JS caching if enabled though. Here is another idea. Adding a fifth option under 'Ignore cache flushing:' that would read 'Ignore All Delete Commands. And turn on CSS & JS caching'. The point is to have only one button triggering a 'lock down' process. Now it would be more a usability improvement not a new feature.

That's good news about the Stream wrappers. I learned something new.

As for the automatic 'lock down mode' it would not be the same as Site maintenance mode. But the same as 'lock down' button but automated. Sorry for the confusion about the message thing. To clarify the message purpose it would be display when the automatic 'lock down mode' is activated. But would be display only to some specific Roles. Such as log-in Admin and some Content Authors. Anonymous users wouldn't see this message. The point of the message is to let some Roles know that the automatic 'lock down mode' is currently activated. So they won't expect their new content to be available during an automatic 'lock down mode'. Find below attached mockup to clarify. Another colour would be more appropriate. Red is usually for error messages. Maybe yellow?

I would be happy to contribute testing.

mikeytown2’s picture

@Onopoc
Another idea is to have a minimum & maximum cache expiration times. Max would go into effect for cron cache expiration and minimum for nodeapi & comments ect...

Your talking about tighter integration with something like throttle. Only thing is throttle doesn't get an accurate user count when the site is boosted. It would have to be a manual button; call it the panic/lockdown button or something. Would enable everything, copy htaccess rules if not in place, ect... allow for fast deployment in short.

#575080: Retroactive CSS/JS cache

mikeytown2’s picture

For stream wrappers, try replacing the cache dir with a stream... I think it will work; http & ftp are built into php.

admin/settings/performance/boost
Boost directories and file extensions
Cache Dir:

http://php.net/stream.streamwrapper.example-1

mikeytown2’s picture

streamwrapper issue: #579716: Stream Wrapper Example

mikeytown2’s picture

Issue tags: +2.0

adding 2.0 tag

404’s picture

> # [x] Enable the cron crawler
> Pre-cache boosted URL's so they get cached before anyone accesses them.

Does this mean when cron runs, boost will crawl the site and prepare for the cached files? If that's the case, when I click cron manually, nothing happens.

The preemptive cache and crawling is an interesting idea, but I just don't know how it works. Would be nice if the boost documentation page could expand on that.

> An administrative interface for pre-generating static files for all pages
> on the Drupal site in one go using the Batch API.
> http://drupal.org/node/546134

This certainly will solve my question.

MORE ON BOOST MODULE:

I don't want anyone to draw the conclusion that boost is difficult to use from my previous comment. It's not!

The performance gain with boost is extraordinary. I have it on all of my sites. I am not savvy enough to use memcached and varnish and nginx. Boost saved the day (saved my server, saved my money,... I use it on work, too, so in a way, helped to save my job! :).

It is poorman's apc/memcached/varnish/nginx/other/crazy/caching/schemes/. It's a drop-in solution for 90% of "drupal is slow" problem.

Froggie-2’s picture

Would be nice to have a feature where Boost can act as a standalone CDN (Content Delivery Network) module. With Boost installed on a powerful main server to constantly crawl and churn out static pages which can then be transferred by ftp to other server or servers (depending on the requirement and geographic location) for speedy display of content via secondary servers.
Just my 2 cents .....

Francewhoa’s picture

The obvious feature request: Improve performance and scalability with caching for authenticated users / log-in users.

xmarket’s picture

Reverse proxy support would be a good future.

_paul_meta’s picture

it would be great to have the option to only cache certain node types - eg, cache pages and blogs but not some custom nodes which are changing more regularly.

cheers :)

mikeytown2’s picture

@_paul_meta
You can do that right now using the "Statically cache specific pages:" - "Cache pages for which the following PHP code returns TRUE (PHP-mode, experts only)." setting. The code is exactly the same as the blocks code so you can follow this guide exactly and it will work.
http://drupal.org/node/115419

So instead of this article being called "Show a block depending on node type and node id" if using it with this boost setting you can call it "Cache a page depending on node type and node id".

example from that handbook page

  // Only show if $match is true
  $match = false;

  // Which node types
  $types = array('book', 'news', 'anothernodetype' );

  // Match current node type with array of types
  if (arg(0) == 'node' && is_numeric(arg(1))) {
    $nid = arg(1);
    $node = node_load(array('nid' => $nid));
    $type = $node->type;
    $match |= in_array($type, $types);
  }

  return $match;

Modified code

  // Only cache if $match is true
  $match = FALSE;

  // Which node types
  $types = array('book', 'news', 'anothernodetype' );

  // Match current node type with array of types
  if (arg(0) == 'node' && is_numeric(arg(1))) {
    $nid = arg(1);
    $node = node_load(array('nid' => $nid));
    $type = $node->type;
    $match = in_array($type, $types);
  }
  // Page is not a node, cache it
  else {
    $match = TRUE;
  }

  return $match;
_paul_meta’s picture

very helpful reply, thanks!

ddorian’s picture

so i read this: http://www.askapache.com/web-hosting/super-speed-secrets.html
and now i see this issue.
store all html files in TMPFS(which is like a filesystem in RAM). it will be crazy fast. just read the article

mikeytown2’s picture

Very interesting... just from the title, sounds like you just mount the cache folder in a ramdisk. Poormans varnish... very interesting. Thanks for the link!

ddorian’s picture

yes but it is different from ramdisk(in a better way) (because ramdisk is fixed size but tmpfs can increase and discrase the size)
or maybe can it be done with memcache? since it stores cache tables in memcache bins and use drupal.page_caching and store the table in memcache.
From the article:
If I had to explain tmpfs in one breath, I’d say that tmpfs is like a ramdisk, but different. Like a ramdisk, tmpfs can use your RAM, but it can also use your swap devices for storage. And while a traditional ramdisk is a block device and requires a mkfs command of some kind before you can actually use it, tmpfs is a filesystem, not a block device; you just mount it, and it’s there. All in all, this makes tmpfs the niftiest RAM-based filesystem I’ve had the opportunity to meet. So when you reboot you can restore the files.

crea’s picture

If a PC has free RAM files will be in kernel file cache anyway, so the whole ramdisk idea is stupid IMO. And if a PC doesn't have free RAM that means you are going to swap -> bad system setup.

Francewhoa’s picture

Two suggestions to improve Boost's usability.

Adding new rules generation: 'Multi-sites Boost htaccess rules generation' http://drupal.org/node/800408

Adding link to Advanced Help page http://drupal.org/node/800358

Vacilando’s picture

@miketown2 re #27 ... with respect to "Cache pages for which the following PHP code returns TRUE" -- I need something similar.

Is there a way to specify different expiration period for pages of a certain type (or where PHP returns TRUE, etc.)

On some sites I want some usual expiration (say 6 or 12 hours) for pages and stories, but up to several weeks for more static content. Is there a way to do this, even if programmatically?

Thanks.

mikeytown2’s picture

@vacilando
there is no PHP code box for that but there is a configuration block
"Boost: Pages cache configuration"
Allow for different setting per content container (node vs view) content type (node page vs story) & node ID (2 vs 4).

Just be aware that this clears the cache of the effected level so the new expiration time takes place. I'm thinking of ways to fix this and I think a creation timestamp will do the trick in the database.

Vacilando’s picture

@mikeytown2 -- thanks, indeed the Pages cache configuration was the one thing I never noticed with Boost. Even after you mentioned it I read too quickly and went looking for the setting in the main Boost config page.

But it is a block. And that has to be allowed for some pages. If it's allowed for a page, it allows changing cache for that page, for the content type page, or for the node. Similar for a view.

But for the view it did not work for me properly. As soon as I set the cache to apply for the whole view (leaving the first 2 dropdowns at default), cleared all caches, went to another browser as anonymous, reloaded twice, well, the Boost signature was: <!-- Page cached by Boost @ 2010-07-20 08:39:42, expires @ 2010-07-20 08:39:41 -->, meaning no caching. All the while I had the expiration time at "default", here meaning 12 hours. As soon as I deleted the config and disabled the block, cleared caches etc., the expiration is again OK 12 hours ahead from present.
So much for a little bug report (for 6.x-1.18).

My main point is, and it fits here, hopefully, in the discussion about 6.x-2.x features. The block (if it works) is perfect for tweaking individual page caches. But for content types and for various views, it would be lovely to have a (collapsed) section with settings on the main Boost config page (/admin/settings/performance/boost). I doubt people ever have hundreds of content types and views on one single site, so it could be a list of content types with the time expiration dropdowns next to them in the row (HTML, XML, JSON), and below also a list of all views like that.
What do you think?

Alternatively/in addition to this, isn't there a snippet of code I could use for programmatically specifying the name of a view and the time all its pages should be cached? (Trying to find a solution for the immediate need...)

Thanks a lot.

mikeytown2’s picture

@vacilando
Part of 2.x is to split up boost into various independent modules. What your talking about is some sort of "Expiration Grid"
http://drupal.org/node/622820#comment-2956586
BTW this module does everything but views intergration for the varnish module. As of right now I'm trying to stay away from the database as much as I can since on some setups the boost_cache table can cause issues do to all the writes to it. I got some ideas on how to make it much better behaved.

EDIT: looks like you have already seen that link, lol

doublejosh’s picture

Hope it's still ok to post here, perhaps 7 is plenty to think about, also hello...

I've been thinking about trying a kind of 'remote caching' where I generate my cache files on another restricted mirror server.

On a regular basis: rsync files, code, and wrap up and push data to the second server. Run cron through all the page generation then sync back only the cache files, letting the live server only ever serve that with perhaps a few AJAX blocks, un-cached roles, stats gathering, etc.

Imagine this is how large catalog type sites (without much user interaction) might regenerate their site from time to time when the design is updated.

Suppose this all might be doable with a fairly simple shell script... but I wanted to discuss it here.

This is kind of the opposite approach to simple HTML header/footer with JSON/AJAX page content and generating most DOM elements on the fly.

Francewhoa’s picture

Adding support for internationalization/multilingual http://drupal.org/project/i18n

For example Boost interface could be translated in other languages. To do so contributors could simply create .po files.

ddorian’s picture

I have a workig site with i18n and boost and it works: http://tinyurl.com/4qcb8nc
And the boost interface can be translated with localize.drupal.org

Francewhoa’s picture

Thanks ddorian

@All: If you want to contribute Boost interface translation go to http://localize.drupal.org/translate/projects/boost

digi24’s picture

I would really like to see more hooks in boost (and less complexity in the core boost module).

1. Hooks

One specific example that would be extremly useful for me:
Move the boost_settings_database functions into a sub-module and call a hook for settings. This way modules can easily implement lifetimes for their content, keeping logic simple.

The boost_settings_db can remain almost unchanged, just with the difference that it answers to a hook.

Only question to be solved, whether to sort the results by minimum lifetime or some kind of weight.

2. Tables

the same applies for tables. For large sites the boost table can grow extremly big, making it hard to use an optimal database format and there are keys for everything. Instead it would be nice to have a lean central table that easily fits into memory and separate look up tables for stuff that only needs to be looked up (paths, urls).

3. Goodies

Boost is full of "Goodies", some gzipping here, some noderelations there, multisite rewriting, mobile rewriting... Please keep the core module focussed, if you had hooks you could let the respective modules take care of all the integration.

Besides that, great module, great work, could not live without it.

bgm’s picture

Title: Ideas for boost 6.x-2.x » Ideas for boost 7.x-2.x
Version: 6.x-1.x-dev » 7.x-1.x-dev