Implement fastpath database caching mechanism
matt westgate - August 27, 2006 - 01:07
| Project: | Drupal |
| Version: | x.y.z |
| Component: | base system |
| Category: | bug report |
| Priority: | critical |
| Assigned: | matt westgate |
| Status: | closed |
Description
With a little bit of code shuffling I think we churn out a caching system as fast as, if not faster than a file-based caching approach. Chx has already made a hook to inject an early-staged cache callback, so we ought to be able to establish a database connection and fetch the cache without loading sessions or the modules. Attached is a visual map of where I want to go.
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| Page-Caching-Lifecycle-HEAD.pdf | 50.91 KB | Ignored | None | None |

#1
Looking good to me. I'm really interested in this so subscribing now to see updates and post feedback.
#2
This first draft of this patch changes the caching radio buttons from 'disabled' and 'enabled' to 'disabled', 'standard' and 'aggressive'.
Each benchmark below was run with 4 times with
ab -n 1000 -c 20 http://localhost/drupalcvs/and the highs and lows were removed.## Default Drupal Caching
Requests per second: 10.89 [#/sec] (mean)
Time per request: 91.85 [ms] (mean, across all concurrent requests)
Requests per second: 11.37 [#/sec] (mean)
Time per request: 87.92 [ms] (mean, across all concurrent requests)
## Fastpath DB Caching - bypassing init and exit hooks
Requests per second: 15.42 [#/sec] (mean)
Time per request: 64.86 [ms] (mean, across all concurrent requests)
Requests per second: 15.97 [#/sec] (mean)
Time per request: 62.63 [ms] (mean, across all concurrent requests)
## Fastpath File-Based Caching
Requests per second: 10.45 [#/sec] (mean)
Time per request: 95.71 [ms] (mean, across all concurrent requests)
Requests per second: 10.38 [#/sec] (mean)
Time per request: 96.32 [ms] (mean, across all concurrent requests)
#3
So to summarize, it appears the we can handle 50% more requests per second using the aggressive cache mechanism.
#4
Good to see you taking a look at this. Thanks.
I know that this is a first draft, but I think we'll want to document this a little bit better. The aggresive cache is cool but has some drawbacks as the _init and _exit hooks won't run. Depending on what modules you use, this may or may not be a problem. For example, poormanscron will no longer run, node counters will stop working, statistics logging will become inaccurate, etc.
#5
I added developer and end-user documentation.
#6
Slight cleanup with the documentation.
#7
More cleanup with documentation.
#8
My main issue is with the bootstrap code. I wish we could make it a bit more elegant and easier to follow. :-)
#9
Two more details:
#10
Here is a useful snippet for the documentation:
<?phpprint 'The aggresive caching strategy is incompatible with the following modules: '. implode(', ', module_list(TRUE, TRUE));
?>
Needs to be Drupal-ised but it tells the site administrator exactly what modules conflict with the aggressive caching.
Does disabling these modules render the aggressive mode useless? If so, we could just as well advise people to disable these?
#11
Nope. The performance boost also comes from not needing to load common.inc and all the enabled modules, which is done when bootstrap_invoke_all('init') is called.
Attached is a new patch which attempts to make _drupal_bootstrap() a little more elegant and the documentation a little more useful.
#12
I am surely blind but if you call your cache before loading the session how do you know whether the user is anonymous or not?
#13
From a usability perspective, if a user has modules installed that implement init and exit hooks, shouldn't we set the advanced setting to "disabled" so they *can't* click it?
Reason being, if they have modules enabled that are implenting these hooks, going with aggressive caching will obviously break these in some sort of form. Don't want someone accidently turning this on. Similar to how we prevent users from turning on clean urls.
#14
subscribing
#15
Chx is right. I'm loading the session information now.
Statistics module, devel module and throttle module are still useful mods even if they are running impaired for anonymous users. I hope that providing documentation that informs users of the risk and then telling them which modules on their site are problematic is a viable middle ground?
#16
Oh, I forgot to mention. I removed the DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE phase since we could invoke the fastpath hook in the DRUPAL_BOOTSTRAP_CONFIGURATION phase. There's even a slight performance gain here since we don't even need to include the settings.php file for a fastpath cache. That's Drupal running off of only 3 files folks!
#17
Last patch was against an outdated HEAD.
#18
Updated patch. Fix the description on the cache settings page.
It will now show in the color green if you can safely enable expert mode. If not, it's shown in red. I cleaned up the wording quite a bit. Also, I changed the title on the page to "cache mode" instead of "cache strategy". Minor other tweaks to the interface.
#19
Matt, you and Jeremy created the first fastpath module (fastpath_fscache) how did you solve the session problem there? I can see cache_get called from the fastpath function and global $user; in there. I think that's also broken.
When creating the EARLY_PAGE_CACHE step, I was looking at Jeremy's stuff and how he used a
$_COOKIE['drupal_uid']. I still think that's a better -- even if a bit clunky -- solution for certain environments than creating a database connection.So -1 for removing EARLY_PAGE_CACHE.
#20
Hm, I did not really grok you are now calling fastpath from config. That's even worse than I thought.
First of all, forget variable_get before you load the setting.php. variable_get uses the $conf in settings.php. Second, configuration must be a separation step if you want Drupal to be reusable. I am not even sure whether install does not break with this bold step.
#21
I benchmarked this patch and it gives me a 55% performance improvement.
The latest patch got me to think: maybe we should remove the 'expert mode' radio button, and automatically switch to 'expert mode' when there are no conflicts ... That is, hide it all from the user. :)
#22
Dries I thought about that too. However, if you do use expert mode it if works--what happens when a user then installs statistics, poormanscron, etc... and then notices that the site is behaving a bit odd.
We *shouldn't* automatically give it to them--then they blame Drupal and us :-) Rather, we give them all the warnings and the green light, but ultimately let them make the decision. If it breaks, well they shouldn't have taken the green pill ;-)
#23
subscribe ... there are module enable/disable hooks now so it does seem possible to automatically disable/enable this mode modules chnaged. unfortunately, it seems difficult to know when a user wants to keep expert enabled even though he has devel or statistics enabled. maybhe the UI is more like a checkbox to 'force enable aggressive cache'.
i haven't closely reviewed the logic here yet.
#24
Moshe, that is why I updated the patch-- you can still enable expert mode even if it affects some modules. However, there is a warning in nice bold red text that says this is not advised. If you are good to go, you see some bold green light text saying you have nothing to worry about.
I think that makes the most sense--users are free to do what they want, we provide the necessary information to let them know if it is a good or bad decision.
#25
Shucks. chx is correct, which makes 4 files Drupal needs to include for a fastpath cache. Now that's one bloated CMS! * kidding *
I think it's a bad idea to hide the expert option when there are module conflicts. Modules can still be useful even if they may no longer work for anonymous users, such as statistics module.
New patch addresses recent issues. I'm glad to see that Dries validated my benchmark numbers.
#26
Marking back for review.
#27
mav3rck: what I mean is this. If someone enables caching, Drupal will automatically try to use the aggressive path. If, however, the administrator enables a 'conflicting module', Drupal will automatically switch back to normal caching. In other words: an automated approach without extra user interface.
To check if there are conflicting modules, simple use:
<?phpif (module_list(TRUE, TRUE)) {
// conflicting modules
}
else {
// no conflicting modules
}
?>
I think this approach is beneficial for various reasons: it will always behave correctly (administrators won't think their newly enabled modules are broken when aggressive caching is on), it automatically optimizes performance and there is less documentation to read. :)
#28
Dries, what if enable Devel and Statistics module but I *know* how they break and are ok with the breakage, since I know what is going on. I don't Drupal to turn off aggresive caching now for me :-/
#29
Just a note on this - as an enthusiastic drupal user but relative newbie, I'd rather have control over "aggressive" mode, with the warning, than have the decision made for me.
Two reasons:
For a start, there are things that devel does (clear cache, php info, showing you query information for each page you view (like those 600 on my default forums page with 50 subforums!), probably other stuff I've not got to yet) that wouldn't be affected by the caching to anonymous users if I've got it right. Why disable aggressive caching only because of query logging or whatever when people might be using devel for other reasons anyway?
If statistics module with aggressive caching only collected stats for logged in users, then it could also be useful for comparing page views or referrals for logged in and anonymous in users against external statistics software like google analytics/awstats etc. - google/awstats won't tell you how many users who visited your site were logged in or anonymous, this sort of would - even if it's unpredictable it's better than nothing.
#30
Variable init simply does not belong to the database step -- what if I write code outside of Drupal and want to simply reuse the DB layer? And cache in session, what if I write a little script which serves file thus boots up until the session, what does cache do there? Add more steps if you want or simply move your cache from session to the beginning of late cache, it looks belonging there. This solves your variable init problem as well as that can move back to late cache.
Otherwise, it's a very promising patch.
#31
variable_get to the rescue. Dries can have his automatic switching to non-aggressive -- this will result in the variable being non set to CACHE_EXPERT. m3avrck being the advanced user can force the setting to CACHE_EXPERT in his $conf in settings.php.
So, let's do the automatic stuff.
#32
New patch in response to chx's concerns, which makes sense. I did not add in the automatic enabling of the expert cache mode. If we chose to do that, we ought to inform the user this change has been made. Also, we won't be able to catch every instance of an interfering module. There might be modules that run code everytime their loaded outside of the init and exit hooks. This might be a bad coding practice on the module developer's part, but we should take it into consideration.
#33
flipping the switch
#34
The patch is looking better and better. However, weird stuff is going on with the latest version:
- Without the patch, I can serve 101 requests/second.
- With the patch, and with the caching strategy set to 'recommended', I can serve 172 requests/second (?).
- With the patch, and with the caching strategy set to 'aggressive', I can serve 151 requests/second.
It is strange that I can serve that many pages with the strategy set to 'recommended'.
Update: turns out I get a PHP error with the strategy set to 'recommended'.
#35
Move the cache.inc inclusion down from the database step to the late_page_cache step where it belongs.
I start to dislike the _drupal_cache_init function. I can see three totally distinct function named one for no real good reason. Down with spaghetti!
#36
We do not need to take into consideration that a module can load code outside of function definitions. Our motto is: "we do not babysit broken code". And that kind of code is seriously fubar.
#37
I'm feeling good about this patch and fixed the PHP issue. I also ran the benchmarks again to see if it's still worth adding this feature (after killes' cache splitting patch)
BEFORE THIS PATCH, W/ CACHING
Requests per second: 11.15 [#/sec] (mean)
W/ PATCH, CACHE_RECOMMENDED
Requests per second: 11.01 [#/sec] (mean)
W/ PATCH, CACHE_EXPERT
Requests per second: 17.34 [#/sec] (mean)
I was trying to figure out why this patch makes slows default caching down a bit, but nothing really jumped out me. Probably just the extra PHP logic and all in all I don't think it's worth worrying about. It's safe to say expert caching mode is a considerable performance boost and worth adding to Drupal.
#38
Committed a modified version of your patch to CVS HEAD.
#39
Eh, with this patch, I can't install Drupal anymore because of this one line change in bootstrap.inc:
- foreach ($GLOBALS as $key => $value) {
+ foreach ($GLOBALS as $key) {
#40
#41
Committed to HEAD.
#42