Implement fastpath database caching mechanism

matt westgate - August 27, 2006 - 01:07
Project:Drupal
Version:x.y.z
Component:base system
Category:bug report
Priority:critical
Assigned:matt westgate
Status:closed
Description

With a little bit of code shuffling I think we churn out a caching system as fast as, if not faster than a file-based caching approach. Chx has already made a hook to inject an early-staged cache callback, so we ought to be able to establish a database connection and fetch the cache without loading sessions or the modules. Attached is a visual map of where I want to go.

AttachmentSizeStatusTest resultOperations
Page-Caching-Lifecycle-HEAD.pdf50.91 KBIgnoredNoneNone

#1

m3avrck - August 27, 2006 - 01:27

Looking good to me. I'm really interested in this so subscribing now to see updates and post feedback.

#2

matt westgate - August 27, 2006 - 05:06

This first draft of this patch changes the caching radio buttons from 'disabled' and 'enabled' to 'disabled', 'standard' and 'aggressive'.

Each benchmark below was run with 4 times with ab -n 1000 -c 20 http://localhost/drupalcvs/ and the highs and lows were removed.

## Default Drupal Caching

Requests per second:    10.89 [#/sec] (mean)
Time per request:       91.85 [ms] (mean, across all concurrent requests)

Requests per second:    11.37 [#/sec] (mean)
Time per request:       87.92 [ms] (mean, across all concurrent requests)


## Fastpath DB Caching - bypassing init and exit hooks

Requests per second:    15.42 [#/sec] (mean)
Time per request:       64.86 [ms] (mean, across all concurrent requests)

Requests per second:    15.97 [#/sec] (mean)
Time per request:       62.63 [ms] (mean, across all concurrent requests)


## Fastpath File-Based Caching

Requests per second:    10.45 [#/sec] (mean)
Time per request:       95.71 [ms] (mean, across all concurrent requests)

Requests per second:    10.38 [#/sec] (mean)
Time per request:       96.32 [ms] (mean, across all concurrent requests)

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache.patch8.25 KBIgnoredNoneNone

#3

matt westgate - August 27, 2006 - 05:12
Status:active» needs review

So to summarize, it appears the we can handle 50% more requests per second using the aggressive cache mechanism.

#4

Dries - August 27, 2006 - 07:33

Good to see you taking a look at this. Thanks.

I know that this is a first draft, but I think we'll want to document this a little bit better. The aggresive cache is cool but has some drawbacks as the _init and _exit hooks won't run. Depending on what modules you use, this may or may not be a problem. For example, poormanscron will no longer run, node counters will stop working, statistics logging will become inaccurate, etc.

#5

matt westgate - August 27, 2006 - 17:15

I added developer and end-user documentation.

In order to reduce server load and save bandwidth, Drupal stores and sends compressed cached pages requested by "anonymous" users. By caching a web page, Drupal does not have to create the page each time someone wants to view it. The 'standard' cache option is suitable for most sites without any side effects. 'Aggressive' caching is twice as fast since it skips the loading of enabled modules when serving a cached page. For most modules this is not a problem, however some modules such as statistics and throttle modules will lose a part of their functionality since they track information regarding anonymous user visits and are now no longer able to do so. Other contributed and custom modules may also be affected.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_0.patch9.27 KBIgnoredNoneNone

#6

matt westgate - August 27, 2006 - 17:19

Slight cleanup with the documentation.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_1.patch9.27 KBIgnoredNoneNone

#7

jvandyk - August 28, 2006 - 02:17

More cleanup with documentation.

AttachmentSizeStatusTest resultOperations
fastcache.patch9.21 KBIgnoredNoneNone

#8

Dries - August 28, 2006 - 07:31
  • Can't we tidy up the bootstrap process a little bit more? It feels a little bit "slapped on" right now. I wouldn't mind to massage the bootstrap process a little to make this a tad more elegant. (Not sure if it is possible at all.)
  • Even with the eplanation, I don't think people will be able to predict the impact of 'aggressive mode'. For example, node counters will stop working. I bet many people won't be able to tell. Maybe we should add 'recommended, no side-effects' after 'standard', and 'expert mode, possible side-affects' after 'aggressive'? I think we should advise against using 'aggressive' caching.
  • I would avoid saying 'twice as fast'. It depends on the modules you shortcut, your LAMP-configuration, etc.
  • To make the form descriptions a little bit shorter, maybe we should move the 'introduction into the page cache' (currently under the 'Page cache'-setting) to the top of the page (i.e. make it a regular help text)?
  • My main issue is with the bootstrap code. I wish we could make it a bit more elegant and easier to follow. :-)

#9

Dries - August 28, 2006 - 07:36

Two more details:

  1. The title of the administration page reads 'page cache', and the title of the first configuration option is 'Page caching:'. It looks kinda dull, don't you think? It is a redundant repetition. I suggest we rename the title of the form element to 'Caching strategy:', or something a bit more descriptive.
  2. The sentence "Enabling the cache will offer a sufficient performance boost for most low-traffic and medium-traffic sites." is redudant (people already figured that, hopefully) and should either be removed, or moved to the "introduction" at the top of the page (see previous comment).

#10

Dries - August 28, 2006 - 07:48

Here is a useful snippet for the documentation:

<?php
print 'The aggresive caching strategy is incompatible with the following modules: '. implode(', ', module_list(TRUE, TRUE));
?>

Needs to be Drupal-ised but it tells the site administrator exactly what modules conflict with the aggressive caching.

Does disabling these modules render the aggressive mode useless? If so, we could just as well advise people to disable these?

#11

matt westgate - August 29, 2006 - 00:50

Does disabling these modules render the aggressive mode useless?

Nope. The performance boost also comes from not needing to load common.inc and all the enabled modules, which is done when bootstrap_invoke_all('init') is called.

Attached is a new patch which attempts to make _drupal_bootstrap() a little more elegant and the documentation a little more useful.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_2.patch11.52 KBIgnoredNoneNone

#12

chx - August 29, 2006 - 01:41

I am surely blind but if you call your cache before loading the session how do you know whether the user is anonymous or not?

#13

m3avrck - August 29, 2006 - 01:46

From a usability perspective, if a user has modules installed that implement init and exit hooks, shouldn't we set the advanced setting to "disabled" so they *can't* click it?

Reason being, if they have modules enabled that are implenting these hooks, going with aggressive caching will obviously break these in some sort of form. Don't want someone accidently turning this on. Similar to how we prevent users from turning on clean urls.

#14

webchick - August 29, 2006 - 02:35

subscribing

#15

matt westgate - August 29, 2006 - 03:51

I am surely blind but if you call your cache before loading the session how do you know whether the user is anonymous or not?

Chx is right. I'm loading the session information now.

From a usability perspective, if a user has modules installed that implement init and exit hooks, shouldn't we set the advanced setting to "disabled" so they *can't* click it?

Statistics module, devel module and throttle module are still useful mods even if they are running impaired for anonymous users. I hope that providing documentation that informs users of the risk and then telling them which modules on their site are problematic is a viable middle ground?

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_3.patch14.1 KBIgnoredNoneNone

#16

matt westgate - August 29, 2006 - 03:56

Oh, I forgot to mention. I removed the DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE phase since we could invoke the fastpath hook in the DRUPAL_BOOTSTRAP_CONFIGURATION phase. There's even a slight performance gain here since we don't even need to include the settings.php file for a fastpath cache. That's Drupal running off of only 3 files folks!

#17

matt westgate - August 29, 2006 - 04:22

Last patch was against an outdated HEAD.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_4.patch13.78 KBIgnoredNoneNone

#18

m3avrck - August 29, 2006 - 04:49

Updated patch. Fix the description on the cache settings page.

It will now show in the color green if you can safely enable expert mode. If not, it's shown in red. I cleaned up the wording quite a bit. Also, I changed the title on the page to "cache mode" instead of "cache strategy". Minor other tweaks to the interface.

AttachmentSizeStatusTest resultOperations
cache_5.patch14.6 KBIgnoredNoneNone

#19

chx - August 29, 2006 - 05:35
Status:needs review» needs work

Matt, you and Jeremy created the first fastpath module (fastpath_fscache) how did you solve the session problem there? I can see cache_get called from the fastpath function and global $user; in there. I think that's also broken.

When creating the EARLY_PAGE_CACHE step, I was looking at Jeremy's stuff and how he used a $_COOKIE['drupal_uid'] . I still think that's a better -- even if a bit clunky -- solution for certain environments than creating a database connection.

So -1 for removing EARLY_PAGE_CACHE.

#20

chx - August 29, 2006 - 06:03

Hm, I did not really grok you are now calling fastpath from config. That's even worse than I thought.

First of all, forget variable_get before you load the setting.php. variable_get uses the $conf in settings.php. Second, configuration must be a separation step if you want Drupal to be reusable. I am not even sure whether install does not break with this bold step.

#21

Dries - August 29, 2006 - 07:35

I benchmarked this patch and it gives me a 55% performance improvement.

The latest patch got me to think: maybe we should remove the 'expert mode' radio button, and automatically switch to 'expert mode' when there are no conflicts ... That is, hide it all from the user. :)

#22

m3avrck - August 29, 2006 - 15:26

Dries I thought about that too. However, if you do use expert mode it if works--what happens when a user then installs statistics, poormanscron, etc... and then notices that the site is behaving a bit odd.

We *shouldn't* automatically give it to them--then they blame Drupal and us :-) Rather, we give them all the warnings and the green light, but ultimately let them make the decision. If it breaks, well they shouldn't have taken the green pill ;-)

#23

moshe weitzman - August 29, 2006 - 17:31

subscribe ... there are module enable/disable hooks now so it does seem possible to automatically disable/enable this mode modules chnaged. unfortunately, it seems difficult to know when a user wants to keep expert enabled even though he has devel or statistics enabled. maybhe the UI is more like a checkbox to 'force enable aggressive cache'.

i haven't closely reviewed the logic here yet.

#24

m3avrck - August 29, 2006 - 17:34

Moshe, that is why I updated the patch-- you can still enable expert mode even if it affects some modules. However, there is a warning in nice bold red text that says this is not advised. If you are good to go, you see some bold green light text saying you have nothing to worry about.

I think that makes the most sense--users are free to do what they want, we provide the necessary information to let them know if it is a good or bad decision.

#25

matt westgate - August 29, 2006 - 19:50

Forget variable_get before you load the setting.php. variable_get uses the $conf in settings.php. Second, configuration must be a separation step if you want Drupal to be reusable.

Shucks. chx is correct, which makes 4 files Drupal needs to include for a fastpath cache. Now that's one bloated CMS! * kidding *

I think it's a bad idea to hide the expert option when there are module conflicts. Modules can still be useful even if they may no longer work for anonymous users, such as statistics module.

New patch addresses recent issues. I'm glad to see that Dries validated my benchmark numbers.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_5.patch12.9 KBIgnoredNoneNone

#26

webchick - August 29, 2006 - 20:33
Status:needs work» needs review

Marking back for review.

#27

Dries - August 29, 2006 - 21:17

mav3rck: what I mean is this. If someone enables caching, Drupal will automatically try to use the aggressive path. If, however, the administrator enables a 'conflicting module', Drupal will automatically switch back to normal caching. In other words: an automated approach without extra user interface.

To check if there are conflicting modules, simple use:

<?php
 
if (module_list(TRUE, TRUE)) {
   
// conflicting modules
 
}
  else {
  
// no conflicting modules
 
}
?>

I think this approach is beneficial for various reasons: it will always behave correctly (administrators won't think their newly enabled modules are broken when aggressive caching is on), it automatically optimizes performance and there is less documentation to read. :)

#28

m3avrck - August 29, 2006 - 21:21

Dries, what if enable Devel and Statistics module but I *know* how they break and are ok with the breakage, since I know what is going on. I don't Drupal to turn off aggresive caching now for me :-/

#29

catch - August 29, 2006 - 22:04

Just a note on this - as an enthusiastic drupal user but relative newbie, I'd rather have control over "aggressive" mode, with the warning, than have the decision made for me.

Two reasons:

For a start, there are things that devel does (clear cache, php info, showing you query information for each page you view (like those 600 on my default forums page with 50 subforums!), probably other stuff I've not got to yet) that wouldn't be affected by the caching to anonymous users if I've got it right. Why disable aggressive caching only because of query logging or whatever when people might be using devel for other reasons anyway?

If statistics module with aggressive caching only collected stats for logged in users, then it could also be useful for comparing page views or referrals for logged in and anonymous in users against external statistics software like google analytics/awstats etc. - google/awstats won't tell you how many users who visited your site were logged in or anonymous, this sort of would - even if it's unpredictable it's better than nothing.

#30

chx - August 30, 2006 - 01:20
Status:needs review» needs work

Variable init simply does not belong to the database step -- what if I write code outside of Drupal and want to simply reuse the DB layer? And cache in session, what if I write a little script which serves file thus boots up until the session, what does cache do there? Add more steps if you want or simply move your cache from session to the beginning of late cache, it looks belonging there. This solves your variable init problem as well as that can move back to late cache.

Otherwise, it's a very promising patch.

#31

chx - August 30, 2006 - 01:25

variable_get to the rescue. Dries can have his automatic switching to non-aggressive -- this will result in the variable being non set to CACHE_EXPERT. m3avrck being the advanced user can force the setting to CACHE_EXPERT in his $conf in settings.php.

So, let's do the automatic stuff.

#32

matt westgate - August 30, 2006 - 02:00

New patch in response to chx's concerns, which makes sense. I did not add in the automatic enabling of the expert cache mode. If we chose to do that, we ought to inform the user this change has been made. Also, we won't be able to catch every instance of an interfering module. There might be modules that run code everytime their loaded outside of the init and exit hooks. This might be a bad coding practice on the module developer's part, but we should take it into consideration.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_6.patch12.94 KBIgnoredNoneNone

#33

matt westgate - August 30, 2006 - 02:01
Status:needs work» needs review

flipping the switch

#34

Dries - August 30, 2006 - 09:26
Status:needs review» needs work

The patch is looking better and better. However, weird stuff is going on with the latest version:

- Without the patch, I can serve 101 requests/second.
- With the patch, and with the caching strategy set to 'recommended', I can serve 172 requests/second (?).
- With the patch, and with the caching strategy set to 'aggressive', I can serve 151 requests/second.

It is strange that I can serve that many pages with the strategy set to 'recommended'.

Update: turns out I get a PHP error with the strategy set to 'recommended'.

#35

chx - August 30, 2006 - 20:39

Move the cache.inc inclusion down from the database step to the late_page_cache step where it belongs.

I start to dislike the _drupal_cache_init function. I can see three totally distinct function named one for no real good reason. Down with spaghetti!

#36

chx - August 30, 2006 - 20:41

We do not need to take into consideration that a module can load code outside of function definitions. Our motto is: "we do not babysit broken code". And that kind of code is seriously fubar.

#37

matt westgate - August 30, 2006 - 20:42
Status:needs work» needs review

I'm feeling good about this patch and fixed the PHP issue. I also ran the benchmarks again to see if it's still worth adding this feature (after killes' cache splitting patch)

BEFORE THIS PATCH, W/ CACHING
Requests per second:    11.15 [#/sec] (mean)

W/ PATCH, CACHE_RECOMMENDED
Requests per second:    11.01 [#/sec] (mean)

W/ PATCH, CACHE_EXPERT
Requests per second:    17.34 [#/sec] (mean)

I was trying to figure out why this patch makes slows default caching down a bit, but nothing really jumped out me. Probably just the extra PHP logic and all in all I don't think it's worth worrying about. It's safe to say expert caching mode is a considerable performance boost and worth adding to Drupal.

AttachmentSizeStatusTest resultOperations
fastpatch-dbcache_7.patch13.06 KBIgnoredNoneNone

#38

Dries - August 31, 2006 - 18:40
Status:needs review» fixed

Committed a modified version of your patch to CVS HEAD.

#39

beginner - September 2, 2006 - 16:11
Category:feature request» bug report
Priority:normal» critical
Status:fixed» active

Eh, with this patch, I can't install Drupal anymore because of this one line change in bootstrap.inc:
- foreach ($GLOBALS as $key => $value) {
+ foreach ($GLOBALS as $key) {

#40

chx - September 3, 2006 - 07:00
Status:active» reviewed & tested by the community
AttachmentSizeStatusTest resultOperations
conf_0.patch535 bytesIgnoredNoneNone

#41

drumm - September 3, 2006 - 07:13
Status:reviewed & tested by the community» fixed

Committed to HEAD.

#42

Anonymous - September 17, 2006 - 07:15
Status:fixed» closed
 
 

Drupal is a registered trademark of Dries Buytaert.