Cache registry lookups
justinrandell - October 24, 2008 - 15:07
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | base system |
| Category: | task |
| Priority: | normal |
| Assigned: | justinrandell |
| Status: | closed |
Description
With http://drupal.org/node/298600 we cut out a lot of registry queries, but for a logged in user, all core modules enabled, loading node/1 with one comment, we're still at 12 registry queries with a fully loaded cache.
if we cache registry misses, this drops down to 5.

#1
I can see a
$checked[$function] = FALSE;and that gets cached and it's checked by isset so by theory FALSEs are catched, why do you need the misses separately?#2
@chx: those misses are static cached, but justin idea is to db cache them.
I think justin approach is right: we should cache misses on a global basis, not just on a per-path basis, as misses will probably be transverse (as hooks are).
#3
But then we should array_filter the per router path cache so that misses are not bloating that.
#4
we only call
registry_mark_codewhen code is found in the registry or already loaded, so i don't think we need to worry about filtering the file path cache.#5
ok, here's the patch, real simple now that #298600: DBTNG: make module_implements work regardless of bootstrap phase is in.
all tests pass.
#6
CNR. i'm such a n00b at remembering to change status...
#7
some small benchmarks, all with APC, all core modules enabled, 1 node with 1 comment.
ab -n 1000 -c 10 http://site/node/1
HEAD, not logged in:
Requests per second: 61.91 [#/sec] (mean)
Requests per second: 61.97 [#/sec] (mean)
Requests per second: 61.84 [#/sec] (mean)
HEAD + patch, not logged in:
Requests per second: 65.77 [#/sec] (mean)
Requests per second: 65.91 [#/sec] (mean)
Requests per second: 65.79 [#/sec] (mean)
ab -n 1000 -C my=cookie -c 10 http://site/node/1
HEAD, logged in:
Requests per second: 57.88 [#/sec] (mean)
Requests per second: 58.05 [#/sec] (mean)
Requests per second: 57.98 [#/sec] (mean)
HEAD + patch, logged in:
Requests per second: 61.48 [#/sec] (mean)
Requests per second: 61.42 [#/sec] (mean)
Requests per second: 61.64 [#/sec] (mean)
#8
Please add code comments.
#9
comments added, no code changes. that was a bit lazy of me.
#10
Simplify
+ * Indicates that the _registry_check_code function should writeto
+ * Indicates that _registry_check_code() should writeUhm. A value of FALSE means that something should be done (according to the PHPdoc)? Maybe use an negative integer here instead.
/**+ * Indicates that the _registry_check_code function should write
+ * the registry lookup misses from this request to cache.
+ */
+define('REGISTRY_CHECK_CODE_WRITE_CACHE', FALSE);
Typo in resource:
+ // Only write this to cache if we have new registry resouce lookup misses.#11
#12
Thanks. :)
+ $cache = cache_get('registry_check_code_misses', 'cache_registry');probably means that we query the DB on each page request just to prevent us from doing the same mistakes again (i.e. looking up functions that do not exist at all).
Since Drupal is already query-bloated, can't we find a way to attach this information directly into the primary registry query? Maybe using a magic function name and serializing all cache misses in it?
#13
I see no improvement at all with that patch (even when fixing the condition on the cache_get). Can't reproduce Justin results at all.
#14
@Damien: can you share your benchmark setups and results? good catch on the 'fixing the condition on cache_get', i added the array part of the declaration after the benchmarking, fixed in this patch.
i ran the benchmarks again, against this patch, and i see less queries, and faster results in line with earlier runs. is anyone else able to run some benchmarks? if this only improves drupal on my computer, we need to know :-)
@sun: we run a query that saves us more queries, so its a net gain. also, for a bigger site, where hitting the cache probably means hitting memcached/APC instead of the db, its an even bigger win.
#15
@sun#12: the code would be
#16
I'd really like to get these performance issues fixed. When I committed the registry patch, I was told it would lead to performance improvements. Hasn't happened yet. Kudos to those who are working on fixing the registry performance.
#17
@Dries: i think that D6 -> D7 performance comparisons involve a lot more than the registry. i'd hate to see 'D7 is slower, the registry is to blame' become an accepted truth, because its likely going to narrow the focus too much.
having said that, getting performance iprovements out of the registry will involve a lot of changes outside of the registry code.
for example,
_drupal_bootstrap_full, which loads all enabled modules and a huge chunk of includes/*.inc code for every request. making the bootstrap a lot lighter is something we talked about during the development of the registry patch, but we haven't gotten to it yet.also, the help system, which will load all enabled modules on just about every request, even if we remove
module_load_allfrom the bootstrap. there's an issue for this #299050: Help System - Master Patch (which includes improvements above and beyond making the help system load less code).#18
I was about ready to commit this patch, but then it failed to apply. Can someone reroll? Thanks!
#19
@Dries: can we hold off on this? i'm experimenting with a different approach to the problem, will post later today.
#20
this patch caches all registry lookups, not just misses, and removes the path-based caching. all tests pass.
my benchmarks showed it to be quicker, both for default modules enabled, and all modules enabled, for node/1 with one comment. i'll try to get kbahey to run his tests on it for confirmation.
this makes registry caching simpler, and moves us closer to true lazy loading without killing the db. the more i looked at how much code the path-based cache was loading, and how often the cache is destroyed and rewritten (eg, watch the cache get rewritten for the same path for logged in vs anon requests), the less happy i am with this approach.
#21
Hem. So basically you load a big array of all lookup_key => files out there.
There are two *big* issues with that approach (not that the current approach is right, of course):
(1) there is a big memory cost involved (what is the size of this?), and the cost will increase with the number of modules: bad, that sort of O(module) overhead is what the registry is trying to avoid,
(2) you lost the concept of "type", so you are out of luck if, for example, a function and a class has the same name
I think the idea of caching the lookups is smart. It is based on the hypothesis (probably true most of the time) that functions are mostly always called in the same order, so we only need to remember the name of the first function called in a file.
I suggest we keep the concept of path-based caching, but make it progressive (as in your approach). We could also study how we could break that big array in two part to save space in the case a given file has several points of entry of a file changes: I'm thinking of a (lookup_key, type) => file_id first part and a file_id => filename second part.
Of course, all this will have to be benchmarked (including memory usage).
#22
do you have any numbers? i gathered data on this. i enabled all modules, created a node of each built-in content type, added comments, created users, some new roles, edited permisssions, visited any link in the admin section that wouldn't rebuild the registry and a few other things i've forgotten.
that is, i went out of my way to make the cache as big as i could. the array reached 372 elements, a large number of which had a value of FALSE. how much memory does a single dimension 372 element array with many empty keys take up? on my desktop, ubuntu intrepid, running php cli, that php reports 50kilobytes.
<?phpprint memory_get_usage() . "\n";
$cache = array();
for ($i = 0; $i < 372; $i++) {
if ($i % 2) {
$cache[$i . 'resource_name'] = FALSE;
}
else {
$cache[$i . 'resource_name'] = 'modules/memory/memory.module';
}
}
print memory_get_usage() . "\n";
?>
keep in mind, this patch is eliminating the path cache, so the impact is less than that. i'm willing to listen to objections about this, but not ones that come with no evidence.
ahem:
<?php$cache_key = $type{0} . $name;
?>
care to elaborate on these ideas more? i'm willing to completely ditch this implementation (just like i think we should be willing to ditch the path-based caching) if we can find a better way. right now, i think the path taken by the patch at #20 is the best way.
#23
Benchmarked #20 with ab. No query cache, two nodes on front page, a few core modules and forum blocks enabled.
With APC:
Before patch: 14.68 #/sec
After patch: 16.74 #/sec
Without APC:
Before patch: 5.62 [#/sec]
After patch: 6.06 [#/sec]
Noticeable improvement :) Didn't do any memory profiling.
#24
I don't have a benchmarking environment set up, but I've assembled some crude data by getting values for memory usage at start, peak and end.
The test conditions weren't very well designed (I randomly cleared out the cache tables in between tests), but it didn't pick up anything significant to the naked eye.
#25
This is an interesting turn of events. When we have written the registry cache we presumed that there are not many code paths for any given router path so the cache, based on router items will contain every one of them. My only concern is the size of this cache. Oh, and the constants need to start with REGISTRY_.
#26
Setting CNW for the constant names...
#27
updated patch attached, changes constant names to address chx's comment in #25.
@Arancaytar: thanks for the memory data. i'll try to work up some more data on memory usage to make sure this patch doesn't make things worse memory-wise.
#28
oh, and here's the patch.
@catch: thanks for benchmarks. that's about what i saw as well.
#29
currently, the bulk of the members of the registry lookup cache are misses, and misses for resources that will never exist.
if we can chop some/most of those out 'upstream', then this patch gets even nicer.
first target, in
module_rebuild_cache, this code is bloating the cache with bogus lookups:<?php// Log the critical hooks implemented by this module.
$bootstrap = 0;
foreach (bootstrap_hooks() as $hook) {
if (module_hook($file->name, $hook)) {
$bootstrap = 1;
break;
}
}
?>
#30
updated patch makes
module_rebuild_cacheonly look for bootstrap hooks for installed modules.saves *66* queries on a standard install when
module_rebuild_cacheis called! oh, and makes the registry cache smaller.#31
fixed a bug in the write to cache code - make sure we return, even if there's nothing new written to cache. one less query.
#32
this time its
sytem_modulesthat's looking for hook_help in uninstalled modules. fixed that, another boatload of useless queries gone, registry cache gets smaller again.#33
fresh HEAD, with devel and simpletest enabled, i get:
Executed 303 queries in 29.44 milliseconds.at admin/by-moduleExecuted 241 queries in 39.39 milliseconds.at adminwith this patch, with devel and simpletest enabled, i get:
Executed 128 queries in 20.12 milliseconds.at admin/by-moduleExecuted 145 queries in 25.48 milliseconds.at admin#34
some benchmarks, APC enabled, anonymous user, no page or block cache for all runs.
default modules, no content, front page:
HEAD
Requests per second: 114.93 [#/sec] (mean)
Requests per second: 114.62 [#/sec] (mean)
Requests per second: 114.22 [#/sec] (mean)
HEAD + patch
Requests per second: 125.47 [#/sec] (mean)
Requests per second: 125.57 [#/sec] (mean)
Requests per second: 125.13 [#/sec] (mean)
all core modules, no content, front page:
HEAD
Requests per second: 81.67 [#/sec] (mean)
Requests per second: 81.60 [#/sec] (mean)
Requests per second: 81.47 [#/sec] (mean)
HEAD + patch
Requests per second: 90.04 [#/sec] (mean)
Requests per second: 89.93 [#/sec] (mean)
Requests per second: 89.81 [#/sec] (mean)
all core modules, 50 nodes of each content type with 4 comments each via devel, /node:
HEAD
Requests per second: 44.55 [#/sec] (mean)
Requests per second: 44.64 [#/sec] (mean)
Requests per second: 44.87 [#/sec] (mean)
HEAD + patch
Requests per second: 48.19 [#/sec] (mean)
Requests per second: 48.12 [#/sec] (mean)
Requests per second: 48.17 [#/sec] (mean)
if anyone else can benchmark this to validate those results that would be great.
#35
more better, more smaller registry cache if we make
module_hookusemodule_implementsinstead ofdrupal_function_exists.got to this via trying to figure out why
system_get_module_admin_taskswas adding so many misses to the registry cache.this makes some more of drupal's admin lighter.
#36
I like the principle of that last patch, but it apparently broke the installation of modules. For example for SimpleTest:
#37
Damien is right, patch at #35 is broken.
new patch leaves
module_hookalone, and makessystem_get_module_admin_tasksusemodule_implementsdirectly.same lightening of the registry cache, without the breakage. all tests pass.
i think there might be further opportunities to tune the registry cache, but in the interests of kittens, i think they are best left to follow up patches.
as it is, this patch makes drupal quicker, and run less db lookups, so i think its good to go.
its also an important step to making drupal a lazy loading system, which i intend to follow up on at #332003: cut the fat out of _drupal_bootstrap_full.
#38
changing status.
#39
=== NULLis just a little slower thanissetbut still (is_nullis a lot slower). And then you have$lookup_cache = $name;instead of$lookup_cache = NULL;, whom are you trying to confuse?$type{0}use[0], PHP guys tried to make people use {} for string indexes, it failed and it'ls deprecated since 5.1 and not supported in PHP6 at all.Use the new memory_get_peak_usage to measure real memory requirements. Now, let's presume that there is a linear relation between the size of the registry table and the memory usage, if we make that presumption and add Views 2 and CCK2 (throw out hook perm and i believe they will work enough for a registry rebuild we do not care about anything else), how big the registry table will be and following that and our presumption just how big the memory usage will be?
#40
thanks for the review chx.
fixed.
reworked this to make it clearer. because it was clearly not obvious, the intent is to 'rewarm' the lookup cache with resources that we've cached previously that haven't changed during a rebuild.
the registry rebuild now takes care of keeping the lookup cache warm, and $lookup_cache is set to NULL in
_registry_check_code.also, added some wrapper functions to convert the lookup cache to a string, after some suggestions from chx in #drupal.
all tests pass, so posting now. will post benchmarks shortly.
#41
argh, status update again.
#42
re-ran benchmarks, results have the same trend as #34 - this patch is measurably quicker than HEAD. again, would love to see some other benchmarks to validate this.
#43
1. Further optimization of _registry_check_code(): $type is either a numeric constant value or it's a string. Strict (===) checks are expensive. By wrapping the two checks for the REGISTRY_* constants in an additional is_numeric() block, we can possibly speed up execution.
2. This looks like a c'n'p error: +function _registry_set_lookup_cache(array $lookup_cache) {
3. I've run some benchmarks of the wrapper code which turns the lookup cache into a string, and I (personally) find that code ugly. The results that have been posted on DrupalBin only show you one thing: PHP's garbage collector is working properly, as the extra memory required by serialize() has already been released again by the time the memory_get_*() functions are executed. Additionally, I've seen only light increase of memory (about 500KB difference), but never such high numbers that are shown on DrupalBin, which means they're probably system-specific.
#44
@smk-ka: thanks for the review.
maybe you're looking at an older version of the patch? the patch at #40 doesn't use === anymore.
nope, that should always be an array, and i want to die screaming if its not. as an aside, i think we should do that in more places in drupal (use php 5's type hinting).
on the memory stuff, well, i'd prefer to err on the side of being overcautious about memory consumption. this patch is clearly quicker, and does have to work when we have a larger lookup cache. i think avoiding serialize is the way to go, so i'm setting this back to CNR. can you share your benchmarks or suggest other ways to illustrate that avoiding serialize is not necessary?
#45
rerolled to take out some fuzz, and converted a couple more places where we aren't using the hook implementation code to cut down on useless registry queries.
registry cache, now with even less fat.
#46
smk-ka, it's not about garbage collection, if you were to read PHP source, you would see it creates a clone of the thing to be serialized. It can be a huge lump of memory as my pastebin showed.
#47
A summary of my comments on IRC:
* This patch cleans up a lot of kind of nasty-looking code in addition to being faster. Nice!
* However, it desperately needs more documentation since there is some tweaky stuff in there (that almost-serialized-but-not quite stuff, the name[0] stuff, etc.) that isn't very well explained currently.
* We are also only changing a small sub-set of total hooks. Should we expand this out to others?
* I would love to get at least one other independent benchmarker on this patch. I know catch did some benchmarks but that was awhile back. Khalid?
* Since Dries already stuck his fingers in this patch earlier, I'd kind of like to talk to him to get his thoughts before committing it.
#48
updated patch based on IRC review by webchick.
#49
Thanks, docs are *much* better now. :)
+ $cache_key = $type[0] . $name;This line still perplexes me; perhaps because I still don't understand what $type is here. Is it class/function/etc? If so, are certain that the first character will be enough to make it unique?
(PS: Don't tell me, add it as a comment. ;))
+ // See _registry_set_lookup_cache, we cache a string, not an array+ // to avoid serialize in the cache system.
Why..? chx just posted to me some .c code, which is going a bit overboard. ;) A couple-word summary about why serialize() is bad here though would be nice.
This marks all my concerns with the code/comments being addressed. Leaving CNR in the hopes of benchmarks. ;)
#50
added more comments for
+ $cache_key = $type[0] . $name;, and more comments about the memory impact ofserialize().benchmarks anyone?
#51
In both cases I visited admin/build/modules then a few admin pages before running the benchmarks. Looks very good. Not looked at the patch itself.
/node
no patch:
9.31 [#req/sec]
9.28 [#/sec]
patch:
10.05 [#/sec]
10.24 [#/sec] (mean)
/node/1
no patch:
13.18 [#/sec]
12.90 [#/sec]
patch:
14.43 [#/sec]
14.23 [#/sec]
#52
updated patch to keep up with HEAD. all tests that pass on HEAD pass with the patch.
#53
I gave this a good review and it looks great. Committed to CVS HEAD.
--project followup subject--
Automatically closed -- issue fixed for two weeks with no activity.
#54
Automatically closed -- issue fixed for two weeks with no activity.