With http://drupal.org/node/298600 we cut out a lot of registry queries, but for a logged in user, all core modules enabled, loading node/1 with one comment, we're still at 12 registry queries with a fully loaded cache.

if we cache registry misses, this drops down to 5.

CommentFileSizeAuthor
#52 simple.patch15.38 KBAnonymous (not verified)
#50 simple.patch15.53 KBAnonymous (not verified)
#48 simple.patch15.38 KBAnonymous (not verified)
#45 simple.patch15.19 KBAnonymous (not verified)
#40 simple.noserialize.patch12.68 KBAnonymous (not verified)
#37 simple.patch12.05 KBAnonymous (not verified)
#35 simple.patch11.51 KBAnonymous (not verified)
#32 simple.patch10.58 KBAnonymous (not verified)
#31 simple.patch9.77 KBAnonymous (not verified)
#30 simple.patch9.68 KBAnonymous (not verified)
#28 simple.patch8.32 KBAnonymous (not verified)
#24 memlog.txt4.79 KBcburschka
#20 simple.patch8.32 KBAnonymous (not verified)
#14 cache_registry_misses.patch2.87 KBAnonymous (not verified)
#11 cache_registry_misses.patch2.54 KBchx
#9 cache_registry_misses.patch2.9 KBAnonymous (not verified)
#5 cache_registry_misses.patch2.28 KBAnonymous (not verified)
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx’s picture

I can see a $checked[$function] = FALSE; and that gets cached and it's checked by isset so by theory FALSEs are catched, why do you need the misses separately?

Damien Tournoud’s picture

@chx: those misses are static cached, but justin idea is to db cache them.

I think justin approach is right: we should cache misses on a global basis, not just on a per-path basis, as misses will probably be transverse (as hooks are).

chx’s picture

But then we should array_filter the per router path cache so that misses are not bloating that.

Anonymous’s picture

Assigned: Unassigned »

we only call registry_mark_code when code is found in the registry or already loaded, so i don't think we need to worry about filtering the file path cache.

Anonymous’s picture

Status: Needs review » Active
FileSize
2.28 KB

ok, here's the patch, real simple now that #298600: DBTNG: make module_implements work regardless of bootstrap phase is in.

all tests pass.

Anonymous’s picture

Status: Active » Needs review

CNR. i'm such a n00b at remembering to change status...

Anonymous’s picture

some small benchmarks, all with APC, all core modules enabled, 1 node with 1 comment.

ab -n 1000 -c 10 http://site/node/1

HEAD, not logged in:
Requests per second: 61.91 [#/sec] (mean)
Requests per second: 61.97 [#/sec] (mean)
Requests per second: 61.84 [#/sec] (mean)

HEAD + patch, not logged in:
Requests per second: 65.77 [#/sec] (mean)
Requests per second: 65.91 [#/sec] (mean)
Requests per second: 65.79 [#/sec] (mean)

ab -n 1000 -C my=cookie -c 10 http://site/node/1

HEAD, logged in:
Requests per second: 57.88 [#/sec] (mean)
Requests per second: 58.05 [#/sec] (mean)
Requests per second: 57.98 [#/sec] (mean)

HEAD + patch, logged in:
Requests per second: 61.48 [#/sec] (mean)
Requests per second: 61.42 [#/sec] (mean)
Requests per second: 61.64 [#/sec] (mean)

Dries’s picture

Status: Active » Needs review

Please add code comments.

Anonymous’s picture

comments added, no code changes. that was a bit lazy of me.

sun’s picture

Status: Needs review » Needs work

Simplify

+ * Indicates that the _registry_check_code function should write

to

+ * Indicates that _registry_check_code() should write

Uhm. A value of FALSE means that something should be done (according to the PHPdoc)? Maybe use an negative integer here instead.

 /**
+ * Indicates that the _registry_check_code function should write
+ * the registry lookup misses from this request to cache.
+ */
+define('REGISTRY_CHECK_CODE_WRITE_CACHE', FALSE);

Typo in resource:

+  // Only write this to cache if we have new registry resouce lookup misses.
chx’s picture

Status: Needs work » Needs review
FileSize
2.54 KB
sun’s picture

Thanks. :)

+    $cache = cache_get('registry_check_code_misses', 'cache_registry');

probably means that we query the DB on each page request just to prevent us from doing the same mistakes again (i.e. looking up functions that do not exist at all).

Since Drupal is already query-bloated, can't we find a way to attach this information directly into the primary registry query? Maybe using a magic function name and serializing all cache misses in it?

Damien Tournoud’s picture

I see no improvement at all with that patch (even when fixing the condition on the cache_get). Can't reproduce Justin results at all.

Anonymous’s picture

FileSize
2.87 KB

@Damien: can you share your benchmark setups and results? good catch on the 'fixing the condition on cache_get', i added the array part of the declaration after the benchmarking, fixed in this patch.

i ran the benchmarks again, against this patch, and i see less queries, and faster results in line with earlier runs. is anyone else able to run some benchmarks? if this only improves drupal on my computer, we need to know :-)

@sun: we run a query that saves us more queries, so its a net gain. also, for a bigger site, where hitting the cache probably means hitting memcached/APC instead of the db, its an even bigger win.

chx’s picture

@sun#12: the code would be

Dries’s picture

I'd really like to get these performance issues fixed. When I committed the registry patch, I was told it would lead to performance improvements. Hasn't happened yet. Kudos to those who are working on fixing the registry performance.

Anonymous’s picture

@Dries: i think that D6 -> D7 performance comparisons involve a lot more than the registry. i'd hate to see 'D7 is slower, the registry is to blame' become an accepted truth, because its likely going to narrow the focus too much.

having said that, getting performance iprovements out of the registry will involve a lot of changes outside of the registry code.

for example, _drupal_bootstrap_full, which loads all enabled modules and a huge chunk of includes/*.inc code for every request. making the bootstrap a lot lighter is something we talked about during the development of the registry patch, but we haven't gotten to it yet.

also, the help system, which will load all enabled modules on just about every request, even if we remove module_load_all from the bootstrap. there's an issue for this #299050: Help System - Master Patch (which includes improvements above and beyond making the help system load less code).

Dries’s picture

Status: Needs review » Needs work

I was about ready to commit this patch, but then it failed to apply. Can someone reroll? Thanks!

Anonymous’s picture

@Dries: can we hold off on this? i'm experimenting with a different approach to the problem, will post later today.

Anonymous’s picture

Title: Cache registry misses » Cache registry lookups
Status: Needs work » Needs review
FileSize
8.32 KB

this patch caches all registry lookups, not just misses, and removes the path-based caching. all tests pass.

my benchmarks showed it to be quicker, both for default modules enabled, and all modules enabled, for node/1 with one comment. i'll try to get kbahey to run his tests on it for confirmation.

this makes registry caching simpler, and moves us closer to true lazy loading without killing the db. the more i looked at how much code the path-based cache was loading, and how often the cache is destroyed and rewritten (eg, watch the cache get rewritten for the same path for logged in vs anon requests), the less happy i am with this approach.

Damien Tournoud’s picture

Status: Needs review » Needs work

Hem. So basically you load a big array of all lookup_key => files out there.

There are two *big* issues with that approach (not that the current approach is right, of course):
(1) there is a big memory cost involved (what is the size of this?), and the cost will increase with the number of modules: bad, that sort of O(module) overhead is what the registry is trying to avoid,
(2) you lost the concept of "type", so you are out of luck if, for example, a function and a class has the same name

I think the idea of caching the lookups is smart. It is based on the hypothesis (probably true most of the time) that functions are mostly always called in the same order, so we only need to remember the name of the first function called in a file.

I suggest we keep the concept of path-based caching, but make it progressive (as in your approach). We could also study how we could break that big array in two part to save space in the case a given file has several points of entry of a file changes: I'm thinking of a (lookup_key, type) => file_id first part and a file_id => filename second part.

Of course, all this will have to be benchmarked (including memory usage).

Anonymous’s picture

Status: Needs work » Needs review

Hem. So basically you load a big array of all lookup_key => files out there.
....
(1) there is a big memory cost involved (what is the size of this?), and the cost will increase with the number of modules: bad, that sort of O(module) overhead is what the registry is trying to avoid,

do you have any numbers? i gathered data on this. i enabled all modules, created a node of each built-in content type, added comments, created users, some new roles, edited permisssions, visited any link in the admin section that wouldn't rebuild the registry and a few other things i've forgotten.

that is, i went out of my way to make the cache as big as i could. the array reached 372 elements, a large number of which had a value of FALSE. how much memory does a single dimension 372 element array with many empty keys take up? on my desktop, ubuntu intrepid, running php cli, that php reports 50kilobytes.

print memory_get_usage() . "\n";
$cache = array();
for ($i = 0; $i < 372; $i++) {
    if ($i % 2) {
        $cache[$i . 'resource_name'] = FALSE;
    }
    else {
        $cache[$i . 'resource_name'] = 'modules/memory/memory.module';
    }
}
print memory_get_usage() . "\n";

keep in mind, this patch is eliminating the path cache, so the impact is less than that. i'm willing to listen to objections about this, but not ones that come with no evidence.

(2) you lost the concept of "type", so you are out of luck if, for example, a function and a class has the same name

ahem:

  $cache_key = $type{0} . $name;
I suggest we keep the concept of path-based caching, but make it progressive (as in your approach). We could also study how we could break that big array in two part to save space in the case a given file has several points of entry of a file changes: I'm thinking of a (lookup_key, type) => file_id first part and a file_id => filename second part.

Of course, all this will have to be benchmarked (including memory usage).

care to elaborate on these ideas more? i'm willing to completely ditch this implementation (just like i think we should be willing to ditch the path-based caching) if we can find a better way. right now, i think the path taken by the patch at #20 is the best way.

catch’s picture

Benchmarked #20 with ab. No query cache, two nodes on front page, a few core modules and forum blocks enabled.

With APC:
Before patch: 14.68 #/sec
After patch: 16.74 #/sec

Without APC:
Before patch: 5.62 [#/sec]
After patch: 6.06 [#/sec]

Noticeable improvement :) Didn't do any memory profiling.

cburschka’s picture

FileSize
4.79 KB

I don't have a benchmarking environment set up, but I've assembled some crude data by getting values for memory usage at start, peak and end.

The test conditions weren't very well designed (I randomly cleared out the cache tables in between tests), but it didn't pick up anything significant to the naked eye.

chx’s picture

This is an interesting turn of events. When we have written the registry cache we presumed that there are not many code paths for any given router path so the cache, based on router items will contain every one of them. My only concern is the size of this cache. Oh, and the constants need to start with REGISTRY_.

cburschka’s picture

Status: Needs review » Needs work

Setting CNW for the constant names...

Anonymous’s picture

updated patch attached, changes constant names to address chx's comment in #25.

@Arancaytar: thanks for the memory data. i'll try to work up some more data on memory usage to make sure this patch doesn't make things worse memory-wise.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
8.32 KB

oh, and here's the patch.

@catch: thanks for benchmarks. that's about what i saw as well.

Anonymous’s picture

currently, the bulk of the members of the registry lookup cache are misses, and misses for resources that will never exist.

if we can chop some/most of those out 'upstream', then this patch gets even nicer.

first target, in module_rebuild_cache, this code is bloating the cache with bogus lookups:

  // Log the critical hooks implemented by this module.
  $bootstrap = 0;
  foreach (bootstrap_hooks() as $hook) {
    if (module_hook($file->name, $hook)) {
      $bootstrap = 1;
      break;
    }
  }
Anonymous’s picture

FileSize
9.68 KB

updated patch makes module_rebuild_cache only look for bootstrap hooks for installed modules.

saves *66* queries on a standard install when module_rebuild_cache is called! oh, and makes the registry cache smaller.

Anonymous’s picture

FileSize
9.77 KB

fixed a bug in the write to cache code - make sure we return, even if there's nothing new written to cache. one less query.

Anonymous’s picture

FileSize
10.58 KB

this time its sytem_modules that's looking for hook_help in uninstalled modules. fixed that, another boatload of useless queries gone, registry cache gets smaller again.

Anonymous’s picture

fresh HEAD, with devel and simpletest enabled, i get:
Executed 303 queries in 29.44 milliseconds. at admin/by-module
Executed 241 queries in 39.39 milliseconds. at admin

with this patch, with devel and simpletest enabled, i get:
Executed 128 queries in 20.12 milliseconds. at admin/by-module
Executed 145 queries in 25.48 milliseconds. at admin

Anonymous’s picture

some benchmarks, APC enabled, anonymous user, no page or block cache for all runs.

default modules, no content, front page:
HEAD
Requests per second: 114.93 [#/sec] (mean)
Requests per second: 114.62 [#/sec] (mean)
Requests per second: 114.22 [#/sec] (mean)
HEAD + patch
Requests per second: 125.47 [#/sec] (mean)
Requests per second: 125.57 [#/sec] (mean)
Requests per second: 125.13 [#/sec] (mean)

all core modules, no content, front page:
HEAD
Requests per second: 81.67 [#/sec] (mean)
Requests per second: 81.60 [#/sec] (mean)
Requests per second: 81.47 [#/sec] (mean)
HEAD + patch
Requests per second: 90.04 [#/sec] (mean)
Requests per second: 89.93 [#/sec] (mean)
Requests per second: 89.81 [#/sec] (mean)

all core modules, 50 nodes of each content type with 4 comments each via devel, /node:
HEAD
Requests per second: 44.55 [#/sec] (mean)
Requests per second: 44.64 [#/sec] (mean)
Requests per second: 44.87 [#/sec] (mean)
HEAD + patch
Requests per second: 48.19 [#/sec] (mean)
Requests per second: 48.12 [#/sec] (mean)
Requests per second: 48.17 [#/sec] (mean)

if anyone else can benchmark this to validate those results that would be great.

Anonymous’s picture

FileSize
11.51 KB

more better, more smaller registry cache if we make module_hook use module_implements instead of drupal_function_exists.

got to this via trying to figure out why system_get_module_admin_tasks was adding so many misses to the registry cache.

this makes some more of drupal's admin lighter.

Damien Tournoud’s picture

Status: Needs review » Needs work

I like the principle of that last patch, but it apparently broke the installation of modules. For example for SimpleTest:

PDOException: INSERT INTO {simpletest_test_id} DEFAULT VALUES - Array ( ) SQLSTATE[HY000]: General error 1: no such table: simpletest_test_id in simpletest_run_tests() (line 347 of /var/www/dev/sqlite/modules/simpletest/simpletest.module).

Anonymous’s picture

FileSize
12.05 KB

Damien is right, patch at #35 is broken.

new patch leaves module_hook alone, and makes system_get_module_admin_tasks use module_implements directly.

same lightening of the registry cache, without the breakage. all tests pass.

i think there might be further opportunities to tune the registry cache, but in the interests of kittens, i think they are best left to follow up patches.

as it is, this patch makes drupal quicker, and run less db lookups, so i think its good to go.

its also an important step to making drupal a lazy loading system, which i intend to follow up on at #332003: cut the fat out of _drupal_bootstrap_full.

Anonymous’s picture

Status: Needs work » Needs review

changing status.

chx’s picture

Status: Needs review » Needs work

=== NULL is just a little slower than isset but still (is_null is a lot slower). And then you have $lookup_cache = $name; instead of $lookup_cache = NULL;, whom are you trying to confuse? $type{0} use [0], PHP guys tried to make people use {} for string indexes, it failed and it'ls deprecated since 5.1 and not supported in PHP6 at all.

Use the new memory_get_peak_usage to measure real memory requirements. Now, let's presume that there is a linear relation between the size of the registry table and the memory usage, if we make that presumption and add Views 2 and CCK2 (throw out hook perm and i believe they will work enough for a registry rebuild we do not care about anything else), how big the registry table will be and following that and our presumption just how big the memory usage will be?

Anonymous’s picture

FileSize
12.68 KB

thanks for the review chx.

=== NULL is just a little slower than isset but still (is_null is a lot slower).
...
$type{0} use [0], PHP guys tried to make people use {} for string indexes, it failed and it'ls deprecated since 5.1 and not supported in PHP6 at all.

fixed.

And then you have $lookup_cache = $name; instead of $lookup_cache = NULL;, whom are you trying to confuse?

reworked this to make it clearer. because it was clearly not obvious, the intent is to 'rewarm' the lookup cache with resources that we've cached previously that haven't changed during a rebuild.

the registry rebuild now takes care of keeping the lookup cache warm, and $lookup_cache is set to NULL in _registry_check_code.

also, added some wrapper functions to convert the lookup cache to a string, after some suggestions from chx in #drupal.

all tests pass, so posting now. will post benchmarks shortly.

Anonymous’s picture

Status: Needs work » Needs review

argh, status update again.

Anonymous’s picture

re-ran benchmarks, results have the same trend as #34 - this patch is measurably quicker than HEAD. again, would love to see some other benchmarks to validate this.

smk-ka’s picture

Status: Needs review » Needs work

1. Further optimization of _registry_check_code(): $type is either a numeric constant value or it's a string. Strict (===) checks are expensive. By wrapping the two checks for the REGISTRY_* constants in an additional is_numeric() block, we can possibly speed up execution.

2. This looks like a c'n'p error: +function _registry_set_lookup_cache(array $lookup_cache) {

3. I've run some benchmarks of the wrapper code which turns the lookup cache into a string, and I (personally) find that code ugly. The results that have been posted on DrupalBin only show you one thing: PHP's garbage collector is working properly, as the extra memory required by serialize() has already been released again by the time the memory_get_*() functions are executed. Additionally, I've seen only light increase of memory (about 500KB difference), but never such high numbers that are shown on DrupalBin, which means they're probably system-specific.

Anonymous’s picture

Status: Needs work » Needs review

@smk-ka: thanks for the review.

1. Further optimization of _registry_check_code(): $type is either a numeric constant value or it's a string. Strict (===) checks are expensive. By wrapping the two checks for the REGISTRY_* constants in an additional is_numeric() block, we can possibly speed up execution.

maybe you're looking at an older version of the patch? the patch at #40 doesn't use === anymore.

2. This looks like a c'n'p error: +function _registry_set_lookup_cache(array $lookup_cache) {

nope, that should always be an array, and i want to die screaming if its not. as an aside, i think we should do that in more places in drupal (use php 5's type hinting).

on the memory stuff, well, i'd prefer to err on the side of being overcautious about memory consumption. this patch is clearly quicker, and does have to work when we have a larger lookup cache. i think avoiding serialize is the way to go, so i'm setting this back to CNR. can you share your benchmarks or suggest other ways to illustrate that avoiding serialize is not necessary?

Anonymous’s picture

FileSize
15.19 KB

rerolled to take out some fuzz, and converted a couple more places where we aren't using the hook implementation code to cut down on useless registry queries.

registry cache, now with even less fat.

chx’s picture

smk-ka, it's not about garbage collection, if you were to read PHP source, you would see it creates a clone of the thing to be serialized. It can be a huge lump of memory as my pastebin showed.

webchick’s picture

A summary of my comments on IRC:

* This patch cleans up a lot of kind of nasty-looking code in addition to being faster. Nice!
* However, it desperately needs more documentation since there is some tweaky stuff in there (that almost-serialized-but-not quite stuff, the name[0] stuff, etc.) that isn't very well explained currently.
* We are also only changing a small sub-set of total hooks. Should we expand this out to others?
* I would love to get at least one other independent benchmarker on this patch. I know catch did some benchmarks but that was awhile back. Khalid?
* Since Dries already stuck his fingers in this patch earlier, I'd kind of like to talk to him to get his thoughts before committing it.

Anonymous’s picture

FileSize
15.38 KB

updated patch based on IRC review by webchick.

webchick’s picture

Thanks, docs are *much* better now. :)

+  $cache_key = $type[0] . $name;

This line still perplexes me; perhaps because I still don't understand what $type is here. Is it class/function/etc? If so, are certain that the first character will be enough to make it unique?

(PS: Don't tell me, add it as a comment. ;))

+  // See _registry_set_lookup_cache, we cache a string, not an array
+  // to avoid serialize in the cache system.

Why..? chx just posted to me some .c code, which is going a bit overboard. ;) A couple-word summary about why serialize() is bad here though would be nice.

This marks all my concerns with the code/comments being addressed. Leaving CNR in the hopes of benchmarks. ;)

Anonymous’s picture

FileSize
15.53 KB

added more comments for + $cache_key = $type[0] . $name;, and more comments about the memory impact of serialize().

benchmarks anyone?

catch’s picture

In both cases I visited admin/build/modules then a few admin pages before running the benchmarks. Looks very good. Not looked at the patch itself.

/node
no patch:
9.31 [#req/sec]
9.28 [#/sec] 

patch:
10.05 [#/sec] 
10.24 [#/sec] (mean)

/node/1
no patch:
13.18 [#/sec] 
12.90 [#/sec]

patch:
14.43 [#/sec] 
14.23 [#/sec]
Anonymous’s picture

FileSize
15.38 KB

updated patch to keep up with HEAD. all tests that pass on HEAD pass with the patch.

Dries’s picture

Status: Needs review » Fixed

I gave this a good review and it looks great. Committed to CVS HEAD.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.