As follow-ups to #497118: Remove the registry (for functions), we should introduce back some modest memory and function call optimizations.

The elements are 2-fold:

  1. allow hook implementations to live in a single .inc file per module. This would be for info-type hooks like hook_menu() or hook_theme() that are only called rarely. E.g. modulename.hooks.inc
  2. Change functions like module_invoke_all() so that we cache a list of modules the implement certain hooks (and possibly whether the implementation is in the new .inc file). This would let us avoid doing a file_exists() + require_once for every hook invocation which the first element alone would need, plus make all such invocations more efficient.
CommentFileSizeAuthor
#191 557542-cache_module_implements-191.patch1.13 KBartkon
#190 557542-cache_module_implements-190.patch1.37 KBdbehrman
#186 557542-cache_module_implements-185.patch1.92 KBdbehrman
#184 557542-cache_module_implements-184.patch1.63 KBdbehrman
#179 557542-module_implements-cache-179.patch8.88 KBmikeytown2
#178 557542-module_implements-cache-178-D6.patch8.88 KBmikeytown2
#177 557542-module_implements-cache-167-175-innerdiff-D6.patch1.05 KBmikeytown2
#176 557542-module_implements-cache-161-175-innerdiff-D6.patch2.83 KBmikeytown2
#175 557542-module_implements-cache-175-D6.patch8.83 KBmikeytown2
#170 557542-module_implements-cache-161-167-interdiff-D6.patch1.92 KBglennpratt
#167 557542-module_implements-cache-167-D6.patch8.81 KBmikeytown2
#162 557542-module_implements-cache-161-D6.patch7.71 KBglennpratt
#159 drupal-6-module_implements-cache-557542-159.patch8.55 KBglennpratt
#157 drupal-6.20-module_implements-cache-557542-157.patch7.68 KBglennpratt
#155 module-implements-cache-d6.patch11.64 KBcatch
#151 module-implements-cache-3.patch11.64 KBc960657
#149 module-implements-cache-2.patch11.22 KBc960657
#141 module-implements-cache-1.patch11.43 KBc960657
#129 caches.patch1.97 KBcatch
#121 caches.patch2.07 KBcatch
#120 caches.patch2.07 KBcatch
#115 module_implements.patch2.76 KBcatch
#114 module_implements.patch0 bytescatch
#113 module_implements.patch2.42 KBcatch
#110 module_implements.patch1.94 KBcatch
#108 module_implements.patch1.93 KBcatch
#107 module_implements.patch893 bytescatch
#107 head.png479.94 KBcatch
#107 patch.png312.77 KBcatch
#103 drupal.module-implements-cache.103.patch5.44 KBsun
#102 module_implements.patch5.34 KBcatch
#96 head.png390.43 KBcatch
#96 patch.png505.06 KBcatch
#93 module_implements_cache_rerolled-14.patch4.48 KBCorniI
#92 module_implements_cache_rerolled-13.patch4.44 KBCorniI
#91 module_implements_cache_rerolled-12.patch4.4 KBCorniI
#90 module_implements_cache_rerolled-11.patch4.23 KBCorniI
#85 module_implements_cache_rerolled-10.patch4.01 KBCorniI
#80 module_implements_cache_rerolled-9.patch3.97 KBCorniI
#78 module_implements_cache_rerolled-9.patch2.61 KBCorniI
#73 module_implements_cache_rerolled-8.patch4 KBCorniI
#71 module_implements_cache_rerolled-7.patch3.97 KBCorniI
#68 module_implements_cache_rerolled-6.patch7.04 KBCorniI
#66 module_implements_cache_rerolled-5.patch3.62 KBCorniI
#63 module_implements_cache_rerolled-4-test.patch3.78 KBCorniI
#61 module_implements_cache_rerolled-3.patch3.66 KBCorniI
#59 module_implements_cache_rerolled-2.patch3.62 KBCorniI
#57 module_implements_cache_rerolled.patch3.73 KBCorniI
#46 module_implements_cache.patch3.74 KBcatch
#44 module_implements_cache.patch3.74 KBcatch
#42 module_implements_cache.patch3.56 KBcatch
#41 module_implements_cache.patch3.38 KBcatch
#38 module_implements_cache.patch3.58 KBcatch
#38 head.png434.06 KBcatch
#38 patch.png471.7 KBcatch
#32 hook_groups.patch81.55 KBCrell
#31 module_implements.png366.88 KBcatch
#30 module_implements.png241.83 KBcatch
#21 include.patch1.81 KBfago
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

This is effectively like the page-split patch in that the first part is a totally optional optimization, and the second part is an internal optimization that would not affect the API. Hence it shoudl be ok post-code freeze.

webchick’s picture

The second part I'm fully on-board with, and we might want to look into this sooner than later since we added a bunch more hook calls this release with the expectation that they were 'free' with the inclusion of the registry.

On the first part, I'd like to see benchmarks to see if this makes much of a difference. Also, if it's only for certain hooks, let's definitely not name it *.hooks.inc. As a module developer, if I saw files named that in core and tried to copy them for my own module, I would naturally assume that that file was for hook_init() as well. If we decide to move forward with this, let's make *sure* that the code is split up around easily memorizable lines. *.admin.inc and *.pages.inc and *.install are all very easy.

pwolanin’s picture

well, catch suggested something like "modulename.hooks-not-on-every-page-you-idiot.inc", but that might be a little long ;-)

webchick’s picture

Let's figure out what hooks we're actually talking about first, and pick a name from that. ;)

pwolanin’s picture

hook_menu, hook_menu_alter, hook_theme, hook_theme_registry_alter for sure. maybe hook_action_info and hook_node_info?

other rare hooks:

hook_modules_enabled, hook_modules_disabled, hook_system_info_alter

Crell’s picture

Contrib has a few dozen of these as well, so we cannot just have a hard-coded list. We need some way to declare "and oh yeah, this is an info hook so it has these extra properties".

pwolanin’s picture

This is really a documentation question - the idea is that we would look for *any* hook in this file, but we document that only info-type hooks can be (optionally) placed there for best performance.

Crell’s picture

How would we look for any hook in an optional file without loading and parsing the file to know if we need to load it or not?

pwolanin’s picture

@Crell - right, that's where the caching comes in - so once we check the .module and .inc for hook, we can mark a particular module as implementing it or not.

Crell’s picture

... It sounds like you're heading toward a Reflection API-based reimplementation of the registry, especially if you're talking about it working for any hook. I considered Reflection API but rejected it on the grounds that requires parsing all of the code into memory, so the initial scan would still be ridiculously huge on memory usage.

mattyoung’s picture

subscribe

catch’s picture

Subscribing.

pwolanin’s picture

@Crell - I don't follow why this needs to be huge.

All I need to store is an array linking each hook name to every module that implements it (and whether the .inc needs to be included)

$hooks =  array(
  'menu' => array(
    'system' => TRUE,
    'menu' => TRUE,
    'mycustommodule' => TRUE,
    ...
  )
  'form_alter' => array(
    'mycustommodule' => FALSE,
    ...
  }
)

Then any time I need to invoke a hook, I build this data for the one hook if it's not cache (e.g. cached in a drupal variable). I iterate the array, and only invoke hooks known to exist. If the value is TRUE, I call module_load_include() to make sure the hooks.inc file is in memory.

Also, we shoudl also optimize module_load_include() so it's like drupal_load() and knows what it already loaded.

Crell’s picture

And is that array dynamically built or developer-defined? If developer defined, then we're at hook_hooks(). (May not be a bad place, but that's where we are.)

If dynamic, how do you propose we build that list in the first place? The options I am aware of are:

1) Parse code into memory and use Reflection. This is memory intensive.

2) Tokenize and parse code to analyze where code lives. This is the registry that was just removed.

3) Require developers to explicitly define, somewhere, "I implement this hook and it is in this file".

I am not aware of an option 4.

pwolanin’s picture

build the array automatically:

option 4:

function_exists() || (module_load_include() && function_exists())

Crell’s picture

That's still a variant on option 1. "Keep loading code until we find it or know that we're not going to find it". With an empty cache, that could be a non-small operation.

pwolanin’s picture

@Crell - yes, it would not be a small operation, but rare.

Crell’s picture

I could keep asking questions about implementation, performance, and stability, but I suppose seeing code will answer most of those. :-)

fago’s picture

What about .info.inc ?

So the name suggests to only add some metadata providing hooks there, which area ideally cached or not performance related. Hm but module.info.inc could be easily confused with module.info.

@implementation:
What about going with one fixed file name and make sure it's included when needed with a simple helper function? Should be simple and effective.

pwolanin’s picture

@fago - yes - the idea was to have one fixed file name - well, or a fixed [modulename].X.inc, where X is fixed.

fago’s picture

Status: Active » Needs review
FileSize
1.81 KB

Well, so what about that + manually invoke module_load_all_includes('inc', X) for hooks living in module.X.inc ?

Crell’s picture

A general "sub component of a module" includer like fago suggests is a good idea, but to be honest I think the naming convention for that function is and has always been backwards. :-) We really should just standardize on $module.$subcomponent.inc and be done with it.

That's not a complete solution for the memory problem, but it helps if we have proper standards. (Earlier versions of the registry considered a similar concept, but I kept ending up with overly-complex designs and the self-learning logic seemed much more robust.) It could be useful for contribs, too, like $module.views.inc.

pwolanin’s picture

Status: Needs review » Needs work

I think the static to track if it's already loaded should be in module_load_include

catch’s picture

Is there any way we can unify module_load_include() and drupal_load() - so we can have one central function for require once, and then change it to require?

pwolanin’s picture

it seems like we shoudl be able to - but probably needs to be a post-freeze exception.

Crell’s picture

Issue tags: +Needs committer feedback

Expanding this idea a bit, what if we did the following:

1) Change module_invoke_all() to be: module_invoke_all($hookname, $args_array, $groupname);

2) module_invoke_all() then does the following:

a) Have we already cached which modules implement $hookname? If so, load that array and call those hooks. (We can protect against a stale cache when a module goes missing with a simple function_exists() check, which should be at worst no more expensive than what we do now.)

b) If not, load all $module.$groupname.inc files. Then iterate all module/hook combinations to get a list of all hooks that are implemented. Cache that. Then follow step A. (If $groupname is not specified, skip that part.)

This establishes that for a given hook, it can always live either in $module.module or in $module.$groupname.inc. What $groupname is, that's left up to whoever is implementing the hook. That is, menu_rebuild() arbitrarily decides to use a group of "info", which means hook_menu() implementations could live in either $module.module or in $module.info.inc. Views can specify a group of "views" for hook_views_plugins(), and so those hooks can live either in $module.module or in $module.views.inc. And so on.

Very low overhead, Very easy to extend in contrib, no extra parsing, and no BC break on file organization because .module is still a viable place for any hook. That makes the entire thing "opt in" for module developers who don't want to have zillions of 5 line files. But we still get the ability to move virtually any hook off to another file, using a common, recognizable pattern. (Callbacks would still have to be specified manually as they are in hook_menu and hook_theme.)

The only caveat is that module_invoke_all() currently takes a variable number of parameters rather than an array of arguments. That means we could do one of two things:

module_invoke_all($hook, $group, $list_of_args): This way, we still have a list of args but the $group comes first. If you don't want to use it, specify NULL explicitly.

module_invoke_all($hook, $args_array, $group): Very easy to leave off $group if you want, but you now have to specify your arguments as an indexed array. (Arguably we could encourage the use of groups by making $group the second param, so you do have to specify NULL if you don't want to have an optional group.)

Either one is going to be an API change to module_invoke_all(), which means not quite trivial to write. We'd have to either really really hurry or allow this as a post-freeze exception. I'm willing to write it if people like this approach, but as I'm about to get on a plane the odds of me being able to finish it pre-freeze are not high unless we allow an exception for it. Drieschick, I defer to you on that front.

fago’s picture

Not directly related, but changing to an $args_array would have the advantage that we could support passing arguments by reference by using array($arg1, &$arg_by_ref). That way it wouldn't be necessary anymore to bypass module_invoke_all() just to don't loose the reference.

pwolanin’s picture

In general this sounds like a good idea, but I'm a little leery of the groupname thing. I guess that means in some cases we can force the hook to be in .module? I can see how this might be useful for something like modulename.views.inc

catch’s picture

Of the two options, I think I prefer:

module_invoke_all($hook, &$args_array, $group);

Although it seems like the caching should really be maintained by module_implements() so we'd have module_implements($hook, $group) too. This'd mean module_implements() caching would also count for drupal_alter().

catch’s picture

FileSize
241.83 KB

module_implements() + module_hook() are currently taking up around 10% of page execution time, so we are missing that caching a fair bit.

catch’s picture

FileSize
366.88 KB

Doh, that was a page with a lot of cache misses, immediately after an apache restart, it's actually more like 7% when cached hooks are taken into account.

Crell’s picture

Status: Needs work » Needs review
Issue tags: +Performance
FileSize
81.55 KB

So this has some test failures still, and is going to be very fragile with other parallel patches, but this shows what I was talking about. It adds a $group parameter to module_invoke(), module_invoke_all(), etc. that is optional. If not specified, nothing changes. If it is specified, then we first try to include $module.$group.inc. In order to support optional parameters, $args is also converted to an array, not a variable length

Notes:

- module_load_include() is a horrifically stupid function, as its signature and documentation sucks. I am not touching it here. We fix it another time. :-)

- This patch does potentially introduce a lot of file stats. However, file stats are cached by most modern OSes so should be fairly cheap.

- This does not YET introduce any caching, but that could be added. My concern is that there's so many layers of function hand-off here that it may not be as easy to cache nicely as we'd like.

- I've moved node_menu() off to another file to show how it works.

- The important parts of the patch are quite small, in module.inc. There rest of the patch is just converting code to the new API, which means throwing lots of array() around stuff. :-)

I could definitely use help tracking down the last test failures.

Status: Needs review » Needs work

The last submitted patch failed testing.

alexanderpas’s picture

The problem with the implementation of explicitly specifying a group for a hook, is that it needs to be set in each call of that hook.

What if we take the first part of the hook_name as the groupname and require that the module that declares that group needs to specify it in the .info file. (to help with performance)

for example:
The Block module could declare the block group and hook_block* could live module.block.php
The Views module could declare the views group and hook_views* could live module.views.php
(we could even go as far as to move hook_boot to module.boot.php)

In addition, all that code never gets loaded when the respective module is not installed, as the declared group is not availble.

Also, it keeps the API everyone is used to straight ;)

(development machine not availble here, or i would've provided patch)

Crell’s picture

The problem with making it magic is that most hooks are not multi-word. So if you want to break up a large module, you're going to end up with an awful lot of files. Do we really need to put hook_menu(), hook_theme(), etc. all in different files? It also means we need to do more parsing of the hook that's being called to determine what files to try and include. That raises the CPU cost along a very critical path. I also don't see what the info file has to do with it then, since we'd just magically determine "Oh you're calling hook_foo_bar, I'll check for $module.foo.inc files while I'm at it."

It would still be better than what we have now, though. :-) So I'm open to it if the committers are. I don't think that the requirement for a group name in module_invoke_all() is that big of a deal, though. Generally a given hook is called from only a limited number of places, and all in one module. (And if they are not, that may be a sign that you should have a wrapping function of some kind to unify that lookup anyway.)

Crell’s picture

Hm, I just realized another problem. That effectively means one file per contrib module that declares hooks. Problem. hook_views_data(), hook_views_plugins(), etc. are not that big, and can/should continue to live in $module.views.inc. hook_views_default_views() is frickin' huge, and should be avoided whenever possible. The approach in #34 would push hook_views_default_views() into $module.views.inc as well, which we very much do not want.

catch’s picture

I'd be very concerned about the file_exists() without the caching - we already call module_hook() about 1000 times on each page in HEAD.

catch’s picture

Status: Needs work » Needs review
FileSize
471.7 KB
434.06 KB
3.58 KB

Here's a patch which just implements the caching. I think we should get that right first, then add the optional group/includes stuff onto it - so that we can verify the performance impact in both cases somewhat independently.

Didn't run tests yet but seems to be working. It handles missing functions, and reduces module_hook() calls from 783 to 58 on a completely empty front page. This will be an exponentially bigger saving the more hooks are called, and the more modules installed on a site.

I removed some reset calls from module_implements() in module_load_all() - since nothing seems to break if I do that, there were no supporting comments and it removes three cases where the variable would be set on every page load (those update queries would cancel out the caching, and we don't want a soft and hard cache clear if at all possible). We'll find out if "nothing" really is nothing when the test bot runs.

I don't think this is too far off, so marking CNR for the bot, and for early-ish reviews.

webgrind screenshots attached, I'm getting bugs with screenshots and percentages for some reason, so measurements are in milliseconds. Looks like about 35ms saving - again this is on a very light page, add a few nodes, comments, or additional modules and the number of calls to module_implements() and module_hook() could increase by a factor of 10 or more.

catch’s picture

Title: Allow info-type hooks to move to an .inc file, optimize module_invoke_all() » Cache module_implements() and allow for optional includes for hooks

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
3.38 KB

What about this one?

catch’s picture

This time we also rebuild the cache from scratch when a missing hook is found - this mean if a hook implementation changes from one to another, we ought to find the new implementation as well as not getting fatal errors from the old one. Completely new hook implementations won't be found until the variable is reset though - I've added that to update.php.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
3.74 KB

Moved some more logic around to save checking hooks twice if they're not cached.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
3.74 KB

Less broken version.

Anonymous’s picture

the patch at #46 looks fine to me.

ran some benchmarks with all modules enabled, three nodes with two comments each, hitting the front page.

didn't see any improvement with the patch, but i guess that's to be expected. we're loading the same amount of code.

think we'll need this all in one go to test out what gains we can get.

pwolanin’s picture

It doesn't look to me like there is a ever a variable_set() once you generate a list of valid hooks.

catch’s picture

Well it does variable_set() every time there's new hooks. The last time there's new hooks would presumably be the first time the hook cache is complete no?

pwolanin’s picture

ah, ok - missed that. I was looking for the variable_set() inside module_implements().

dropcube’s picture

subscribing

moshe weitzman’s picture

Interesting.

How many variable_set() calls are we expecting per page? This can be an expensive function on a mature site since it leads to a rebuild of all (crufty) variables with its many unserialize() calls. You might say that a site should keep its variables table clean but thats pretty hard and infrequent in practice.

catch’s picture

Hmm, both pwolanin's and moshe_work's comment suggest my code comments are crappy. Here's a summary of what the patch is doing, if I need to add some of this as phpdoc, I guess the header of module_implements_write_cache()?

In the first call to module_implements(), get the array of $hook[$module] from $conf. This starts off the $implements static variable with whatever's in there. Each time module_implements() is called, if we find any $hook that wasn't there before, we add that to $implements.

Then in drupal_page_footer(), we compare the value of $conf['module_implements'] - which is still the same as it was at the beginning of the request, with the &drupal_static('module_implements') static variable we built up during the request.

If, and only if, there are hooks found during the page request which weren't previously cached (array_diff_key() in module_implements_write_cache()), we do a single variable_set() with the expanded array of hooks. This process is only additive - and once it's initially set up, I reckon the only places likely to cause a variable_set() are administrative or CUD pages which have more rarely called hooks. hook_foo_delete() hook_foo_update(), hook_form_FORM_ID_alter() - but again, once these have been found once, we don't see a write again until both the cache has been cleared AND that specific operation is formed.

Additionally, there's a failsafe in there, so that if a module is disabled via the system table, or a cvs update removes a hook implementation, we start the cache building again with the hooks found during that request - this also gives the system a chance to find new hook implementations if they were added outside of the usual modules page / update.php cycle.

pwolanin’s picture

This looks and sounds like a pretty good implementation. The only nagging question I have is regarding multiple threads doing all these steps in parallel. I think the worse case is that we variable_set() (and consequent cache clearing) several times with the same data and incur that extra overhead, but likely on average that's much better than doing any kind of query (and better than using the lock API) since the write shoudl be an atomic operation.

CorniI’s picture

Status: Needs review » Needs work

cool patch :)
a) imho, the variable would be better named module_implements_cache instead of module_implements.
b) I don't understand why we add an array_diff_key in the main code path, wouldn't setting a $GLOBAL['module_implements_cache_write'] = TRUE; and in module_implements_write_cache() an if(isset($GLOBAL['module_implements_cache_write']) be faster and nicer to read?
Especially as the $new_hooks array is not used later.
c) The line

+    $implementations = $cached_implementations += $implementations;

is definitly wrong, the operator+= is misplaced, $cached_implementations is not used later anymore. For better readability, i'd prefer array_merge here anyways.
d) Are the benchmarks from #38 still valid for this patch? Or can you rerun the test with a warm cache (after addressing my feedback)?
Thanks
Corni

catch’s picture

a) agreed. I started thinking about this with cache_set()/cache_get() then pwolanin suggested a variable in irc before I got to rolling a patch, but $cid was still on my mind.
b) I don't think a single array_diff_key() matters much, but wouldn't particularly object to a global there.
c) Yes we can just use $implementations
d) There's no benchmarks there, only profiling output, but with those changes they'd still be valid yeah.

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.73 KB

rerolled, per #56 and #57
verified that it works in xdebug, still gives a good performance improvement :)

Status: Needs review » Needs work

The last submitted patch failed testing.

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.62 KB

uh, testbot doesn't like me :(
i'm trying again, now without the array_merge as it is unneeded as catch told me.
Needs review for the testbot.

Status: Needs review » Needs work

The last submitted patch failed testing.

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.66 KB

I can't run the whole testsuite locally, but I picked some failures and wasn't able to reproduce them...

Status: Needs review » Needs work

The last submitted patch failed testing.

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.78 KB

try something...
sorry for spamming everyone's inboxes!

Status: Needs review » Needs work

The last submitted patch failed testing.

Damien Tournoud’s picture

Please do not use the variable table as a cache. This is not what it has been designed for, and makes clearing the cache a non trivial operation.

There is no point in trying to save one query, as this "query" will only happen if the website is using the standard DB cache.

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.62 KB

per #65 and chx's comments in irc, now again with cache_*(), also I hope i finally found the test failure source.

Status: Needs review » Needs work

The last submitted patch failed testing.

CorniI’s picture

Status: Needs work » Needs review
FileSize
7.04 KB
CorniI’s picture

#576096: Port cache-install.inc to the new cache system went in, so please consider the patch from #66 for reviewing/committing.
Thanks!

catch’s picture

Status: Needs review » Needs work
+++ includes/module.inc	13 Sep 2009 13:45:34 -0000
@@ -344,19 +343,46 @@
+    if($implementations === FALSE) {

Missing a space.

+++ includes/module.inc	13 Sep 2009 13:45:34 -0000
@@ -344,19 +343,46 @@
+        //cache_set('module_implements', $implementations);

This shouldn't be here ;).

Otherwise looks great.

I'm on crack. Are you, too?

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.97 KB

...

pwolanin’s picture

the use of a global is a bit ugly. perhaps you can just set something in the $implementations array - e.g. a key starting with a # can never be a hook name.

$implementations['#changed'] = TRUE;
CorniI’s picture

best idea till now, thanks!
attached patch changes $GLOBALS to the static per #72.

Crell’s picture

Status: Needs review » Needs work

The comment in module_implements() still says that we're using the variable table, when the line right after it is a cache_get() call.

Right now, module_hook() is just a trivial wrapper around function_exists(), is it not? Couldn't we save ourselves some function calls and just do a function_exists() ourselves? (The lazy-load mechanisms I was working on above would do more there, so that may be self-defeating of me to suggest, but I'll mention it anyway for completeness.)

pwolanin’s picture

Yeah, looking at http://api.drupal.org/api/function/module_hook/7 there is no reason to add that extra function call.

catch’s picture

@Crell, yeah I was thinking any type of lazy load stuff probably needs to be encapsulated in module_hook() - at this point, assuming the cache is primed we're calling it once for ever real hook implementation (approx 40 times on HEAD front page, but I guess this can easily go up to a few hundred, so might well be worth skipping that function call).

See also #529194: module_hook() your time has come.

chx’s picture

if you are using drupal_static then you should drop the refresh parameter and use drupal_static_reset instead.

CorniI’s picture

Status: Needs work » Needs review
FileSize
2.61 KB

wrt #548470: Use something other than a single underscore for hook namespacing module_hook instead of function_exists() has an advantage...
chx: doesn't work, because you'd have to do a drupal_static_reset + a cache_set('module_implements', array()); and this is an implementation detail i'd like to hide from the clients.
@Crell: removed the comment
I also replaced a comment and a cache_set with a better way.

catch’s picture

Status: Needs review » Needs work

Cornil - can you roll again with -up, patch has funny arrows in it ;)

CorniI’s picture

Status: Needs work » Needs review
FileSize
3.97 KB

argh sorry...

catch’s picture

Why don't we do a cache_clear_all($cid) instead of cache_set() when we need to rebuild the cache?

CorniI’s picture

It doesn't really matter, imho it is fine this way. query-wise it would be delete+insert vs 2xupdate...
I think a cache_set is fine, but if you want, I can reroll with cache_clear_all :)

RobLoach’s picture

Subscribing. The ability to have [mymodule].[othermodule].inc is what I loved about the the registry as it improved the developer experience.

chx’s picture

Status: Needs review » Needs work

I think I saw Tab characters in that patch. Oh horror!

CorniI’s picture

Status: Needs work » Needs review
FileSize
4.01 KB

/me runs away frightened of the tabs ;)
no functional changes, just tabs=>spaces.

CorniI’s picture

Followup per chx's request which is possible because of drupal_static's :)
#577950: API-cleanup: Factor module_implements $refresh parameter into it's own function

chx’s picture

Status: Needs review » Needs work

Removing module_implements('', FALSE, TRUE); from module_load_all can't be good. A drupal_static_reset is necessary.

Also, I am fairly sure that you want to write that cache only on GET to avoid bloating the hell out of cache with hooks only used when POSTing.

chx’s picture

So we have a problem here that we can't really reset the static cache because we already loaded the cache implementations and this would neuter the db cache. So we need to be able to tell to module_implements that we are in bootstrap and when in bootstrap, make sure it does not static cache non-bootstrap hooks 'cos modules are not loaded and when bootstrap is finished then it can begin caching in earnest.

chx’s picture

On further thinking, the bootstrap issue is moot. You would need to call a Drupal function which, in turn, calls module_invoke_all. Not much of Drupal is loaded at this point, not even theme.inc. If someone manually begins to include .inc files from includes then it's her own problem to get Drupal functioning after that. The statics are exposed enough to make that possible and that's good enough.

So it's okay to remove that but the GET/POST issue still stands.

CorniI’s picture

Status: Needs work » Needs review
FileSize
4.23 KB

Now uses drupal_page_is_cacheable() to determine if we want to write the cache or not.

CorniI’s picture

per request of chx better comment for the addition

CorniI’s picture

typo in comment...

CorniI’s picture

changed comment per catch's request...

catch’s picture

Status: Needs review » Reviewed & tested by the community

I'm (very) happy with this now. We keep the cache a manageable size, all the extra hooks we added to Drupal 7 won't cost us half as much, it's compatible with the lazy load stuff Crell wants to do to reduce memory usage on non-APC hosts, and it's a small amount of code with ~ equal number of comments to code. Should also have less DX side-effects than the registry since you should never get undefined function for a hook implementation.

So setting RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Let's get some benchmarks for this.

catch’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
505.06 KB
390.43 KB

Vanilla install with one node on the front page:
ab -c1 -n500

HEAD:

Requests per second:    10.18 [#/sec] (mean)
Time per request:       98.198 [ms] (mean)
Time per request:       98.198 [ms] (mean, across all concurrent requests)
Transfer rate:          71.65 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    90   98  13.5     95     317
Waiting:       90   98  12.4     95     317
Total:         90   98  13.5     95     317

Patch:

Requests per second:    10.54 [#/sec] (mean)
Time per request:       94.862 [ms] (mean)
Time per request:       94.862 [ms] (mean, across all concurrent requests)
Transfer rate:          74.17 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    83   95  19.9     88     341
Waiting:       83   95  19.6     88     331
Total:         83   95  19.9     88     341

A bit faster but possibly margin of error - the main advantage here is that the cost of module_implements() is now based on how many hook implementations you actually have(i.e. more or less flat compared to most things on a site), not on how many possible hook implementations could possibly exist ever under any circumstances. So if we enable 50 modules it might go up 20% instead of 400%.

Also attached dreamy webgrind screenshots. You want to look at the right hand column for module_implements() to see where the impact is - numbers are percentage taken of the page request.

donquixote’s picture

Maybe the following could be useful to extend on this:
#519940: Performance: optimize building theme registry by using get_defined_functions() instead of function_exists()

The idea is that in some cases it is much faster to look at all defined functions, instead of looking at all possible module + hook (+ suffix) combinations. This is especially valid for _theme_process_registry, but could also be useful for module_implements() or module_invoke_all(), in some cases.

The other idea is that hooks don't need to be explicitly defined anywhere. Instead, we remember functions that could be a hook implementation, and wait for some code to ask for a hook with that name. This naive approach can make things a lot easier.

The downside:
This only works if either all candidate functions are already loaded at the moment when get_defined_functions() is called (mostly the case in D6), or if you have a list of function names available from elsewhere (defined via a config file or other means).
Btw, I think this is no different from the #93 patch, which uses module_hook / function_exists.

 
Which one is faster?

On my local test site the time for a full scan with get_defined_functions was around 50ms, with views and ubercart and a lot of other modules (measured with PHP microtime).

I imagine that in most situations where you need only one hook name, the #93 patch is faster than that, because it does only test the module + hook combinations with module_hook (= function_exists), for one specific hook name. Number of modules is smaller than number of defined functions, and there are fewer string operations involved for each iteration.

The get_defined_functions solution can be a lot faster if we want to find many different hooks at once, or if we allow different suffix combinations (with _alter_ or _preprocess_, or theme functions).

The idea would be to do one full scan on the modules page, and cache it in the DB.

 
Works for D6?

There is one problem both with the #93 patch and the get_defined_functions solution: It does not find hook implementations in include files. This is no problem for D7, but it can be a problem in D6, see #519940-#9 or #221964-#214 for a description of the problem and a possible solution.

sun’s picture

catch’s picture

Title: Cache module_implements() and allow for optional includes for hooks » Cache module_implements()
Dries’s picture

Title: Cache module_implements() » Cache module_implements() and allow for optional includes for hooks

I think this is a great patch, but some extra code comments would be appreciated. :-)

sun’s picture

+++ update.php	15 Sep 2009 15:03:54 -0000
@@ -292,6 +292,9 @@ if (empty($op) && $update_access_allowed
+  // Reset the module_implements() cache.
+  module_implements('', FALSE, TRUE);

Why? :)

Shouldn't this also happen in drupal_flush_all_caches() to clear any stale data?

+++ includes/module.inc	15 Sep 2009 15:03:57 -0000
@@ -344,19 +343,41 @@ function module_hook($module, $hook) {
 function module_implements($hook, $sort = FALSE, $refresh = FALSE) {

@see module_implements_write_cache()

Could we rename that last argument to $reset? We have another issue in the queue that tries to find a sane pattern for reset functions/arguments, so it would be good to catch this instance when grepping for $reset.

+++ includes/module.inc	15 Sep 2009 15:03:57 -0000
@@ -344,19 +343,41 @@ function module_hook($module, $hook) {
       if (module_hook($module, $hook)) {
-        $implementations[$hook][] = $module;
+        $implementations[$hook][$module] = $module;
+        // We added something to the cache, so write it when we're done.
+        $implementations['#write_cache'] = TRUE;

This means we're writing the cache on every request that finds/invokes not-yet-known hook implementations, right?

+++ includes/module.inc	15 Sep 2009 15:03:57 -0000
@@ -344,19 +343,41 @@ function module_hook($module, $hook) {
+  else {
+    foreach ($implementations[$hook] as $module) {
+      // Ensure no undefined function errors for calling functions.
+      if (!module_hook($module, $hook)) {
+        unset($implementations[$hook][$module]);

Here, the comment could additionally explain the condition.

+++ includes/module.inc	15 Sep 2009 15:03:57 -0000
@@ -371,6 +392,20 @@ function module_implements($hook, $sort 
+ * Write to the module_implements() cache.
+ */
+function module_implements_write_cache() {

@see module_implements()

+++ includes/module.inc	15 Sep 2009 15:03:57 -0000
@@ -371,6 +392,20 @@ function module_implements($hook, $sort 
+  if (isset($implementations['#write_cache']) && drupal_page_is_cacheable()) {

I don't understand where #write_cache is (ever) unset - it seems even to be contained in the cache?

This review is powered by Dreditor.

catch’s picture

FileSize
5.34 KB

This means we're writing the cache on every request that finds/invokes not-yet-known hook implementations, right?

.

Yep, except for POST requests - this is a relatively cheap cache to build (compared to say the menu cache, which is an insert per router item visited (I think)), so I think the trade-off is fine.

Everything else looked valid to me - tried to add as many additional comments as possible.

sun’s picture

Fixed some text.

However, one question: What happens if my module loads an include containing further hook implementations in a POST request, so the cached hook implementations do not contain those implementations?

CorniI’s picture

@sun: Just _fired_ hooks are recorded in the cache, not loaded hooks, so you're question is kinda weird wrt the POST part.
You're simply not expected to conditionally include hook implementations I think (wether it's POST or GET...).

catch’s picture

If it's a POST request, no new hook on that request will get cached - so it will just be as if there wasn't a cache, your module won't be singled out. If those hook implementations then get loaded in a non-POST request, the cache would kick in again for those hooks which have not yet been cached. In other words we only cache the implementations of each hook once, in non-POST requests.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

This looks great now. Committed to CVS HEAD.

catch’s picture

Category: task » bug
Status: Fixed » Needs review
FileSize
312.77 KB
479.94 KB
893 bytes

We missed an obvious error in the patch. drupal_page_is_cacheable() is set to FALSE for all authenticated requests. Since we fetch from cache whenever, I believe I tested with a primed cache (and hence the benchmarks were still OK). This is a very quick fix which just uses the code from drupal_page_is_cacheable() instead of calling it. I don't have time to write it now, but it's possible we could add a very, very simple test which just visits a couple of pages, then does a cache_get() and does an assertTrue() on it.

Attaching webgrind screenshots which show the difference (module_hook() calls down from 700-odd to 60).

catch’s picture

FileSize
1.93 KB

With a test. I get 'base table or view not found' exception when running this, but thinking that might be a local issue.

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
1.94 KB

cache_get() has 'cache' as default table, but cache_clear_all() doesn't, obviously :(

Test fails without the patch, passes with it.

pwolanin’s picture

Title: Cache module_implements() and allow for optional includes for hooks » Cache module_implements()
Status: Needs review » Reviewed & tested by the community

Fixes the test exception

webchick’s picture

Status: Reviewed & tested by the community » Needs work
+++ includes/module.inc	29 Sep 2009 16:07:43 -0000
@@ -421,7 +421,7 @@ function module_implements_write_cache()
-  if (isset($implementations['#write_cache']) && drupal_page_is_cacheable()) {
+  if (isset($implementations['#write_cache']) && ($_SERVER['REQUEST_METHOD'] == 'GET' || $_SERVER['REQUEST_METHOD'] == 'HEAD')) {

The old code here was much easier to understand. Please leave a comment why we're checking the request methods, and why we're checking for GET and HEAD and not also POST and others.

I'm on crack. Are you, too?

catch’s picture

Status: Needs work » Needs review
FileSize
2.42 KB

Howzat?

catch’s picture

FileSize
0 bytes

Added explicit test for logged in user.

catch’s picture

FileSize
2.76 KB
webchick’s picture

Status: Needs review » Fixed

Looks good. Thanks for the extra tests there.

Committed to HEAD!

JacobSingh’s picture

This caused a regression in:

#524728: Refactor install.php to allow Drupal to be installed from the command line.

Tries to access a cache before it is created.

PDOException: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'dev7_drupal.cache' doesn't exist in /Users/jacob/work/github/drupal/includes/database/database.inc on line 1735

David_Rothstein’s picture

Hm, this also broke command line installs for me for a few days (although I was getting a different error message than Jacob), but magically, it seems to be working again. Anyone else experiencing this? The inability to run the installer via the command line seriously cut down on my Drupal productivity for a little while :)

Jacob debugged this pretty well at #524728-88: Refactor install.php to allow Drupal to be installed from the command line but we should really solve it here. It sounds to me like if the problem reoccurs (or if other people are still experiencing it now), the solution might be to remove the funny cache tweaking that is currently in the installer and replace it with a simple call to drupal_flush_all_caches() after the installation is over. We discussed this possibility previously in #577298: Blogs and forums don't get body fields when included in an installation profile as a possible way to future-proof the installer against weird caching bugs, and this might be a good example of that... In general, the installation of Drupal is so different from a regular running Drupal site that I'm not sure we ever really want to trust anything that gets put in the cache during that time period.

catch’s picture

the solution might be to remove the funny cache tweaking that is currently in the installer and replace it with a simple call to drupal_flush_all_caches() after the installation is over.

Yes that seems like the best option to me, and was what we came up with when discussing with Jacob in irc. Note that my debugging suggests we still need the fake cach class, it's the 2007 commit which tries to switch to the 'real' cache early which is the problem - and this needs to be replaced with a cache flush at the very end.

catch’s picture

Status: Fixed » Needs review
FileSize
2.07 KB

Here's a patch which fixes command line installs. Improvements to the comment welcome, but it's better than the obscure comments from 2007 about clean urls check which are in there at the moment.

catch’s picture

FileSize
2.07 KB

missed a spot.

JacobSingh’s picture

Status: Needs review » Reviewed & tested by the community
+++ install.php	6 Oct 2009 11:35:09 -0000
@@ -1421,22 +1416,16 @@ function install_finished(&$install_stat
+  // Flush all caches to ensure that any full bootstraps during the installer
+  // do not leave stale cached data, and that any content types or other items
+  // registered by the install profile are registered correctly.
+  drupal_flush_all_caches();

Is that all it took? I thought I tried this and had problems... Anyway, nice one, it totally works.

I'm on crack. Are you, too?

catch’s picture

Status: Reviewed & tested by the community » Needs work

Bit too simple - it fixes cli installs but regresses back to the original bug from 2007.

David_Rothstein’s picture

Huh, really? That's surprising, since the cache should be totally empty afterwards.

By the way, it looks like the patch missed removing the call to node_types_rebuild() - it removes the code comment only.

It's also strange that command line installs went back to working for me automatically but no one else :) I wonder if this is related to #561306: Installer corrupts cache when database connection is in settings.php in some way... the local script I use replaces settings.php, but maybe others don't? I can't wait until we have "drush install drupal" :) But either way, something is definitely broken here, and it seems like the drupal_flush_all_caches() solution should in theory be able to fix it.

catch’s picture

I can't find the original issue right now, but here's what I get with the current patch:

Command line install works perfectly.

UI install works fine EXCEPT that when you hit the front page of your Drupal site, the sitename is 'Drupal' instead of your actual sitename. What's even more bizarre is that if you just refresh the page, you'll get the correct site name. i ran this several times, including rubbing my eyes, and that's what appears to happen. The refresh of that page should not clear any caches, so it looks like $conf itself is somehow corrupted outside of the variables cache.

David_Rothstein’s picture

Strange indeed, but actually, I can reproduce that exact same problem (site name switching) without the patch as well.

So it doesn't appear to be caused by this patch... or fixed by it either, apparently.

moshe weitzman’s picture

Status: Needs work » Reviewed & tested by the community

This patch fixed CLI installs for me and they were certainly broken without it. Since the site_name bug happens without this patch too, this is RTBC.

FYI, I am about to commit an initial install command for drush that (re)installs core including the whole install profile. Handy for core devs and committers. Requires this patch.

catch’s picture

Agreed with RTBC, the site name patch we can do as a followup issue when this goes in.

catch’s picture

FileSize
1.97 KB

Somehow the node_types_rebuild() call never got deleted. No other changes.

David_Rothstein’s picture

Certainly looks right, and should go in - but Moshe, you might want to avoid doing a CVS update any time soon :)... it turns out the CLI installs are broken for a different reason now also: #412730: Theme system should optionally report when a theme key is not found

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks!

David_Rothstein’s picture

Damien Tournoud’s picture

Priority: Normal » Critical
Status: Fixed » Active

What happens if:

* the cache is not primed
* module_implements is called before DRUPAL_BOOTSTRAP_FULL

... module_implements() will prime the cache based on the bootstrap modules. #fail

catch’s picture

The only situation that would be a problem is when module_implements() is used to invoke exactly the same hook both before DRUPAL_BOOTSTRAP_FULL and afterwards. That would be a bug with the calling function - since by definition it wouldn't be a proper hook invocation for all modules, and it's also the reason we have bootstrap_invoke_all(). Do you have a specific example where such a thing happens and bootstrap_invoke_all() isn't an option?

Damien Tournoud’s picture

The point is: this is documented nowhere, and is a DX hell, because the function doesn't necessarily know it can be called in a non-fully-bootstrapped context, and as such shouldn't use module_implements() (or any function indirectly calling module_implements()).

catch’s picture

Well we can probably prevent caching in that situation by setting a static if the first call to module_implements() is before DRUPAL_BOOTSTRAP_FULL with drupal_get_bootstrap_phase(), then prevent the cache being written if that's the case. Only needs to be checked once since we can't go backwards and unload modules so would have no overhead compared to removing the caching again.

If someone calls module_implements() in hook_boot() then no caching would ever happen, but tough luck I suppose.

CorniI’s picture

sun.core’s picture

Status: Active » Fixed

I like the patch(es) that was committed, but Damien's concerns sound also worrying. However, can we please move this to another issue? This one was RTBC like 5 times already and this means there is not much room for quick discussion. Ideally, more or less copy #133 to a new critical issue and link to it here. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

c960657’s picture

Here is a first stab at a D6 backport. Xdebug reports that the time spent in module_implements() is reduced by 75%, but I couldn't measure any difference inthe total page generation time using siege, so it may be Xdebug that skews the results. I'll leave the patch here if anybody wants to benchmark or continue work on it.

Just a small note about the patch: I moved the functionality from module_implements_write_cache() into module_implements(), because D6 does not have drupal_static(), so two functions cannot access each others static variables.

c960657’s picture

Version: 7.x-dev » 6.x-dev
Status: Closed (fixed) » Needs work
Issue tags: -Needs committer feedback

Reopening for D6. The patches are similar, so D6 should see a performance improvement similar to that in D7, unless the two versions differ in some way that affects this.

iamer’s picture

I gave the D6 patch a spin on a test pressflow installation and I it have two problems :

* Apache error log on accessing update.php: PHP Fatal error: Call to undefined function cache_set() in /var/www/html/includes/module.inc on line 433
* All pages at the top : Notice: Undefined offset: 0 in /var/www/html/includes/path.inc on line 54

The second one probably because I have the path cache module enabled. Line 54 says $modules = module_implements('lookup_path');
Maybe it's getting called too early in bootstrap ?

catch’s picture

Category: bug » feature
Priority: Critical » Normal

Not a critical bug in D6.

mauritsl’s picture

Benchmarked the patch from #141 with ab -c10 -n100 (anonymous).
Used /node of a somewhat basic Drupal 6.16 installation with 2 additional blocks (from views module), page and block cache disabled.

Without patch:
Requests per second: 4.98 [#/sec] (mean)
Time per request: 2008.131 [ms] (mean)
Time per request: 200.813 [ms] (mean, across all concurrent requests)
Transfer rate: 142.75 [Kbytes/sec] received

Patched:
Requests per second: 5.04 [#/sec] (mean)
Time per request: 1983.727 [ms] (mean)
Time per request: 198.373 [ms] (mean, across all concurrent requests)
Transfer rate: 144.51 [Kbytes/sec] received

That shows a 1.2% improvement.

c960657’s picture

When building the module list, module_implements() loops over all installed modules, so I would expect the improvement to increase with the number of modules installed.

c960657’s picture

Using XDebug+Webgrind I get the following numbers for a Drupal site with 250 modules:

Without patch
Total self cost: 106
Total inclusive cost: 241

With patch
Total self cost: 8
Total inclusive cost: 19

ezar’s picture

Works fantastic! Please commit to current D6 dev!
Thanks c960657

eZar

c960657’s picture

Status: Needs work » Needs review
FileSize
11.22 KB

New D6 patch that includes the fix from #836630: (Tests for) module_implements() caching is broken.

ezar’s picture

works fine here!
thanks

c960657’s picture

We need to bootstrap the cache (and thus the database layer) in update.php before flushing the module_implements cache.

Status: Needs review » Needs work
Issue tags: -Performance

The last submitted patch, module-implements-cache-3.patch, failed testing.

c960657’s picture

Status: Needs work » Needs review

#151: module-implements-cache-3.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Performance

The last submitted patch, module-implements-cache-3.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
11.64 KB

Just needs D6 extension, this works fine for me.

jcmarco’s picture

#155 patch works fine but it breaks the drush updb command.
If this patch goes to core it would be needed to update drush commands/core/drupal/update_6.inc to add the same changes than in that patch

diff --git commands/core/drupal/update_6.inc commands/core/drupal/update_6.inc
index 21278b3..d433bf6 100644
--- commands/core/drupal/update_6.inc
+++ commands/core/drupal/update_6.inc
@@ -282,7 +282,7 @@ function update_main_prepare() {

   require_once './includes/bootstrap.inc';
   // Minimum load of components.
-  drush_bootstrap(DRUSH_BOOTSTRAP_DRUPAL_CONFIGURATION);
+  drush_bootstrap(DRUSH_BOOTSTRAP_DRUPAL_DATABASE);
   require_once './includes/install.inc';
   require_once './includes/file.inc';
   require_once './modules/system/system.install';
@@ -292,6 +292,8 @@ function update_main_prepare() {
   $module_list['system']['filename'] = 'modules/system/system.module';
   $module_list['filter']['filename'] = 'modules/filter/filter.module';
   module_list(TRUE, FALSE, FALSE, $module_list);
+  module_implements('', FALSE, TRUE);
+
   drupal_load('module', 'system');
   drupal_load('module', 'filter');
glennpratt’s picture

Re-roll against 6.20

Status: Needs review » Needs work

The last submitted patch, drupal-6.20-module_implements-cache-557542-157.patch, failed testing.

glennpratt’s picture

Wierd, applied fine for me...

Re-roll with CVS from DRUPAL-6.

glennpratt’s picture

Status: Needs work » Needs review

Test please...

Status: Needs review » Needs work

The last submitted patch, drupal-6-module_implements-cache-557542-159.patch, failed testing.

glennpratt’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
7.71 KB

Sorry, of course the tests don't work for D6. This is catch's patch from 155 re-rolled with git.

Running this patch on a production site for a few months with no issues. DX change having to clear cache when implementing new hooks, same as D7.

It provides a modest boost when running just a few modules, the boost becomes much larger on a site with many modules.

Simple ab with most of core + views + cck enabled.

ab -c5 -n100 http://d6.vbox.local/node/add/page

  • 6.x-dev: Requests per second: 2.69 [#/sec] (mean)
  • Patched: Requests per second: 3.20 [#/sec] (mean)
Jeremy’s picture

Subscribe.

Fabianx’s picture

Subscribe, this is cool ....

pwolanin’s picture

Looks a bit odd - are we always refreshing the cache on every page footer?

 function drupal_page_footer() {
   if (variable_get('cache', CACHE_DISABLED) != CACHE_DISABLED) {
     page_set_cache();
   }
 
   module_invoke_all('exit');
+
+  // Update the hook implementation cache.
+  module_implements('', FALSE, FALSE, TRUE);
catch’s picture

Looks fine i think:

module_implements($hook, $sort = FALSE, $reset = FALSE, $write_cache = FALSE)

Would only refresh if $reset was true, and when $write_cache is TRUE, module_implements should still be checking if there's anything to actually write before doing so.

Just a note while I'd like to see the Pressflow merge get in, I'm not convinced about committing this to D6 since it does change behaviour in terms of adding/removing new hooks (and will require a drush patch for drush updatedb to work), however it's also good that the patch is sitting here for people to find it.

mikeytown2’s picture

Subscribe. Patch #162 applies to pressflow without any rejections.

[]# patch -p1 < 557542-module_implements-cache-161-D6.patch 
patching file includes/common.inc                                            
Hunk #1 succeeded at 1621 (offset -2 lines).                                 
patching file includes/module.inc                                            
patching file install.php                                                    
Hunk #1 succeeded at 47 (offset 8 lines).                                    
Hunk #3 succeeded at 84 (offset 8 lines).                                    
patching file update.php            

Initial cachegrind shows an improvement of 150ms or so in the module_implements() function when using memcache. If the module_hook() check is removed, getting called 3,759 times only takes 21ms under total cumulative time. With that check in place It took 203ms of total cumulative time. I understand why that check is needed; but I think we could optimize this.

Patch below takes 32ms total cumulative time and will still verify that the function exists. Changes done is it only runs the foreach loop once per hook, via the new $verified static variable. This patch brings over a 300ms improvement on our test server compared to stock drupal.

What was the issue with drush, is it #156?

catch’s picture

The drush patch has been committed to 4.x but afaik isn't in a point release yet, this patch breaks drush updb without it though. I have not seen many calls to module_hook() left with the patch applied so interesting that those are still showing up for you. Typing from phone so did not review latest patch yet.

mikeytown2’s picture

glennpratt’s picture

Here's an interdiff for 167 from 161.

seandunaway’s picture

Wow, this is awesome. Applies cleanly to pressflow 6.22.

rjbrown99’s picture

I tried #167 tonight and ran into issues where the queries to the url_alias table seemed to go away. Links reverted back on the main site as well, no more aliases - /my/alias was now simply /node/123. I'm using Pressflow and also have the path_alias_cache module enabled. One thing I noticed was a catch branch of Pressflow with this patch, and it has a follow on patch for path.inc here:
http://bazaar.launchpad.net/~catch-drupal/pressflow/pressflow_implements...

This is similar to comment #143 above. I have not re-tested with that fix applied nor did I debug why my aliases stopped working, but I thought I'd bring it up in the event it is more widely applicable to others.

glennpratt’s picture

@rjbrown99 - Yes, you should use the Pressflow branch for Pressflow, not this patch.

mikeytown2’s picture

good to know about the issues with the path_alias_cache module; we disable it on our test servers.

mikeytown2’s picture

Here's a patch based off of #167 that keeps the behavior of module_implements() the same as before thus fixing the pressflow issue.

New patch uses $implementations[$hook][] = $module; Instead of $implementations[$hook][$module] = $module; and uses a $key in the verification loop.

mikeytown2’s picture

And for the curious here's an interdiff for 175 from 161.

mikeytown2’s picture

And for the very curious here's an interdiff for 175 from 167.

mikeytown2’s picture

Re-worked #175. I had a very odd case where cache_get() didn't exist yet (selective bootstrapping), so we need to check that the cache functions exist before using using them inside of module_implements(). I doubt anyone else will have this issue, but ya never know. What changed

  // Fetch implementations from cache.
  if (empty($implementations)) {
    if (function_exists('cache_get')) {
      $cache = cache_get('module_implements');
    }
    if (empty($cache)) {
      $implementations = array();
    }
    else {
      $implementations = $cache->data;
    }
  }
mikeytown2’s picture

renamed the above patch so the testbot will pick it up. This is the same as #178

Grayside’s picture

Drop this into some drush.inc file and you can use 'drush cc hook' in development. The kind of sites that this patch helps are the kind that make cc all a pain :)

/**
 * Implementation of hook_drush_cache_clear().
 */
function example_drush_cache_clear(&$types) {
  $types['hook'] = '_example_hook_cache_clear';
}

/**
 * Clear widget cache.
 */
function _example_hook_cache_clear() {
  cache_clear_all('module_implements', 'cache', TRUE);
}
donquixote’s picture

Issue summary: View changes

Something here does not make sense (EDIT: talking about D7).

  if (!isset($implementations[$hook])) {
    .. // discover hook implementations
    foreach ($list as $module) {
      ..
    }
  }
  else {
    foreach ($implementations[$hook] as $module => $group) {
      .. // re-check $group + module_load_include() and function_exists()
    }
  }

This means, if I call module_implements('foo') 10x, then the first loop will be called 1x, the second loop will be called 9x.
So, if $group is not empty, then module_load_include() will be repeated 9x in the same request, for the same module+hook, for no reason.

Am I missing something?
Why do we even bother with "Use the advanced drupal_static()" if we waste our time down there in the loop?

The only way this could make sense is if the statically cached $implementations can change over time, from something external. This is theoretically possible, given that drupal_static() is globally accessible, but we are not that bad, are we?

I can open a new issue, but maybe some feedback first..

donquixote’s picture

I should add, the D6 patch by mikeytown prevents this problem with a second static variable, $verified.

   if (!isset($implementations[$hook])) {
     ..
   }
+  elseif (empty($verified[$hook])) {

But I'd say this is not even necessary.
EDIT: Oh yes it is.

dbehrman’s picture

My site hits this a few hundred times, and making this additional modification saves about 1/4 second on each page load.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 184: 557542-cache_module_implements-184.patch, failed testing.

dbehrman’s picture

mikeytown2’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 186: 557542-cache_module_implements-185.patch, failed testing.

Fabianx’s picture

I think that is a D6 patch that cannot be tested, but ooh, it makes sense to also cache the verified part.

dbehrman’s picture

Here's another one. It looks like verified can be completely removed instead of cached for the same effect. But also, thanks to the 'wild hack' in ctools features, there are some hooks that exist sometimes and don't exist other times. The problem is when they don't exist, and you call module_invoke(), it will return null. If the function did exist but returned nothing, you would at least get an empty array. So this patch includes a patch for that, also in the modules.inc file.

artkon’s picture

Using 191 but found a bug, module_invoke should return NULL when no implements otherwise it breaks update.php among others.

mikeytown2’s picture

Status: Needs work » Needs review

The last submitted patch, 190: 557542-cache_module_implements-190.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 191: 557542-cache_module_implements-191.patch, failed testing.

Fabianx’s picture

This issue is for 6.x, tests won't work there.

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 191: 557542-cache_module_implements-191.patch, failed testing.

Status: Needs work » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.