DBTNG: make module_implements work regardless of bootstrap phase
chx - August 22, 2008 - 12:19
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | base system |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Description
Currently module_implements relies on module_list (which in itself needs love), this patch makes it rely on the registry instead. This should make db_merge possible during bootstrap on postgresql which currently wont work.
| Attachment | Size |
|---|---|
| module_implements.patch | 13.05 KB |

#1
patch -p0 < module_implements.patch
patching file includes/bootstrap.inc
Hunk #1 succeeded at 820 (offset -11 lines).
Hunk #2 succeeded at 1430 (offset -11 lines).
patching file includes/common.inc
Hunk #1 succeeded at 1526 (offset 29 lines).
patching file includes/menu.inc
Hunk #1 FAILED at 1719.
1 out of 1 hunk FAILED -- saving rejects to file includes/menu.inc.rej
patching file includes/module.inc
patching file includes/registry.inc
patching file index.php
patching file modules/system/system.install
Hunk #1 succeeded at 1091 (offset -4 lines).
#2
Reroll.
#3
For real :)
#4
DB errors abound on my current test install (even after update.php), so I tried on a fresh install. Installed correctly and everything looks good except an error on the bottom of every page: Fatal error: Call to undefined function registry_cache_hook_implementations() in /home/davereid/Projects/drupal-head/includes/common.inc on line 1529. Tried running the system test. Registry parse file test fails 0/3 and stops:
Missing argument 3 for _registry_parse_file(), called in /home/davereid/Projects/drupal-head/modules/simpletest/tests/registry.test on line 32 and defined Warning registry.inc 119 _registry_parse_file()Undefined variable: contents Notice registry.inc 123 _registry_parse_file()
Resource "registry_test_function72ce73c6deb5ef3b6de403e95f904adf" found. Other registry.test 35 RegistryParseFileTestCase->testRegistryParseFile()
Resource "registry_test_classd10b5655b22218fe8b2cf4c444bbb23b" found. Other registry.test 35 RegistryParseFileTestCase->testRegistryParseFile()
Resource "registry_test_interfacedfe2be8143c374ef1733cf36860b1cc2" found. Other registry.test 35 RegistryParseFileTestCase->testRegistryParseFile()
#5
That's an easy fix for common.inc, no idea at all how that call remained in there. I think I fixed the test as well -- it's also clear that we will need a new test. Meanwhile, reviews are still welcome.
#6
Now getting following error on bottom of each page: Fatal error: Call to undefined function registry_cache_path_files() in /home/davereid/Projects/drupal-head/includes/common.inc on line 1530.
I've attached a revised patch. I think someone that has a little more understanding of the registry code to comment on the actual patch. I'll work on learning about it...
#7
Err...didn't mean to change status.
#8
Well, #6 is wrong, that function should not be removed. The patch comes from total module revamp and this was an accident.
#9
Silly me, forgot to add the install files to the registry. Works pretty well!
#10
Subscribe. Will try to review at some point.
#11
All tests now pass. And do you need any proof that patch my speeds up things? Check tracker.test. That was some test bug to hunt down. Little drop became a bit too fast :D
#12
I am either totally confused as to what you're trying to do or there are way too many patches rolled into one here. There's whitespace cleanup, adding .install files to the .info files, what looks like registry changes, and what looks like adding Yet Another Boolean Parameter (YABP). Only the last seems to have any relation to this issue, and I am actually against that approach in most cases for DX reasons.
IIRC from DBTNG, all that needed to be done for this issue was making module_list() itself not suck. Fix that to not be moronic and the rest fixes itself. Let's just do that and do the registry cleanup in another patch.
#13
More kittens are saved by moving the tracker test patch to http://drupal.org/node/309951 . For dropping sort, there was no point in calling watchdog or menu in an alphabetical order so dropping the sort argument is just cleanup. Yes that's makes the patch non-kitten free, but it'd be rather daunting to submit a patch which patches out $sort from the current implementation of module_implements just to be replaced by this version the next day. If we are bent on keeping $sort I can put it back.
#14
I think we crossposted.
#15
Sorry, missed Crell's reply. So this patch makes module_implements rely on the registry instead of module_list. Or you can say that we save some queries to the registry table when the hook registry for this router path is not yet cached. The patch needs to add the install files to the info otherwise hook_schema would fail as the registry wont contain them. To do this we needed a few registry.inc changes where the module name is passed all along. This way when we see a function name which starts with the name of the module, the rest can be a hook, so we store it as such. The rest of the registry changes is just removing what seemed to be cruft at the end of the day. Dunno what YABP are you talking of, really.
#16
Oh and by the way -- this patch paves the way for per hook weights.
#17
This issue is a pretty good example of http://webchick.net/patch-reviewers-are-not-clairvoyant. chx spent about an hour with me in #drupal walking me through the patch. Here's a summary:
#18
Now, for an initial pass.
- foreach (module_implements('watchdog', TRUE) as $module) {+ foreach (module_implements('watchdog') as $module) {
- registry_cache_hook_implementations(FALSE, TRUE);+ module_implements();
...
+ module_implements('', FALSE, TRUE);
cache_clear_all('*', 'cache_registry', TRUE);
At the very least they need comments, but it really is starting to feel like module_implements() is becoming this Frankenstein monster of a function -- it's module_build_hook_cache(), module_clear_hook_cache(), module_get_implementation_of_hook(), at least -- and should likely be split up.
+ * @param $file+ * An associative array about the file. Only the module key is used.
That parameter either needs WAY more comments, or it needs to just be $module.
#19
module_implements split up, utility functions added, doxygen'd commented. That parameter is fixed.
#20
Was hitting:
Fixed that with:
<?php/**
* This is the maintenance version of module_implements.
*
* @param $hook
* The name of the hook (e.g. "help" or "menu").
* @return
* An array with the names of the modules which are implementing this hook.
* Only enabled and already loaded modules are taken into consideration.
*/
function _module_implements_maintenance($hook) {
$implementations = array();
foreach (module_list() as $module) {
$function = $module . '_' . $hook;
if (function_exists($function)) {
$implementations[] = $module;
}
}
return $implementations;
}
?>
Now I'm hitting:
#21
Fixed.
#22
Hmmm,
notice: Undefined variable: refresh in /var/www/drupal/head/includes/module.inc on line 420......
Fatal error: Cannot redeclare cache_get() (previously declared in /var/www/drupal/head/includes/cache.inc:17) in /var/www/drupal/head/includes/cache-install.inc on line 16#23
Readded $sort via another utility function after realizing help needs it.
#24
Re-rolled without unnecessary cruft (coding-style changes).
#25
Fixed some PHPdocs, and renamed _module_implements_sorted() to _module_implements_sort(), and changed $stored into $cache in _module_implements_sort() for better visual separation from $sorted.
#26
To test this patch on a existing site running HEAD, execute the following queries:
ALTER TABLE `registry` ADD `module` VARCHAR(255) NOT NULL, ADD `hook` VARCHAR(255) NOT NULL;TRUNCATE `registry_file`;
and visit devel/cache/clear afterwards.
Testing code in custom.module:
<?phpfunction custom_boot() {
cache_clear_all('hooks', 'cache_registry');
module_implements(NULL, FALSE, TRUE);
var_export(module_implements('schema'));
}
?>
#27
Removed some obsolete cruft from _registry_parse_files().
Fixed _element_info() returning NULL instead of an (expected) array if no elements are returned. (leading to a fatal error otherwise)
Please note that you still need to perform a complete registry rebuild after applying this patch.
Testing results using the example code in #26 (which should simulate the situation we have DIRECTLY after installation):
For D6:
array ( )For D7 unpatched:
array ( )For D7 patched:
array ( 0 => 'block', 1 => 'filter', 2 => 'node', 3 => 'system', 4 => 'user', 5 => 'comment', 6 => 'dblog', 7 => 'menu', 8 => 'simpletest', )Simpletest results:
5799 passes, 5 fails, 0 exceptions
Like always, failed tests are only file related; see #315520: assertFilePermissions() not working on Windows
Aside from that, the implementation looks good, clever, and clean. Some parts might not be trivial, but make perfectly sense in the end.
So this patch should be RTBC.
The only remaining issue - NOT relevant/tied to this issue - is whether we can implement some failsafe algorithm, which automatically performs a complete registry rebuild if the stored data is empty, unavailable, or incomplete due to any reason, since users need to invoke drupal_flush_all_caches() somehow in such situations.
#28
Note that if registry is empty it rebuilds itself. You only had a problem because you ran an unpatched Drupal and then a patched one and the cached contents clashed. This can not happen normally.
#29
subscribing
#30
I've reviewed this patch 2-3 times now, and all the nit-picky things I can find to say about it have been fixed.
At an architectural level, I do have some concerns about the 'best guess' approach it takes to identifying hooks. But I understand that this is a trade-off in agility vs. accuracy... there are concerns about requiring yet another "registry" function with the theme registry, and possibly variable registry, etc.
However, I'd like to get Dries's two cents before committing this, since it seems like it crosses that treshold of "big" change.
#31
i don't like the extra 'guess' hook data in the registry schema either.
here's an alternative for
_module_implements_buildthat doesn't require any changes to the registry schema.its a bit slower than chx's patch, but its simpler, and it is quicker than the current implementation.
<?php
/**
* this is ugly, there's probably a DBNTG way to do it, but you get the idea.
*/
function _module_implements_build($hook) {
$return = array();
$candidate_functions = array();
$modules = array();
$result = db_query("SELECT name FROM {system} WHERE type = 'module' AND status = 1");
while ($module = db_fetch_object($result)) {
$candidate_functions[] = "'" . $module->name . '_' . $hook . "'";
$modules[$module->name . '_' . $hook] = $module->name;
}
$result = db_query("SELECT name, filename FROM {registry} WHERE type = 'function' AND name IN (" . implode(',', $candidate_functions) . ")");
while ($function = db_fetch_object($result)) {
// We need to load the relevant file for this function.
require_once DRUPAL_ROOT . '/' . $function->filename;
registry_mark_code('function', $function->name);
$return[] = $modules[$function->name];
}
return $return;
}
?>
#32
Sounds like this still needs discussion.
#33
to make this easier, i've created add functions in .info files to the registry.
#34
No this does not, the
IN (" . implode(',', $candidate_functions) . ")");is slow and as you have more modules it gets slower. A simple query against an indexed hook column is vastly faster.#35
Rerolled after the patch mentioned in #33 is in.
#36
updated patch always writes the module implementations to cache if we don't get any cache data for a request.
as it was, if we didn't get anything from the cache, then this snippet from
module_implements<?phpif ($cache && $implementations != $cache->data) {
?>
would always be false, and we wouldn't cache when we should.
all tests pass, and db queries for a clean install, all modules enabled hitting node/1 with 1 comment are cut by 80%.
still don't like the hook guess approach, and i'll likely post an alternative soon, but this is a big win for cutting back queries, so even if i don't convince people that we don't need the hook guess stuff, i'd still like to see this go in.
setting to code needs review - someone else should validate the caching change.
#37
The change is good, yes but you can express it with a bit less code. Thanks. That drop in 80% of queries is just before the hook cache hits in but it's still great.
#38
+ * By default, modules are ordered by weight and filename, settings this+ * option to TRUE, module list will be ordered by module name.
This is minor enough that someone could do it before a commit so I'll leave this RTBC; I think "settings" in the comment above should perhaps be "setting".
Or something like "By default, modules are ordered by weight and filename. If this is option is TRUE, modules are ordered alphabetically by module name." might be just as clear, or more so.
#39
Undefined index: lg_debug Notice database_test.test 1689 DatabaseLoggingTestCase->testEnableLogging()
#40
Wrrrooonnnng issue. :) Sorry.
#41
As I said to chx on IRC, this qualifies as a patch that I would like to get Dries's OK on before I commit it. It's kind of dirty, and he might have a better solution.
#42
I tried to make use of this by turning custom_url_rewrite* into hooks, and ran into bizarre problems with simpletest while doing so. This line in tearDown() in drupal_web_test_case.php seemed to be the culprit:
// Reload module list to ensure that test module hooks aren't called after tests.module_list(TRUE);
Should that be using the new module_implements(MODULE_IMPLEMENTS_CLEAR_CACHE) instead, or as well?
#43
updated patch to keep up with HEAD. all tests pass.
@John Morahan - not sure about your question.
#44
simpler version, merged
_module_implements_buildand_module_implements_checkand simplifiedmodule_implements.all tests pass. setting back to code needs review to get some more eyes on the changes.
#45
follow up comment following a request from chx to clarify what i changed.
i've modified the code so that for a cache miss for a given hook, we just call one function,
_module_implements_build._module_implements_buildpulls the hooks from the db, and loads the file for the hook into memory if its not already loaded.this means we don't need the
$loadedstatic or_module_implements_checkhelper.so, just a simplification really. returning the favour for once, as chx usually simplifies my code ;-).
#46
Justin, that won't fly, the reason I had the loaded thing is that it's possible that the implementations are cached but the include file containing the hook is not loaded.
#47
Review of chx patch from #37:
<?php
@@ -1361,7 +1361,7 @@ function _element_info($type, $refresh =
}
}
- return $cache[$type];
+ return isset($cache[$type]) ? $cache[$type] : array();
}
?>
What is the point of this (is that scope-creep)?
<?php+function module_implements($hook, $sort = FALSE) {
?>
The calling pattern of this function looks unecessary complicated. Why not replacing $sort with a bit mask $mode with MODULE_IMPLEMENTS_CLEAR_CACHE, MODULE_IMPLEMENTS_WRITE_CACHE and MODULE_IMPLEMENTS_SORT?
<?php+ // Though drupal_function_exists itself checks function_exists,
+ // most of the time the function will exist because of the per router path
+ // hook cache so we can save lots of calls to drupal_function_exists.
+ if (!function_exists($function)) {
+ drupal_function_exists($function);
}
?>
What is the use of forcing the load of the function here?
I don't see the point of moving the whole _module_implements_check() in a separate function (it is called only once), it could be moved back to module_implements().
<?php+ // We need to load the relevant file for this function.
+ drupal_function_exists($function->name);
?>
In that case, this code should be removed.
<?php+function _module_implements_sort($implementations, $hook) {
+ static $sorted = array(), $cache = array();
?>
Here we are storing cached implementation in $cache just for the purpose of clearing those statics. I suggest moving those statics to module_implements().
#54
Moving to separate functions IMO increased the readibility.
#55
Re. form.inc , it can be omitted if you want, it only caused problems for sun because he applied the patch into an already installed Drupal. I removed six comments attempting to derail the issue over such a minute detail.
Both to Damien and Justin,
<?phpforeach (module_implements('foo') as $module) {
$function = $module .'_foo';
$function();
}
?>
module_implementsmust ensure that the function you are likely to be calling in the near future is actually loaded.About calling patterns, as you wish, really. I hope I can get back Justin in the issue and agree whether #43 or #44 is to be continued.
#56
chx: #43. i missed the 'if its loaded from cache the implementations might not be in memory' part, so #44 is flawed.
#57
We agreed with chx on the IRC that we could remove both _module_implements_check and _module_implements_build and factor them inside module_implements while keeping high readability.
This patch still misses two key parts:
- (1) the default ordering of hooks by weight and module name
- (2) a $sort parameter for _module_implements_maintenance
#58
Here we come. I removed the form.inc patch so kindly please apply this and install after.
#59
Now, registry.test needed an update. Also, if you notice the 68 for the suffix length, well MySQL did not allow longer otherwise the index creation would fail. We can make the weight a smallint to win a few bytes and juggle with the module and the suffix but meh -- when did you use a hook longer than 68 characters?
#60
+ * By default, modules are ordered by weight and filename, settings this+ * option to TRUE, module list will be ordered by module name.
As in #38, I think "settings" should be "setting".
#61
- acked #60
- replaced $cache (the whole previous cache) to $cached_hooks (the count of previously cached implementations), to save memory
I think it would be better if we limit module names to 128 chars. That way, hooks could be 126 characters (because module + '_' + hook = 255, because function names are stored in VARCHAR(255)), and we would be better off here.
#62
I agree with Damien though PHP reference counting makes his memory argument moot (unless there is a new hook called) -- instead it's a performance improvement, and as Zend hashtables store their count storing the count in a variable is a rather speedy operation.
#63
Re-rolled,
- minor PHPdoc fixes for module_implements and _registry_parse_file.
- changed $hook === MODULE_IMPLEMENTS_CLEAR_CACHE tests to be type-agnostic, so invoking module_implements('', ...) does not result in a cleared cache.
#64
is this safe?
<?phpif ($hook === MODULE_IMPLEMENTS_WRITE_CACHE) {
// Only write this to cache if we loaded new implementations.
if (count($implementations) > $cached_hooks) {
cache_set('hooks', $implementations, 'cache_registry');
}
return;
}
?>
what if we have a one module enabled and an another module disabled, and the count of hooks stays the same, but the list if different?
#65
@justinrandell: You have to understand that this function does a progressive caching of the hook. Implementations for a given hook are cached the first time the hook is called. That's the reason why we check for the number of implementations here. Module enabling/disabling is a completely different story. In that case, we simply empty the full cache of hook implementations.
#66
@Damien: yep, my mistake.
#67
watching the registry queries is much easier with the new db logging code :-)
with the patch at #63 we cut out a lot, but for a logged in user, all core modules enabled, loading node/1 with one comment, we're still at 12 registry queries with a fully loaded cache.
if we cache registry misses, this drops down to 5.
if people aren't to allergic to the idea, i'll roll a patch, else i can wait for a follow up issue.
#68
@justinrandell thanks for filing a separate issue at http://drupal.org/node/325665 ;)
#69
ok, lets do the registry misses caching in a follow up.
here's an updated patch, with
drupal_function_existscalling_registry_check_code('function', $function);. otherwise, its the same as at #63.all tests pass.
now, how do we get Dries to weigh in on this issue?
#70
attached patch this time.
#71
@justin: I really don't like those changes added a the last minute.
Your change broke the negative caching in drupal_function_exists(), because _registry_check_code() returns NULL on miss, not FALSE.
#72
@Damien: good catch on NULL vs FALSE, attached patch fixes that,
_registry_check_codereturn FALSE on a miss.all tests pass.
#73
Opened #325994: Reduce system.name to 128 chars for the suggested change to the module name length.
#74
I tried to benchmark using this patch, to see if it speeds up things, using an existing Drupal 7 site (from last Friday or so), but I get the following error when trying to access any page:
And there are no pending updates in update.php.
So, I emptied the database, loaded the Drupal 6 database, and tried to run update.php, but I get another error and can't proceed:
The error page linked is not much better, very vague:
So, this broke update.php.
Backing out the patch makes update.php works again.
I tried creating an system_update_7012() for the new fields and indexes for the registry table, but that caused update.php to white screen (probably a PDO error ...)
#75
just some simple data points of HEAD vs HEAD + this patch.
using ab -n 1000 -c 10 http:///node/1
APC on, page and block caching off, anonymous user, all core modules enabled, one comment on node/1:
HEAD
Requests per second: 27.88 [#/sec] (mean)
Requests per second: 27.74 [#/sec] (mean)
Requests per second: 27.59 [#/sec] (mean)
patched
Requests per second: 30.62 [#/sec] (mean)
Requests per second: 30.50 [#/sec] (mean)
Requests per second: 30.69 [#/sec] (mean)
#76
justin: how does that compare to a Drupal 6 site? Would be useful, but not critical, to know.
#77
- Overall this looks like a good API clean-up. The only part that smells a bit is the maintenance version of module_implements. The documentation of that falls short: "This is the maintenance version of module_implements." Please add some more documentation description why this function is important/needed, and when it should (or should not) be called. By adding some more comments, we might be able to increase the comfort level of having such a function.
-
+ $suffix = '';+ if ($type == 'function' && !empty($module)) {
+ $n = strlen($module);
+ if (substr($resource_name, 0, $n) == $module) {
+ $suffix = substr($resource_name, $n + 1);
+ }
+ }
- Typo:
It'the name.Looks good -- it is a matter of adding a bit more documentation so we can make sense of the maintenance version.
#78
@Dries & Justin
I was trying to exactly do that: compare performance of before/after the patch and to Drupal 6
I wanted to keep all factors the same by upgrading the database from D6 to D7.
See my comment in 74.
If we can get over this, I can do more thorough performance analysis.
#79
@kbahey, there are issues open for update.php please do not derail this one, i will add more docs and add an underscore to the maintenance version as you should never call it.
#80
@chx
Maybe I was not clear.
Without the patch, I can update a site fine.
With the patch, I cannot update a site at all.
So the patch introduced something that causes failure. So was pointing this out.
I am fine that the update be fixed in a separate issue, just wanted to point this out in case it can be easily fixed.
#81
Wait a second, I already had an underscore there. Ah well. It needed a reroll anyways and added "for internal use only".
#82
@Dries, repeated with DRUPAL_6 vs HEAD vs HEAD + this patch
using ab -n 1000 -c 10 http://site/node/1
APC on, page and block caching off, anonymous user, all core modules enabled, one comment on node/1:
DRUPAL_6
Requests per second: 37.50 [#/sec] (mean)
Requests per second: 37.60 [#/sec] (mean)
Requests per second: 37.56 [#/sec] (mean)
HEAD
Requests per second: 27.58 [#/sec] (mean)
Requests per second: 27.63 [#/sec] (mean)
Requests per second: 27.60 [#/sec] (mean)
HEAD + patch
Requests per second: 30.64 [#/sec] (mean)
Requests per second: 30.58 [#/sec] (mean)
Requests per second: 30.66 [#/sec] (mean)
APC off, page and block caching off, anonymous user, all core modules enabled, one comment on node/1:
DRUPAL_6
Requests per second: 9.42 [#/sec] (mean)
Requests per second: 9.44 [#/sec] (mean)
Requests per second: 9.45 [#/sec] (mean)
HEAD
Requests per second: 7.96 [#/sec] (mean)
Requests per second: 7.91 [#/sec] (mean)
Requests per second: 7.93 [#/sec] (mean)
HEAD + patch
Requests per second: 8.34 [#/sec] (mean)
Requests per second: 8.36 [#/sec] (mean)
Requests per second: 8.34 [#/sec] (mean)
note, this is not an upgrade of the 6 install to 7, its two clean installs, enable all modules, create one node and add a comment to the node. clearly HEAD with or without the patch is slower, inline with the results from khalid.
#83
rerolled to keep up with HEAD, and added a couple of lines of comments to the hook gathering code in the registry rebuild cycle as per Dries comments in #77.
so, no code changes, and all tests still pass, so RTBC.
#84
ahem, RTBC.
#85
_module_implements_maintenance should explain how it is different from the regular _module_implements and why it is a safe path. Almost there ...
#86
It's safe because it does not use the DB. Explained. In detail.
#87
As Dries pointed out in #77:
+ 'description' => t("The part of the name after the module. It'the name of the hook this function implements, if any."),And, as in #38 and #60,
+ * By default, modules are ordered by weight and filename, settings this+ * option to TRUE, module list will be ordered by module name.
there's something odd with "settings" here.
But, these are both minor issues; I'll roll a patch if someone doesn't beat me to it.
#88
geez, you'd think people around here actually cared about code quality or something :-)
updated patch attached to address typo's in #87.
#89
Committed to CVS HEAD. Thanks guys.
#90
Automatically closed -- issue fixed for two weeks with no activity.