Problem/Motivation
there seem to be several cases where the module_implements() cache is polluted with incomplete info:
- any hook being invoked from within a hook_boot() -- as described by Jose Reyero in #9;
- entity_get_info() being called from within a hook_hook_info() -- as described in #1415278: Using entity_get_info in a hook_hook_info results in an incomplete module_implements cache.
- a watchdog call during module_load_all() as described in #67;
Proposed resolution
clear the static cache at the end of the full bootstrap and prevent the persistent cache from being written to prior to full bootstrap completion
Remaining tasks
N/A
User interface changes
N/A
API changes
N/A
Data model changes
N/A
And there are different opinions on how to solve it:
- Just document "you can't do this"
- #578044: move module_invoke_all() out of the reach of hook_boot() (though I'm not sure if this fixes the 2nd point)
- Throwing an exception when you do this (#10 / Damien Tournoud)
- previous + find a way then to be able to add hooks to the list of bootstrap hooks (#11 / Jose Reyero)
In the meantime, the patch uploaded in #14 fixes just the caching of incomplete info.
Original issue report
I can't understand the logic of module_implements() function.
Look on definition:
/**
* @param $hook
* The name of the hook (e.g. "help" or "menu"). */
...
function module_implements($hook, $sort = FALSE, $refresh = FALSE) {
...
if (!isset($implementations[$hook])) {
$implementations[$hook] = array();
$list = module_list(FALSE, TRUE, $sort); // second param is $bootstrap = TRUE
foreach ($list as $module) {
if (module_hook($module, $hook)) {
$implementations[$hook][] = $module;
}
}
}
/**
...
* @param $bootstrap
* Whether to return the reduced set of modules loaded in "bootstrap mode"
* for cached pages. See bootstrap.inc. */
...
function module_list($refresh = FALSE, $bootstrap = TRUE, $sort = FALSE, $fixed_list = NULL) {
...
if ($bootstrap) {
$result = db_query("SELECT name, filename, throttle FROM {system} WHERE type = 'module' AND status = 1 AND bootstrap = 1 ORDER BY weight ASC, filename ASC");
}
...
/**
* Define the critical hooks that force modules to always be loaded.
*/
function bootstrap_hooks() {
return array('boot', 'exit');
}
PROBLEM:
1. I'm calling i.e. module_implements('help') expecting list of all modules that implement that hook
2. module_implements calling module_list with $bootstrap param = TRUE?
$list = module_list(FALSE, TRUE, $sort); // second param is $bootstrap = TRUE
3. So instead of receiving all modules with hook_help, I'm getting modules with this hook only which are registered as $bootstrap only
It's also related to: #222109: module_list() should be rewritten as 2 functions: module_list() and module_list_bootstrap()
Comment | File | Size | Author |
---|---|---|---|
#124 | 496170-124.patch | 1.24 KB | marco |
#120 | 496170-120.patch | 9.12 KB | greenSkin |
#104 | 496170-70.patch | 1.77 KB | stefan.r |
#94 | 496170-94.patch | 10.65 KB | David_Rothstein |
#79 | 496170-79.patch | 7.27 KB | stefan.r |
Comments
Comment #1
kenorb CreditAttribution: kenorb commentedIt causing in same cases WSODs and wrong rendered pages.
Read more: #496198: module_list() called with wrong arguments causing WSOD and breaking theme_registry
Comment #2
Heine CreditAttribution: Heine commentedWhen are you calling module_implements? If a full bootstrap finishes, module_list will have been called with refresh TRUE, bootstrap FALSE, and module_list will serve items from the static cache (with all modules).
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe logic is strange and convoluted (as it relies on module_list() ignoring the bootstrap parameter as soon as Drupal is fully bootstrapped), but it does work. Let's fix that in D7 first.
Comment #4
kenorb CreditAttribution: kenorb commented#2
In case when modules not touch module_list() yes, it's working.
Function is in API, so it can be used:
http://api.drupal.org/api/function/module_list
(used by CCK, admin_menu, etc.)
Documentation don't mention anything that you will get invalid list (bootstrap list) when you request for non-bootstrap list.
And it have different behaviour during boot and init mode.
Even if you calling:
module_list(FALSE, TRUE, $sort);
you could get different lists, one time bootstrap, another time non-bootstrap.
It could be called and rebuild during shutdown (when static variables are cleared), when some module want to invoke some hook (module_invoke_all) or even from error handler (hook_watchdog).
When it will be rebuild, it will have bootstrap list, instead of full list of modules.
Comment #5
kenorb CreditAttribution: kenorb commented#2
module_implements() is called each time when watchdog is called, and watchdog could be called anytime from error handler.
Comment #6
kenorb CreditAttribution: kenorb commentedIn 6.x the fix should be simple:
instead of:
Because module_implements() should checking all hooks, not only bootstrap_hooks and this function could called any time (#5).
Comment #7
Heine CreditAttribution: Heine commentedAs to #5, calling a hook in a module that _has not been loaded_ will not work. theme() use in watchdog listeners is wrong. A possible solution is to load modules implementing hook_watchdog during the bootstrap.
Comment #8
Damien Tournoud CreditAttribution: Damien Tournoud commentedmodule_implements()
rely on the fact thatmodule_list(FALSE, TRUE, $sort)
will return:Obviously this is not very elegant, but I don't see any bug or bad logic here. This function has been fully refactored in Drupal 7, so I'll simply won't fix this.
Comment #9
Jose Reyero CreditAttribution: Jose Reyero commentedI'm sorry but I think this needs to be reopened.
The problem is that when any drupal hook is invoked from any module's hook_boot(), the results for module_implements('module_implements_alter') are cached into the database. At this time, a module that implements the hook and is not yet loaded will never see it being called.
Since this 'module_implements' cache can be rebuilt on random pages this is extremely difficult to trace as sometimes the module may be already loaded, sometimes not.
Step by step:
0. Module invokes any hook from hook_boot().
1. module_implements() is called.
2. If the list of implementations for that hook is not yet cached, then it is rebuilt, which ends up in calling drupal_alter('module_implements', $implementations[$hook], $hook);
3. This causes a call again to module_implements, to find out which modules implement hook_module_implements_alter()
4. This list of implementations is saved to the database at the end of the request.
This could be fixed either by adding 'module_implements_alter' to the list of bootstrap hooks or by not caching $implementations['module_implements_alter'], since this is rarely invoked.
As an example, see #1267966: Pathauto patterns are not showing up
There are also other issues with this function, like the list of implementations for a hook depending on whether a it has been called from bootstrap first or not, which ends up on weird issues with random modules that are very difficult to track for module maintainers, since they are always caused by other modules calling hooks when they shouldn't.
Also these issues happen randomly, depending sometimes on whether the cache has been rebuilt on a page with full bootstrap and module init or not.
See also #1415278: Using entity_get_info in a hook_hook_info results in an incomplete module_implements cache.
Comment #10
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis is not supported, period (and in addition it's not going to work). The rest describes how to protect against something that is not supported.
We should just throw an exception when you try to call a non-boot hook when Drupal is not fully bootstrapped.
Comment #11
Jose Reyero CreditAttribution: Jose Reyero commentedI basically agree, though we should find a way then to be able to add hooks to the list of bootstrap hooks.
The problem with this 'not supported' behavior is that it works sometimes. And then if a module does the wrong thing, it ends up in someone's else issue tracker causing major maintenance headaches. So I agree 100% on either supporting it somehow or throwing an exception. But saying it's not supported and allowing it to work is really bad for module maintenance.
Also there's no mention of that not being supported on the documentation and since all module.inc functions are available on hook_boot()...:
http://api.drupal.org/api/drupal/modules!system!system.api.php/function/...
(Same for module_invoke_all(), module_implements(), etc.. )
Comment #12
Dave ReidSeems like we should make bootstrap_hooks() accept a variable instead?
Comment #13
Anonymous (not verified) CreditAttribution: Anonymous commentedI think I am experiencing this bug when devel module and jquery_update are enabled on a site with core page caching enabled.
Both devel and jquery_update modules implement hook_library_alter, but only devel has a bootstrap hook (hook_boot). Sporadically, the site will revert back to core jQuery, which breaks scripts on the site that depend on the later version of jQuery.
I discovered this by waiting until the site exhibited this problem and displaying the output of module_implements('library_alter') in devel's PHP execution page. Only devel module was listed, even though jquery_update is enabled.
The workaround for it is to implement hook_boot in jquery_update OR to implement hook_boot in another module and load the jquery_update module on every page request.
Comment #14
roderikSo there seem to be two cases where the module_implements() cache is polluted with incomplete results:
Related to the 2nd point: #968264: hook_hook_info() can lead to an infinite loop
And there are different opinions on how to solve it:
I completely agree with Jose Reyero (#11) that something should be done about it -- more than just documenting it. Because even if it's not officially allowed, contrib modules are doing this - causing very hard to trace problems in unrelated modules.
But I don't really understand how adding a hook to bootstrap_hooks() would solve things. (I don't see that function being called from anywhere useful, to prevent cache pollution.)
In the meantime, I don't want to have to rewrite other people's hook_boot() implementations until I know where this issue is going... and besides: even though throwing an exception might be the correct thing to do, in practice it will break unsuspecting live D7 sites with contrib modules that have 'officially wrong' behavior that happens to not cause any problems.
I just want my cache to not become polluted. The attached patch fixes this.
I'm aware that this is not a 100% theoretically correct solution - because in e.g. calls to module_implements() from a hook_boot(), incomplete info is returned. But at least this incomplete info does not get cached. My specific problem gets fixed by it. (Namely: a PHP notice that is thrown every time a certain page is rendered, because module_implements() returns wrong results forever.)
Please consider this.
Comment #15
roderik.
Comment #16
Jose Reyero CreditAttribution: Jose Reyero commented@roderik,
If the hook is in bootstrap_hooks(), then modules implementing that hook are loaded for bootstrap, then problem solved, well, kind of...
I don't think not caching the results of module_implements() really fixes the problem either. Because then invoking a hook may have very different results depending on the bootstrap phase on which it is invoked, which creates another issue if the resulting values of that hook are cached before some module implementing them is loaded (An example is hook_entity_info()..)
So I think the good solution should involve a few steps:
1. Make bootstrap_hooks() accept a variable, as @Dave Reid suggests in #12
2. Modules providing bootstrap hooks updating that variable when installed / uninstalled.
3. Throwing an exception when invoking a non bootstrap hook from bootstrap.
Comment #16.0
Jose Reyero CreditAttribution: Jose Reyero commentedAdded summary for current state.
Comment #17
BrockBoland CreditAttribution: BrockBoland commentedHas anyone identified any other situations that cause this bug? I reviewed all our code and didn't find any cases
of the hook_boot() or entity_get_info() problems mentioned in the issue summary.
Part of the problem—and the reason that I haven't tried the patch from #14 yet—is that we haven't been able to replicate this bug consistently, so it would be hard to say whether the patch actually fixed anything. I'm reluctant to patch code so deep in core without really understanding what is happening and why.
Comment #18
roderikFWIW: Another situation: running code in .module files (outside of functions), that call any hook. More detail in #2066205-7: "Error: Unknown condition" in Rule for custom inline condition
@Jose Reyero: Thank you for your very quick response. (I couldn't parse the info immediately. I want to set aside time to get back to this, I do!)
Comment #19
Fabianx CreditAttribution: Fabianx commentedHere is a different patch to solve this problem for the hook_boot() part:
- It uses two different static caches for the hook_boot() problem, so that an incomplete cache can never leak into the system
- That way caches are never written from the wrong implementation
- Both statics are populated with the cache_bootstrap cached implementation
(that is why this bug is seldomly reproducable, it only happens when cache_bootstrap is empty during bootstrap, e.g. after variable_set.
Performance for non-boot users
:
- one if - clause
- one function call to drupal_get_bootstrap_phase().
should be acceptable.
Lets see if this passes tests, also this is major if not critical as watchdog(), url(), drupal_goto() all pollute module_implements() cache and you can't know this from your innocent hook_boot() implementation.
Comment #20
Fabianx CreditAttribution: Fabianx commentedComment #21
Fabianx CreditAttribution: Fabianx commentedComment #22
Fabianx CreditAttribution: Fabianx commentedThat patch as innocent as it looks actually breaks the module_implements cache due to unset() on a static variable putting a new variable in scope, which was then used for $implementations, but only in the first call. The others used the hook_boot() one, leading to a mostly empty $module_implements cache entry ...
Here is a fixed patch that also makes the logic more easy to understand.
EDIT: Interdiff would have been something like:
Comment #23
jbrauer CreditAttribution: jbrauer commentedTwo very minor code style nits updated in this version of the patch no functional changes.
Comment #24
jbrauer CreditAttribution: jbrauer commentedUgh and of course the wrong patch file... Once again this time with the right patch file this is just a couple minor code style items.
Comment #26
chx CreditAttribution: chx commentedWe fixed #667098: drupal_get_bootstrap_phase() is broken so we can use drupal_get_bootstrap_phase, I changed that. I also do not understand the hunk with "future static" , why is that necessary / useful?
Comment #27
chx CreditAttribution: chx commentedLet's try this again.
Comment #29
Fabianx CreditAttribution: Fabianx commented#26:
Yes, once #667098: drupal_get_bootstrap_phase() is broken is not only RTBC, but actually fixed we can start using drupal_get_bootstrap_phase(), but can we do it in a follow-up to replace all of core's drupal_bootstrap(NULL, FALSE) with the right code? There is a lot!
- The future static is for the following scenario:
Given: Cache is warm and 'cache_bootstrap:module_implements' is fully populated
If we reach this function during bootstrap, we will do:
BUT:
This implementations is pointing to the normal static used for bootstrap
and is not the final &drupal_static('module_implements')
Here we then check if we don't have full bootstrap.
and populate the final $implementations - what implementations will be when we reach the final bootstrap.
This is an optimization to avoid two cache_get()'s - if the cache is warm. A cache_get() is cheap, so I am okay to also remove this, if there is a concern.
In case it is cold, two cache_get()s are done, which is also good as another request might have populated the cache in the mean time.
Comment #31
chx CreditAttribution: chx commented#24 is good to go then.
Comment #32
joelpittetI've rolled back this patch on a production server because I was having some issues with module_implements cache bin item balooning out of control and taking down the site and/or making memcached/redis fail.
I'm not setting this back to needs work yet, but may if that rollback turns out to solve this issue. For now though it has...
Comment #33
Fabianx CreditAttribution: Fabianx commented#32: Appreciate the feedback, but this does not change anything for the actual stored module implements cache, but just ensures it works properly. From hook_init() on the behavior is completely like without the patch.
I assume your problem is that the cache item grows too large, which might happen with too many modules. If rolling this back "fixes" it, then in fact your module implements cache is broken.
Comment #34
joelpittet@Fabianx, I came to the same conclusion and was going to write to that this patch isn't necessarily the problem as well.
I think what this patch does is expose another issue that is deeper. The patch itself looks harmless, but so far rolling it back has produced 0 module_implement cache bin corruptions. I'll likely need to leave it for a couple of days to be sure and I will definitely report back if that error surfaces again. So far 12 hours of 0 issues. The big combo seems to be cron job + a small burst of traffic from what I can tell.
The cache item shouldn't grow out of control for 'module_implements'. I'd expect once those implementations are found and accounted for it would stay at that size after that even if re-built. I've cleared, truncated and even switched bins for cache_bootstrap between redis/memcache/database. All had the same corrupt/baloon problem.
I do have a lot of modules, it's a commerce site with features: ~350 active modules.
Comment #35
David_Rothstein CreditAttribution: David_Rothstein commentedI think the patch looks good and really would like to get it in, but given the last few comments I think I should wait for more info to come back before committing this one. (It could be that the patch is correct, but exposes some other more serious problem. If so, it would be good to understand exactly why.) Sorry, this one just has enough potential to break things, given the code that it touches, that I think we need to be extra careful with it...
In the meantime:
Later in the function the code which resets static caches does a lot more, such as drupal_static_reset('module_hook_info') also. Maybe this should explain why drupal_alter is the only one that matters here (I think the others aren't relevant because they either aren't used before bootstrap or are used but run no risk of being corrupted?).
Isn't a key point here not just that it doesn't need a drupal_static(), but that it must not use the same one that the function normally uses because that's the one module_implements_write_cache() looks at when it's writing, but we never want the pre-bootstrap data to be written?
I had trouble understanding this comment also - something about "future static" and "full cache" is confusing. I think what you're trying to say is that because only data from the full boostrap is ever written to the cache, it is always safe to use data that comes from that cache throughout the remainder of the page request, so we can put it in the static variable that is used for the remainder of the (post-full-bootstrap) page request?
Comment #36
Fabianx CreditAttribution: Fabianx commentedThank you very much for the very careful review.
I agree with all the points.
Comment #37
stockliasteroid CreditAttribution: stockliasteroid commentedWe applied this patch on a site after experiencing the issues mentioned in #1934192: hooks disappear from module_implements cache, hooks disappearing from module_implements. After running this patch on that same site for about 24 hours the cache ballooned out of control and the machine ran out of memory. As mentioned in #33, this patch may not actually be causing that issue, but is instead causing an underlying problem manifested via #1934192 to cause the cache to grow out of control.
This site has far fewer modules, around 73 contrib. Without the patch each FPM process runs consistently with about 80MB of memory, with the patch each process eventually ends up pegging at over 200MB.
Just a bit of anecdotal evidence, we've still had no luck diagnosing the missing module_implements hook issue, but in the meantime we've reverted this patch.
Comment #38
Fabianx CreditAttribution: Fabianx commented#37: Interesting, what do you mean with: ballooned out of control?
Any numbers / dumps of what the cache included? Which cache item ran out of control? Can you reproduce the OOM on a dev machine or only on prod?
We run this in production on several sites, so it is a high priority for me to find any issues.
There is a trick you can use to find the problematic module_implements hook:
If you have trigger error setup correctly on prod, this is even safe for production usage and will throw a fatal on dev environments.
Comment #39
stefan.r CreditAttribution: stefan.r commentedComment #40
stefan.r CreditAttribution: stefan.r commented@Fabianx just out of interest, which patch do you run, #24? And this is just to protect against faulty contrib modules?
Comment #41
stefan.r CreditAttribution: stefan.r commentedComment #42
Web-BeestI've stumbled upon this issue while coming across this bug whilst implementing subfolders_domain (Issue https://www.drupal.org/node/2489108).
Patch #24 resolves the issue in my case.
Comment #43
stefan.r CreditAttribution: stefan.r commentedAs mentioned by @Fabianx in #19 this is possibly critical, when only a limited set of hooks run this can lead to strange behavior including data loss.
Comment #44
stefan.r CreditAttribution: stefan.r commentedIf the problem is that the static cache for
drupal_alter()
andmodule_implements()
can be different both before and after a full bootstrap (and shouldn't be called before a full bootstrap in the first place) maybe something like this could work?- A trigger_error() if module_implements() is somehow called anyway
- Clearing the static cache for those functions before moving on to full bootstrap, just in case their statically cached values were based on the (reduced) list of "vital" modules only.
Comment #46
stefan.r CreditAttribution: stefan.r commentedLet's see what happens if we put the trigger_error() in module_invoke_all() instead
Comment #48
stefan.r CreditAttribution: stefan.r commented...with an exception for the installer
Comment #50
stefan.r CreditAttribution: stefan.r commentedand update.php
Comment #51
stefan.r CreditAttribution: stefan.r commentedBack to module_implements() to cover all possible cases.
Comment #53
stefan.r CreditAttribution: stefan.r commentedBack to #50 for now as apparently core itself isn't too strict either about using module_implements() before full bootstrap.
Comment #54
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedHowever its either #51 or #24 as especially the module_implements pollution is the problem.
And even clearing all static caches can break, because drupal_exit() could be called before full bootstrap.
However if you do:
- Remove the trigger_error - that won't fly for BC reasons
- Clear the static cache when we reach full bootstrap
- Ensure that the module_implements cache is never persisted to disk before full bootstrap
Then I think that could fly and is simpler than #24.
Comment #55
stefan.r CreditAttribution: stefan.r commentedThe persisting only happens in drupal_page_footer() (which is only called during full bootstrap), not in hook_exit(), so maybe the extra check isn't needed?
Comment #56
stefan.r CreditAttribution: stefan.r commentedBut just in case it is needed, here's a revised patch :)
Comment #58
stefan.r CreditAttribution: stefan.r commentedComment #59
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedCan we get a test for that, please?
I am also okay without a test, but if we could it would be great.
Patch looks nice to me :).
Comment #60
stefan.r CreditAttribution: stefan.r commentedComment #63
stefan.r CreditAttribution: stefan.r commentedPrevious patches were missing .info files
Comment #65
stefan.r CreditAttribution: stefan.r commentedComment #66
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedRTBC, there are more issues related to this:
- When the result of an incomplete module_implements() is cached in e.g. entity_get_info() or the theme registry, but those should be follow-ups.
This patch is functionally equivalent to #24, but way way simpler. Great work!
Lets get this in.
Comment #67
rwohlebI believe I found an additional edge case that needs a follow-up. If you have code that generates a PHP warning (such as "Call-time pass-by-reference has been deprecated") when the module file is first loaded, it can trigger a watchdog entry, which uses module_implements. The way the bootstrap system works, it has already marked it as a full bootstrap before it actually finishes doing the full bootstrap. As such, when the watchdog entry triggers module_implements early into the run of module_load_all, it thinks all modules are available even though some/many haven't been loaded yet. Tests against function_exists() fail and the corresponding hook implementations are left out.
The upside is that many hooks will remain unaffected since they won't get called until later, but hooks like "module_implements_alter" end up getting cached way too soon. You can end up with things like the token module no longer providing field tokens due to it's reliance on "module_implements_alter", and the hair-pulling experience of trying to track down why. To make it more frustrating, PHP warnings that are generated are dependent on how you have your error reporting level set, so replicating it can be frustrating.
I'm attaching a picture of the callstack for reference.
Comment #68
stefan.r CreditAttribution: stefan.r commented@rwohleb wouldnt that issue be solved by the patch in this issue?
Comment #69
rwohlebUnfortunately the patch does not fix this since it resets the cache right before the module_load_all().
Maybe if the patch was modified to do the reset after module_load_all(), but I haven't looked at the consequences of that.
Comment #70
stefan.r CreditAttribution: stefan.r commentedAs far as I can see, clearing the cache after module_load_all() ought to be fine as well -- only in the mentioned edge case should module_implements() be loaded anyway during the call to module_load_all().
Comment #71
rwohlebI believe the patch in #70 to be sufficient, but looking at this again I'm wondering if just moving the static resets will be enough. If the error generated during module_load_all() is a fatal error, it will display the error immediately through the maintenance theme and exit. I don't believe that in this situation drupal_page_footer() ends up getting called, but if it does, then module_implements_write_cache() will be called. At this point it will write the bad cache since it believes there is a full bootstrap. Correct me if I'm wrong.
Comment #72
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented#71: Yes, that is a generic problem with our state machine, that a phase is considered to be active when its entered, not when its successful ...
However: A fatal error would indeed exit early, so we should be okay there. (see _drupal_log_error)
-- That race is tricky for sure, but as it is an edge case including an error condition, I still think doing it afterwards is okay.
Should probably open a follow-up to harden module_load_all() though ...
Comment #73
stefan.r CreditAttribution: stefan.r commented#71: drupal_page_footer() is not called in such case, so it should be fine
Comment #74
pounardIgnore this, the most recent patch actually should solve most problems here.
Comment #75
stefan.r CreditAttribution: stefan.r commentedUpdating issue title to reflect #67 as I just ran into the same problem on a site with an early watchdog call during the module_load_all() in drupal_bootstrap_full() causing this exact problem.
Hoping we can get this into the next bugfix release...
Comment #76
stefan.r CreditAttribution: stefan.r commentedComment #77
pounard+1 for seeing this in the next bugfix release, I experience this bug from time to time and it's always very hard to debug.
Comment #78
catchWhen module_implements() is called early, and there's a cache hit on the module implements cache, then it looks to me like always clearing the drupal_static is going to result in a second cache_get(). Might need to store the results of that cache get separately, so that it's not fetched twice.
Comment #79
stefan.r CreditAttribution: stefan.r commentedComment #80
bember CreditAttribution: bember commentedI have a related issue with Sub-pathauto module, which sometimes has its hook_url_inbound_alter() implementation ignored.
#79 fixes the problem.
Comment #81
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedReviewed the changes and this looks great to me!
Comment #82
DrColossos CreditAttribution: DrColossos commentedAny updates on this? Will we see it in D7 soon due to its RTBC status?
Comment #83
stefan.r CreditAttribution: stefan.r commentedHopefully, just waiting for a new bugfix release now
Comment #84
pounardI'm afraid the latest version adds another layer of useless caching
drupal_static()
in the newly created function and will break some modules directly emptying this static cache entry ofmodule_implements
by calling thedrupal_static_reset('module_implements')
manually. I know this is not supposed to be public API but it is probably used this way (I have to admit that sometime, for performance reasons, I do it myself).If you keep the
module_implements
static cache then it's useless to keep another one in the_module_implements_get_implementations()
function, it's the same job twice, you probably should remove one of the statics and call the remaining onemodule_implements
to keep full compatibility.In most cases:
This attempt to solve problems for buggy modules at the cost of complexity, I think that the buggy modules should be fixed instead.
Comment #85
stefan.r CreditAttribution: stefan.r commentedLet's see if @Fabianx can have a look at this and see what we should do here. I do wonder if it's only buggy modules causing early module_implements calls -- I believe core does this as well in come cases.
Comment #86
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented#84: I am afraid you will have to live with the static cache being cleared as bootstrap is a completely different phase.
My more complex solution also would have implicitly thrown away the static cache used during bootstrap.
As for the static cache, that is something catch asked for and I am indifferent, so back to RTBC for now, but assigned to catch.
Comment #87
catchI think we should avoid the double cache_get() - prefer the extra static caching to that.
However if people prefer to have that, don't mind that much - just wanted to point out that the new patch was functionally different (and a regression in that respect) to the original one on this issue.
@pounard yes I also think it's a bug that 7.x modules invoke non-bootstrap hooks from bootstrap hooks, but unfortunately there are dozens of modules doing that. In 8.x this is effectively impossible because we removed hook_boot()/exit() so a workaround for the actually existing situation in 7.x seems OK. This is extremely hard to debug so there could be a lot of sites running into it without realising the cause.
Since we're detecting the error condition here, we could trigger_error()/watchdog() something to notify people this is happening, but that'd be up to David really.
Comment #88
stefan.r CreditAttribution: stefan.r commentedSo David_Rothstein will have 2 patches to pick from, #70 or #79. If it's the one with the new static, let's mention it in the release notes just in case.
The trigger_error() was already proposed earlier in #44, I think it was decided against.
Comment #89
pounardI'm just thinking about something, but hook_watchdog() could be defined as a bootstrap hook, it would solve, at least, the problem of watchdog() calls during bootstrap. Even thought it'd not fix other buggy cases.
Comment #90
stefan.r CreditAttribution: stefan.r commentedAh yes that makes sense, care to roll a patch?
Comment #91
pounardMaybe this worth a new issue ?
Comment #92
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented#91: Yes, please.
Comment #93
rwohlebI vote for #70 as it's the simplest implementation without changing expectations of how the cacheing works. A double cache_get won't kill anyone. Further improvements and warning messages could be implemented later, but with D8 on the horizon I'm not convinced it's worth while. The patch from #70 gets the job done as minimally as possible.
Comment #94
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedNot sure I'm qualified to decide, but the criticisms of both approaches (#70 and #79) actually seem valid to me :)
There's also a fair amount of "action at a distance going" on here, which makes the patches a bit scarier. A solution that is fully self-contained (within the problematic functions) would feel better.
And there's no guarantee that clearing the static caches at this point in the bootstrap fully solves the problem, right? (If contrib modules are doing crazy enough things in hook_boot() to trigger this bug, then it wouldn't surprise me at all if some of them are loading non-bootstrap modules early also.)
I'm also concerned that when the earlier patch in this issue was tested on production sites, multiple people reported that it was causing major problems (e.g. #32 and #37), and we still don't seem to fully understand why that happened... I am not sure if the newer patches have had that level of testing or not.
I looked at this for a while and began to wonder if we couldn't fix this in a conceptually simpler way - just by refusing to write to either the static or persistent cache until after all modules have been loaded? It seems like the right thing to do; if all modules aren't loaded yet there's no reason to think the current list of hook implementations will ever be valid later on, so we should not pollute any cache with it. And I don't think we need to micro-optimize performance for sites that are running tons of code in hook_boot() anyway.
That approach looks like the attached patch and seems to work fine when I tried it. It changes more lines of code than the previous patches, but most of the changes are just renaming variables to avoid writing to the static cache except when explicitly desired. I left the tests from #79 in place - the only functional changes are now to includes/module.inc.
Curious what people think of this one.
Comment #95
pounardI think this is also valid, just like the two others, I have no strong opinion about the right way to do it, but I still prefer the simpler patch in #70 just for the sake of simplicity. I'm all in for removing code not adding it.
Comment #96
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commented+1 to #94 - that makes a ton of sense.
Back to RTBC for now - to have some people test the new patch.
Note: A similar strategy will work for schema, theme registry and entity definition caches - which all are also prone to that error (though those are different issues of course).
Comment #97
stefan.r CreditAttribution: stefan.r commented#94 seems fine to me too
Comment #98
catchMost of those caches are using DrupalCacheArray and already check bootstrap phase before writing, i.e. https://api.drupal.org/api/drupal/includes%21theme.inc/function/ThemeReg...
So this is consistent with that, it's just that the hook implementations cache is inconsistent because it doesn't use DrupalCacheArray.
Comment #99
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedCool, is anyone able to test #94 with a real site?
Would be particularly great if @joelpittet or @stockliasteroid could test it (if that's still a realistic possibility) since they had the sites where the earlier patch seemed to lead to problems.
Comment #100
joelpittet@David_Rothstein I've been using the patch from #70 since Aug 18th without any problems in production.
I'll swap it out for #94 and see what happens.
Comment #101
joelpittetHmm, #94 conflicts with #2263365: Second loop in module_implements() being repeated for no reason. which I've also been using and I've performance tested as well. So I'm a bit hesitant to change before the conflict is resolved because it's also RTBC.
Comment #102
stefan.r CreditAttribution: stefan.r commented@joelpittet care to do a reroll of a combined patch and try that on your production setup? Hoping we can still get either of these patches (#70 or #94) into the next bugfix release (which might be due this week).
We've also had no problems with #70 on some of our sites. Not sure about the concerns in #94 but I can understand the approach in the latest patch may feel a bit less disruptive (also given it's consistent with what we do in DrupalCacheArray)
Comment #103
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI went ahead and committed #2263365: Second loop in module_implements() being repeated for no reason. now since it looked like it was ready to go.
So I actually didn't realize how much real-world testing #70 has already had. Maybe that plus the fact that it doesn't conflict means we should commit that now, then do #94 later. I think the two concrete reasons #94 is a better long-term fix than #70 are:
1. There are some situations (mostly edge cases) where #70 won't fix the bug but #94 would.
2. The double cache-get issue, which should not be a problem with #94.
I think we can live with both of those. The second is the only one that might be considered a regression, but it should be relatively minor (and should only affect sites that are actually invoking hooks early in the bootstrap). Nobody was completely opposed to committing #70 above, so maybe that plan will work for everyone.
Leaving this RTBC for now to see if anyone objects.
Comment #104
stefan.r CreditAttribution: stefan.r commentedSounds great! Re-uploading #70
Comment #105
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commented#104 is missing the tests, but I'll just trigger a retest of #70 instead.
Comment #107
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedI would still like to see #94 getting some real-world testing as it is conceptually more clean and more consistent with the rest of core as catch pointed out, so could we re-roll #94 on current D7 HEAD?
Comment #108
stefan.r CreditAttribution: stefan.r commented@Fabianx Ithink the proposal was to do both (committing #70 for the bugfix release due soon as it can't make things worse than they already are, and #94 at a later point as it got more testing) - are you OK with that?
In any case a re-roll of #94 would be good so it can be tested.
Comment #109
Fabianx CreditAttribution: Fabianx as a volunteer and at Tag1 Consulting commentedFWIW, I am fine with that approach.
Comment #110
stefan.r CreditAttribution: stefan.r commentedCool, let's see if we're still on time for this to be committed... Will be good to finally have this (mosty) fixed!
Comment #111
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedOK, committed #70 to 7.x. Thanks!
So I guess we should leave this issue open (but at lower priority) for rerolling #94 and getting that in, which should hopefully fix any remaining edge cases here.
Comment #113
jenlamptonedit: Oh, #70 nevermind. :)
Comment #114
catchhttps://www.drupal.org/node/496170#comment-10410241 has the test coverage. The idea here seemed to be to get the whole patch fro #94 in, but would be worth getting the test coverage in by itself anyway.
Comment #115
davidwbarratt CreditAttribution: davidwbarratt at Golf Channel commentedDoes anyone know if this issue resolved #2421609: EntityMalformedException: Missing bundle property on entity of type file. in entity_extract_ids()?
Comment #116
szantog CreditAttribution: szantog as a volunteer commentedHmm, something is not good.
I used #24 before core upgrade on 3 sites, it worked well.
After upgrade 2 of them are ok, but the last 1 (it's a real monster, commerce, flag, etc)..
It starts to produce rules and flag issues on my localhost but not on our dev environment, for example: #1925922: Unsupported operand types in factory_by_row() in flag_flag.inc I check our custom codes, do some changes, then it seems, it is fixed. But no, the error comes back once, only drush rr fixes it. Then comes back again 2 days later total randomly. Then I see the differences: On my localhost I switched back to memcache! On db cache the problem came back immediately, on memcache we just saw it rarely, so I found out, it might be our old module_implements trouble.
Here are my experiences:
This whole issue might be some kind of sizing thing. Something overgrow immediately when I use db cache, something later on memcache.
I tried to play with _drupal_bootstrap_full(), this is my current state which is obviously not a fix, I just hope, it will give some idea.
The interesting part is:
drupal_static_reset('module_hook_info');
Without this only drush rr can reanimate the site after getting error! drush cc all produces the same error, code changes doesn't matter anymore.I have no idea, but it seems, the caches still remain corrupted somehow. With the patch above, the site works perfectly, but I though, clear caches at this point not a performance friendly solution
Comment #117
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedI believe all the test coverage was already committed as part of #112. So we'd just need an updated version of the refactoring itself to continue here.
Haven't thought much about this myself but #2351731: hook_element_info_alter() does not always get invoked in themes reminded me that for drupal_alter(), we need to think a bit about how either of these solutions affect themes, which can implement alter hooks too....
Comment #118
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commented@szantog, hm, module_hook_info() already ignores the cache if it's called early in the bootstrap - see https://api.drupal.org/api/drupal/includes!module.inc/function/module_ho... - so it's interesting that that cache clear is needed needed.
It's possible that an approach like #94 will help with that, but I'm not sure; part of the reason for proposing that was that there are some extreme edge cases where things can theoretically not be fixed even with the patch that was committed here.
Comment #119
MixologicJust ran into this again on drupal.org - this time the backtrace in the core dumps looks like so:
And the actual bt:
Comment #120
greenSkin CreditAttribution: greenSkin commentedHere's a re-roll of #94.
Comment #121
xjmComment #122
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedWhile we never got #94 in, it also has other problems.
That said I think we can fix the remaining edge cases, by doing this:
and then set the drupal_static at the end of the bootstrap full function.
That way - that at phase is set when it's entered, not when it's completed - would no longer bite us here.
Comment #123
joseph.olstad#122 makes sense to me. I have not yet tested this however.
I believe most of my projects have been for years using a version of one of the above patches.
Comment #124
marco CreditAttribution: marco at Tag1 Consulting commentedThe attached patch implements #122, with an addendum from @peximo to make sure that calling a full bootstrap more than once does not break the logic.
Comment #125
apadernoComment #126
Darren Oh