Closed (fixed)
Project:
Drupal core
Version:
8.0.x-dev
Component:
base system
Priority:
Major
Category:
Task
Assigned:
Issue tags:
Reporter:
Created:
13 Jan 2012 at 16:21 UTC
Updated:
29 Jul 2014 at 20:18 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
fagoI don't think two simple rebuild and cache clear stages are going to suffice, as we have multiple different kinds of rebuilds in our process. First off we've underlying rebuilds for applying module provided configuration (read: exportables), then we've something like menu_rebuild(). However, rebuilding the menu basically requires most caches to be updated for changes to take affect, while rebuilds for applying module provided configuration are really foundational and need to happen before clearing caches, so caches are updated afterwards.
Thus, I think we should
1. Rebuild the registry
2. Apply module provided configuration (node types, system info, ..) ---> General rebuild phase.
3. Clear *all* caches, including all statics so changes take affect afterwards. ---> General cache clear phase.
4. Rebuild the menu, what's going to populate lots of caches again.
Then, we should make sure no module calls menu_rebuild() in an API function, but makes use of that:
instead. Maybe we could improve that by having menu_rebuild() to default to this behavior.
Idea:
Using the "menu_rebuild_needed" flag, makes menu_rebuild() work like any other permanent cache flush. Thus, we could even don't bother with statics and just make sure all permanent caches are cleared in phase 3.
However, I guess having all statics cleared after drupal_flush_all_caches() would be a desired behavior anyway.
Comment #2
amateescu commentedPossibly related: #1387776: Move some rebuilds outside of Drupal
Comment #3
sunI partially disagree with the proposed order.
To clarify what you described in #1, we have:
Unless respective API functions have a primed static cache, the database cache is re-loaded.
Without primed static cache, respective API functions attempt to re-load from database cache. Otherwise, recollect/reconcile/rebuild from scratch.
Synchronizing the system's knowledge about existing classes, modules, and themes with actual filesystem data.
Synchronizing stored data in database with code changes. Typically priming static caches with results again. Also, these can have dependencies on rebuilds of other modules - typically following module dependencies (as declared in .info files).
Final update of router data to match new system state.
And that's basically also what I'd precisely envision for dfac() - to repeat:
hook_flush_caches(): Flush all database caches and static caches you have, so we can continue from a clean slate. (Static caches could be taken over by dfac().)registry_rebuild()(implyingsystem_rebuild_module_data()) +system_rebuild_theme_data(): Hard-coded update of class and extension data.hook_rebuild(): Rebuild whatever data you have that may be based on code.menu_build(): Hard-coded rebuild of the router, so following executed code and requests know about obsolete and new stuff.cache('page')->flush(): As a special exception, intentionally flush the page cache last, so any requests hammering the site during dfac() can potentially still be served.Notes:
hook_rebuild()would ideally be invoked in the order of module dependencies.hook_rebuild()implementations manually calling intomenu_build()are broken code and we do not babysit broken code.hook_menu()implementations do not manually call into any rebuild functions.Comment #4
fagoHow would you like do that? This would mean we'd have to look the whole system, so no cache can get initialized in-between. However, I think if we'd have a solid order of steps with each step being locked, we should be already safe and minimize the down-time.
Agreed.
Makes sense. However, it sounds quite complicated and I can't think of any case requiring that right now. So, going with regular module weights and/or module controlled ordering via hook_module_implements_alter() should do it for now? We can still go and improve it once the need for it really comes up.
I don't think so any more. Point 3) (System components and extensions update) does not depend on 1) and 2) (permanent and static cache clears) in any way, however there are dependencies the other way round. Thus I think point 3) should run first.
Similarly, I think point 4) (Data structure and configuration rebuilds) usually won't require updated caches neither, as reading configuration files and synching them into the db mostly depends on 3). Furthermore, rebuilding items might issue quite some changes and cache-clears - thus flushing all caches afterwards would save us from quite some repetitive cache flushes. Next, rebuilt items won't be properly reflected in caches if the modules in question do not properly clear all affected caches, what sometimes is not even properly possible due to cache-dependency-chains (cache A depends on cache B, but modules clears only B so A stays outdated). Only doing a full-cache clear after rebuilding would guarantee completely fresh caches.
Thus, I'd suggest the following order:
Thinking about it again, the best solution probably boils down to the rebuild-operation one thinks of. Rebuild operations that sync rather foundational, defined configuration items (views, content types, rules, ...) should run before cache clears to make sure caches are properly flushed once, but the other way round rebuilding data that depends on quite some other data and configuration (menu, blocks) should run last and after clearing caches.
Thus, we might even want to go with two different rebuild steps or would require the latter rebuilding steps to move to a cache-like approach: 1. Clear caches/set the rebuild flag. 2. Rebuild once on de-mand and cache results. We even have that flag in place for the menu.
However, we really cannot skip the first rebuild step without having a repetitive drupal_flush_all_caches() to force everything is updated.
Comment #5
sunEnhancing the lock system to allow for a special, global lock doesn't sound that bad, actually.
Nope. Consider:
1) After clearing caches
2) assume you need to rebuild
3) another request primes the caches again
4) call into the rebuild hooks, then
5) either nothing is rebuilt, or
6) the world ends with a nuclear disaster, fatal error.
Yeah, potentially. (It sounds like a very good idea to do though.)
They do, heavily even. Especially this kind of data is typically cached right within the function that builds and returns it.
Even essential stuff like module_list() itself relies on caches. What we're after is a clear separation between
1) flushing any and all old data and caches, and
2) (re)building data structures and configuration
This is essential. We do not want to have any old data in these operations, which (naturally) can only be stale and outdated.
Nope. Data is only rebuilt if there is no cached data in the first place. Again: We need a clear separation of stages and concepts.
A rebuild operation might set a cache after rebuilding - but that is fine, as long as we can be sure that everything was built from scratch.
"Not properly clearing all caches" during the (first) cache flush operation of dfac() means broken code.
We do not babysit broken code, so there cannot be cache dependency chains for rebuild operations in the first place.
Yes, I totally agree that all caches need to be flushed. However, caches are only created when they are built. And that's what the rebuild operation is for (though not everything is directly cached upon rebuild).
To make sure this really gets across, as I really can't stress enough on it:
You cannot reliably rebuild data structures and configuration, if there is a possibility of a function returning old/stale data somewhere in between.
Especially because of the module data dependency chains in the rebuild operations it would be utterly wrong to attempt to (re)build anything while having a chance of stale data anywhere.
All rebuild operations need to be able to rely on the fact that all caches have been cleared.
Without this essential difference in the flow, we're not going to improve anything. If there's any hook implementation deep down the rabbit hole that returns stale data from an old/outdated cache, then the data you are trying to (re)build is fucked up.
That's the exact systematic problem that #996236: drupal_flush_all_caches() does not clear entity info cache revealed.
The new paradigm has to be: Kill it. Build it.
Comment #6
fago@sun:
I'd not say so. I think we are not talking about the same kind of 'rebuild'. So let's try to clarify what we mean with 'rebuild':
To me 'rebuild' is an operation that prepares data, that is used afterwards. There is a difference to caches in that, that the system relies upon this built data to be there afterwards, whereby caches can be refreshed anytime they have been cleared.
Examples of that would be the registry or synching node-types to the database. If node-types have not been imported, they are just missing as the system won't be able to determine their absence.
That makes it in particular problematic as caches might require those items to have been rebuilt in order to not contain stale data. In case of caches it's easier, as you can just clear everything and caches are re-initialized just on-demand, thus automatically building in the right order. Given that, we have to rebuild *before* the complete cache clear in order for caches to be able to reflect the rebuilt data.
(on cache vs rebuilding data: I already tried to handle rebuilding exportables like a regular cache clear by doing a rebuild on-demand with exportable entities, what results entity-crud hooks being possibly triggered by a simple entity-load operation. That way quite some ugly nested hook-invocations as seen in #1267070-7: Lazy-rebuilding the defaults for exported entities might not be desirable occur, what we better avoid by handling rebuilds explicitly.)
Example:
Node types need to be rebuilt before the entity info cache is initialized. Flushing all caches first, then going in rebuild phase probably ends up with someone triggering entity-info cache building before node types can be rebuilt, and voila your entity-info is outdated.
Possible solution:
1. complete cache clear
2. rebuilds
3. complete cache clear
I do think though the first complete cache clear is not necessary for most kind of rebuilds, however I agree that we should clear module lists first (as required by the registry rebuild).
->
1. clear module caches (no dependencies)
2. rebuilds (may depend on modules being up2date)
3. complete cache clear
If you do think of rebuilds that require more caches cleared, please give some examples.
Comment #7
sunExtremely simple example of how not flushing caches upfront and not being able to rely on a clean slate fails:
Albeit abstract, there's most likely no simpler example.
If you really want just one of plenty real-world examples that involves a plethora of cached data in statics and database, witness _block_rehash().
There's no point in flushing caches after rebuilding. You've either rebuilt everything from scratch and from a clean slate, or you did not. At the point you're no longer sure, the system is unreliable and the design of the process broken.
Comment #8
fagoSo what's with the example I've given then?:
The same holds for quite some other exportables that need to be rebuilt. To apply those exportables though, we'd require another cache clear. Thus, are you suggesting we should do it twice?
Indeed, that one needs to happen after all caches have been cleared. Probably best we'd just have two hooks, one before flushing caches and one afterwards. :/ Let's face it, we have real-world use-cases for both.
Comment #9
fagook, here is a patch overhauling cache-clearing in 3 phases. Let's see what the test-bot says.
Patch also fixes revamps simple-tests resetAll() methods to rely more on drupal_flush_caches() - a good proof it's working right (or not). Also, fixed a test-problem in the comments fields tests where previously caches were rebuilt without the full module list being loaded.
Comment #11
fago>Drupal installation failed.
Cannot reproduce. :/
Comment #12
fago#9: drupal_8_flush_caches.patch queued for re-testing.
Comment #14
sunThe situation you are describing can only happen in the current system, the broken system we need to fix.
In the new system, it does not matter what is rebuilt first or later, and which function primes its cache(s) after rebuilding. It does not matter at all, because we're running from a clean slate and everything has been reset to NULL in the first place.
Therefore, there cannot be a wonky dependency between node types and entity info caches. If one depends on the other, that's perfectly fine: the other is rebuilt, too. It's only rebuilt once, and only needs to be rebuilt once, because the new flow and clear separation between
hook_flush_caches()andhook_rebuild()ensures that everything old we knew before is forgotten, and the registry and extensions are updated before we reach the rebuild stage.Thus, if you happen to have a rebuild that primes a cache and it doesn't contain everything that's there, then that's obviously a major bug and broken code in the module that should have returned its data when the data was rebuilt.
Your argumentation would mean that Drupal wouldn't be able to build (not rebuild) the data it requires when starting from null, nada, scratch, ground zero, no pre-existing information. That's not the case, Drupal is and must be able to do so.
What this API change needs to achieve is to essentially make dfac() the system get "back" to that null state. Only after doing so, we have a reliable system state and are able to [re]build information. (Depending on the individual implementation, it's not necessarily a "rebuild", it may just be a straight "build" (because there's no previous information anymore).)
Sorry, but this is zero improvement to the current dfac() nightmare and getting nowhere. This doesn't separate concepts at all.
Comment #15
fagoI agree with you that ideally, we'd just have cache-like systems that we can flush + rebuild afterwards. However, in reality we haven't.
We have lots of such "broken code":
All of those are not flush-able like caches, so that they re-initialize on demand. If you just clear those data, then necessary information won't be rebuilt - it will be missing.
Let's consider the example of node types. First off there is
node_types_rebuild()what is necessary to detect new/gone-away node types from modules. Then there isnode_type_cache_reset(), which is about clearing the node-type cache. Clearing the cache won't help you to discover new node-types. Deleting all node-types from the db won't help you to automatically discover them either, you just loose your customizations and your non-code node types. That is why node_types_rebuild() is invoked explicitly on module enable / disable and dfac.Comment #16
sunAttached patch implements a clear and concise separation between cache and rebuild concepts.
Comment #18
fagoMeantime, I realized my proposal won't fly. Rebuilding even foundational stuff without properly cleared caches partly relies on a working system to do even this foundational basic rebuilds. However, we really should be able to recover from any wrong, incomplete or whatever caches by running dfac(). Thus, finally +1 for sun's approach even if it means some caches have to be re-freshed twice.
Ideally, I guess rebuilds should try to clear all necessary caches after rebuilding. E.g., like rebuilding node-types already triggers a menu rebuild. That said, rebuild operations should offer hooks other modules can use to clear necessary caches (= node type CRUD hooks).
This requires the global lock to work, else we might end up with old blocks.
Comment #19
sunWe should figure out what's wrong with the testbot first. I'm able to successfully install Drupal with this patch applied and do not see any kind of hiccups.
Comment #20
sunPossibly related: #1213536: Non-resettable theme_get_registry() cache causes problems for non-interactive installations
Comment #21
sun#16: drupal8.flush-rebuild.16.patch queued for re-testing.
Comment #23
sunHrm. Re-rolling this patch is painful.
Comment #25
sunI'm not able to reproduce the installation test failure locally -- neither through the UI, nor via run-tests.sh.
Regardless of that, trying whether further adjustments to install_finished() make any difference.
Comment #26
sunI think this is a good idea. Not sure whether to do it in this patch or in a follow-up issue though?
I've no idea how to resolve this. (this is actually the same comment as the @todo in install_finished())
There are actually no non-drupal statics to reset currently. But it would be good to keep the comment (without @todo), in order to state that - if regular statics need to be reset at some point - then they should be reset here.
This @todo is actually wrong. It should instead clarify that this is needed, because the drupal statics of list_themes() and drupal_add_js/css() are completely reset by dfac(), so the install finished page would only contain the bare/raw text output without any maintenance theme styling.
Let's fix the calling functions accordingly as part of this patch.
Comment #27
aspilicious commentedDoesn't this run menu_rebuild twice?
-24 days to next Drupal core point release.
Comment #28
sun@aspilicious: Nope, menu_rebuild() is not a hook implementation of hook_rebuild().
Resolved all issues in #26:
Note: If this patch fails, then we need a system_list_reset() after invoking it in dfac(). (or re-order the invocation)
Comment #30
sunwhoopsie -- didn't catch all instances of 'flush_caches' throughout core ;)
whoopsie -- missed the inline @todo here ;)
This patch is RTBC from my perspective.
Comment #31
sunComment #32
hefox commentedSkim read the issue, so may be missing out anyone bringing this up, but perhaps a less generic hook then hook_rebuild would be good?
Feels like it could cause some confusion when newer people see it vs. registry_rebuild, menu_rebuild, etc. and also limits the namespace (can't have a module named registry, menu, etc.). Also, seems like a possibility of mistakenly naming a function [module]_rebuild and accidentally implementing that hook XD!
Comment #33
sunReplaced "database caches" with "persistent caches."
Comment #34
Anonymous (not verified) commentedi really like this patch. a couple of smallish things:
- i think 'database caches' reference should be replaced with 'persistent caches'
- the moving around of _system_update_bootstrap_status() could do with some comments, reading the patch it's unclear why that's important
- i think we should implement system_cache_flush() and return those bins, rather than special casing 'core' bins
Comment #35
sunBased on #34:
1) renamed (in #33 already)
2) The system_list_reset() is performed in dfac(), module_enable(), and module_disable() already. _system_update_bootstrap_status() invoked it another time, needlessly.
3) Incorporated the system_cache_flush() suggestion. Thanks!
Comment #36
sunMinor phpDoc adjustments.
Comment #37
Anonymous (not verified) commentedok, i think this is ready to fly. setting to RTBC so catch will see it.
Comment #38
chx commentedHuh. I have a lot of problems with the new doxygen. The old had examples which seem to be gone. Also,
+ * Flushes and resets all caches, and rebuilds data structures.
I presume you mean
All static caches are reset and all cache bins are flushed. And then what is rebuilt...? That's where it gets confusing. What about adding to+ * - Rebuilds and synchronizes data structures (invoking hook_rebuild()).
"The following core modules implement this hook: block doing bananas" etc. The big problem is that without an intimate knowledge of core you have no chance to know whether something is rebuilt on this call or the next request when a cache miss occurs and is stored into cache.
Comment #39
sun#38 asks for phpDoc clarification on what subsystem caches are cleared and what subsystem data is rebuilt. Also, to clarify in phpDoc that, in fact, the menu router has not been rebuilt yet when hook_rebuild() is invoked.
Comment #40
berdirApplied the patch, executed update.php, then my theme information was lost on the final confirmation page.
Ah-ha. Then how do you explain this?:
We do have a menu.module....
Comment #41
Anonymous (not verified) commentedHahahaha. I mean, stabbity stab stab.
Comment #41.0
Anonymous (not verified) commentedUpdated issue summary.
Comment #42
sunNote: Doing this revealed many calls to menu_rebuild() throughout the system that look very dubious to me. I'll create a follow-up issue to analyze whether those invocations are actually correct and needed, or whether we're unnecessarily rebuilding the router repeatedly. In any case, separate issue.
Also updated the issue summary with API changes, as well as some more notes about other/related efforts elsewhere.
Comment #44
sunD'OH! Forgot the instance in dfac() ;) No other change, so the last interdiff still applies.
Comment #45
fagoThe bastard is green! Good work! ;)
The entire system isn't empty, the content stays. This should be phrased better.
Outdated comment? It's now set to the maintenance theme and install_finished() doesn't re-initialize anything.
ouch! Imho that should be fixed in the bootstrap.. Well, I guess that's a no trivial task and is best done by the bootstrap cleanup anyway. Maybe we should add a @todo here to remove it once the bootstrap has been cleaned up?
hm, not necessarily. There might be data that needs to be rebuilt in order to be available to the module. E.g. consider a block_node module "building" blocks for each node type using a rebuild mechanism - it won't see new node types as hook_node_rebuild() is invoked later on by default. Thus, we should clarify that the order of rebuilds might be important (which can be altered as usual for hooks).
That said, modules rebuilding data should clear their caches *again* once the have rebuilt the data. Taken the node-block example, the block rebuild might have primed the node type cache which might contain old information once node types got rebuilt. Thus, node_types_rebuild() should make sure to update/clear the node-type cache once rebuilt.
Also, that shows that rebuilding data is more painful than the regular cache-approach of building the data on demand, as possible dependencies have to be resolved manually. Maybe we should try to clarify on when it is a good idea to pursue which approach (cache vs rebuild)?
Comment #46
sun1) Fixed docs on "empty."
2) Fixed docs on the theme reset. (removed the todo)
3) The schema reset and corresponding comment is not touched by this patch, merely relocated. Separate issue.
4) Honestly, I'd like to see a separate issue for the multidimensional rebuild/cache issue you're describing. Because I don't really get what you're trying to explain. ;) That issue will be able to rely on hook_rebuild() being there, so can focus on that operation only, and can properly discuss that detail. Discussing it in here only leads to more confusion around this patch. I'd really like to get rid of this issue here first, because the change is anything but trivial.
EDIT: Last patch had "0 passes". Let's watch out.
Comment #47
chx commentedI added further doxygen. Yes, we usually don't list the core implementation of a hook but this is so hard more documentation can not hurt.
Comment #48
xjmStill "0 passes."
Comment #49
Anonymous (not verified) commentedupdating patch for latest commits, reroll only except for a menu_rebuild() call in install.core.inc which was removed in latest head.
Comment #50
Anonymous (not verified) commentedhmm, there was a menu_rebuild() in standard_install() that wasn't ported over.
Comment #52
sun#50: 1404198-50.patch queued for re-testing.
Comment #54
Anonymous (not verified) commentedwow. so, inside the deep rabbit hole i just dug around in, i discovered something rather obvious.
in the parent drupal, when you properly blow away all caches and rebuild node types, you don't get any new node types from modules enabled in the child drupal. because module_implements(), $deity bless it, checks to see if you have loaded the module's implementation before returning it.
so, adding the three lines at the bottom to testCommentEnable() makes the test pass:
but is also clearly completely wrong.
Comment #55
Anonymous (not verified) commentedhere's a patch that makes DrupalWebTestCase::resetAll() load all code for enabled modules before calling drupal_flush_all_caches() to make sure we don't rebuild stuff in the parent Drupal and miss any implementations.
so the only change #50 is to resetAll():
Comment #56
Anonymous (not verified) commentedumm. with patch this time.
Comment #57
sunThanks for debugging this, @beejeebus!
So this patch should actually prevent that from happening in the first place. dfac() is supposed to be able to fully re-initialize and build or rebuild Drupal into a state in which it can fully operate. Regardless of whether there currently is bogus or outdated information or not.
The actual cause and problem is the static variable in module_list().
Apparently, module_list() calls into system_list(), which uses a drupal_static() already, so the system list cache is fully flushed and reset by dfac(), but module_list() is not.
That can obviously call for trouble only. So I'd suggest to fix that static instead.
Interdiff is against #50.
Comment #59
sunThankfully, #1541958: Split setUp() into specific sub-methods made the operations being performed in setUp() much more clear. To explain the fatal errors and quote from there:
Comment #61
sunSo #59 resolved a lot of failures. Now dfac() only needs to ensure that all modules that are supposed to be enabled are actually loaded. Added in this patch.
Comment #62
sunVictory! :)
That said, if we don't want to introduce the new drupal statics, then we could as well reset them manually within dfac(). However, I think that adding them is the proper change. Especially because of the module_list() vs. system_list() difference.
Comment #63
Anonymous (not verified) commentedi want to set this to RTBC, but i guess fago should have a look first?
Comment #64
sunIf #1541958: Split setUp() into specific sub-methods lands first, this patch will need to be re-rolled. To do so, just skip/remove the hunk in drupal_web_test_case.php.
Comment #65
fagoTrue. Ok, let's do so.
Why do we re-order that in module_enable()/disable()?
That looks problematic, as updating the bootstrap status relies upon module_implements() which relies upon module_list(), thus both caches need to flushed before we call this function. Otherwise, the new module might be missed my _system_update_bootstrap_status().
That said, clearing all that different module-related function and their interdependencies is quite a mess too. This would deserve its own clean-up issue.
Else, patch looks good to me.
Comment #66
sunThe concern in #65 sounds valid to me. Thus, I've reverted all changes to _system_update_bootstrap_status() in attached patch.
Comment #67
fagoGood!
Comment #68
sunHm. I'd prefer the change for setUp() to land with the follow-up patch in #1541958: Split setUp() into specific sub-methods instead of this one, because the follow-up fix really belongs over into that issue/patch.
Comment #69
dries commentedCommitted to 8.x. Big clean-up. Thanks for all the careful reviews.
Comment #70
sunThanks. This is a huge API change and obviously needs a change notice. :)
Comment #71
sunCreated two change notices:
http://drupal.org/node/1561490
http://drupal.org/node/1561492
Comment #72
moshe weitzman commentedNice patch, but the name menu_router_rebuild() is misleading. We are rebuilding two subsystems here - menu router and menu links. Conflating the two is a step backward IMO. Hopefully we can fix that in a followup.
Comment #73
xjmAwesome!
Is there a followup issue for #72? If so, can we link it here, and if not, should we file one?
Comment #74
sunHm. I'm not sure whether we need that follow-up - the Web services initiative will rewrite the router system anyway. It's possible that these changes will remove the double notion.
Comment #75.0
(not verified) commentedUpdated issue summary.