Updates run in unpredictable order
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | update system |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | needs review |
| Issue tags: | D7 upgrade path, Needs Documentation |
I'll leave it up to others to decide if this is a bug or not and if it's critical or not and if it should be fixed in D6 or not.
The contrib module updates in update.php do not run in any predictable order, it's not alphabetic and it's not by weight, and it's not by dependency, it's seems to be a pretty random order. I ran into this while working on fixes for the D6 updates for CCK. When CCK is updated, the content module is one of the last to be updated, even though it is first alphabetically and everything else is dependent on it. The fieldgroup module is one of the first to update even though it has a weight of 9 and the others are zero.
We just fixed the install process so modules install in dependency order, it seems like updates should also run in dependency order. The update will be a harder fix than the install though because updates now run on uninstalled as well as installed modules so the dependency must be observed even when the modules aren't installed.
This is not a new problem in D6, but it has new implications now that modules that are not installed will get updated.
The problem will be mitigated, a little, by http://drupal.org/node/208602, if it gets in, so developers can prevent updates from running if some important previous step hasn't happened yet, but that is not a total fix.

#1
KarenS: modules which are not *enabled* are updated, not modules, which are not *installed*. I don't remember having updates running in predictable orders anytime.
I wonder how dependencies will be resolved for disabled modules (which could depend on modules which are not even in the file tree, so we cannot track their dependencies).
#2
@gabor, right, I mis-spoke, disabled modules are updated, not modules which are not installed.
#3
OK I have no idea how to fix this issue in core, but maybe this might be a potential workaround for cck.
There was a unique index introduced after RC1 that had to be rolled back and replaced. Because we can't know what the schema is in core and had to deal with updates with two potential schemas, a variable_set() was put into the first update, then subsequent behaviour was dependent on that. This effectively makes one update dependent on another. If you know exactly which module requires the update to have been run already, the respective update function could also delete that variable. If it's one update dependency for multiple dependent modules, then I guess the worst that happens is a stale variable gets left in the table.
Patch and discussion is here upwards: http://drupal.org/node/164532#comment-679172
Also, I don't see that this is that much of a different situation to the 4.7 cck to 5.x core content type creation (whch iirc relied on users following instructions correctly), or 5.x update status to 6.x core update module - which requires a full uninstall run on the contrib before upgrading. Obviously it's better not to have to worry about users following instructions, but both came to mind reading this.
#4
Well, we never had ordering / dependency support for updates, and this is indeed sad to be missing. But we did live without it so far and being a totally new feature to implement I'd suggest you depend on some existing technology to implement it when you need:
- http://drupal.org/node/208602 is not committed, so #abort-ing updates of one module is possible
- you can check the schema version of a module in your update functions, to ensure that a module you depend on already run its updates *to the point you require them*
- you need to have your users run update.php multiple times to get through this, or we can build in something automated at the end of update.php to look again for update functions to run, but that would possibly go into an infinite loop with #abort-ed updates being tried to run again.
Looks like you can do this with the current tools, and no dependency ordering would help you require specific schema versions, which you might not get with #abort taking effect even with some kind of consistent update ordering.
Or to put it differently, if module B updates run after module A updates, but module A aborted some of its updates, you are screwed again. You need to look for the schema version anyway, which requires some specific code on your part, but leads to the expected result instead of more failures in this case.
#5
gabor - you correctly say that #abort can cause problems but i think this will be very rarely used if we guarantee the order in which updates are run. modules with pending updates should be considered unreliable until their updates are run. we are reducing the meaning of dependency to mean "except when running update.php".
my inclination is to fix this for 6, even at this late hour.
#6
Moshe: so you would not run updates for modules which depend on modules for which updates were not run / are not runnable? I think would be just a small step in the right direction, since often we make assumptions about a specific schema version of a module being already there, but I am looking forward to non-disruptive ways of solving dependencies in the update as well. There is no such feature to mark this as a bug of, so at best, this can be made a task.
#7
"There is no such feature to mark this as a bug of, so at best, this can be made a task." - one could say it is a bug in the dependency feature.
I'll ask a naive question - how come we can guarantee module run order during install and not during update. i would think they are the same problem. karen says "the dependency must be observed even when the modules aren't installed". whats hard about that? dependencies are declared in .info file which is always safe to load.
#8
The difference IMHO is that in install we expect a clean slate, while in update, we always go from one version to another, so the beginning of the road is also "unknown" which results in more error conditions. Lamenting on whether this is a bug or not, does not move this forward anyway. Dependencies were never a feature in updates, so I would not call them a bug, but I am not interested in arguing about it, so let it be.
#9
Moshe redirected me to this issue from #220945: patch: drush mm enable/disable/dot/uninstall where I try to enable/disable modules. It was a too quick patch for drush, but atm i'm reverse enginering D5 to make it work ;-)
Afaik this dependencies stuff is about directed graph.That is walk through the graph start to end or the other way around (depending on dependency or reverse dependency) according to a 'Topological Sorted List' and halt on failure.
Whether installing modules or enabling or updates (backwards) disabling/uninstalling (forward) them. For a graph example see the aforementioned issue or http://build2be.com/files/build2be.com/ryan.png (ubercart).
I want to contribute to this issue but don't know how to start. So some feedback is welcome.
Regards,
Clemens Tolboom
#10
Talked @ FOSDEM to Gabor and Gerhard. Their advise was to first patch D7
... so i will try to deliver a patch for D7
... guess i have to provide this for install.php, update.php and module.inc
... that will take some time :-/
#11
Hm, there is a graph implementation already in the install code as chx pointed out (he wrote the code). Look into what happens when you enable a module which has dependencies, all dependencies are nicely installed as well on the web interface in Drupal 6. So drush could use that in Drupal 6 or even backport that code for Drupal 5 for use. The same is not run on running updates though, and this is what this issue is about. I am not sure how the drush operations relate to update ordering.
#12
Update.php function update_batch() has a simple for each in preparing the batch. This should not be the case!
$operations[] should contain the batches in a topological sorted list (TSL) order. This way a dependant module gets the 'right' updated dependency when updating itself.
In order to do this some changes should be made to modules.inc to provide a TSL of all modules choosen to update.
Ie function module_get_tsl( $modules = array()) which will according to 'dependencies' builds the right order out of the given list.
I've done this in my drush patch for mm for enabling/disabling/uninstalling and it works great.
I'm in preparation to document this idea.
#13
I documented my thoughts http://build2be.com/content/module-management-drupal without getting much response.
Just talk to chx a little about DFS, TSL and yes I could try to provide a few patches. But what I want is a little more: rewrite install.php, update.php, module.inc and system.inc to streamline all module related code into module.inc. And I guess that's to far fetched to do.
I made my drush patch for D5 a real module drush_mm (not only) to make it easier to test this issue against.
@KarenS What is the recipe to reproduce this issue?
#14
We ended up using the #abort solution for cck field modules - all contrib fields should do the same :-( See #304813: CCK module developers read this!!
#15
In #310898: Making module_enable and module_disable API functions module_enable and module_disable do not run in a unpredictable order. I implemented a TSL based on the modules dependency graph. Maybe this can be used here too?
#16
If you need a workaround to ensure that a module like CCK has its updates run first (without requiring other dependent modules to add any code of their own), you could just do that via hook_form_alter. For example:
<?phpfunction content_form_alter(&$form, $form_state, $form_id) {
if ($form_id == 'update_script_selection_form') {
// Preserve the Drupal core behavior where system.module's updates run first.
$form['start']['system']['#weight'] = -100;
// Make sure CCK's updates run next, before any others.
$form['start']['content']['#weight'] = -99;
// (You could clean up the above code to make this more robust, but it's probably good enough as is, since other modules will default to #weight = 0.)
}
}
?>
It's really ugly, but as far as I can tell from testing it, it works fine. Seems like it might be a much better approach than #304813: CCK module developers read this!! (is it too late to matter)? You could also probably use
unset()statements in the above code to prevent dependent modules from running updates if they are not enabled (the other issue that CCK is facing).(I've been banging my head against this all day for Acquia, where we are considering in general how we might have to deal with complicated update dependencies in the Acquia Drupal distribution, and so far this is the solution I have.)
#17
Presumably a real solution would go in Drupal 7 now.
#18
In thinking about a way to fix this for D6 that involves minimal changes, one possibility is that Drupal could continue to run updates exactly as it does now, but just before running them, it could simply invoke something like drupal_alter() on the list of module updates. This would allow CCK (or any other module) to implement a function in its install file that reorders the updates, enforces dependency checking, etc. For an example, see the patch I posted at #304813: CCK module developers read this!!... which more or less does this already for CCK, but since there is no hook it needs to use a very sketchy method to "intercept" the list of module updates, and that is not really good.
Any thoughts on whether something like this kind of drupal_alter() could make it into Drupal 6 as a bugfix for the issue here? It would be completely nonintrusive (would not change Drupal core's behavior at all), but it would be adding a new hook to a stable release... not sure if that's a no-no or not. (Note by the way that it couldn't use "drupal_alter" exactly, since the hook would need to be called for disabled modules also. But that's not really a big deal.)
#19
From the complex upgrade experiences we have it looks like it would indeed be a good idea to try and fix this in a backwards compatible way. We should however sketch up a few scenarios. How would multiple modules "interact" on the chain of alter calls to form a list which is fine for all? To me it sounds like the modules know rules that "I should come before these other modules" (and the list of those other modules is dynamic depending on the site, eg. all field modules); or "I should come after this module" (eg. after a base module of a module suite). If we only provide a simple alter, modules altering the rules might do changes which go against requirements set by previous modules possibly, right?
#20
Could we perhaps use the dependency system here - so core runs first, then modules with no dependencies, then modules which depend on those modules, then modules which depend on them etc. etc. - Are there any situations where that pattern would be inappropriate?
#21
Yes, if modules run updates on Drupal 5 database tables for example on a Drupal 6 site. Take image module. It does not have any 5.x-2.x table release, but it does include db updates for 5.x-2.x in the Drupal 6 version. So you are likely used 1.x stable and now update to 6.x, but all your images will be lost, since the image module tries to run updates which should be run BEFORE system module's updates (system module updates the core tables). Although system module does not depend on image module in any way :)
#22
Well, to some extent, this is a potential issue with many Drupal hooks, right? Any module that implements something like hook_nodeapi() or hook_form_alter() can do a whole lot of damage to other modules if it chooses to :) So maybe the solution is just to say that any module implementing this hook should be careful to only affect modules it actually cares about, and if it doesn't, it should be filed as a bug against the module implementing the hook...
For example, the code I proposed in #304813: CCK module developers read this!! is definitely too "greedy" in that it forces system.module to always run first and content.module to always run second. It should instead be rewritten so that it checks module dependencies and only sorts CCK to be in front of modules that depend on it, rather than in front of any module at all.
Perhaps to facilitate this process, the results of module_rebuild_cache() could be passed along with the hook, so that modules have an easy reference to figure out module dependencies they might need to know, etc.
#23
Well, the consequences of wrong update function order are a bit farther reaching then the consequences of nodeapi hooks not runnning well. Anyway, let's see a dependency based update function order.
#25
Marked #168018: Module dependencies not enforced on update as a duplicate, subscribe, and IMHO this is critical.
#26
Not sure this is "we cannot ship without this fixed", but it is really close.
#27
For me the solution looks quite trivial :0 Let me illustrate this by an example (guess that's not complete so please reply)
Say we have just _installed_ a new filefield and enabled it and _afterwards_ installed a new drupal version _and_ updated views. Futhermore the *.install have a new
<?phpfunction hook_install_dependencies
?>
<?phpfunction filefield_install_dependencies() {
return array(
// ie patch for new views schema change
"6003" => array(
"views" => array(
"after" => "6005"
"before" => "6006"
),
),
);
}
?>
<?phpfunction views_install_dependencies() {
return array(
// ie patch for new node teaser short field
"6005" => array(
"node" => array(
"before" => "6025"
),
),
);
}
?>
By laying out the dependencies named in
hook_install_dependenciesas 'module' - 'schema-version' together with the current and latest version and adding all updateable modules we get a graph like below (http://drupal.org/files/issues/update_0_0.png)Having a graph we can run update.php on it. This example shows a problem though. FileField was _installed and enabled_ before the updated views version was installed. To solve this extra complication the
function hook_install_dependencies()must be used with installing new modules too. That way the filefield module wasn't even enabled before the new views arrived.For module dependencies _without_ a
hook_install_dependencieswe could use the module dependencies graph which is illustrated with the taxonomy and forum module.#28
A better hook name would be
hook_schema_dependenciesbecause it's about schema (version) dependencies.#29
Because we already use the metadata from the docblock during the update process, why not putting @requires tags in that:
<?php/**
* Upgrade FileField to Drupal 7.
*
* @requires content:7001
*/
function filefield_update_7001() {
// Do your stuff
}
?>
#30
In #27 the image is broken so here is a new version.
@29 Good point but we need more then require right? Hope the image helps
#31
I like the docblock suggestion. But do we need #393068: Make the registry class hierarchy, interface and docblock aware for that?
Anyway: a given module must tell _both_ a require and a _reversed_ require. Better words are maybe: depends and rdepends. It is ie not up to views to tell filefield what schema(s) to use. See #27
Anyway ... I'll try to device a test for this first. graph.inc does not have a Topological Sorted List TSL) but a weight assigned. That's kinda random TSL :p
#32
@clemens: graph.inc does generate a topological order (as soon as the graph is not cyclic), or something close if the graph is cyclic. What makes you think otherwise?
#33
The update path is broken because of this (see #583226: Update broken: Call to undefined function field_attach_create_bundle()). We really need to sort this one out.
#34
This patch implements @requires and @blocks directives to the update functions. I tested the logic locally, but we need to add automated tests to that.
Added an example dependency between system_update_7038 (enable the field modules) and comment_update_7005 (create bundles for comments).
#35
A quick review first:
I did not think it was that easy :) Cool.
@requires versus @blocks
- I think blocks is a weird term.
- In #27 I popped before/after which are at least each opposites.
- Maybe depend/rdepend (debian) is better
- Your code doc states: forward/backward
-- I like before/after
Because the functions are taken from the doxygen we need at least a function_exists? But guess batch API solves that?
The requires are nicely listed on 'pending updates' ... there is something with newlines. My try with two looks like
7007 - Create comment Field API bundles. @requires system_update_7038@requires ystem_update_7038
I'm writing a test atm. Hope to finish soon.
#36
I get errors array key 'module' does not exists.
The comment example update
@requires system_update_7038leads to _adding_ 'system_update_7038' to the graph _and_ into the batch which is not required! It should be just a claim 'I need system in schema version 7038 _or higher_'That explains the error. So only the modules in the batch $start should be in the final batch?
#37
I'm unsure about the exact input $start but
Do we have to test for $updates === FALSE ?
<?php// Set the installed version so updates start at the correct place.
foreach ($start as $module => $version) {
drupal_set_installed_schema_version($module, $version - 1);
$updates = drupal_get_schema_versions($module);
?>
====
This test
$version > $max_versioncannot happen right? Unless it's shortcutting FALSE?<?php$max_version = max($updates);
if ($version > $max_version) {
continue;
}
?>
====
afaik we cannot choose anymore what version to upgrade to so $update < $version will never happen.
Hmmm are we missing the latest version?
<?phpwhile ($update = array_shift($updates)) {
if ($update < $version) {
continue;
}
?>
#38
@clemens.tolboom: no, there is no issue there, a module that is not installed cannot be updated.
There is no UI to select the version by default, but the underlying code does and should still support this functionality. The idea is that a contrib module could restore the confusing select boxes.
#39
I moved the graph functionality into its own function so we can write a test for it without having to run the batch.
I added too many TODOs :(
We now have a ordered schema update but still not a correct update scheme. A
@requires system_update_7038is now just used as an ordering principle but not as a validation to "Is this update really possible". But that is up to update.php form(s)Furthermore in writing this comment I realize we need to add _all_ versions of the included modules through @required / @blocked into the graph.
#40
Added some tests for
function _update_calculate_graph.I atm don't know how to run the batch directly.
#41
Can we change @requires and @blocks?
I really am still puzzled about these. I propose
@before system_update_7004
@after views_update_7003
#42
The last submitted patch failed testing.
#43
I guess #521838: drupal_get_schema_versions() takes 30% of page execution time on /admin got this broken.
Not sure what yet. #40 ran successful once :(
#44
A new patch for #521838: drupal_get_schema_versions() takes 30% of page execution time on /admin is committed again.
#45
#40 works fine now
#46
Code style fixes, need to add validation in update.php
#47
missed some tabs
#48
This changes to @before and @after for clarity/consistency.
I'm also not really sure what validation is needed, unless we modify update_get_update_list() to build the tree and sort dependencies there?
#49
Here's a patch that adds an extra stage after the selection (non)form. This will show any incompatible updates and then allow you to apply remaining updates. Please see the screencaps for a better example. I've cleaned the strings up a little bit since taking the screencaps.
#50
Images are missing somehow for me.
#51
The last submitted patch failed testing.
#52
i don't think the test failures are related....
#53
mikey_p: What do you think about the feasibility of testing for missing or circular dependencies? When I tested it at the Drupalcamp PDX sprint, these cases threw exceptions. (mikey_p pointed out this is really an edge case within an edge case, and I agree updates don't need to "work" in this case, just that it should fail in a predictable way if graph.inc can't create an update plan).
On the whole this code seems to work as advertised. Tagging for API docs.
#54
The function I added ( update_check_dependencies() )to check the dependencies and invalidate update functions whose dependencies are not met, will definitely end up in an infinite loop on circular dependencies.
To properly test this out, add some additional update functions to an enabled module, check that update.php picks these up and then add an unmet dependency to one of them, such as @after system_update_8001 and try to submit the batch in update.php it should pop up a page similar to this:
Of course it should still run remaining updates, but not anything with unmet dependencies.
This could definitely use some review of the strings and such.
#55
tagging
#56
Patch contains:
- two new test cases for wrong update scenarios with update_test_3/4 modules. The test are commented out. But I think we should have these as working tests.
- bug fix of update_test_2.module
- moved some documentation from function _update_calculate_plan back into update_batch
We still have a few TODO's which should either get explained or removed.
Furthermore I don't thinks circular dependencies are an edge case. It is just 3 contrib modules away :p
My quick effort in checking for circular dependencies based on http://drupal.org/files/issues/module_enable_disable_310898.patch (isCircularMember) failed. It throws always an exception :(
<?php
// First we need all participants
$participant=array();
foreach( array_keys($updates_graph) as $from) {
$result[$from]=1;
if (isset($updates_graph[$from]['edges'])) {
foreach( $updates_graph[$from]['edges'] as $to) {
$participant[$to]=1;
}
}
}
$has_circular_member=FALSE;
foreach( array_keys($updates_graph) as $from) {
if (isset($participant[$from])) {
unset($participant[$from]);
}
else {
$has_circular_member=TRUE;
break;
}
if (isset($updates_graph[$from]['edges'])) {
foreach( $updates_graph[$from]['edges'] as $to) {
if (isset($participant[$to])) {
unset($participant[$to]);
}
else {
$has_circular_member=TRUE;
break;
}
}
}
if ($has_circular_member) {
break;
}
}
if ($has_circular_member) {
throw new DrupalUpdateException("Circular updates.");
}
?>
#57
The last submitted patch failed testing.
#58
I don't think this needs circular dependency checking. Any such dependency would be a bug in the code of items that form such a dependency IMO.
#59
This is great. However, when I run it with #563106-32: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue, the comment update is ran first, before the system update which should be the first one to run.
comment module
Update #7000
* Comment order settings removed.
system module
Update #7007
* Inserted into {role_permission} the permissions for role ID 1, Inserted into {role_permission} the permissions for role ID 2
...
The patch attached fixes some notices during the update process.
#60
@scor: I'm not sure what you mean. This seems to be by design. The only constraint in that patch (we probably need more) is that comment_update_7005() needs to run after system_update_7038().
#61
subscribing
#62
Yeah if I'm understating this system properly, there is absolutely *no* default ordering, other than updates that specifically specify they need to run before or after other functions. So system.install running first is not guaranteed in any way.
#63
This patch adds a few more @after commands so that (in my testing) core can be upgraded when combined with #563106-41: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue. Is there a reason why the tests from #56 were dropped?
#64
Actually at the top of update_get_update_list() we attempt to set all system.module updates to run first. We also set this up to be displayed first in update.php in the validate function, I think also wherever the updates are listed.
#65
@quicksketch : We definitely need these test. See ie http://drupal.org/node/521838#comment-2081246 and further down.
Hope someone can put these back in :)
#66
as mikey_p said in #64, I was referring to the comment at the begining of update_get_update_list()
<?php// Make sure that the system module is first in the list of updates.
$ret = array('system' => array());
?>
which should be consistent with the order in which the updates are actually run (unless there are before/after commands which move some update before any system.install update function).
oops, the tests dropped of the patch, I forgot to add them to bzr. I've added them to this patch.
I tried the patch #563106-41: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue with this one on two Drupal 6 sites update with success! I got some failed updates due to the upload update but this dealt with in #563000: Remove upload module and provide upgrade path to file and does not block this patch.