Problem/Motivation
Under circumstances system updates will not be executed as the first ones even if there are no dependencies preventing this. However there are cases where the system and the DB should be prepared before other updates are allowed to run.
Actually there are a lot of places in the core, which are suggesting that it is desired that system updates are the ones being executed first, but under some circumstances this is not completely guaranteed - e.g. having one system update and multiple updates of another module leads to assigning lower weights to the updates of the other module.
Proposed resolution
In update_build_dependency_graph()
find update functions without edges to system updates and add such updates as edges to system updates so that system updates are executed first / get assigned lower weights by \Drupal\Component\Graph\Graph::searchAndSort()
.
Provide a general check in \Drupal\FunctionalTests\Update\UpdatePathTestBase
assuring system updates will be executed first.
Remaining tasks
Review.
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#89 | 1894286-nr-bot.txt | 4.58 KB | needs-review-queue-bot |
#85 | 1894286-85.patch | 11.37 KB | DanielVeza |
Issue fork drupal-1894286
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #1
larowlanWe have DependencyHookInvocationTest.php to test the order is correct, is it possible that we have a circular dependency?
Comment #2
larowlanModules implementing hook_update_dependencies().
All of these are concerned with getting the users_data sorted so there is no reason why user.module updates should come before the system.module ones
Comment #3
chx CreditAttribution: chx commentedDependencyHookInvocationTest only tests when dependencies are enabled. The order does seem to be the same all the time but when it looks like I won't trust it...
Comment #4
BerdirAs discussed with @chx, changing the priority of this to normal.
I haven't seen random upgrade path test failures like we used to have frequently in quite a while and while this does look strange, it is IMHO neither critical nor major until we have proof that this actually causes problems with the upgrade path.
Comment #5
DamienMcKennaClosing this in favor of #211182: Updates run in unpredictable order which has several patches and over 200 comments.
Comment #6
DamienMcKennaSorry, I hadn't realized the other issue was committed and that this is a symptom of the problem still existing.
Comment #12
hchonovI was experiencing this yesterday and today and suddenly it disappeared. While debugging for #2951279: After 8.5.0 upgrade: Unknown column revision.revision_default it happened multiple times that the
system_update_8501
was executed as one of the last instead as one of the first updates. This cased my Drupal update to fail, because it is necessary that this particular system update is executed before performing entity CRUD operations in other updates.I've cooked something which should further ensure that system updates are executed as soon as possible.
Comment #13
hchonovHaving a bunch of contrib updates and updating Drupal from one minor to another minor version could be classified as dangerous if we cannot ensure correct update order.
Therefore I think that this issue is at least major even if the failures are random, as we now have a lot of complains in #2951279: After 8.5.0 upgrade: Unknown column revision.revision_default , which for me is enough proof that there is something wrong.
Comment #14
hchonovI've finally found something which is reproducible - update of a "standard" profile installation from 8.4.8 to 8.6.x revision "e9dd745e03168b4759f8d3261b3ebd93246cb168".
Case 1:
There are no update dependencies. The updates order as returned by
update_resolve_dependencies()
looks like this:There is absolutely no reason for the taxonomy updates to be executed before the system update, but this is what
\Drupal\Component\Graph\Graph::searchAndSort()
generates.Case 2:
If we define through
hook_update_dependencies
thattaxonomy_update_8502
andblock_content_update_8400
have to be executed beforesystem_update_8501
then the updates order looks like this:With the attached patch to this comment the cases look like this:
Case 1:
Case 2:
The new patch ensures that system updates will be executed as soon as possible.
Comment #15
hchonovIf there are multiple system updates and there is an update defined to run before the first system update, then we should not add regular updates as edges to later system updates otherwise the dependencies will not be executed before the system updates.
What went wrong with the last patch is this case:
Define that
taxonomy_update_8502
should run beforesystem_update_8501
and add new updatesystem_update_8502
.With the previous patch the system updates will be executed before the taxonomy updates
taxonomy_update_8501
andtaxonomy_update_8502
, but it should be the other way around and the new patch is making this right.Comment #16
hchonovI've implemented a general test asserting that system updates will be executed first.
Comment #18
hchonovComment #20
hchonovComment #23
tstoecklerHad a look at the patch and it seems to be doing three things. None of which seem inherently problematic, but I'd like to get a better understanding why they are needed and also why all three things are needed. The three things are:
Notes on the actual patch:
As far as I can tell the code here is correct, but I think this could use an example with a bunch of update functions and their weight before and after calling this function.
Should we not do a
array_unique()
on this? I think there could be multiple paths from one update function to another.This is a mix between describing the behavior generally and providing an example. I think that makes it less understandable than just giving the example. Also I think the fact the U1 has an edge to U2 is kind of missing in the explanation. I suggest:
The code here is correct, but it is strange to read
foreach ($graph as $non_system_func)
since the graph contains system functions as well. What do you think about doing a$non_system_graph = array_filter($graph, ...);
before this?I think we should do an
array_unique()
on the result of thearray_merge()
Same question here: Could we use a
array_filter()
here?I don't think this is being used
Comment #24
hchonov1. Add +1 to all weights of non-system functions
is needed to ensure that system updates will run before the other updates when in the same group of weights. So if the graph outputs something like -then this weight reordering would ensure that system updates will be the first to run in a group, because we put them in a group before the group they initially belonged to.
2. Add edges to "grandparents" of system updates
During my manual tests I've noticed that somehow the order U1=>U2=>S1 might break if not both U1 and U2 get an edge to S1. I think that it is not wrong at doing so when it increases the stability.
3. Add edges to system updates to updates without any system update edges
This is one more step into increasing the stability and is similar to 1. by ensuring that everything depends on the system updates.
One might argue that 1. and 3. are doing the same thing, but I thought that the more safety features we have the better.
About the patch:
I've thought that it is not a problem to set the same update multiple times as an edge :). But I guess you are right, that it is better to return only the unique entries. Done.
This is used when we come to a function, which is not a system update. Consider the forbidden case S1->U1->S2 (given they don't have dependencies between each other), now when we arrive at U1, we'll set
$system_updates_finished
toTRUE
and when we reach S2, we'll assert that this variable is stillFALSE
, which will then fail as expected.Comment #25
tstoecklerRe
1. ...
2. ...
3. ...
So I think if 2. is in some way required that seems like a separate bug. And while you're right that doing both 1. and 3. isn't a problem in and of itself, but I do think adding code to an already complex subsystem comes at a high maintenance cost. So I would prefer we only do either 1. or 3. I personally think 3. is a bit more elegant than 1. but I'll leave that up to you. If I understood correctly the test wouldn't need any changes.
Re patch fixes:
1. I think that's already helpful, thanks! (Although we might have to update/remove it per the above.)
2. ✔️
3. ✔️
4. No, that's fine.
5. ✔️
6. See 4 😉
7. Yup, you're right. I had the logic messed up in my head.
Leaving at needs review because my only criticism is a conceptual issue and it wouldn't hurt to get more eyes/opinions on this.
Comment #26
tstoecklerThis needed a re-roll, but Git was able to handle it.
Comment #28
kfritschePlain re-roll.
Comment #31
geaseSimple reroll.
Comment #32
hchonovThis should fix the failing tests complaining about vfs. The other one does not fail locally. Let's give it another try.
-----
I must've got confused - the test still fails locally and therefore I've re-opened the issue where it was introduced and posted an additional test for demonstrating the problem - https://www.drupal.org/project/drupal/issues/3031128#comment-13349092
Comment #34
hchonovAccording to the referenced issue we are not allowed do this check in this particular test and therefore I am simply overriding the method with an empty body.
Comment #35
baikhoPatch doesn't apply anymore on HEAD of 8.9.x and 8.8.x, needs a re-roll.
Comment #36
baikhoComment #37
gnunes CreditAttribution: gnunes at bio.logis Genetic Information Management GmbH commentedI've rerolled this patch for 8.8.x, 8.8.5, 8.9.x, 9.0.x and 9.1.x.
It needs review, please.
Comment #38
alexpott@gnunes if a patch applies to more that one branch then there is no need for multiple patches. So we only need on 8.x patch and one 9.x patch.
Comment #39
gnunes CreditAttribution: gnunes at bio.logis Genetic Information Management GmbH commentedComment #42
hchonovThe 8.8 patches are failing because of the committed #3063912: Move UpdatePathTestBase::runUpdates() to a trait.
The 8.9.x branch passes because of the committed #3098475: Add more strict checking of hook_update_last_removed() and better explanation, which has not yet been committed to 8.8.x.
Comment #43
hchonovJust to ensure that #3098475: Add more strict checking of hook_update_last_removed() and better explanation fixes the 8.8.x fails.
Comment #44
alexpottThis bugfix shouldn't be changing the text.
This is a worry. I think the code in assertSystemUpdatesPriorityExecution should be able to run with broken stuff. Because otherwise contrib tests could hit this too. Or future core tests.
Comment #45
alexpottRe #44.1 hmmm actually I don't get the changes to system_requirements... ah I see it included #3098475: Add more strict checking of hook_update_last_removed() and better explanation. Let's get a new set of patches on this issue.
Comment #46
alexpottHere's the same patches as #37
Comment #47
chr.fritschI just wanted to note that this patch also fixes #3131523: Running update path with core 8.8 and pathauto breaks
Comment #48
alexpottGiven the importance of running system updates first and in-wild issues that would benefit from this I'm promoting this to critical as I think it will help the overall stability of our update eco system.
Comment #49
andypostMaybe new function should start with underscore or it is supposed to be new API?
Comment #50
alexpottGood idea @andypost start with an underscore and have @internal. This is definitely not API.
Comment #51
andypostHere's a fix for #49 and CS
Comment #53
andypostValid patch for 8.9
Comment #55
dwwKnowing this is a critical bug, trying to give this a thorough review. I haven't (yet) read any of the comments in the thread. Just trying to make sense of the patch on its own. Having some trouble with that. ;) Hopefully this is useful feedback. I know the db update system is confusing and hard. But I'd love to try to help keep it accessible for a wider swath of the core contributor community to make sense of.
Reading this whole method a few times, I'm not totally clear what "group" means here. It's new terminology to me for the DB update system. Does a group just mean "all updates with the same weight value"? Do we need to introduce and use "group" here for this?
Given the complications below, can we get a slightly more complex/realistic example to help make sense of the algorithm this function is implementing? This example is so trivial, it doesn't add much to understanding the function.
The example from above says we're going to start with 2 updates with weight 0. So we're going to have:
$new_weights = array_combine([0,0], [0,0]);
Due to the collision on 0, isn't that just going to give us:
$new_weights = [0 => 0]
? Is that intentional? Seems a bit wonky and magical. If that's intentional, can we document it? If it's a bug, I'm shocked, but then we should fix it. ;)This nested foreach could use some comments. I still don't fully understand why it's doing this in this way. The
$old_weight
might be much higher than the$system_update_weight
, but we only increment$new_weight
by 1.If the goal is to ensure that everything *not* a system update has a weight higher than all the system updates, I'd expect we'd keep track of the
max()
of$system_updates
and then in the final loop, we just check each update, and if it's not a system update, we simply add$max_system_update_weight
to the defined weight. That seems more efficient, more obvious, and more clear. But it's probably the wrong behavior. ;) I don't know why. Please comment this to help me/everyone understand what's going on. ;)And here... don't know what "in a group with a system update" actually means.
instead of $func, can we call this $update_name for consistency with the above code, to avoid confusion that it's some magic PHP syntax inside a lambda (anonymous) function? ;)
A) Wants a comma after the first "U2".
B) "pointing to the a system update S," doesn't parse. I think we can remove "the a" and have "pointing to system update S". Maybe "S1" is better for consistency with the U* examples?
Why is everything in this graph $non_system_func? I don't get it just from reading the patch. Oh, because of the line right below it:
Hrm. I guess this works. Not sure there's a cleaner alternative.
Very efficient and clever. And also a lot to try to make sense of in a ~160 char line. ;) Probably I'll be outvoted and this should stay as-is. But for mere mortals trying to understand this code, maybe it's worth unwinding this a bit? /shrug
A) It's slightly confusing that these loops have the keys as system_function vs. non_system_function in advance, but then immediately test data['module']. More of the same from 2 points up. But I don't see a cleaner way to do this. /shrug
B) Trying to make sense of this whole block of code, it seems like thanks to the outer loop, for a given non-system update U1 that has no edges pointing to a system update, we're going to add U1 as an edge to *every* system update. Right? If so, I think the comment above should say that's the intended behavior.
We re-compute all this many times. It's not clear why. Can't we do a single pass through the entire graph, figure out if each update has a system_update_edge once, and then re-use that information inside the nested loops? Might make the whole thing easier to understand, and perhaps more efficient overall?
A) The whole first sentence seems like "test might retrieve schema version before updates are executed, which will lead to (something else) will retrieve schema version before updates are executed". ;) So what? I don't know why we need to prevent that. Maybe we don't need to explain, but it seems a bit circular.
B) "To prevent this, we ..." (wants a comma).
2nd sentence doesn't parse: "... update information have been called they might..." -- huh? I'm not going to pretend I understand this enough to suggest better wording. ;)
I thought we didn't do periods at the end of @see comments like this.
I have very little idea what this code block is doing. Clearly my PHP lambda function fu is lacking. Why pass $db_update_controller twice to bindTo()? Reading https://www.php.net/manual/en/closure.bindto.php only partly explains. I guess it's not the job of inline comments to teach readers about such intricacies of PHP closures, but ugh.
Naive Derek asks: "Why not just call
$db_update_controller->getModuleUpdates()
directly?" ;) WTF do we need the anonymous function and doublebindTo()
for? Help?!$a === 'b'
seems harder to mentally parse than$a !== 'b'
. Maybe that's just me. Also seems to rely too much on operator precedence. We don't want to negate $a and then compare to $b. We want to negate the comparison of $a and $b. If I understand correctly, let's use !== for this.Elsewhere in this patch, we use:
foreach ($updates as $update_function => $update_data)
which seems a bit more self-documenting than this.Thanks/sorry!
-Derek
p.s. Not changing the status, since I don't want to hold up a critical bug for mostly minor cosmetic commenting stuff. We could open a follow-up. But if anyone wanted to re-roll to address any/all of this, I'd feel a lot more comfortable RTBC'ing.
Comment #57
daffie CreditAttribution: daffie commentedThe number of updates this function returns is only two (system_update_8901 and update_test_semver_update_n_update_8001). This is so, because the number of installed modules is very small (system and update_test_schema). We should add a new test with all or almost all core modules installed.
Comment #60
SpokjeComment #61
SpokjeGonna have to let this one go since, I really don't grok most of the logic in this patch.
Comment #62
dwwTagging this as a possible target for the #BugSmash initiative...
Comment #67
Taran2LComment #68
Taran2LOK, the idea of the latest patch is to run updates with modules dependencies in mind. Naturally, it puts core (system) things first. Tested this approach with Open Social based project upgrade with 220 updates. It worked.
Let me know if we should move in this direction.
Possible pitfalls: this patch might create a circular dependency ...
Comment #69
Taran2LComment #70
daffie CreditAttribution: daffie commented@Taran2L: Your patch needs testing.
Comment #71
Taran2L@daffie, yes definitely. But before investing time into it I would like to see some pre-approves of the taken direction.
Comment #72
daffie CreditAttribution: daffie commentedOpened a thread on the MR.
Comment #73
alexpott@Taran2L when you provide a whole new branch to review on an issue it is on you to provide information to the previous branch author and other following the issue as to how your new work differs from past efforts. The comment #68 needs to explain way way more - like why it might create circular dependencies. And it needs to add the test coverage from the previous work on this issue because just saying this fixes things is not enough in this area. FWIW when I've looked that this issue before #53 was a good solution and all we needed to do was to address #55 much of which was cosmetic. Can't we go back to trying to do that.
Comment #74
Taran2L@alexpott, OK, fair points, but I'm not sure the language is. Isn't one of the crucial feature of the forks is to allow of different opinions ?
Anyway, the main difference in #68 from previous work in #53 is that it changes order for every pending update (not only core ones), forcing an update subsystem to take into the account dependencies of the modules. This actually broads the scope of the issue, but also (naturally) makes system updates to run as soon as possible (just because most of the code in the end depends on system module)
Technically it makes each first pending update of each module to depend on the last updates of all the modules that this particular one depends on
This approach might create a circular dependency in conjunction with
hook_update_dependencies()
when calling code is setting expectation on the order. Personally to me, this should not be an issue in the real world, because it's just wrong to expect updates from the things that a module does not explicitly depends on.Also, as I asked in #68
If this is a direction that is interesting, I can provide more details/description/create a followup etc, but I don't think investing my time into a direction that won't be considered/committed at the first place worth it.
Comment #75
alexpott@Taran2L thank you for #74 - I was not saying that alternate approaches are not welcome I was saying that when an alternate approach is offered it needs take into account the previous state of the issue. Given the prior approach was close to rtbc given @dww's review that means completely new need come a bit more fully formed than #68. I'm sorry that I did not thank you for working on this complex issue. It will be great to have this solved.
I'm not sure that the logic as explained in #74 is correct. It's perfectly possible for a module to needs it's update to come in-between another module's (that it is dependent on) update. This is why
hook_update_dependencies()
is bi-directional - i.e. you can set the dependencies of your own updates but you can also set dependencies of other updates.Comment #76
Taran2L@alexpott, no worries ..
So, tests from previous version do pass with the new approach without any adjustments. Is it worth to explore the thing more ?
Comment #77
alexpott@Taran2L as far as I understand from your description one of the documented purposes of
hook_update_dependencies()
will be broken by the patch. Ie...So I'm not sure that it is worth exploring - unless I'm wrong it does not break that.
Comment #78
Taran2L@alexpott, the proposed solution works like that:
hook_update_dependencies()
rulesanyway, in theory ... I think core lacks such complex test coverage, so can't tell for sure.
Comment #79
Taran2L@alexpott, my updates does preserve
hook_update_dependencies()
only when there is no dependency between modules.So, it needs some work. I've spent some time on it and was not able to achieve significant progress. I guess, we need to get back to #53 and (maybe) create a follow up to fix the order for every single module.
The idea is to:
Comment #80
alexpott@Taran2L I think the explicit dependency needs to be the last thing - it's there to resolve super gnarly issues.
Comment #81
xjmUnpublishing the second MR (and also hiding the previous patches before the first MR for clarity). Thanks all!
Comment #83
bircherI tried to grok the patch and I am not sure I understand all of it.
But reading the description of
_update_prepare_graph_for_sort
it seems to me that a much simpler algorithm could be used to change the weights: double all the weights and subtract 1 if it is from `system` (or add 1 if it is not from system)ie
I am not sure I can say something about the rest of the patch without digging in more.
Comment #85
DanielVezaThe suggestion is to revert back to #53, so here is a reroll of that for D10.
Now to address the feedback in #55
Comment #87
larowlanPatch at #85 was never reviewed
Comment #88
larowlanTagging for issue summary update, but likely will be 'address feedback in #55'
Comment #89
needs-review-queue-bot CreditAttribution: needs-review-queue-bot as a volunteer commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.