| Project: | Drupal core |
| Version: | 7.x-dev |
| Component: | update system |
| Category: | bug report |
| Priority: | normal |
| Assigned: | kwinters |
| Status: | needs review |
Issue Summary
When updating from admin_menu 6.x-3.0-alpha1 to 6.x-3.0-alpha3 in Drupal 6.14 I received the following PHP warning:
array_pop(): The argument should be an array in [path]/update.php on line 314.
The line in question is:
<?php
list($module, $version) = array_pop(reset($_SESSION['updates_remaining']));
?>In D7 HEAD the line has been moved to 127 but is otherwise identical.
When I ran update.php again, the update finished normally. In both cases it was the only module that had changed (I updated nodewords just before that, but had already run update.php for that).
The main problem might be with the admin_menu module, but the fact that update.php doesn't handle it gracefully and doesn't give useful feedback is a bug in update.php.
Just reporting for now, hopefully patch to come later.
Comments
#1
Forgot to mention: #307570: Database errors when upgrading from Drupal 5 to Drupal 6 is the only other reference to this error message in the update issue queue, but there's a lot going on over there that isn't directly related to this issue.
#2
Made some progress, posting notes since I'm out of time to spend on this for today.
The updates are run as a batch, and when the batch is finished function update_finished is called, which puts the results into the session.
After that returns and the results page is displayed, the code makes the assumption that either update_success will be true OR updates_remaining will contain a non-empty array with exactly one item in it.
The first thing to fix is to validate updates_remaining and give an error message if it's empty / not an array.
Next, the reset() call should be turned into a foreach and every updates_remaining entry should output a message.
Finally, figure out what situation can lead to updates_remaining being empty on failure and provide more useful warning messages (narrow it down to the module / update number, for example).
The admin_menu update was not repeatable (likely a mysql timeout, etc.) so I'll also need to create a test module and / or SimpleTest.
#3
The original bug was not reproducible, and it looks like most of the problem has been fixed in D7 batch improvements.
However, the research revealed a related problem: an error during update will only abort the process for that one module. If multiple modules error out, only one error message will be displayed. Attached patch will output all errors from the session rather than just the first one, AND output a more useful message if the error list is not in a valid format.
#4
#3: update_multiple_errors.patch queued for re-testing.
#5
This is still worth fixing, and will become important soon because the upgrade path work is about to kick off. Can anyone help review?
#6
The issue here is actually somewhat more broad.
The current code in D7 is correct to only use the first array element in $_SESSION['updates_remaining'], because you can only trigger, e.g., one fatal PHP error on the page. The other elements in the array represent update functions that never had a chance to run, not update functions that triggered errors themselves, so we don't want to display them as errors. For that reason, the patch in #3 doesn't seem correct to me.
The biggest problem, though, is that the current code only works correctly in the case of PHP fatal errors or other situations where the batch process was interrupted. In D7, however, we except hook_update_N() implementations to throw exceptions, and those exceptions are caught and handled, so the batch API and $_SESSION['updates_remaining'] don't know anything about them.
Putting it all together it means that not only multiple update failures, but if fact any update failures corresponding to a handled exception, are ignored and treated as almost indistinguishable from the 'success' condition by update.php. I think to fix that we need a larger cleanup such as in the attached. This makes sure that all update failures, whether handled exceptions or not, get prominently highlighted on the update results page.
I'm also attaching some screenshots.
#7
In the code comments of the patch, I referred to user interruption (in addition to a PHP fatal error) as one way the current error condition can be triggered. Trying that out, I'm actually not sure I can reproduce it - e.g., by trying to hit the stop button in my browser or navigating away from the page while update.php is running. I thought in general it was possible to interrupt a batch process that way, though.
In any case, PHP fatal errors definitely do it :)
#8
In #3 I attached a screenshot where I triggered multiple errors and reported them using the #3 patch. Here were the update functions:
<?phpfunction test_goto_update_7125() {
$ret = array();
$ret['#abort'] = array('success' => FALSE, 'query' => 'fail 1');
trigger_error("Cannot divide by zero", E_ERROR);
die();
return $ret;
}
?>
and
<?phpfunction ex_test_update_7125() {
$ret = array();
trigger_error('Test 2', E_USER_ERROR);
die();
}
?>
Note that both of them are fatal, but they still both got run. I'm pretty sure this is just a result of how the batch process works (multiple executions). So, I still definitely think that $_SESSION['updates_remaining'] needs to be treated as an array like in #3 and I have a feeling that the patch doesn't handle that condition correctly.
Hopefully I'll get some time in the next few days to go over the patch in more detail and / or reroll, but if someone beats me to it, this is one way to test.
#9
Agree that we need better feedback from the batch process.
I spent part of yesterday testing out D6 -> D7 updates. First a did not get what went wrong with the db. updates until I realized that if node_update_7002 failed node_update_7003, 7004, 7005 did not run. We should inform the user that one update failed and list the ones that did not run caused of decencies rules. Now if it fails you have to dig into module.install files to figure out if/what other updates that did not run if one of them failed.
#10
Well, you will still need the code to find out what the updates actually do and why they failed. I don't think spitting out the code on the error screen is reasonable (and you need to look at it eventually regardless).
However, we can at least inform the user that some updates may not have been run because of these errors, and potentially which ones. I really think that this should be a different issue though, so please split it off. Single issues that try to do too much often don't get done at all.
#11
@kwinters, I tried your example code but modified it to add this line at the top of each function:
<?phpsyslog(LOG_NOTICE, __FUNCTION__ . " was called");
?>
So I could see for sure which functions actually were run. When I did this, only one function appeared in the syslog, which is what I expected. The batch API bails out when it hits a fatal error; I don't see how you can ever hit two of them.
I think your patch showed two errors on the screen because it reported all functions the batch API didn't finish running, but that includes ones it never tried to run also. Using @steinmb's example, it would have reported node_update_7003, 7004, 7005 as having "errors", even though those functions never ran.
I agree we might want to display somewhere the functions that never ran (maybe in the detailed report at the bottom of the update page) and also agree that that may be best left to a separate issue :)
For reporting errors, though, we should only report the actual functions that were known to have problems when they ran, and that is what I was trying to do with my patch.
#12
It looks like changes to the batch module underlying the update process in the last 9 months have removed the need for multiple fatal errors. I'm pretty sure my patch didn't report anything that didn't run, but that's irrelevant now. I tried it again, and normal fatal errors (function not defined, etc.) kill the batch process immediately and display an error. Multiple Exceptions where also handled fine.
However, 2x trigger_error with E_USER_ERROR level reports both issues in the log, but I couldn't get it to display a nice update.php result screen with either patch. Is this working for you? I ran two update funcs, each containing only trigger_error('Test 1', E_USER_ERROR); with variations in text message. E_USER_ERROR is handled differently than normal fatal errors, so maybe this is just an unhandled case. Screenshot attached.
#13
Ah, yes, that could explain it :) I believe it is the case that some drastic changes were made to the internals of the batch API in the last several months.
Using trigger_error() with E_USER_ERROR, I can reproduce the same thing as you. I am not sure if that is a bug or not; it is documented in the hook_update_N() docs that exceptions should be thrown in case of an error.
#14
Considering the hook_update_N() documentation, I'd consider it semi-OK to not treat incorrectly-reported errors (trigger_error) special. It would still be good to display them, however. Old update functions won't necessarily be updated for D7, and (not that this happens often in updates) external libraries could report errors however they wanted.
More importantly, this might represent a bug or design issue with the batch process underlying it. All that code is used by other batch processes, right? This behavior would need to be documented elsewhere as well, or perhaps fixed in the batch code.
In either case, that has gone beyond my knowledge of the batch process. Split it off into a different issue?