As has been described in several issues (#349375: Sometimes update_help() execute ages, Drupal freezing, #222073: Update Status causing Drupal to run very slowly, probably others), the lack of responsiveness of updates.drupal.org can cause sites with update.module enabled to freeze on many pages (list in update_project_cache()).
While a simple workaround can be to disable this module, as suggested, this loses an important safety asset of drupal: the ability to be warned about the need to apply (security) updates.
The cause for this behaviour lies in the fact that the module triggers _update_refresh() for these commonly used paths, especially the already heavy admin/build/modules page.
Adding just one setting could avoid the problem like this:
- by default, don't change the current behaviour
- if this setting is enabled, do not check automatically on these admin paths, but only on expired cron or manual checks
That way, sites would still be usable without fear of such checks, which recur on every such access once a fail has occurred until fixed, and cause a multitude of dangling tcp connections from the sites to u.d.o., (typical symptom: netstat -an shows lots of LAST_ACK, TIME_WAIT and FINAL_WAIT to the u.d.o. IP) and actually prevent accessing admin/build/modules to disable update.module, meaning a direct DB server action is needed to restore operation.
And they'd still get their notifications from cron jobs.
Going one step further, it might even be valuable to disable update_cron(), since cron tasks can already be heavy, and enable another path for the equivalent of a manual update check, that would also trigger _update_cron_notify(), which is not the case with the current manual refreshes, since these are targeted to interactive use. This URL could then be used by (non-drupal) cron tasks to trigger periodic update checks, outside the time allocated to run hook_cron implementations.
| Comment | File | Size | Author |
|---|---|---|---|
| #7 | update-update_and_notify_frequency_fix-359171-7.patch | 2.74 KB | rjacobs |
| #6 | update-update_and_notify_frequency_fix-359171-6.patch | 2.68 KB | rjacobs |
Comments
Comment #1
dave reidComment #2
fgmFeature...
Comment #3
dwwThis doesn't need a setting. What we need is to ensure that there's no possible way for update.module to attempt to fetch new available update data via the requirements checking via update_help() on admin/*. If we have cached available update data, we can see if we're missing a security release, but if we don't have data, we should just give up there. This is intimately related to the problems at #220592: Core cache API breaks update.module: fetches data way too often, kills site performance, etc but this is a slightly different part of the bug, that will need a different kind of patch. Therefore, I think these should be handled in separate issues to make each patch easier to review and commit. At least, I hope it'll be easier this way. ;) If not, we can always roll the fix for this back into #220592 and mark this one duplicate after all...
Comment #4
dwwI'm going to call this "won't fix" based on re-reading #209242: update.module reports bogus results when .info files are newer than the cached available update data in detail (where the code in update_requirements() that can trigger a re-fetch was added -- see also #200028: Add db caching to update_get_projects() and update_calculate_project_data()). Once #220592: Core cache API breaks update.module: fetches data way too often, kills site performance, etc lands, these caches will actually work, and we're not going to be having the massive problems people are currently seeing. We really can't afford to *never* invalidate the data during hook_requirements(). However, once #220592 is in, we'll almost never do it except during the edge cases that #209242 was designed to fix...
Therefore, there's nothing to fix here...
Comment #5
rjacobs commentedGiven that there have been multiple issues floating around related to update_refresh() running too often or failing, etc., I want to be sure not to open a new one. From what I can see, this is is currently the best issue to work with.
I've done my best to sort through the discussion that lead up to #220292 being committed (which seems to have addressed the majority of issues related to the frequency of update_refresh() running, etc.). I've also had a look at #209242 and generally follow the logic that was added to check the ctime value of .info files to ensure update_refresh() gets triggered when cached data may be stale. The problems I've observed recently seem to be related to the combination of these two fixes that can play out in certain circumstances. Namely it's still possible to have update_refresh() running somewhat frequently on a /admin page load (with notable wait times), and it also seems that update notifications may sometimes be missed.
Here is what seems to lead to these problems:
What seems to happen under these conditions is that visiting the admin page will bypass the cache mechanisms and force an update_refresh(). This then leads to the specific issues I alluded to above:
It would seem that the first problem could be addressed by ensuring that the project .info ctime check (that's part of update_get_available()) also happens on cron. This would make it likely that cron would trigger the update_refresh() in the "background" before the update is requested due to a visit to an /admin page.
I suppose that a solution to the second problem could be to introduce a new variable (e.g. update_last_notify) that would be used to track the frequency that _update_cron_notify() runs, and update that separately from update_last_check. This way, the "Check for Update" configuration could be maintained regardless of the frequency of update_refresh(). An alternative to this could be to allow _update_cron_notify() to be triggered whenever update_refresh() runs. This could lead to notifications going out more often than desired, but at least they would not be missed.
I just want to get my preliminary analysis out there (hopefully in the correct issues thread) and see if there is something to this. I am more than happy to personally explore patches, but want to first see if I'm correctly isolating the problem I have been witnessing.
Comment #6
rjacobs commentedI'll take a stab at an initial patch here. I'm hesitant to introduce a new variable, but the more I think about it the more it seems like the best solution. We need a way to track when update_refresh() runs (update_last_check) in order for the project cache timing to work, but this frequency can't effectively be linked to the update notification functionality. So it just seems cleaner to have a separate timestamp variable for each.
The attached patch introduces this new variable (update_last_notify) and ties it to the update notification cycle, while leaving the update_last_check variable with the sole responsibility of tracking when update_refresh() runs. I've also added some logic to run update_get_available() upon each cron run to ensure the project cache is updated in the background, thus avoiding the remaining cases where it could still be fired during an /admin pageload.
Comment #7
rjacobs commentedI did at bit more research on this and discovered the following:
So I think I may at least be on the correct track with my approach. With that I went ahead and made some small adjustments to the patch (made sure I'm using the same update variable name being discussed in the Drupal 7/8 approach, more consistent comment terminology, etc.).
I'm hesitant to mark this issue as a duplicate of #1263040: No notifications sent when updates are available (followup and tests) for a couple reasons:
Comment #8
rjacobs commentedMarking this as "major" because it relates to a potential failure in the update notification system that some may depend on for security update information.
Comment #9
akamaus commented@rjacobs, looks like you made a substantial investigation. Can you please describe code paths which lead to downloading the update information and sending email notifications? The logic seems quite convoluted for me and, afaik, undocumented. I guess it's the primary reason of the current sad situation.
Comment #10
rjacobs commentedHi akamaus. Do you mean provide some added core code documentation in patch form, or provide additional elaboration in this queue about the original vs. patched logic introduced in #7?