I ran "check updates" and everything was fine. Then I ran Cron (15 seconds later), and I no longer had any update data. It is set to weekly checking, if that matters, but the data never lasts that long.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

NancyDru’s picture

Version: 6.0-rc4 » 6.0

Why am I losing this data just by running cron?

dww’s picture

Status: Active » Postponed (maintainer needs more info)

Good question. I suggest you try to debug it and let us know what you find out. ;) Thanks.

NancyDru’s picture

Am I the only one seeing this behavior?

dww’s picture

IIRC, you had a similar problem back with update_status 5.x, and no one else could reproduce it. Something seemed to be screwed up with the caching on your site, such that it never managed to save the cached data. But no, I've never seen this problem in all my testing of update.module in D6 core (or 5.x contrib for that matter).

Perhaps this is related to #222073: Update Status causing Drupal to run very slowly and/or #223925: Performance issues behind proxy but I have no time to look closely right now.

NancyDru’s picture

I see the following cid's in cache_update both before and after running cron:

  • update_project_projects
  • update_info
  • update_project_data

In the variable table I see:

  • update_advanced_project_settings
  • update_check_frequency
  • update_last_check
  • update_notification_threshold

None of them change value when cron runs.
Modules enabled:

  • required
  • update

The problem happens with only these modules enabled.

Is it possible this is just a problem with the status report?

NancyDru’s picture

Running Cron from the status report, it reports that Cron ran successfully, but update_get_available() returns an empty array.

NancyDru’s picture

Okay, I had a minimum cache lifetime set to 3 minutes. When I changed it to none, the problem went away.

BTW, should update_get_available() be running on every admin page?

dww’s picture

Category: bug » support
Status: Postponed (maintainer needs more info) » Fixed

Ok, great. Glad to hear that the problem was in fact the cache lifetime on your site.

yes, update_get_available() should be running on every admin page. however, if your caching is working, all it does is get the cached data. it runs on every admin/* page so that if your site is missing a security update, there's a drupal_set_message() that your site is running known-insecure code on every page in admin/*. this is by design.

NancyDru’s picture

Category: support » bug
Status: Fixed » Active

Update sets its own expiration, so "minimum cache lifetime" should not apply. Besides, the text on that setting says "applies to both page and block" - Update is neither.

dww’s picture

Title: No update data available » Core cache API breaks update.module: fetches data way too often, kills site performance, etc
Version: 6.0 » 6.x-dev
Priority: Normal » Critical

See #319486-3: Fetched update informations at every cron and beyond for more. Basically, lots of things can break update.module's caching of available update data:

- site that set a [sic] "minimum" cache lifetime
- improperly configured memcache
- ...

AFAIK, this is the earliest reported instance of this bug (and there are a *lot* of duplicates in various shades) so, let's use this one for the final fix...

The current proposal (which I still haven't gotten confirmation from Dries and/or Gabor about the viability of being committed to D6) is to move {cache_update} out of the regular cache API into an update.module-specific table that just does its own non-volatile storage, without relying on the (highly fragile) core cache API. I think it's the safest solution for this in D6 for many reasons (mostly explained at #319486, especially comments #8 and #15)... There's no data migration problem, since we can just drop the current table and make the new one in the schema upgrade, and the actual code changes will probably be less than 50 lines total (with comments).

catch’s picture

dww’s picture

Assigned: Unassigned » dww
Status: Active » Needs review
FileSize
11.77 KB

Over at #319486-16: Fetched update informations at every cron, Gabor said he didn't have time to look closely, but was in principle ok with this strategy assuming Dries doesn't veto it. At the risk of wasting a few hours of my life on a patch that's doomed, here's one for D6 that implements the proposal explained above.

  • DB update to drop {cache_update} and add {update_cache}. The new table doesn't mess with the 'headers' column, since that's meaningless for this cache.
  • Adds 3 new (heavily PHP doc'ed) private functions to update.module:
    • _update_cache_set()
    • _update_cache_get()
    • _update_cache_clear()
  • changes the few call sites in update status to use these new methods
  • changes update_flush_caches() to point to the new table (so things are still always cleared when you run update.php -- I think that's reasonable)

I've verified that this patch solves the bug reported here, at least in terms of minimum cache lifetime. Here's how to reproduce the bug:

  1. Start with a clean D6 core test site.
  2. Visit admin/settings/performance, set "Minimum cache lifetime" to "1 min".
  3. Visit admin/reports/updates, click "Check manually", and see that "Last checked" says "0 sec ago" (or maybe "1 sec ago").
  4. Stand up from your computer, breath deeply, relax your eyes, and stretch your wrists for a little while.
  5. Reload admin/reports/updates and see that "Last checked" is some reasonable value larger than 1 second.
  6. Do anything that core causes core to invoke cache_clear_all(), for example, add a comment to a node on your site, press the "Save configuration button" on admin/settings/performance, etc.
  7. Reload admin/reports/updates and notice that "Last checked" is back to "0 sec ago".

Without the patch, step #5 only works if #6 doesn't happen during #4. ;) Once you apply the patch, you should be able to take as long as you want at step #4, and repeat #6 as often as you like, and you should only see #7 if you actually click "Check manually" again, or if a day goes by and cron checks for updates again...

More testing and careful review would be most appreciated. This does *NOT* fix #359171: Avoid cases where update data is refreshed on admin pageloads and ensure update notifications are always sent as configured yet -- that'll be a follow-up patch.

Also, as I explained at #319486, this is *NOT* a "fix it first in D7 and backport it" bug fix issue. We might radically change how the caching API in D7 works to solve this a different way. And, this bug is right now crippling sites (and putting a punishing load on the d.o infrastructure in the process). So, we're definitely need to fix this ASAP in D6 (preferably before the next core release) and then take our time figuring out how we want to fix this long term in D7.

Thanks,
-Derek

dww’s picture

@catch Re: #11: Thanks for the pointer. Indeed, that's the opposite problem. Further evidence of how utterly broken the "minimum cache lifetime" setting is, but that's not a solution to this bug. If you use the D6 backport patch from there (http://drupal.org/files/issues/cache-DRUPAL-6.patch) and try to repeat the steps in #12, you'll see the same broken behavior.

dww’s picture

NancyDru’s picture

Wow, Derek, I had for gotten I even posted this. I will try to test this later today. Who, knows, maybe I can turn Update back on for some of the worst sites. I have some sites that I had to turn it off or I could never get into admin, even increasing my max execution time.

pwolanin’s picture

@dww - need to dig in more - but kind of seems like this sort of data could go into the general {cache} table with CACHE_PERMANENT as the lifetime? The it should (I hope) only expire when you specifically purge it?

dww’s picture

@pwolanin: I'm skeptical. That might be true, but a) it'd require a LOT of heavy testing to ensure it's really true and b) I'm nervous that's still not going to save the day if memcache is involved. Given how much I've been burned by the core cache API in the recent past, I'm probably too jaded to come to a clear-headed conclusion, but I'd rather just use a completely separate table and a few helper functions and guarantee that no other external factor is going to break this code and purge the data at the wrong time.

NancyDru’s picture

+1 for a separate table. When I added caching to Taxonomy Image, I did some timing tests. Using the general table (cache), the queries were really slow, even on a lightly loaded system. The only way to get my pay-off was to create a separate table; then it screamed.

dww’s picture

There's no need to merge into the shared {cache} table. We've already got a separate table, we might as well keep it. In fact, maybe all the schema updates from #12 are stupid (just to rename the table from {cache_update} to {update_cache}). Even if we stop using the core cache API, we could probably just keep the table exactly as it is, and just start using our own private helper functions so we're not calling cache_(get|set) anymore.

That said, it's possible that something like the attached patch is all we need. I haven't tested with memcache at all, so I have no idea if that completely honors CACHE_PERMANENT like it should. However, this does solve the minimum cache lifetime woes from #12. I'm still a little skeptical, but if this really solves the problem, it's obviously a lot less of a change in D6 (and we might as well just commit it to HEAD, too, even if we revisit the cache API to fix some of the brokenness there).

There are a few other places this patch should probably touch. It currently never clears the cache of projects and the current status of the site (not the fetched available release data -- that's a separate story) unless we're on certain pages, we manually check for updates, or via cron. Previously, those were cached for at most an hour, now we're potentially caching much longer. Since we're using CACHE_PERMANENT, we don't have an easy way to keep track of when these should expire. We'd either have to stash those as variables, or put them in {cache_update} itself in another record or something...

Anyway, this needs more thought, testing, and I should look more closely at the rest of the code (especially when it's not past my bed time), but this might be a much smaller change that solves the problem, which should therefore be much easier to commit to D6.

NancyDru’s picture

Well, the cache gets cleared when a module is enabled or disabled. In between those events, I don't really see any major reason for it to expire.

pwolanin’s picture

@dww - the default param for the cache_set() is $expire = CACHE_PERMANENT, so you could just omit that param.

dww’s picture

Status: Needs review » Needs work

@NancyDru: The cache of info about your current status (are you out of date or not?) should be cleared when you upgrade the version you're currently running. That's not the same as enable/disable. Unless you happen to visit the right pages, update.module would continue to think you're missing an update. In the case of a missing security release, you're still going to get that red warning on every admin/* page load until you happen to visit the update report itself, which will clear the cache, and then say there's no problem.

@pwolanin: Duly noted, thanks. I'm tempted to leave it as explicitly setting CACHE_PERMANENT, though, from the perspective of defensive programming. If we go this route, it's vitally important that these are CACHE_PERMANENT, and I think it's not unreasonable to spell that out explicitly in the code. We could also add a "// This defaults to CACHE_PERMANENT, which we desperately need..." comment above each call site, but I'm not sure that's any better.

@all: In discussion with killes in IRC, it seems like there's really no good solution to the memcache problem without either patching the drupal memcache glue itself (which is complicated in many ways) or just avoiding the core cache API as I originally propose.

However, I did have a nice idea to simplify #12 a lot. There's really nothing to gain from renaming {cache_update} to {update_cache}, which is where the DB update and the bulk of the code in #12 comes from. We can continue to use the identical {cache_update} table, we just need to stop using cache_(get|set) and use _update_cache_(get|set) instead. Then, we can still use {cache_update}.expires to keep track of when the data should be purged, but ensure that we're the only ones actually purging that data when "minimum cache lifetime" is set, and avoid having this data live in memcache at all. I think that's the winner. I'm going to re-roll now. Stay tuned.

dww’s picture

Status: Needs work » Needs review
FileSize
7.73 KB

This patch:

- solves the "minimum cache lifetime" bug explained in #12.

- requires no schema changes at all.

- touches no public-facing update.module APIs (it adds an optional parameter to update_invalidate_cache(), but I thought that was cleaner than a separate _update_cache_clear($cid) method).

- is guaranteed to work if you have memcache installed. ;)

- continues to use {cache_update}.expired as the time when the cached data should really expire, so my concerns in #19 are basically addressed.

- has characteristically thorough PHPDoc given update module's (aka my) very high standards. ;)

Reviews/testing urgently requested. Meanwhile, I'll pound on this some more and look closely at some related issues...

dww’s picture

Note to reviewers: the cache id we *really* care about preserving is 'update_info'. That's the (giant) array of all the parsed data from the XML we fetch from updates.d.o. It's sort of a lame, nondescript cache id, and I'd be happy to rename that to something like 'update_available_releases' or something, but I don't want to risk feature creep jeopardizing this patch from landing in D6.

The other cache ids in use ('update_project_data' and 'update_project_projects') are still worth caching, but much less important in terms of if they get cleared. 'update_project_projects' is just the list of all projects on the site, and some goodies from the .info files we care about. 'update_project_data' is basically a cached copy of the current status of the site -- which modules are up to date, which are out of date, etc. Both of these can at least be recomputed locally without touching the network (just a lot of touching the disk, especially if you use cvs_deploy -- that's why we cache them at all).

So, I'm totally fine with giving 'update_info' a much better name, but only if no one freaks out about 3 more hunks in the diff. ;) Either way, I wanted to clarify this part, since it came up in an IRC discussion with pwolanin and killes...

pwolanin’s picture

Looks good, though a little code comment about that being the expensive entry would help clarify why others are still being cleared regularly.

dww’s picture

Now with more comments about the various cache ids in use, and why some of them are only valid for an hour.

NancyDru’s picture

A little question based on a recent issue I had on a customer's site. Is saving that big array as one big chunk going to possibly present problems with "max_allowed_packet?" Their host had it set at 1M and we did end up with problems in Views/NodeQueue when we added the last term. I got them to bump that to 3M, but this big array could possibly exceed even that. And, no, they are not on 6.x yet; I'm just trying to think these things through.

@dww: I do see the "no information available" message even on enable/disable.

dww’s picture

@NancyDru:

In terms of the giant array as a single entry in the table causing potential problems with max_allowed_packet -- yes, that's a concern. That's a lot of what #238950: Meta: update.module RAM consumption is about. We could arguably split up the 'cache_info' array into separate cache records for each project's list of available updates. However, that'd require a bigger patch. We should almost certainly do so for D7, and should consider doing so for D6, but it's not nearly as urgent as what this patch is solving.

Re "no information available" -- that's expected, but that's different than what I'm talking about. That's because the submit handler for the modules page clears the cache of available updates ('update_info' from #24). Maybe it shouldn't do that, I'm not sure. But, that's unrelated to this issue and this patch.

dww’s picture

Here's a port to HEAD/D7. Untested (my laptop can no longer run HEAD since I haven't gotten around to recompiling PHP et al for PDO support).

Side note: I marked #359171-4: Avoid cases where update data is refreshed on admin pageloads and ensure update notifications are always sent as configured "won't fix" -- see comment #4 for details. So long as this patch lands, that isn't a problem (and in fact, we need to keep the existing behavior because of #209242: update.module reports bogus results when .info files are newer than the cached available update data)

Gerhard Killesreiter’s picture

Status: Needs review » Reviewed & tested by the community

I've tested the D6 patch on a site that uses memcached and it works as advertised in #12.

A rather simple and elegant patch to fix an important problem.

Gerhard Killesreiter’s picture

dww asked for more info:

I've tested my install with and without the patch. Without the patch, the cache is cleared after e.g. editing a vocabulary, with the patch it is not. I am using the memcache.inc replacement for cache,inc (as on drupal.org) and my memcached config is looking like this:

  'memcache_servers' => array( //'127.0.0.1:11211' => 'default',
                                '127.0.0.1:11211' => 'default',
                                '127.0.0.1:11311' => 'form',
                                '127.0.0.1:11411' => 'page',
                                '127.0.0.1:11511' => 'filter',
                                '127.0.0.1:11611' => 'menu',
                                '127.0.0.1:11711' => 'node',
                                '127.0.0.1:11811' => 'path',
                                '127.0.0.1:11911' => 'taxonomy',
                                ),
  'memcache_bins' => array(
                           // core
                           'cache' => 'default',
                           'cache_block' => 'default',
                           'cache_form' => 'default',
                           'cache_page' => 'default',
                           'cache_filter' => 'default',
                           'cache_menu' => 'default',
                           // from cck
                           'cache_content' => 'default',
                           // added by advcache patches
                           'cache_node' => 'default',
                           'cache_path' => 'default',
                           'cache_taxonomy' => 'default',
                           // views
                           'cache_views' => 'default',
                            ),

There is no special entry for cache_update, so it gets mapped to the default bin and thus cleared on cache clears. If I had an extra bin, it would not get cleared for normal cache_clear_alls, but I'd consider it unusual to make an extra bin for this.

dww’s picture

Status: Reviewed & tested by the community » Needs work

Almost RTBC. ;) pwolanin pointed out that hook_flush_caches() isn't going to do any good in the memcache case, since drupal_flush_all_caches() just turns around and calls cache_clear_all() on each table. So, we should probably just directly truncate the cache in that hook implementation, and not return anything. I guess. Seems a bit of a hack, but not sure what else to do in this case.

Also, the D7 patch should *probably* use DBTNG. I was hoping to just reopen #394590: DBTNG: Update module instead, but I guess I should just do it in here... ;)

dww’s picture

Status: Needs work » Needs review
FileSize
11.76 KB
11.89 KB

Improvements since the last round of patches:

A) Fixed update_flush_caches() to just directly call update_invalidate_cache() as per pwolanin's suggestion in IRC (see #32)

B) D7 version is now fully DBTNG.

C) I bit the bullet and ended up performing s/update_info/update_available_releases/ for that cid since 'update_info' is so lame and unspecific.

D) We're now a bit more careful about how we invalidate data during _update_refresh() to fix a long-standing bug in cvs_deploy: #235425: HEAD release always considered Not Supported during cron

This could use more review and testing, but I'm getting increasingly happy that this is RTBC, and will fix a whole host of inter-related problems.

sun’s picture

Code (especially comments) looks very good. Note, we should have tests for this functionality - NOT in this issue/patch - but focused in #253501: Tests for update.module.

cburschka’s picture

Status: Needs review » Needs work
+ * When we relied on the core cache API, there were all sorts of
+ * potential problems that would result in attempting to fetch available
+ * update data all the time, including if a site sets a so-called "minimum
+ * cache lifetime" (which is both a minimum and a maximum), or if a site uses
+ * memcache.
+ *
+ * We continue to use the {cache_update} table, but instead of using
+ * cache_set(), cache_get(), and cache_clear_all(), there are private helper
+ * functions that implement these same basic tasks but ensure that the cache
+ * is not prematurely cleared, and that the data is always stored in the
+ * database, even if memcache is in use.

This comment explains the issue really very well, and I'm glad it's so detailed. Unfortunately, it is written in a form resembling a changelog. So far as I can tell, inline comments are only used to explain what is being done and why - but not the differences to the previous versions, let alone in code that isn't even one major release old.

I think you can fix it by removing all references to "how it used to be" or "what has changed" and instead explaining why you don't use the caching API, period.

dww’s picture

Status: Needs work » Needs review

I respectfully disagree. We're explaining what goes wrong if you don't do it this way. That's important if anyone later tries to touch this code.

pwolanin’s picture

Status: Needs review » Needs work

how about something like jsut this in place of the above:

 * This module uses a standard cache table, {cache_update}, but instead of
 * cache_set(), cache_get(), and cache_clear_all(), there are private helper
 * functions that implement these same basic tasks but ensure that the cache
 * is not prematurely cleared, and that the data is always stored in the
 * database, even if memcache is in use.  This is because the cached data
 * is very expensive to fetch and must persist the full time specified.
pwolanin’s picture

Status: Needs work » Needs review

cross-post

sun’s picture

Re-rolled to address Arancaytar's docs issue.

One remaining issue left:

/**
 * Implementation of hook_flush_caches().
 *
 * Called from update.php (among others) to flush the caches.
 * Since we're running update.php, we are likely to install a new version of
 * something, in which case, we want to check for available update data again.
 * However, because we have our own caching system, we need to directly clear
 * the database table ourselves at this point and return nothing, for example,
 * on sites that use memcache where cache_clear_all() won't know how to purge
 * this data.
 */
function update_flush_caches() {
  update_invalidate_cache();
  return array();
}

I wonder whether we really want to flush this cache every time hook_flush_caches() is invoked. The comment speaks of update.php only (which is obvious), but if it's only update.php (I'd guess so), then it should use the same check as http://api.drupal.org/api/function/drupal_flush_all_caches/7 for _system_theme_data().

pwolanin’s picture

@sun - you can manually trigger this hook with the button on the admin/settings/performance page. I'm not aware of other uses in core.

Why do we need the same check? The check around themes seems to involve a race condition that will leave you theme-less.

cburschka’s picture

I respectfully disagree. We're explaining what goes wrong if you don't do it this way. That's important if anyone later tries to touch this code.

dww: I'm sorry to have been unclear. sun's new version of the comment is exactly what I had in mind, and I believe it meets your criteria as well:

+ * We specifically do NOT use the core cache API for saving the fetched data
+ * about available updates. It is vitally important that this cache is only
+ * cleared when we're populating it after successfully fetching new available
+ * update data. Usage of the core cache API results in all sorts of potential
+ * problems that would result in attempting to fetch available update data all
+ * the time, including if a site has a "minimum cache lifetime" (which is both
+ * a minimum and a maximum) defined, or if a site uses memcache.
+ *
+ * Update module still uses the {cache_update} table, but instead of using
+ * cache_set(), cache_get(), and cache_clear_all(), there are private helper
+ * functions that implement these same basic tasks but ensure that the cache
+ * is not prematurely cleared, and that the data is always stored in the
+ * database, even if memcache is in use.
+ */

It's all in the tense. ;)

pwolanin’s picture

Status: Needs review » Needs work

oh dear, things get very ugly when I enable a module in D7 - jsut some DBTNG issue I think:

# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# PDOException: DELETE FROM {cache_update} WHERE (cid = :db_condition_placeholder_906_modules, :db_condition_placeholder_906_submit, :db_condition_placeholder_906_#action, :db_condition_placeholder_906_#form_id, :db_condition_placeholder_906_#args, :db_condition_placeholder_906_#build_id, :db_condition_placeholder_906_#type, :db_condition_placeholder_906_form_build_id, :db_condition_placeholder_906_#token, :db_condition_placeholder_906_form_token, :db_condition_placeholder_906_form_id, :db_condition_placeholder_906_#id, :db_condition_placeholder_906_#description, :db_condition_placeholder_906_#title, :db_condition_placeholder_906_#attributes, :db_condition_placeholder_906_#required, :db_condition_placeholder_906_#method, :db_condition_placeholder_906_#theme_wrapper, :db_condition_placeholder_906_#tree, :db_condition_placeholder_906_#parents, :db_condition_placeholder_906_#submit, :db_condition_placeholder_906_#processed, :db_condition_placeholder_906_#defaults_loaded, 
pwolanin’s picture

$form['#submit'][] = 'update_invalidate_cache';

No...

pwolanin’s picture

Status: Needs work » Needs review
FileSize
12.44 KB

The $form array was getting passed in as the $cid. Bad news.

webchick’s picture

Status: Needs review » Needs work

I spent an hour looking over this patch and another hour reading the issue(s), at the request of dww. I'm not sure how helpful this will be, but I tried my best. :)

As a gut reaction, I *vastly* prefer the approach taken in #19. The idea of a completely separate caching system for this data is positively abhorrent to me, and I don't understand why, rather than being skeptical and jaded about the core caching API, we wouldn't simply write tests to determine if the CACHE_PERMANENT type works as advertised or not, and fix it if there are bugs. The answer for this unique separate cache table approach seems to be "memcache," but isn't that then something for memcache module to special-case? Why should anything in core need to care about one particular caching backend?

(Note that I will 110% own up to not being anything like a performance guru, and am happy to step completely aside with no hard feelings at all if Gábor and/or Dries think this is the right way to go. I should be talking with both of them tomorrow, so I will ask.)

So while I can appreciate the urgency with which this issue needs to be fixed in 6.x, right now I'm not very eager to commit this to 7.x, especially without tests, and especially given that the 7.x patch has obviously not been tested by anyone but Peter or he wouldn't have encountered a gazillion flaming errors when he applied it.

Then, in terms of a code (actually, mostly comment) review for #44:

+      // Set the projects array into the cache table. Since this is not the
+      // data we fetched about available updates, it is ok to invalidate it
+      // after an hour. We don't want this to persist too long, or if someone
+      // upgrades their version of a module, update status will be using stale
+      // data unless an admin visits one of the paths where this data is
+      // flushed (see update_project_cache()).

The same comment is used twice on top of two different variables. We're missing the nice, specific explanations that dww gave for what each of these variables stands for back in #24. Let's get those moved into each section. That last sentence could also use some work, as I had to read it about 3 times to understand what it was saying. Could it possibly be split into two?

+    update_invalidate_cache($cid);

If we're going to copy/paste function signatures from includes/cache.inc, might as well name this update_clear_cache(), no? That way it's clear what function you're replicating. (Oh, wait. Nevermind. That's an old function name.)

+  // during update_get_projects(), for example, to modules that implement
<code>

should be ...projects()<strong>;</strong> for example, ...

<code>
-    cache_set('update_info', $available, 'cache_update', REQUEST_TIME + (60 * 60 * 24 * $frequency));
+    _update_cache_set('update_available_releases', $available, REQUEST_TIME + (60 * 60 * 24 * $frequency));

+100 to s/update_info/update_available_releases/

+ * Note: this function completely ignores the {cache_update}.headers field
+ * since that is meaningless for the kinds of data we're caching.

Then should we not have an update function to drop the column?

Also, I know that the renaming of the table from cache_update to update_cache was removed in an earlier patch, but I kind of think that we should do that to make it clear that this is an "other" kind of cache. Views does the same thing with 'views_object_cache' so there's somewhat a precedence for this. That said, I know you're trying to keep the line count down on this schema/API change, so probably best to wait to see what Gábor has to say.

dww’s picture

Status: Needs work » Needs review
FileSize
14.54 KB
14.99 KB

@sun: Thanks for rerolling -- and yes, Arancaytar, that makes sense about the tense.

@pwolanin: Ugh, sorry about the form bug. :( That was an unintended side-effect of adding the optional argument to update_invalidate_cache(). Maybe we really should have just left that alone and added an _update_cache_clear() that takes the $cid... I think that's probably safest for the D6 backport, so that's what I'm going to do. And in D7, we might as well just call it _update_cache_clear() for consistency with the rest of the private cache functions, even if we get rid of update_invalidate_cache() entirely in D7.

@webchick: Thanks for the review. I know how you feel. Part of me would vastly rather use something like #19, and tell memcache to hack around it (although there's still the problem of not being able to use {cache_update}.expire for keeping track of when the data should expire *sigh*, so we'd have to do some kind of hack for that -- makes #19 less elegant than it appears).

However, this just reminds me of {cache_form} all over again (which caused a full day of grief and data loss after we upgraded d.o to D6). We call it a cache table, and we use the cache API, but it's *NOT* like our other caches. It *needs* to be non-volatile. Every D6 site running memcache has to stand on its head to do configuration hacks to special case this cache-table-that's-not-really-a-cache-table. It should be called {form_persistent_storage}, but since it's called {cache_form} and we use the cache API, (which is plug-able, it's not just memcache per se), every possible cache plugin needs to special case it. Is it better to always treat everything the same, even if they have totally different characteristics, or should we just embrace the fact that not all caches are created equally, and some just don't belong in the usual cache API?

Imagine this was the node access system. Should we put something weird into the node access system in core that's related to (but not really) node access, and then say "all node access modules should hack around this special case -- core wants to be internally consistent and consider everything node access"?

Of course, in the end, we should fix the (plug-able) cache API to support truly non-volatile caches. If/when that happens, we can revisit this. But, that's definitely not happening in D6, and until it happens in D7, we can't leave update.module this broken. Just because we commit these patches now doesn't mean we're going to ship 7.0 with exactly this code. But, we should try to keep HEAD and D6 as close as possible in terms of fixing bugs, and for now, this is the best solution we have that ensures we fix the problem.

And yes, obviously we need #253501: Tests for update.module. Long ago, I explained how to do it, and no one ran with it. I've been spread way too thin with 1000 other responsibilities, and it turns out I'm a solo maintainer for this subsystem after all (#447700: Update MAINTAINERS.txt to reflect reality for update.module). So, that hasn't happened yet. But, that's no reason not to fix things we know are broken. ;)

All that said, here are new patches for both HEAD and D6 that fix your "comment review" points from #45, and address the badness with update_invalidate_cache() from #42-43... The D6 patch also fixes a few other trivial comment-style bugs and typos that were already fixed in HEAD (I was diffing between HEAD and D6 to make sure I got everything in the backport that should be there).

I've tested heavily on D6. I still don't have a working HEAD test environment, but it *should* work now. ;)

p.s. While testing all this in D6, I found #448268: Cached update status data not cleared when saving theme admin page due to wrong form_id. I resisted the urge to roll in the fix for that here -- too many kittens have already been wounded in this issue.

pwolanin’s picture

is this function actually used anywhere, or this is just to preserve the API?

+function update_invalidate_cache() {
+  _update_cache_clear();
+}
webchick’s picture

Thanks, dww.

Ok, so the answer to the "why a completely separate cache table and functions" question isn't "memcache," it's more accurately "other caching backends, such as memcache." I see that this was clarified in the comments in the most recent re-roll (or else I was just tired last night and mis-read them). +1. Before, it sounded too much like "core is special casing a bunch of crap because a particular contrib module can't deal."

I also wasn't quite sure what you meant by:

"However, this just reminds me of {cache_form} all over again (which caused a full day of grief and data loss after we upgraded d.o to D6). We call it a cache table, and we use the cache API, but it's *NOT* like our other caches. It *needs* to be non-volatile."

But I searched the infrastructure queue for cache_form and found #376666: Release node created without title by "preview" which sounds like what you're talking about.

Incidentally, I'm aware of #253501: Tests for update.module; the tests I was talking about in this particular case (which wasn't very clear, sorry) were for tests for the core cache API which shouldn't need any fancy workarounds. But the way you explained it, this separate caching mechanism sounds like it is needed regardless if the core cache API is working 100% as advertised or not.

webchick’s picture

Hit Dries up in IRC about this patch, and he asked for a summary.

I was going to write something up, but then noticed that http://drupal.org/node/220592#comment-1520722 contains both a good overview of the patch's changes and a set of reproduction steps for the bug that we're fixing.

chx’s picture

While there is no doubt of the valors of this patch for Drupal 6, I am unhappy with it for Drupal 7. We can solve this properly there. So please commit to D6 only.

chx’s picture

Erm. OK, i see now: we do not want to use the cache api here at all because cache can go away any time. And while if your cache backend is flaky it only wrecks your site, in this case, it wrecks drupal.org and that sucks. So OK with D7 too.

dww’s picture

Re: Summary -- #12 doesn't really explain the heart of the bug, and that's not the implementation in #46, either. Here goes...

The problem

  • Our cache API assumes that things are always volatile, and can be cleared anytime and regenerated. The "expire" field is just a hint -- after that time, it'll *definitely* be cleared -- before then, it'll *probably* be cleared. This is especially the case in two configurations:
    1. You set the "minimum cache lifetime" setting.
    2. You use a different plugable cache backend, such as memcache.
  • Some things that core tries to use the cache API for should really be handled by the vaporware "time-based persistent storage API". The two most urgent candidates are {cache_update} (this bug) and {cache_form} (see #376666: Release node created without title by "preview" for the full story of data loss and pain when we upgraded d.o to D6).
  • If {cache_update} is managed by the volatile cache API, and you turn on "minimum cache lifetime", your site will connect to updates.drupal.org to fetch the XML release history data for all of your projects basically all the time. At minimum it'll happen at whatever your "min cache lifetime" setting is. But, just turning on that knob means that every cache_clear_all() will also do it (e.g. every new comment, page edit, saving a settings form, etc).
  • update.module tries to be extra paranoid about warning site admins if they're missing a security update. So, there's a hook_help() implementation that checks the cached data about the state of updates on the site, and if it's missing a security update, it prints an error message via drupal_set_message(). Due to #209242: update.module reports bogus results when .info files are newer than the cached available update data, if the cache is empty, we need to refresh that data, even in this hook_help() case. However, due to this bug, on some sites, that cache is effectively always empty, and update tries to refetch update data over the network on every single admin/* page load. This has resulted in a string of duplicate bug reports and forum posts about "Drupal is slow", "I can't access my admin section", etc.
  • While all this is bad enough from the end site admin's perspective, it's crippling the d.o infrastructure. :( The load on updates.drupal.org regularly hovers around 30, sometimes spiking higher. The OSL folks regularly got paged about the load going so high it'd normally be considered a crisis, but it's just this bug, and they had to change the monitoring system to stop notifying them about the load average on the web nodes. Fixing this bug has been one of the top priorities for the infrastructure team.

The solution

  • Since the "time-based persistent storage API" is vaporware, update.module is implementing its own.
  • The DB schema needs for this are basically identical to what cache API wants, and the {cache_update} table already exists (both in D6 and D7), so instead of introducing a schema change for this, we just re-use the existing {cache_update} table.
  • All the call sites in update module that used to call cache_(set|get|clear_all) now use private helper functions _update_cache_(set|get|clear) instead. These functions ensure two critical things:
    1. The data always lives persistently in the database, regardless of what cache backend you're using.
    2. The data in {cache_update} is never cleared unless {cache_update}.expire is reached, or the cache is explicitly invalidated.
  • We do want to clear these caches whenever hook_flush_caches() is being invoked (e.g. when update.php runs). However, we can no longer just return the table name, we need to directly clear our cache inside update_flush_caches() since this hook is "owned" by the plug-able cache back-end, so it won't necessarily know how to get rid of our data in the DB.

Other changes in this patch

  • Renamed the ambiguous and lame cache id that stores the data we fetch over the network from 'update_info' to 'update_available_releases' for better self-documenting clarity.
  • Improved lots of documentation comments about all of this, both for the private persistent cache API and in general about how/why/what update module is caching.
  • Fixed a related bug about when exactly we invalidate the cache of available releases that will allow me to fix a long-standing bug in cvs_deploy: #235425: HEAD release always considered Not Supported during cron -- technically, that could be a separate patch. However, it completely depends on the API changes I'm making here, and I decided it was best to just take care of it in one pass instead of trying to coordinate two inter-related (conflicting) patches.
  • The D6 backport fixes a few comment-style bugs and typos that were already fixed in HEAD but never backported.

p.s. @pwolanin #47: Yes, in D6, it's still used as a the submit callback for admin/build/(modules|themes). In HEAD, it's totally gone, and we have an equivalent function with a more specific name that you added: update_cache_clear_submit().

dww’s picture

Status: Needs review » Needs work

Tee hee, we seem to be clearing the cache on every cron run, for some reason. Investigating now...

pwolanin’s picture

here's why: http://api.drupal.org/api/function/system_cron/6

calls module_invoke_all('flush_caches') on every cron run.

dww’s picture

Status: Needs work » Needs review
FileSize
14.92 KB
15.37 KB

Cool, yet another critical bug fixed by this patch. ;) Basically unrelated, but system_cron() clears all caches. So, the "how often to check for updates" setting has been useless -- we always check after every cron run. Don't have time now for the CVS archeology to figure out when/how that happened. But, yet further proof that we shouldn't be using the regular cache API for this. ;)

New patch changes our hook_flush_caches() implementation to see if update.php is actually calling us (by testing MAINTENANCE_MODE) and only clearing the cache if we're in 'update' mode. Added tongue-in-a-cheek comment about 'update' mode. ;)

pwolanin’s picture

Status: Needs review » Reviewed & tested by the community

Applied and tested D6 version - no longer fetches on every cron run. Looks good.

webchick’s picture

Sounds like dww and pwolanin have shaken out the last of the bugs. Looks ready to go from my side, I checked with Gábor and he said it was ready to go from his, so committed to HEAD. Thanks a lot for all of the work on this patch! :D

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Committed to Drupal 6 too.

dww’s picture

YAY!!!!

Thanks, folks. The infrastructure team loves you.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

mcurry’s picture

subscribe

anea02’s picture

Version: 6.x-dev » 6.19
Assigned: dww » Unassigned
Priority: Critical » Normal
Status: Closed (fixed) » Active

This thread comes closest to the issue I have been experiencing over the last few days - sorry if I am wrong.

Basically, I was observing slow admin page load times (35-40 seconds). Site is running about 45 modules, btw. Searching through the Drupal forums turned up suggestions of trying to switch on MySQL query caching, adjusting site memory upwards via PHP.ini. I am sure early on I read that update.module may be causing the problem but that theory was undermined by someone else's comments and I moved on to trying to debug the issue. After a couple of days of trying to trace the issue with 'devel', 'trace', 'bench_charts', I finally settled on 'xdebug'. Information was still fairly thin until I turned on tracing so I could see function names. Page loads were getting bogged down with 'fsockopen' which was being called by an update function. I found a bit of code that allowed me to echo what was going on.

What I saw was that on every admin page load updates.drupal.org was being queried for module updates. Each of these network accesses were 'costing' about 250ms and my site was doing it 46 times per page load. (!)

After reading this page (higher up) I tried a couple of different cache setting under the admin/settings/performance but nothing seemed to alter this behaviour.

Disabling the update module removes the issue.

My question is this - am I seeing a bug or is this in fact 'by design'?. If it is by design would it be possible for a feature to be implemented whereby 'cron' does in fact collect update information, stores it and the admin page loads only look at that info (stored in DB, for example)? (this obviously needs more thought but I am sure you understand what I mean 'generally') Or perhaps a manual update page whereby I can check updates myself (as admin) instead of the constant checks I am seeing now? Naturally I think the update.module is essential and so would like to keep using it but at present I am forced to switch it off as 40 second page loads prevent me from working 'optimally' :-)

Another idea - is it a plausible suggestion that a web service at updates.drupal.org receives a XML file from Drupal installations listing ALL installed modules and their versions and returns a XML file with the results? This would reduce 'fsockopen' requests to 1 from 46 in my case.

Apologies if I am stating ideas that have already been implemented in D7 or, LOL, if there is a UI tick box I have failed to see that makes my issue go away :-) , D'oh!

Adam

dww’s picture

Version: 6.19 » 6.x-dev
Assigned: Unassigned » dww
Priority: Normal » Critical
Status: Active » Closed (fixed)

a) Yes, this was somewhat fixed in D7 by using the queue API for retrieving data about available updates.

b) The *original* design of the Update status module sent an XML-RPC request to drupal.org as you propose. That was painfully expensive on the d.o servers. So no, it's not a viable option to try to speed up your individual site by putting the computation load on the single d.o server. updates.d.o is just serving these XML files from cache now, in fact.

c) There are a bunch of other support requests open about this. Please don't reopen old fixed issues like this.

Thanks,
-Derek

yan’s picture

Status: Closed (fixed) » Active

That issue has been marked as a duplicate of this one:
#222073: Update Status causing Drupal to run very slowly

But it seems to me, that the patch that was applied here didn't really solve the problem. At least I am experiencing the same problem: With update status enabled, the admin pages are not usable. Not only are they extremely slow, they are also not built correctly sometimes. The admin menu doesn't show and sometimes the css isn't loaded correctly. I have a lot of modules enabled and I suppose it has to do with it. But update status shouldn't 'break' the whole admin area.

I'm re-opening this issue because the other one was marked as a duplicate. Please excuse if that's wrong.

rjacobs’s picture

Humm, when trying to make sense out of many issues previously linked to this one (update module fetching updates too frequently or failing to fetch updates), it's a bit confusing for this issue to be currently marked as "active" given that there has been little activity since it was initially closed. Since this was committed, #62 and #64 are the only comments in about 3 years (and it looks like #62 was addressed). Perhaps #64 is something of an isolated case which may not be related to this same issue? If not it would be great to get more info from yan to confirm.

rjacobs’s picture

Status: Active » Closed (fixed)

I'm going to take the liberty of changing the status back to "fixed". If yan can elaborate more on the specifics of their problems, perhaps this can be re-opened.

Please note that there is a related issue currently open at #359171: Avoid cases where update data is refreshed on admin pageloads and ensure update notifications are always sent as configured. Though I believe that issue does have some connections to this one, it appears to be much less critical, and I think its fix can be discussed independently of this one.