Core cache API breaks update.module: fetches data way too often, kills site performance, etc
NancyDru - February 12, 2008 - 01:10
| Project: | Drupal |
| Version: | 6.x-dev |
| Component: | update.module |
| Category: | bug report |
| Priority: | critical |
| Assigned: | dww |
| Status: | closed |
Description
I ran "check updates" and everything was fine. Then I ran Cron (15 seconds later), and I no longer had any update data. It is set to weekly checking, if that matters, but the data never lasts that long.

#1
Why am I losing this data just by running cron?
#2
Good question. I suggest you try to debug it and let us know what you find out. ;) Thanks.
#3
Am I the only one seeing this behavior?
#4
IIRC, you had a similar problem back with update_status 5.x, and no one else could reproduce it. Something seemed to be screwed up with the caching on your site, such that it never managed to save the cached data. But no, I've never seen this problem in all my testing of update.module in D6 core (or 5.x contrib for that matter).
Perhaps this is related to #222073: Update Status causing Drupal to run very slowly and/or #223925: Performance issues behind proxy but I have no time to look closely right now.
#5
I see the following cid's in cache_update both before and after running cron:
In the variable table I see:
None of them change value when cron runs.
Modules enabled:
The problem happens with only these modules enabled.
Is it possible this is just a problem with the status report?
#6
Running Cron from the status report, it reports that Cron ran successfully, but update_get_available() returns an empty array.
#7
Okay, I had a minimum cache lifetime set to 3 minutes. When I changed it to none, the problem went away.
BTW, should update_get_available() be running on every admin page?
#8
Ok, great. Glad to hear that the problem was in fact the cache lifetime on your site.
yes, update_get_available() should be running on every admin page. however, if your caching is working, all it does is get the cached data. it runs on every admin/* page so that if your site is missing a security update, there's a drupal_set_message() that your site is running known-insecure code on every page in admin/*. this is by design.
#9
Update sets its own expiration, so "minimum cache lifetime" should not apply. Besides, the text on that setting says "applies to both page and block" - Update is neither.
#10
See #319486-3: Fetched update informations at every cron and beyond for more. Basically, lots of things can break update.module's caching of available update data:
- site that set a [sic] "minimum" cache lifetime
- improperly configured memcache
- ...
AFAIK, this is the earliest reported instance of this bug (and there are a *lot* of duplicates in various shades) so, let's use this one for the final fix...
The current proposal (which I still haven't gotten confirmation from Dries and/or Gabor about the viability of being committed to D6) is to move {cache_update} out of the regular cache API into an update.module-specific table that just does its own non-volatile storage, without relying on the (highly fragile) core cache API. I think it's the safest solution for this in D6 for many reasons (mostly explained at #319486, especially comments #8 and #15)... There's no data migration problem, since we can just drop the current table and make the new one in the schema upgrade, and the actual code changes will probably be less than 50 lines total (with comments).
#11
Seems like the opposite problem to here, but just in case:
#227228: cache_clear_all and cache_get fail to clear caches when Minimum cache lifetime is on
#12
Over at #319486-16: Fetched update informations at every cron, Gabor said he didn't have time to look closely, but was in principle ok with this strategy assuming Dries doesn't veto it. At the risk of wasting a few hours of my life on a patch that's doomed, here's one for D6 that implements the proposal explained above.
I've verified that this patch solves the bug reported here, at least in terms of minimum cache lifetime. Here's how to reproduce the bug:
Without the patch, step #5 only works if #6 doesn't happen during #4. ;) Once you apply the patch, you should be able to take as long as you want at step #4, and repeat #6 as often as you like, and you should only see #7 if you actually click "Check manually" again, or if a day goes by and cron checks for updates again...
More testing and careful review would be most appreciated. This does *NOT* fix #359171: Never fetch update data at admin/*, only compare existing data yet -- that'll be a follow-up patch.
Also, as I explained at #319486, this is *NOT* a "fix it first in D7 and backport it" bug fix issue. We might radically change how the caching API in D7 works to solve this a different way. And, this bug is right now crippling sites (and putting a punishing load on the d.o infrastructure in the process). So, we're definitely need to fix this ASAP in D6 (preferably before the next core release) and then take our time figuring out how we want to fix this long term in D7.
Thanks,
-Derek
#13
@catch Re: #11: Thanks for the pointer. Indeed, that's the opposite problem. Further evidence of how utterly broken the "minimum cache lifetime" setting is, but that's not a solution to this bug. If you use the D6 backport patch from there (http://drupal.org/files/issues/cache-DRUPAL-6.patch) and try to repeat the steps in #12, you'll see the same broken behavior.
#14
(Whoops, http://drupal.org/files/issues/cache-flush-D6.patch is actually the most recent patch from #227228-27: cache_clear_all and cache_get fail to clear caches when Minimum cache lifetime is on but it still doesn't solve this).
#15
Wow, Derek, I had for gotten I even posted this. I will try to test this later today. Who, knows, maybe I can turn Update back on for some of the worst sites. I have some sites that I had to turn it off or I could never get into admin, even increasing my max execution time.
#16
@dww - need to dig in more - but kind of seems like this sort of data could go into the general {cache} table with CACHE_PERMANENT as the lifetime? The it should (I hope) only expire when you specifically purge it?
#17
@pwolanin: I'm skeptical. That might be true, but a) it'd require a LOT of heavy testing to ensure it's really true and b) I'm nervous that's still not going to save the day if memcache is involved. Given how much I've been burned by the core cache API in the recent past, I'm probably too jaded to come to a clear-headed conclusion, but I'd rather just use a completely separate table and a few helper functions and guarantee that no other external factor is going to break this code and purge the data at the wrong time.
#18
+1 for a separate table. When I added caching to Taxonomy Image, I did some timing tests. Using the general table (cache), the queries were really slow, even on a lightly loaded system. The only way to get my pay-off was to create a separate table; then it screamed.
#19
There's no need to merge into the shared {cache} table. We've already got a separate table, we might as well keep it. In fact, maybe all the schema updates from #12 are stupid (just to rename the table from {cache_update} to {update_cache}). Even if we stop using the core cache API, we could probably just keep the table exactly as it is, and just start using our own private helper functions so we're not calling cache_(get|set) anymore.
That said, it's possible that something like the attached patch is all we need. I haven't tested with memcache at all, so I have no idea if that completely honors CACHE_PERMANENT like it should. However, this does solve the minimum cache lifetime woes from #12. I'm still a little skeptical, but if this really solves the problem, it's obviously a lot less of a change in D6 (and we might as well just commit it to HEAD, too, even if we revisit the cache API to fix some of the brokenness there).
There are a few other places this patch should probably touch. It currently never clears the cache of projects and the current status of the site (not the fetched available release data -- that's a separate story) unless we're on certain pages, we manually check for updates, or via cron. Previously, those were cached for at most an hour, now we're potentially caching much longer. Since we're using CACHE_PERMANENT, we don't have an easy way to keep track of when these should expire. We'd either have to stash those as variables, or put them in {cache_update} itself in another record or something...
Anyway, this needs more thought, testing, and I should look more closely at the rest of the code (especially when it's not past my bed time), but this might be a much smaller change that solves the problem, which should therefore be much easier to commit to D6.
#20
Well, the cache gets cleared when a module is enabled or disabled. In between those events, I don't really see any major reason for it to expire.
#21
@dww - the default param for the cache_set() is
$expire = CACHE_PERMANENT, so you could just omit that param.#22
@NancyDru: The cache of info about your current status (are you out of date or not?) should be cleared when you upgrade the version you're currently running. That's not the same as enable/disable. Unless you happen to visit the right pages, update.module would continue to think you're missing an update. In the case of a missing security release, you're still going to get that red warning on every admin/* page load until you happen to visit the update report itself, which will clear the cache, and then say there's no problem.
@pwolanin: Duly noted, thanks. I'm tempted to leave it as explicitly setting CACHE_PERMANENT, though, from the perspective of defensive programming. If we go this route, it's vitally important that these are CACHE_PERMANENT, and I think it's not unreasonable to spell that out explicitly in the code. We could also add a "// This defaults to CACHE_PERMANENT, which we desperately need..." comment above each call site, but I'm not sure that's any better.
@all: In discussion with killes in IRC, it seems like there's really no good solution to the memcache problem without either patching the drupal memcache glue itself (which is complicated in many ways) or just avoiding the core cache API as I originally propose.
However, I did have a nice idea to simplify #12 a lot. There's really nothing to gain from renaming {cache_update} to {update_cache}, which is where the DB update and the bulk of the code in #12 comes from. We can continue to use the identical {cache_update} table, we just need to stop using cache_(get|set) and use _update_cache_(get|set) instead. Then, we can still use {cache_update}.expires to keep track of when the data should be purged, but ensure that we're the only ones actually purging that data when "minimum cache lifetime" is set, and avoid having this data live in memcache at all. I think that's the winner. I'm going to re-roll now. Stay tuned.
#23
This patch:
- solves the "minimum cache lifetime" bug explained in #12.
- requires no schema changes at all.
- touches no public-facing update.module APIs (it adds an optional parameter to update_invalidate_cache(), but I thought that was cleaner than a separate _update_cache_clear($cid) method).
- is guaranteed to work if you have memcache installed. ;)
- continues to use {cache_update}.expired as the time when the cached data should really expire, so my concerns in #19 are basically addressed.
- has characteristically thorough PHPDoc given update module's (aka my) very high standards. ;)
Reviews/testing urgently requested. Meanwhile, I'll pound on this some more and look closely at some related issues...
#24
Note to reviewers: the cache id we *really* care about preserving is 'update_info'. That's the (giant) array of all the parsed data from the XML we fetch from updates.d.o. It's sort of a lame, nondescript cache id, and I'd be happy to rename that to something like 'update_available_releases' or something, but I don't want to risk feature creep jeopardizing this patch from landing in D6.
The other cache ids in use ('update_project_data' and 'update_project_projects') are still worth caching, but much less important in terms of if they get cleared. 'update_project_projects' is just the list of all projects on the site, and some goodies from the .info files we care about. 'update_project_data' is basically a cached copy of the current status of the site -- which modules are up to date, which are out of date, etc. Both of these can at least be recomputed locally without touching the network (just a lot of touching the disk, especially if you use cvs_deploy -- that's why we cache them at all).
So, I'm totally fine with giving 'update_info' a much better name, but only if no one freaks out about 3 more hunks in the diff. ;) Either way, I wanted to clarify this part, since it came up in an IRC discussion with pwolanin and killes...
#25
Looks good, though a little code comment about that being the expensive entry would help clarify why others are still being cleared regularly.
#26
Now with more comments about the various cache ids in use, and why some of them are only valid for an hour.
#27
A little question based on a recent issue I had on a customer's site. Is saving that big array as one big chunk going to possibly present problems with "max_allowed_packet?" Their host had it set at 1M and we did end up with problems in Views/NodeQueue when we added the last term. I got them to bump that to 3M, but this big array could possibly exceed even that. And, no, they are not on 6.x yet; I'm just trying to think these things through.
@dww: I do see the "no information available" message even on enable/disable.
#28
@NancyDru:
In terms of the giant array as a single entry in the table causing potential problems with max_allowed_packet -- yes, that's a concern. That's a lot of what #238950: Meta: Reduce RAM resource consumption is about. We could arguably split up the 'cache_info' array into separate cache records for each project's list of available updates. However, that'd require a bigger patch. We should almost certainly do so for D7, and should consider doing so for D6, but it's not nearly as urgent as what this patch is solving.
Re "no information available" -- that's expected, but that's different than what I'm talking about. That's because the submit handler for the modules page clears the cache of available updates ('update_info' from #24). Maybe it shouldn't do that, I'm not sure. But, that's unrelated to this issue and this patch.
#29
Here's a port to HEAD/D7. Untested (my laptop can no longer run HEAD since I haven't gotten around to recompiling PHP et al for PDO support).
Side note: I marked #359171-4: Never fetch update data at admin/*, only compare existing data "won't fix" -- see comment #4 for details. So long as this patch lands, that isn't a problem (and in fact, we need to keep the existing behavior because of #209242: update.module reports bogus results when .info files are newer than the cached available update data)
#30
I've tested the D6 patch on a site that uses memcached and it works as advertised in #12.
A rather simple and elegant patch to fix an important problem.
#31
dww asked for more info:
I've tested my install with and without the patch. Without the patch, the cache is cleared after e.g. editing a vocabulary, with the patch it is not. I am using the memcache.inc replacement for cache,inc (as on drupal.org) and my memcached config is looking like this:
'memcache_servers' => array( //'127.0.0.1:11211' => 'default','127.0.0.1:11211' => 'default',
'127.0.0.1:11311' => 'form',
'127.0.0.1:11411' => 'page',
'127.0.0.1:11511' => 'filter',
'127.0.0.1:11611' => 'menu',
'127.0.0.1:11711' => 'node',
'127.0.0.1:11811' => 'path',
'127.0.0.1:11911' => 'taxonomy',
),
'memcache_bins' => array(
// core
'cache' => 'default',
'cache_block' => 'default',
'cache_form' => 'default',
'cache_page' => 'default',
'cache_filter' => 'default',
'cache_menu' => 'default',
// from cck
'cache_content' => 'default',
// added by advcache patches
'cache_node' => 'default',
'cache_path' => 'default',
'cache_taxonomy' => 'default',
// views
'cache_views' => 'default',
),
There is no special entry for cache_update, so it gets mapped to the default bin and thus cleared on cache clears. If I had an extra bin, it would not get cleared for normal cache_clear_alls, but I'd consider it unusual to make an extra bin for this.
#32
Almost RTBC. ;) pwolanin pointed out that hook_flush_caches() isn't going to do any good in the memcache case, since drupal_flush_all_caches() just turns around and calls cache_clear_all() on each table. So, we should probably just directly truncate the cache in that hook implementation, and not return anything. I guess. Seems a bit of a hack, but not sure what else to do in this case.
Also, the D7 patch should *probably* use DBTNG. I was hoping to just reopen #394590: DBTNG: Update module instead, but I guess I should just do it in here... ;)
#33
Improvements since the last round of patches:
A) Fixed update_flush_caches() to just directly call update_invalidate_cache() as per pwolanin's suggestion in IRC (see #32)
B) D7 version is now fully DBTNG.
C) I bit the bullet and ended up performing s/update_info/update_available_releases/ for that cid since 'update_info' is so lame and unspecific.
D) We're now a bit more careful about how we invalidate data during _update_refresh() to fix a long-standing bug in cvs_deploy: #235425: HEAD release always considered Not Supported during cron
This could use more review and testing, but I'm getting increasingly happy that this is RTBC, and will fix a whole host of inter-related problems.
#34
Code (especially comments) looks very good. Note, we should have tests for this functionality - NOT in this issue/patch - but focused in #253501: Tests for update.module.
#35
+ * When we relied on the core cache API, there were all sorts of+ * potential problems that would result in attempting to fetch available
+ * update data all the time, including if a site sets a so-called "minimum
+ * cache lifetime" (which is both a minimum and a maximum), or if a site uses
+ * memcache.
+ *
+ * We continue to use the {cache_update} table, but instead of using
+ * cache_set(), cache_get(), and cache_clear_all(), there are private helper
+ * functions that implement these same basic tasks but ensure that the cache
+ * is not prematurely cleared, and that the data is always stored in the
+ * database, even if memcache is in use.
This comment explains the issue really very well, and I'm glad it's so detailed. Unfortunately, it is written in a form resembling a changelog. So far as I can tell, inline comments are only used to explain what is being done and why - but not the differences to the previous versions, let alone in code that isn't even one major release old.
I think you can fix it by removing all references to "how it used to be" or "what has changed" and instead explaining why you don't use the caching API, period.
#36
I respectfully disagree. We're explaining what goes wrong if you don't do it this way. That's important if anyone later tries to touch this code.
#37
how about something like jsut this in place of the above:
* This module uses a standard cache table, {cache_update}, but instead of* cache_set(), cache_get(), and cache_clear_all(), there are private helper
* functions that implement these same basic tasks but ensure that the cache
* is not prematurely cleared, and that the data is always stored in the
* database, even if memcache is in use. This is because the cached data
* is very expensive to fetch and must persist the full time specified.
#38
cross-post
#39
Re-rolled to address Arancaytar's docs issue.
One remaining issue left:
<?php/**
* Implementation of hook_flush_caches().
*
* Called from update.php (among others) to flush the caches.
* Since we're running update.php, we are likely to install a new version of
* something, in which case, we want to check for available update data again.
* However, because we have our own caching system, we need to directly clear
* the database table ourselves at this point and return nothing, for example,
* on sites that use memcache where cache_clear_all() won't know how to purge
* this data.
*/
function update_flush_caches() {
update_invalidate_cache();
return array();
}
?>
I wonder whether we really want to flush this cache every time hook_flush_caches() is invoked. The comment speaks of update.php only (which is obvious), but if it's only update.php (I'd guess so), then it should use the same check as http://api.drupal.org/api/function/drupal_flush_all_caches/7 for _system_theme_data().
#40
@sun - you can manually trigger this hook with the button on the admin/settings/performance page. I'm not aware of other uses in core.
Why do we need the same check? The check around themes seems to involve a race condition that will leave you theme-less.
#41
dww: I'm sorry to have been unclear. sun's new version of the comment is exactly what I had in mind, and I believe it meets your criteria as well:
+ * We specifically do NOT use the core cache API for saving the fetched data+ * about available updates. It is vitally important that this cache is only
+ * cleared when we're populating it after successfully fetching new available
+ * update data. Usage of the core cache API results in all sorts of potential
+ * problems that would result in attempting to fetch available update data all
+ * the time, including if a site has a "minimum cache lifetime" (which is both
+ * a minimum and a maximum) defined, or if a site uses memcache.
+ *
+ * Update module still uses the {cache_update} table, but instead of using
+ * cache_set(), cache_get(), and cache_clear_all(), there are private helper
+ * functions that implement these same basic tasks but ensure that the cache
+ * is not prematurely cleared, and that the data is always stored in the
+ * database, even if memcache is in use.
+ */
It's all in the tense. ;)
#42
oh dear, things get very ugly when I enable a module in D7 - jsut some DBTNG issue I think:
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# Notice: Array to string conversion in DatabaseStatementBase->execute() (line 1662 of /Users/Shared/www/drupal-7/includes/database/database.inc).
# PDOException: DELETE FROM {cache_update} WHERE (cid = :db_condition_placeholder_906_modules, :db_condition_placeholder_906_submit, :db_condition_placeholder_906_#action, :db_condition_placeholder_906_#form_id, :db_condition_placeholder_906_#args, :db_condition_placeholder_906_#build_id, :db_condition_placeholder_906_#type, :db_condition_placeholder_906_form_build_id, :db_condition_placeholder_906_#token, :db_condition_placeholder_906_form_token, :db_condition_placeholder_906_form_id, :db_condition_placeholder_906_#id, :db_condition_placeholder_906_#description, :db_condition_placeholder_906_#title, :db_condition_placeholder_906_#attributes, :db_condition_placeholder_906_#required, :db_condition_placeholder_906_#method, :db_condition_placeholder_906_#theme_wrapper, :db_condition_placeholder_906_#tree, :db_condition_placeholder_906_#parents, :db_condition_placeholder_906_#submit, :db_condition_placeholder_906_#processed, :db_condition_placeholder_906_#defaults_loaded,
#43
$form['#submit'][] = 'update_invalidate_cache';No...
#44
The $form array was getting passed in as the $cid. Bad news.
#45
I spent an hour looking over this patch and another hour reading the issue(s), at the request of dww. I'm not sure how helpful this will be, but I tried my best. :)
As a gut reaction, I *vastly* prefer the approach taken in #19. The idea of a completely separate caching system for this data is positively abhorrent to me, and I don't understand why, rather than being skeptical and jaded about the core caching API, we wouldn't simply write tests to determine if the CACHE_PERMANENT type works as advertised or not, and fix it if there are bugs. The answer for this unique separate cache table approach seems to be "memcache," but isn't that then something for memcache module to special-case? Why should anything in core need to care about one particular caching backend?
(Note that I will 110% own up to not being anything like a performance guru, and am happy to step completely aside with no hard feelings at all if Gábor and/or Dries think this is the right way to go. I should be talking with both of them tomorrow, so I will ask.)
So while I can appreciate the urgency with which this issue needs to be fixed in 6.x, right now I'm not very eager to commit this to 7.x, especially without tests, and especially given that the 7.x patch has obviously not been tested by anyone but Peter or he wouldn't have encountered a gazillion flaming errors when he applied it.
Then, in terms of a code (actually, mostly comment) review for #44:
+ // Set the projects array into the cache table. Since this is not the+ // data we fetched about available updates, it is ok to invalidate it
+ // after an hour. We don't want this to persist too long, or if someone
+ // upgrades their version of a module, update status will be using stale
+ // data unless an admin visits one of the paths where this data is
+ // flushed (see update_project_cache()).
The same comment is used twice on top of two different variables. We're missing the nice, specific explanations that dww gave for what each of these variables stands for back in #24. Let's get those moved into each section. That last sentence could also use some work, as I had to read it about 3 times to understand what it was saying. Could it possibly be split into two?
+ update_invalidate_cache($cid);If we're going to copy/paste function signatures from includes/cache.inc, might as well name this update_clear_cache(), no? That way it's clear what function you're replicating. (Oh, wait. Nevermind. That's an old function name.)
+ // during update_get_projects(), for example, to modules that implement
<code>
should be ...projects()<strong>;</strong> for example, ...
<code>
- cache_set('update_info', $available, 'cache_update', REQUEST_TIME + (60 * 60 * 24 * $frequency));
+ _update_cache_set('update_available_releases', $available, REQUEST_TIME + (60 * 60 * 24 * $frequency));
+100 to s/update_info/update_available_releases/
+ * Note: this function completely ignores the {cache_update}.headers field+ * since that is meaningless for the kinds of data we're caching.
Then should we not have an update function to drop the column?
Also, I know that the renaming of the table from cache_update to update_cache was removed in an earlier patch, but I kind of think that we should do that to make it clear that this is an "other" kind of cache. Views does the same thing with 'views_object_cache' so there's somewhat a precedence for this. That said, I know you're trying to keep the line count down on this schema/API change, so probably best to wait to see what Gábor has to say.
#46
@sun: Thanks for rerolling -- and yes, Arancaytar, that makes sense about the tense.
@pwolanin: Ugh, sorry about the form bug. :( That was an unintended side-effect of adding the optional argument to update_invalidate_cache(). Maybe we really should have just left that alone and added an _update_cache_clear() that takes the $cid... I think that's probably safest for the D6 backport, so that's what I'm going to do. And in D7, we might as well just call it _update_cache_clear() for consistency with the rest of the private cache functions, even if we get rid of update_invalidate_cache() entirely in D7.
@webchick: Thanks for the review. I know how you feel. Part of me would vastly rather use something like #19, and tell memcache to hack around it (although there's still the problem of not being able to use {cache_update}.expire for keeping track of when the data should expire *sigh*, so we'd have to do some kind of hack for that -- makes #19 less elegant than it appears).
However, this just reminds me of {cache_form} all over again (which caused a full day of grief and data loss after we upgraded d.o to D6). We call it a cache table, and we use the cache API, but it's *NOT* like our other caches. It *needs* to be non-volatile. Every D6 site running memcache has to stand on its head to do configuration hacks to special case this cache-table-that's-not-really-a-cache-table. It should be called {form_persistent_storage}, but since it's called {cache_form} and we use the cache API, (which is plug-able, it's not just memcache per se), every possible cache plugin needs to special case it. Is it better to always treat everything the same, even if they have totally different characteristics, or should we just embrace the fact that not all caches are created equally, and some just don't belong in the usual cache API?
Imagine this was the node access system. Should we put something weird into the node access system in core that's related to (but not really) node access, and then say "all node access modules should hack around this special case -- core wants to be internally consistent and consider everything node access"?
Of course, in the end, we should fix the (plug-able) cache API to support truly non-volatile caches. If/when that happens, we can revisit this. But, that's definitely not happening in D6, and until it happens in D7, we can't leave update.module this broken. Just because we commit these patches now doesn't mean we're going to ship 7.0 with exactly this code. But, we should try to keep HEAD and D6 as close as possible in terms of fixing bugs, and for now, this is the best solution we have that ensures we fix the problem.
And yes, obviously we need #253501: Tests for update.module. Long ago, I explained how to do it, and no one ran with it. I've been spread way too thin with 1000 other responsibilities, and it turns out I'm a solo maintainer for this subsystem after all (#447700: Update MAINTAINERS.txt to reflect reality for update.module). So, that hasn't happened yet. But, that's no reason not to fix things we know are broken. ;)
All that said, here are new patches for both HEAD and D6 that fix your "comment review" points from #45, and address the badness with update_invalidate_cache() from #42-43... The D6 patch also fixes a few other trivial comment-style bugs and typos that were already fixed in HEAD (I was diffing between HEAD and D6 to make sure I got everything in the backport that should be there).
I've tested heavily on D6. I still don't have a working HEAD test environment, but it *should* work now. ;)
p.s. While testing all this in D6, I found #448268: Cached update status data not cleared when saving theme admin page due to wrong form_id. I resisted the urge to roll in the fix for that here -- too many kittens have already been wounded in this issue.
#47
is this function actually used anywhere, or this is just to preserve the API?
+function update_invalidate_cache() {+ _update_cache_clear();
+}
#48
Thanks, dww.
Ok, so the answer to the "why a completely separate cache table and functions" question isn't "memcache," it's more accurately "other caching backends, such as memcache." I see that this was clarified in the comments in the most recent re-roll (or else I was just tired last night and mis-read them). +1. Before, it sounded too much like "core is special casing a bunch of crap because a particular contrib module can't deal."
I also wasn't quite sure what you meant by:
"However, this just reminds me of {cache_form} all over again (which caused a full day of grief and data loss after we upgraded d.o to D6). We call it a cache table, and we use the cache API, but it's *NOT* like our other caches. It *needs* to be non-volatile."
But I searched the infrastructure queue for cache_form and found #376666: Release node created without title by "preview" which sounds like what you're talking about.
Incidentally, I'm aware of #253501: Tests for update.module; the tests I was talking about in this particular case (which wasn't very clear, sorry) were for tests for the core cache API which shouldn't need any fancy workarounds. But the way you explained it, this separate caching mechanism sounds like it is needed regardless if the core cache API is working 100% as advertised or not.
#49
Hit Dries up in IRC about this patch, and he asked for a summary.
I was going to write something up, but then noticed that http://drupal.org/node/220592#comment-1520722 contains both a good overview of the patch's changes and a set of reproduction steps for the bug that we're fixing.
#50
While there is no doubt of the valors of this patch for Drupal 6, I am unhappy with it for Drupal 7. We can solve this properly there. So please commit to D6 only.
#51
Erm. OK, i see now: we do not want to use the cache api here at all because cache can go away any time. And while if your cache backend is flaky it only wrecks your site, in this case, it wrecks drupal.org and that sucks. So OK with D7 too.
#52
Re: Summary -- #12 doesn't really explain the heart of the bug, and that's not the implementation in #46, either. Here goes...
The problem
The solution
Other changes in this patch
p.s. @pwolanin #47: Yes, in D6, it's still used as a the submit callback for admin/build/(modules|themes). In HEAD, it's totally gone, and we have an equivalent function with a more specific name that you added:
update_cache_clear_submit().#53
Tee hee, we seem to be clearing the cache on every cron run, for some reason. Investigating now...
#54
here's why: http://api.drupal.org/api/function/system_cron/6
calls
module_invoke_all('flush_caches')on every cron run.#55
Cool, yet another critical bug fixed by this patch. ;) Basically unrelated, but system_cron() clears all caches. So, the "how often to check for updates" setting has been useless -- we always check after every cron run. Don't have time now for the CVS archeology to figure out when/how that happened. But, yet further proof that we shouldn't be using the regular cache API for this. ;)
New patch changes our hook_flush_caches() implementation to see if update.php is actually calling us (by testing MAINTENANCE_MODE) and only clearing the cache if we're in 'update' mode. Added tongue-in-a-cheek comment about 'update' mode. ;)
#56
Applied and tested D6 version - no longer fetches on every cron run. Looks good.
#57
Sounds like dww and pwolanin have shaken out the last of the bugs. Looks ready to go from my side, I checked with Gábor and he said it was ready to go from his, so committed to HEAD. Thanks a lot for all of the work on this patch! :D
#58
Committed to Drupal 6 too.
#59
YAY!!!!
Thanks, folks. The infrastructure team loves you.
#60
Automatically closed -- issue fixed for 2 weeks with no activity.