Hey all!
I've just discovered a (potentially serious) bug in 4.7.2.
Basically, if you change the status of a node via admin/node, it tries to save time by executing queries directly against the database.
However, cached pages aren't cleared during this operation, even if they really should be.
So anonymous users could potentially get a stale front page for... a long time.
And even better (worse?) if an admin demotes/unpublishes a node because he really doesn't want it to be visible, it'll still be visible for anonymous users until the cache is cleared.
Anybody want to tackle this? It could be a nightmare for a newbie admin in the wrong situation.
Comment | File | Size | Author |
---|---|---|---|
#10 | cache_vs_admin_node__HEAD_10.txt | 323 bytes | pwolanin |
#9 | cache_vs_admin_node_9.txt | 297 bytes | pwolanin |
#1 | cache_vs_admin_node_1.0.patch.txt | 1.69 KB | alexmarkley |
Comments
Comment #1
alexmarkley CreditAttribution: alexmarkley commentedI've taken a stab at patching this issue, mostly for academic and/or illustrative purposes.
This patch makes the assumption that using established mechanisms to perform tasks is better than making up new mechanisms. Thus, I am doing a node_load and node_save on every nid in the list.
Granted, this is slower than the previous method, but it addresses my concern and stands up to some pretty thorough testing.
I'm not suggesting this is a finished patch because there are a number of loose ends which should be addressed by somebody higher up in the food chain than myself.
obviously some of those concerns are more relevant than others.
I'm just trying to do my part, so do with it what you will. :)
ttyl
--Alex Markley
Comment #2
beginner CreditAttribution: beginner commentedI am not sure it is that critical.
Anyway, bugs ought to be fixed in head first: you don't want the bug to reapear during the next release.
In head, I believe there is no moderate bit, anymore.
See this for the patch format:
http://drupal.org/patch
I'll test a patch for cvs.
Comment #3
dwwsorry, i disagree w/ beginner about HEAD vs. 4.7.2. if it's a bad bug in 4.7.2, it should be fixed there. maybe it's not even a bug anymore in HEAD due to other changes. doesn't matter, it still needs to be fixed in 4.7.x. 4.7.3 is definitely going to be released before 4.8.0/5.0.0, so it's more important to get it fixed in 4.7.x first, IMHO.
the issue sounds scary and wrong, though i haven't tried to reproduce it or understand the relevent code. knowing when to invalidate the cache is a tricky question, since you can't invalidate it too frequently, or it becomes useless as a cache. however, in the kinds of cases you're describing, you need to invalidate, or you get incorrect behavior. correctness is always more important than efficiency, if you ask me (i.e. you must be correct first, and if you're correct, then you can worry about being fast).
"This patch makes the assumption that using established mechanisms to perform tasks is better than making up new mechanisms."
very often true. however, in rare cases, you sometimes need to use a new mechanism to do something specific that a more general mechanism could also do. this is the nature of optimization -- some things (e.g. things you do over and over in a tight loop) need to be more efficient than the general case...
"Thus, I am doing a node_load and node_save on every nid in the list."
that sounds like a potentially huge performance hit. if this stuff was being optimized in the first place, then it's possible to keep correctness without incurring such a big cost by simply adding 1 more query to invalidate the appropriate records in the cache table.
that said, i haven't looked *at all* at the patch or the surrounding code, so i won't comment on the specifics. ;) mostly, i just wanted to bump this back to 4.7.2...
also, to prevent this thread from spiraling into too many disconnected things, i'd highly recommend submitting a new issue (as a "feature" for "cvs") about the improvements to the UI on admin/node you'd like to see. let's keep *this* issue focused on the correctness bug (which i'm inclined to leave as critical), and then we can worry about the new features for the future in another thread, focused on a the appropriate branch of the source, and with less urgency...
thanks,
-derek
Comment #4
pwolanin CreditAttribution: pwolanin commentedI assume it's really expensive to clear the entire cache?
Looking at the way cache_set stores node info, it doesnt' seem that you can be sure that you properly reset the cache unless you know all possible alises for a node- is that right?
includes/common.inc:1403: cache_set($base_root . request_uri(), $data, CACHE_TEMPORARY, drupal_get_headers());
Comment #5
dwwI assume it's really expensive to clear the entire cache?
of course. ;) it's not the operation of clearing it itself, it's the fact that *every* page viewed by anonymous visitors will then have to be fully re-built and re-cached...
Comment #6
alexmarkley CreditAttribution: alexmarkley commentedhey... Just to clarify, I'm not proposing clearing the entire cache. This patch simply uses the appropriate channels to notify modules that content has been updated. Thus, the appropriate cached items can be cleared.
The fact is that an administrator who promotes/demotes something has a right to expect that action to be reflected to anonymous uses immediately, unless he has explicitly set a minimum cache item lifetime.
Does that make sense?
Additionally, the slowdowns associated with using node_load and node_save are not expensive for anybody but the admin. In terms of page load time, this is actually pretty fast. The only extra slowdowns are associated with rebuilding the appropriate cache items, which really has to be done anyway.
I hope this clarifies my position.
--Alex Markley
Comment #7
alexmarkley CreditAttribution: alexmarkley commentedTo clarify again: nothing about this patch results in a fully cleared cache.
I haven't investigated how things work on the back-end, but the "correct" way to do it is to only clear cache items that may be affected by the change that has been made. (For example, if the item is being promoted/demoted...)
Clearing cache items is not an insane slowdown unless it happens too often. On the other hand, if it doesn't happen when it is supposed to, we get undefined/unexpected/unwanted behavior.
ttyl
--Alex Markley
Comment #8
pwolanin CreditAttribution: pwolanin commentedWait a minute, take a look at the actual code for node_save:
http://api.drupal.org/api/4.7/function/node_save
The end of the function is:
So every call to node_save invokes cache_clear_all()!
Would it not be much simpler to keep all the exisintg code and add a single call to cache_clear_all()?
Comment #9
pwolanin CreditAttribution: pwolanin commentedOk, patch attached which adds a signle call to cache_clear_all()
Tested with drupal 4.7.2 and seems to solve the bug for me.
Comment #10
pwolanin CreditAttribution: pwolanin commentedpatch for HEAD
Comment #11
alexmarkley CreditAttribution: alexmarkley commentedOoh, you're right. (!)
This strikes me as a serious design flaw.
The implications of this are actually staggering, since practically any user can generate a node_save. (For example, if a site has a forum enabled.)
If a node_save operation clears the entire cache, wouldn't that mean that a user could DoS a really large site by simply creating a forum node and programmatically altering it every few seconds?
Now that I understand more about how the cache system "works", I've uncovered several more stale cache bugs. What's really bad is that some of these stale cache entries might never go away if somebody doesn't manage to trigger a cache_clear_all().
The module I've been designing to export node-related information plays nice with the system by watching node updates via hook_nodeapi and clearing appropriate cache entries when it thinks it should. (For example, it only clears cache entries for pages it generates, and only when a node that it handles comes through.)
However, implementing such intelligence would require every single module that sets a cache item to watch for an update to trigger a delete for its own cache entries.
Is this even practical? I'm in completely over my head.
ttyl
--Alex Markley
Comment #12
alexmarkley CreditAttribution: alexmarkley commentedConsidering the current state of the Drupal caching system, pwolanin's patches seem like the right thing to do. They prevent the original problem that I was reporting, while also avoiding the odd behavior and slight slowdowns that my patch caused.
I have this pwolanin's patch installed on my site, and it's working great! +1; thanks, pwolanin. :)
ttyl
--Alex Markley
Comment #13
pwolanin CreditAttribution: pwolanin commentedBeing bold I've moved the status to RTBC ;-)
Quite frankly, I'm not sure how to avoid clearing the cache on node_save. Imagine, if you will, that content from a single node appears on every page in a site. For example, if the "new forum topics" block is enabled and placed in the sidebar on every page. If the a forum node title is changed, or a new topic created, every page on the site is stale. Similarly, every page could be a "view" using the views module and a particular node could appear on most or all pages.
I'm not sure it's possible within the Drupal framework to predict or determine which pages the content from a particular node will appear on.
Since no content is cached for authenticated users, and many forum topics, etc. are created on Drupal.org perhaps the maintaners can add their input on what role caching plays. I think that if you set a minimum cache lifetime, the cache is not cleared until that lifetime expires, so in this way your theoretical DOS attack wouldn't necessarily have any effect in terms of invalidating the cache (It might, however, have a major impact by locking the node table...)
Comment #14
drummCommitted to HEAD.
Comment #15
pwolanin CreditAttribution: pwolanin commentedGreat! Needs to commited to the 4.7 branch too.
Comment #16
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedapplied
Comment #17
(not verified) CreditAttribution: commented