Hey all!

I've just discovered a (potentially serious) bug in 4.7.2.

Basically, if you change the status of a node via admin/node, it tries to save time by executing queries directly against the database.

However, cached pages aren't cleared during this operation, even if they really should be.

So anonymous users could potentially get a stale front page for... a long time.

And even better (worse?) if an admin demotes/unpublishes a node because he really doesn't want it to be visible, it'll still be visible for anonymous users until the cache is cleared.

Anybody want to tackle this? It could be a nightmare for a newbie admin in the wrong situation.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexmarkley’s picture

I've taken a stab at patching this issue, mostly for academic and/or illustrative purposes.

This patch makes the assumption that using established mechanisms to perform tasks is better than making up new mechanisms. Thus, I am doing a node_load and node_save on every nid in the list.

Granted, this is slower than the previous method, but it addresses my concern and stands up to some pretty thorough testing.

I'm not suggesting this is a finished patch because there are a number of loose ends which should be addressed by somebody higher up in the food chain than myself.

  • this patch doesn't apply to HEAD! I can barely keep up with 4.7, so I have not tried adapting this patch to HEAD. (Although I expect that it's easy to do...)
  • node_operations() really needs to go and/or be rewritten. Yes, that is my honest opinion. No, I'm not kidding.
  • using node_save has "side effects" on the admin/node content list. Specifically, content is marked as updated when it's really not. Additionally, the content ordering changes. I'm not sure if this is a problem or not... Perhaps it would be possible to prevent node_save from changing $node->changed in a situation like this? I don't know, I'm thinking out loud.
  • maybe I'm wrong, but I think the operations in the admin/node form should be more comprehensive and/or have more predictable effects. For example, we have a 'sticky' operation, but no 'unsticky' operation...

obviously some of those concerns are more relevant than others.

I'm just trying to do my part, so do with it what you will. :)

ttyl
--Alex Markley

beginner’s picture

Version: 4.7.2 » x.y.z

I am not sure it is that critical.

Anyway, bugs ought to be fixed in head first: you don't want the bug to reapear during the next release.
In head, I believe there is no moderate bit, anymore.
See this for the patch format:
http://drupal.org/patch

I'll test a patch for cvs.

dww’s picture

Version: x.y.z » 4.7.2

sorry, i disagree w/ beginner about HEAD vs. 4.7.2. if it's a bad bug in 4.7.2, it should be fixed there. maybe it's not even a bug anymore in HEAD due to other changes. doesn't matter, it still needs to be fixed in 4.7.x. 4.7.3 is definitely going to be released before 4.8.0/5.0.0, so it's more important to get it fixed in 4.7.x first, IMHO.

the issue sounds scary and wrong, though i haven't tried to reproduce it or understand the relevent code. knowing when to invalidate the cache is a tricky question, since you can't invalidate it too frequently, or it becomes useless as a cache. however, in the kinds of cases you're describing, you need to invalidate, or you get incorrect behavior. correctness is always more important than efficiency, if you ask me (i.e. you must be correct first, and if you're correct, then you can worry about being fast).

"This patch makes the assumption that using established mechanisms to perform tasks is better than making up new mechanisms."

very often true. however, in rare cases, you sometimes need to use a new mechanism to do something specific that a more general mechanism could also do. this is the nature of optimization -- some things (e.g. things you do over and over in a tight loop) need to be more efficient than the general case...

"Thus, I am doing a node_load and node_save on every nid in the list."

that sounds like a potentially huge performance hit. if this stuff was being optimized in the first place, then it's possible to keep correctness without incurring such a big cost by simply adding 1 more query to invalidate the appropriate records in the cache table.

that said, i haven't looked *at all* at the patch or the surrounding code, so i won't comment on the specifics. ;) mostly, i just wanted to bump this back to 4.7.2...

also, to prevent this thread from spiraling into too many disconnected things, i'd highly recommend submitting a new issue (as a "feature" for "cvs") about the improvements to the UI on admin/node you'd like to see. let's keep *this* issue focused on the correctness bug (which i'm inclined to leave as critical), and then we can worry about the new features for the future in another thread, focused on a the appropriate branch of the source, and with less urgency...

thanks,
-derek

pwolanin’s picture

I assume it's really expensive to clear the entire cache?

Looking at the way cache_set stores node info, it doesnt' seem that you can be sure that you properly reset the cache unless you know all possible alises for a node- is that right?

includes/common.inc:1403: cache_set($base_root . request_uri(), $data, CACHE_TEMPORARY, drupal_get_headers());

dww’s picture

I assume it's really expensive to clear the entire cache?

of course. ;) it's not the operation of clearing it itself, it's the fact that *every* page viewed by anonymous visitors will then have to be fully re-built and re-cached...

alexmarkley’s picture

hey... Just to clarify, I'm not proposing clearing the entire cache. This patch simply uses the appropriate channels to notify modules that content has been updated. Thus, the appropriate cached items can be cleared.

The fact is that an administrator who promotes/demotes something has a right to expect that action to be reflected to anonymous uses immediately, unless he has explicitly set a minimum cache item lifetime.

Does that make sense?

Additionally, the slowdowns associated with using node_load and node_save are not expensive for anybody but the admin. In terms of page load time, this is actually pretty fast. The only extra slowdowns are associated with rebuilding the appropriate cache items, which really has to be done anyway.

I hope this clarifies my position.
--Alex Markley

alexmarkley’s picture

To clarify again: nothing about this patch results in a fully cleared cache.

I haven't investigated how things work on the back-end, but the "correct" way to do it is to only clear cache items that may be affected by the change that has been made. (For example, if the item is being promoted/demoted...)

Clearing cache items is not an insane slowdown unless it happens too often. On the other hand, if it doesn't happen when it is supposed to, we get undefined/unexpected/unwanted behavior.

ttyl
--Alex Markley

pwolanin’s picture

Wait a minute, take a look at the actual code for node_save:

http://api.drupal.org/api/4.7/function/node_save

The end of the function is:


  // Clear the cache so an anonymous poster can see the node being added or updated.
  cache_clear_all(); 

So every call to node_save invokes cache_clear_all()!

Would it not be much simpler to keep all the exisintg code and add a single call to cache_clear_all()?

pwolanin’s picture

Status: Active » Needs review
FileSize
297 bytes

Ok, patch attached which adds a signle call to cache_clear_all()

Tested with drupal 4.7.2 and seems to solve the bug for me.

pwolanin’s picture

patch for HEAD

alexmarkley’s picture

Status: Needs review » Active

Ooh, you're right. (!)

This strikes me as a serious design flaw.

The implications of this are actually staggering, since practically any user can generate a node_save. (For example, if a site has a forum enabled.)

If a node_save operation clears the entire cache, wouldn't that mean that a user could DoS a really large site by simply creating a forum node and programmatically altering it every few seconds?

Now that I understand more about how the cache system "works", I've uncovered several more stale cache bugs. What's really bad is that some of these stale cache entries might never go away if somebody doesn't manage to trigger a cache_clear_all().

The module I've been designing to export node-related information plays nice with the system by watching node updates via hook_nodeapi and clearing appropriate cache entries when it thinks it should. (For example, it only clears cache entries for pages it generates, and only when a node that it handles comes through.)

However, implementing such intelligence would require every single module that sets a cache item to watch for an update to trigger a delete for its own cache entries.

Is this even practical? I'm in completely over my head.

ttyl
--Alex Markley

alexmarkley’s picture

Considering the current state of the Drupal caching system, pwolanin's patches seem like the right thing to do. They prevent the original problem that I was reporting, while also avoiding the odd behavior and slight slowdowns that my patch caused.

I have this pwolanin's patch installed on my site, and it's working great! +1; thanks, pwolanin. :)

ttyl
--Alex Markley

pwolanin’s picture

Status: Active » Reviewed & tested by the community

Being bold I've moved the status to RTBC ;-)

Quite frankly, I'm not sure how to avoid clearing the cache on node_save. Imagine, if you will, that content from a single node appears on every page in a site. For example, if the "new forum topics" block is enabled and placed in the sidebar on every page. If the a forum node title is changed, or a new topic created, every page on the site is stale. Similarly, every page could be a "view" using the views module and a particular node could appear on most or all pages.

I'm not sure it's possible within the Drupal framework to predict or determine which pages the content from a particular node will appear on.

Since no content is cached for authenticated users, and many forum topics, etc. are created on Drupal.org perhaps the maintaners can add their input on what role caching plays. I think that if you set a minimum cache lifetime, the cache is not cleared until that lifetime expires, so in this way your theoretical DOS attack wouldn't necessarily have any effect in terms of invalidating the cache (It might, however, have a major impact by locking the node table...)

drumm’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD.

pwolanin’s picture

Status: Fixed » Reviewed & tested by the community

Great! Needs to commited to the 4.7 branch too.

killes@www.drop.org’s picture

Status: Reviewed & tested by the community » Fixed

applied

Anonymous’s picture

Status: Fixed » Closed (fixed)