Editing a node does not wipe its cache entry
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | node system |
| Category: | bug report |
| Priority: | normal |
| Assigned: | bjaspan |
| Status: | patch (code needs review) |
When a node is saved, cache_clear_all() is called. If a cache minimum lifetime is in effect, cache_clear_all() does not clear any cache entries until cache_clear_all() is called again after the minimum lifetime has passed since the first call. Imagine this scenario:
0. You are an editor at a busy newspaper site.
1. Your cache minimum lifetime is 1 hour.
2. At 8am, you create a new story. It is temporarily unpublished.
3. At 8:15am, a curious user visits node/larger-nid-than-yesterday and hits your story. She gets Access Denied because the story is unpublished.
4. At 8:45am, you publish the story and it goes on your home page. cache_clear_all() is called but, because you have a minimum cache lifetime, the Access Denied page entry is not wiped.
5. Because the story's URL is in the page cache with Access Denied, that is what all anonymous users see.
6. You have cron running, but cron does not call cache_clear_all().
7. At 10am, no new stories have been saved, so cache_clear_all() is still not called, and your users still see Access Denied.
7. At *noon*, you save a new story.
8. Now, users finally see the first story, "published" three hours ago.
9. The publisher fires you for deploying such a stupid CMS.
It seems to me that when a node is saved, that node's page view entries should be wiped from the cache immediately. See the attached patch.
| Attachment | Size |
|---|---|
| node-clear-cache.patch | 1.18 KB |

#1
Looks good to me. The Drupal code police are going to call you on putting spaces around the string concat operator '.' - I'm not one of them but felt I should probably give you a heads up.
#2
@macgirvin
The Drupal Coding Standards Council has declared it heresy to have a space on one side of the '.' but not the other. For 7.x, any blasphemer who commits this violation will be coding Drupal in COBOL for a month. The exact issue # is left as a penance for the transgressor.
@bjaspan
I agree with your that this is a case where the cache needs to be forced cleared. I wondering if we really need two attempts, rather than only one (which we know would be the way it works). Also, what about comments and other node related actions? Should those trigger a flush as well?
#3
@kbahey - If you only knew how many batches of punched cards I've had to redo because my fingers tripped up on IDENTFICATION DIVISION.
IDENTIFICATION DIVISION.PROGRAM-ID. SMILEY.
PROCEDURE DIVISION.
MAIN.
DISPLAY ':-)'.
STOP RUN.
#4
@macgirvin: In that case, why calling the full
cache_clear_all()anyway? Isn't that purely wasteful?#5
@kbahey: Yes, the same idea probably applies in a number of other locations (e.g. comments), though nodes are probably the single most important.
@Damien: The comments explain why cache_clear_all() is still needed. Many page might be affected by a change in a node, it is just that "minimum cache lifetime" should still apply to them.
#6
The patch uses the drupal 5 syntax for the url() function. Here's a patch that uses the new syntax with an associative array for the arguments.
I was able to reproduce the problem manually (and the patch does indeed fix it), but when I wrote a test for it the page never got cached even though page caching was enabled.