Posted by Reg on February 13, 2008 at 5:48am
13 followers
| Project: | Drupal core |
| Version: | 8.x-dev |
| Component: | node system |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
Issue Summary
I was modifying a module which required me to modify a node and save it. However, the changes to that node kept reverting back to what it was before I did the save. Upon investigation I determined that it was the node_load cache and when I did a load just after the save with the flag to reset the cache, all worked as it should.
So first, is there a deliberate reason why node_save can save without keeping the node_load cache in sync?
Second, if it is not deliberate then perhaps a change to the node system along the lines of the following would be useful:
In node_load:
function node_load($param = array(), $revision = NULL, $reset = NULL) {
global $user;
static $nodes = array();
if ($reset) {
if (is_numeric($reset)) {
if (array_key_exists($reset, $nodes)) {
unset($nodes[$reset]);
return true;
} else {
return false;
}
} else {
$nodes = array();
}
}In node_save (about 5 lines from the bottom):
// Call the node specific callback (if any):
if ($node->is_new) {
node_invoke($node, 'insert');
node_invoke_nodeapi($node, 'insert');
}
else {
node_invoke($node, 'update');
node_invoke_nodeapi($node, 'update');
node_load(null, null, $node->nid);
}
// Update the node access table for this node.
node_access_acquire_grants($node);
// Clear the cache so an anonymous poster can see the node being added or updated.
cache_clear_all();
}
Comments
#1
This is a little cleaner for the node_load part:
if ($reset) {if (is_numeric($reset)) {
$result = array_key_exists($reset, $nodes);
if ($result) unset($nodes[$reset]);
return $result;
} else {
$nodes = array();
}
}
#2
I'm changing this to a bug as there is no way that I can see where what you save should be allowed to be out of sync with what you load at such a fundamental level.
#3
Yes, it would be smart to invalidate this cache from within node_save()
#4
This is fixed the patch proposed at http://drupal.org/node/111127#comment-768053. That patch is much bigger and intended for D7.
#5
Here is a patch for Drupal 6 that should fix the bug.
Note, I wrote this for Drupal 6 rather than Drupal 7 because:
(a) the bug appears to already be fixed in Drupal 7, although I think it may have been fixed inadvertently (see #424602: Unsaved node changes are reflected on subsequent calls to node_load()), and
(b) static caching in Drupal 7 is currently in a state of flux, so I wasn't even positive what the correct fix would be right now
#6
I just hit this bug in 6.x.
@#5 node_load() flushes the entire node cache. That seems a bit extreme. node_save() could instead insert the node it just saved into the node cache. That would require exposing the node_load() cache to node_save() as well. Sounds worth it.
#7
FWIW I didn't see a test for this in 7, so here's a test patch.
#8
Digging into this, there's no special logic in 7 to clear the cache on save. The difference between 6 and 7 is that in 7 node_load() returns the node from the cache and 6 stores and returns a COPY of the node from the cache. In 7 when you call node_load() and fiddle with the node you've already updated the cache.
There's two solutions for 6: stop copying the node before returning it from the cache; update the node cache in node_save(). The former has the advantage of compatibility.
In 7, the consequences of node_load() returning the real cached object means that this now becomes unstable:
$node = node_load($nid);
$node->body = "Something";
node_save($node);
$node->body = "Something else";
// code passes
$node = node_load($nid);
print $node->body;
What is $node->body? If the node cache happened to have been cleared between the two node_load() calls its going to be its "Something". If not, its "Something else". This exposes the existence of the cache to the user and allows action at a distance. ++ungood
The attached test class demonstrates the bug. It passes in 6 and fails in 7 (remove setUp to run it in 6).
#9
Yup, exactly :) As I already noted above, please see #424602: Unsaved node changes are reflected on subsequent calls to node_load()
#10
Here's full patches with tests to solve this bug and #424602 in both 6.11 and 7.x. I don't think #424602 is really a duplicate of #154859 as nodes have their own in memory cache unrelated to the database caches.
I'm new to PHP and Drupal, so please excuse any non-idiomatic stylings.
I turned on all error_reporting() in the tests because it seems like a good idea to run the tests with all the warnings flipped on. It exposes some issues in simpletest and the node code. Feel free to turn them off if you don't want to deal with them right now.
#11
Due to the nature of patches for 6.x, I assume this won't go in as is. However, this is rather nasty bug to fall on... Maybe a clean workaround could be laid out for 6.x developers, something along the lines of:
if you node_save() something you might want to node_load() later, make sure you clean the nodes cache by calling node_load(array(), NULL, TRUE).
Subscribing.
#12
I just got to this issue and I think the scope of it is even wider, probably covered in other issues but worth mentioning, maybe the same happens on D7?
If you are in any node/xxx/yyy page (node/123/edit for instance), the menu system will call node_load(123) at the very beginning of the page request, and the node will eventually be statically cached in the node_load() function.
This means that after clicking SAVE in the node/123/edit page, the menu will first call node_load(123) function but the node has not been updated at all yet, so it's really caching the old node. So every node_load(123) function called along the update process will always get the old node.
Maybe the patches already solved this, but thought it worth mentioning.
#13
I think this issue is becoming too confused with #424602: Unsaved node changes are reflected on subsequent calls to node_load(). Please move any discussion related to that problem there (or really to the meta-issue which it was marked duplicate of). Probably I made a mistake by even referring to it in the first place -- although it (coincidentally) helps fix the bug here, it doesn't actually completely fix it, and plus, we shouldn't rely on it but rather should have an actual direct fix.
The attached patch for D7 combines the fix from #5 with a reworked version of the test from #7 (thanks @schwern, and don't be discouraged that your patch sat for so long! - back in April, many people weren't yet focused on the "bug-fixing" part of the D7 development cycle...) The test works with cloned nodes so that it will fail without the actual fix, regardless of what happens with #424602: Unsaved node changes are reflected on subsequent calls to node_load().
Regarding the comment above in #6, I think we actually do need to clear the entire node cache, rather than just the node being saved. Instead of explaining it here, I figured I'd just explain it in a code comment in the patch :)
#14
Not so sure about the clear all nodes upon saving one. Couldn't that be implemented by the text filter module or the node ref module in your examples? I don't think node.module itself needs to be so aggressive, since core does not support these things. My main concern is that we blow away the whole persistent entity cache - http://drupal.org/project/entitycache. Perhaps catch can chime in here.
#15
This should use entity_get_controller('node')->resetCache(); to clear the static.
entitycache.module doesn't clear the persistent cache when you do ->resetCache(); so it wouldn't break that particularly. We could add an optional $nid argument to resetCache() too I guess.
#16
OK, using the resetCache() function now instead.
Seems to me that's probably far enough to go for this patch, since it makes the behavior consistent with http://api.drupal.org/api/function/node_delete_multiple/7... Trying to deal with only clearing a single $nid or group of $nids really sounds more complicated, at least off the top of my head (can a text field easily figure out the cache clearing function for the type of object it is attached to and know how to call it?). Plus it seems like a lot of work for minimal gain, no? The code right above this clears the entire page cache, so clearing the static cache for the rest of the current request seems like nothing compared to that...
#17
Makes sense.
Could save a few cycles by having NodeSaveTestCase extend DrupalUnitTestCase instead of DrupalWebTestCase? Or will the node_save() fail in that state?
#18
Yeah this is fine, we'll just need to be very careful in entitycache to keep the persistent and static cache clearing separate, but that's already the case.
Cross posted with Moshe - UnitTestCase has no database available, at all, so anything which could possibly trigger a database hit is a no go.
#19
No longer applies. :(
#20
Rerolled.
The reroll was completely trivial, so should be RTBC again assuming the tests pass.
#21
The tests passed :)
#22
The node_load() cache being still there after doing a node_save() is currently the only possibility to retrieve the unchanged node after an update. To be able to get the unchanged node is really important when you want to react on changes, thus most rules rely on reacting on changes! This patch makes it impossible to react on this changes on API level!
Ideally the API would make the unchanged node object available in hook_node_presave and hook_node_update. Thus when we remove the only previously possibility to get the unchanged node, we really need to make it available properly instead.
Also FIELD_LANGUAGE_NONE is LANGUAGE_NONE now.
#23
With the current behaviour of objects being shared #154859: Document that cached entity objects are not necessarily cloned any more we are already close to the desired behaviour no? If the changed was previously loaded with node_load() the node is already updated in the cache. However in some situations, the updated node is created fresh e.g. from form values or comes from somewhere else, e.g. the form cache - then the node_load() cache hasn't been updated yet. So what about just updating the cached node with the new one? That way we would save further lookups of a node we already had in memory.
+ // Clear the static cache of nodes. Since the content of one node can depend+ // arbitrarily on the content of another (for example, via a text filter or
+ // field that renders part of one node inside of another), it is not
+ // sufficient to refresh only the node that is being saved; instead, all
+ // nodes must be refreshed.
If a node depends on another node it should really only save a pointer to it - thus the nid or vid. In that case everything is fine as any further request to get the the node will retrieve the updated node anyway. So fixing the cache of the changed node only should be fine.
#24
For the issue of being able to react on changes I created #651240: Allow modules to react to changes to an entity
#25
I fixed the FIELD_LANGUAGE_NONE define and re-rolled the patch.
While I'm still not sure about the complete (static) cache clear I trust the opinion of performance specialists like catch. :)
#26
#25: node-save-cache-221081-20.patch queued for re-testing.
#27
The last submitted patch, node-save-cache-221081-20.patch, failed testing.
#28
This bug was mostly fixed as a result of #651240: Allow modules to react to changes to an entity, but I'm running into one scenario where it still occurs.
The issue is that if you call node_load() from within an implementation of (let's say) hook_node_update(), you get the old node rather than the new one that was just saved to the database. Someone else already pointed this out in the comments on api.drupal.org, and a realistic case where I just ran into it is #1541142: Immediately deploying an edited entity sends the old version of the entity rather than the new one.
The fix would be something like in the attached patch, which I'm sending to the testbot to see what happens. (However, it's not just nodes that this issue occurs for; other entities have the same problem.) Also, it appears that $entity->save() in Drupal 8 already does this correctly, so for Drupal 8 it may fix itself on its own once everything is converted to the new entity system.
I think this is a clear bug, but I'm not sure if it's backportable to Drupal 7 or not? It makes the hooks behave the way you'd expect them to, but I guess I can imagine it breaking some existing code...
#29
Another issue still remaining is the inconsistency between save and delete.
In node_save(), we do this:
<?php// Clear the static loading cache.
entity_get_controller('node')->resetCache(array($node->nid));
?>
But in node_delete(), we do this:
<?php// Clear the page and block and node_load_multiple caches.
entity_get_controller('node')->resetCache();
?>
(And a similar thing for other entities.)
It's not clear to me that this discrepancy exists for a reason.
(Not to mention that the code comment about clearing the page and block caches looks to be out of date, and if we don't clear the page cache when a node is deleted anymore I bet that's a bug too.)
#30
#28: entity-reset-cache-ordering-221081-28.patch queued for re-testing.
#31
The last submitted patch, entity-reset-cache-ordering-221081-28.patch, failed testing.
#32
#28: entity-reset-cache-ordering-221081-28.patch queued for re-testing.
#33
The last submitted patch, entity-reset-cache-ordering-221081-28.patch, failed testing.