Document that objects are shared between modules
moonray - June 26, 2007 - 21:50
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | base system |
| Category: | task |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | active |
Description
The way taxonomy terms and vocabularies are cached in functions like taxonomy_get_term() allows later functions to alter the cached data. This is because PHP (at least in version 5) doesn't copy objects, but passes them by reference.
So, the solution to the problem is to clone the objects in cache before returning them.
example (note the last line: drupal_clone()):
function taxonomy_get_term($tid) {
static $terms = array();
if (!isset($terms[$tid])) {
$terms[$tid] = db_fetch_object(db_query('SELECT * FROM {term_data} WHERE tid = %d', $tid));
}
return drupal_clone($terms[$tid]);
}
#1
#2
This problem still exists in D6.
#3
I've quickly gone through the entire taxonomy.module and found 2 more places where we're returning cached objects (thus effectively returning references to these objects, thus allowing for corruption).
Note that performance implications might have to be tested, I have no idea how heavy drupal_clone() is.
#4
Bump! Marking this as critical because it can potentially cause very weird errors.
#5
I wonder whether the taxonomy module needs to clone these or not. Throughout Drupal, we cache a lot of object, and I don't think we ever clone them.
It's rare that a consumer will alter a cached object. If they do, it's probably a good idea to have _them_ clone the object. It should be rare, not?
#6
I agree that it should be rare. And I also know we're cloning almost nowherein core. But many people have been running Drupal on servers with PHP 4, while this problem only occurs with PHP 5. When Drupal 7 will be PHP 5 only, it's almost certain that more of these weird "bugs" will appear.
IMVHO, it's better to have the solution in core.
A quick google search gave me these results:
-article by Steven: http://acko.net/node/54
-issues: http://drupal.org/node/10490, http://drupal.org/node/63239, http://drupal.org/node/57884
#7
Part of the problem is that somewhere down the road, you don't realize you have to clone the object before using it. If you get it directly from the DB, it's a fresh object anyway, so why bother, right?
Except, things are inconsistent. Sometimes objects are cached, sometimes they come straight from the DB.
When you're working with an API (i.e. getting taxonomy terms), you don't expect to be altering the cached version of objects (IMO).
#8
I see the problem with the inconsistent object behavior. Some people (http://drupal4hu.com/node/52 - although this had lots of comments before chx moved the site) say we should solve this by using arrays and that would give us lots of array functionality which is nonexistent for objects, and we get the on-demand cloning (AKA lazy copy) implemented by PHP.
Anyway, we can approach this problem by fitting the results with cloning, or rethinking where we use objects. The latter is obviously not a Drupal 6 matter, the former could be.
#9
This would need a much more through patching -- why just taxonomy -- and I do not think that is a 6.x matter. Goba, I never throw away content, if you re-read my blogpost, the comments are summed up in there.
#10
Of course you're right chx. But you have to start somewhere and get feedback on that ;) And yes, we're close to a beta for D6, so it's definitely something for D7.
#11
bump - for 7.x and or possible backport?
#12
subscribing
#13
Looks like this one got lost, moving out of the taxonomy queue since it's not specific to there.
#14
Here's a related issue #359746: node_view() modifies the node object stored in the menu (candidate for duplicate)
#15
Duplicate #424602: Unsaved node changes are reflected on subsequent calls to node_load()
#16
Obviously cloning everything before returning is not practical (as it essentially double the memory usage), and switching all our data structures to arrays seems to go in the wrong direction. It think this is essentially a documentation issue.
Most of the objects used throughout the process of a request are shared between all the modules (typical example: objects returned by menu_get_object()). We simply need to document that. I'm with Dries on that, it's the caller responsibility to do the cloning if it needs to.
#17
Erm.
#18
I just duplicated this here #453406: noad_load_multiple() should return cloned objects?, thanks for pointing me here Damien.
It doesn't seem right that it's the "caller responsibility". I think that everyone agrees that conceptually it's not the right thing to do, but we will do it because we are also practical. On that basis, I would say that the most practical argument that holds against this solution is that it means Drupal core will have to do that too, things like the node building which change the node object[1] will have to do that for node objects, and it's one of the largest objects in drupal. I wonder where else core would need to do that, and consequently, if in that case the "caller responsibility" would still maintain its practicality.
[1] and this currently hits Poll module for example, check #3 at #372650: Poll node is not displayed when poll block is displayed as well and see how ugly the current possible fix is.
#19
#20
subscribing