why are nodes cached on a per-role basis?
firebus - October 31, 2007 - 07:17
| Project: | Advanced cache |
| Version: | 5.x-1.6 |
| Component: | Code |
| Category: | support request |
| Priority: | normal |
| Assigned: | firebus |
| Status: | duplicate |
Jump to:
Description
if i'm not mistaken, if you're only using core modules, node_load should be the same for all users. whether or not the node is displayed is a separate question that is handled outside of node_load.
i can imagine that there might be a contrib module that hooks different thing into node_load depending on the user's role (although this seems unwise to me), but i don't know any off-hand.
are there situations in core, or with common contrib modules, where one definitely needs to cache nodes per role?

#1
I too would be VERY interested in hearing why nodes are cached on a per role basis. For my application, there's not much (if anything) different between the nodes I present my authenticated and unauthenticated users, and besides, even if I cache the same node data for all users I would still only actually display node data based on the user's roles.
So, why would we want to create separate node cache entries for each role combination? Would it be wrong for me to strip out the role stuff from node caching? I hate hacking my modules...
Responses appreciated,
Jonathan
#2
It's basically because I couldn't prove that it wasn't needed to guarantee serving the right node to the right person. I'd be all for using another strategy for people who know that this isn't an issue. Do you think including two patches is a good approach? One for basic and one for per-role? The cache would be more effective without the role based loading. On the other hand, a typical site has 1 role that includes 90%+ of the logged in users. For these sites the current patch is still very effective. In fact, on the sites where I run this patch, the node caching is one of the most effective patches.
#3
I understand that serving the right node to the right person is an issue, but I would think this responsibility could be delegated to the display code (hook_view), that way a single copy of the actual node data could be cached for all users. There are probably benefits/drawbacks to either approach, but if it's possible to control what content is being displayed to the user, such as with a custom node module, you might as well just use a single cached node for all users. This approach seems to make sense for my situation at least.
I'm not sure about another patch, though the change itself would be easy enough... just excluding the user role number from the cache key in node_load.
jonathan
#4
But hey... you've gotta admit that the advcache_array2int function which I use to make the integer representing the roles is as cool as it gets =)
#5
It is pretty cool. I could think of some similar uses for something like that. It got me wondering why user/roles themselves aren't cached though.
#6
even if node_cache doesn't need it, array2int should still be used in places that cache multple nodes or comments - eg search caching and comment caching - basically anything that is currently limited by role or or node access query rewrites could be extended using this function.
but then we've got the overhead of keeping track of which keys we've cached (for a single node it makes sense to iterate through all possible role combos, but for things like comments that having paging and sorting to cache as well, it would be prohibitive!) and i'm not sure if my solution for this (see my comment caching alternative) is good enough...
my impression is also that the view code handles which node to display, so it should be safe to cache the node once for all roles. my only worry is that there are some cases (perhaps with contrib modules?) where the node_render returns a different string for different roles.
i'll take this issue and do some investigation...
#7
i like the idea of caching things by role, but only if it can be done in a way that's optional. as i argued for node caching above, there are many situations where caching nodes (or anything else) by role doesn't buy you anything except having lots of identical data (nodes, comments, whatever) in your cache storage.
for nodes and perhaps for other things, i'd think you could cache the same data and use the view code to show/hide bits of data based on the user's roles. no need to store numerous copies of the data for each role combination.
#8
The tables the come the advcache installation are empty.
One more issue is in the advcache_nodeapi().
$maxrole = pow(2, (int)dbresult(dbquery(“SELECT MAX(rid) FROM {role}”)) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. ‘::’. $i, ‘cachenode’);
}
My system has $maxrole = pow(2, 27). It means cacheclearall will execute pow(2, 28) times. And it will result ti a PHP time-out.
And, by the way, I’m also running memcache. Is there any conflict with it?
I hope I’m making sense here.
#9
$int = advcache_array2int($user->roles);
$int is defined but neve used in this module.
can anyone explain?
#10
As this module is only good for users having only one (1) role, I made a query to check if the user has only one role.
$roles = db_result(db_query("SELECT COUNT(*) FROM {users_roles} where uid = $user->uid"));
Having the value of $roles, we can then check it before we proceed to the clearing of cache. So, I replaced
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll')))) {
with
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll'))) && $roles == 1) { ...
Another point here:
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll')))) {
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. '::'. $i, 'cache_node');
}
}
This line of code is good if the maximum rid is low. But in my case, where MAX(rid) is 28, cache_clear_all() would execute 2^27 times. And it will take so much time that it will result a php-time-out.
Studying in depth, I learned that cache_clear_all api is all about deleting the entries stored in the database.
Evaluating the code above, it is just like executing the database command:
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
db_query("DELETE FROM cache_node WHERE cid = '$node->nid'.::.'$i'");
}
To avoid the loop, I made a query that would practically do the same. But this one, the trick is on the sql and not on php.
I replaced
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. '::'. $i, 'cache_node');
}
with
db_query("DELETE FROM {cache_node} WHERE cid LIKE '$node->nid::%'");
So, this is my final code for the nodeapi() of advcache module:
function advcache_nodeapi($node, $op) {
switch ($op) {
case 'update':
case 'insert':
case 'delete':
global $user;
$int = advcache_array2int($user->roles);
$roles = db_result(db_query("SELECT COUNT(*) FROM {users_roles} where uid = $user->uid"));
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll'))) && $roles == 1) {
db_query("DELETE FROM {cache_node} WHERE cid LIKE '$node->nid::%'");
}
// It is unfortunate that we have to use the wildcard here, but it
// comes from the fact that the signature to taxonomy_node_get_terms
// has a $key parameter which goes into building the cache key, which
// we can't reliably reconstruct here.
cache_clear_all('node::'. $node->nid, 'cache_taxonomy', TRUE);
if ($node->type == 'forum') {
cache_clear_all('*', 'cache_forum', TRUE);
}
break;
}
}
This worked out fine on my system. I hope the authors could comment on this.
#11
http://drupal.org/node/199465