node cache_clear_all routine does not scale for sites with many roles
Reg - December 10, 2007 - 10:59
| Project: | Advanced cache |
| Version: | 5.x-1.6 |
| Component: | Code |
| Category: | bug report |
| Priority: | critical |
| Assigned: | firebus |
| Status: | patch (code needs review) |
Description
Ok, sorry if anyone got offended, I was just in a goofy mood. Anyway, in the advanced Cache Module, lines 54+ you get this:
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. '::'. $i, 'cache_node');
}So let's do the math with 19 roles, like I have: pow(2, 19 - 1) = 262,144. Thats how many times this code is sending one line deletes to MySQL on my system. It took a page that transitions in 8 secs. up to 45 secs.
Not knowing if the 2^roles formula is absolutely necessary I have for now changed this code as follows:
cache_clear_all($node->nid. '::', 'cache_node', TRUE);The "TRUE" says regard the ID as a wildcard and creates a query like this:
DELETE FROM cache_node WHERE CID LIKE '1993::%';

#1
i'll take another look at the code, and will check with robert to see if there was any reason (other than memcache support) not to do a wildcard.
if not, it probably makes sense to default to the wildcard clear, and use the original version for memcache support.
i've also thought about keeping track of which role combinations exist (in your site with 19 roles, you almost certainly have a much smaller number of actual role combinations), or adding a configuration setting for which role combos you want to cache (would help both node and comment cache) to improve this feature.
#2
As per the recommendations, I installed memcache and when looking at it I made the following change to accommodate wildcards: http://drupal.org/node/199483. Not a perfect solution but workable within the limitations of memcache's features.
#3
i noticed that!
on my site, which has fewer roles but over 100K users, node caching would not scale for me if the node cache was completely flushed on every node update :)
i think you're correct that http://drupal.org/node/199483 would improve the default behavior of memcache, however i don't think flushing the whole cache is a good solution for advcache.
#4
Agreed.
Drupal provides the wildcard feature just for such situations like the roles problem. Memcache, not being designed for Drupal, simply doesn't work with Drupal's methodology. It may be that memcache has some other mechanism to key a group of cached items. This will probably only get fixed in the short term by someone compromising somewhere such as another parameter being added on the the cache_clear_all routine such as "GroupKey" or whatever works for memcache as it is now or perhaps (using pseudo code) something like:
If $wildcard is not a boolean value then assume it is a group key for memcache
OR perhaps
If using "node_cache" and CID has a "::" in it then use the left hand side of the "::" for a memcache group key.
In the longer term some sort of collaborative agreed solution needs to be put into effect.
#5
I had a look through memcache's features and from what I can see the only methodology that's going to work until it offers something like wildcards or groups is to pull from the database all the existing user/role combinations with an SQL SELECT statement and then delete those one by one from memcache. Like you said, there won't be many combinations and it will scale well enough.
#6
That's interesting. The overhead of this is at least constant (can't balloon like 2^19 roles), and it's only on clearing cache, and not a terribly complicated query. I think I like the approach. Have time to write a patch?
#7
I can't do it in the near term as I am about to go live with a site and you know how busy that can get -- I'm just focusing on "must do's" right now but if no-one else has done it when I resurface then of course I'll fix it. probably a few week down the road though.
Admittedly however, if there is someone out there that's good at SQL they would be a better choice than me. For me it's a lot of trial and error when I start doing SQL stuff.
#8
SELECT DISTINCT(GROUP_CONCAT(rid ORDER BY rid ASC SEPARATOR '-')) FROM {users_roles} GROUP BY uidShould do the trick.
Potential Issues
My gut feeling is still that the issue of role combinations should be exposed as a configuration page. The set of cachable role combos would be stored with variable_set. The admin could manually choose valid role combos, or could run the query above to automatically set things up. It would be up to the admin to update the config if they added roles or added valid combos.
The config solution is also nice if you have some role combos that you don't want to enable caching for in your use case.
#9
I couldn't sleep so I thought I would have a crack at this. Firebus, I hope you know this module because truly the code below is just a best guess on how it's supposed to work and it need someone who actually knows if it's not. The line:
$int = advcache_array2int($roles);I pulled straight from the routine (two line up) which doesn't seem to be being used and I suspect it was the start of what we are doing below but just never got finished.$results = db_query("SELECT DISTINCT (GROUP_CONCAT(rid ORDER BY rid ASC SEPARATOR '-')) roles FROM {users_roles} GROUP BY uid ORDER BY uid");while ($result = db_fetch_object($results)) {
$roles = explode('-',$result->roles);
$int = advcache_array2int($roles);
cache_clear_all($node->nid. '::' . $int, 'cache_node', TRUE);
}
#10
GROUP_CONCAT isn't database neutral. You'll have to use GROUP BY
#11
yeah, i realized that :)
i don't know of a method to get a unique value per role-combo using GROUP BY alone. GROUP_CONCAT allows us to generate a unique string per role-combo.
AFAIK, the other aggregate functions (SUM, etc.) all have possible collisions for distinct role combos (eg, one use has roles 3 and 5, another has 2 and 6, they look the same to SUM)
but i could be missing something, or perhaps there's another SQL-only method?
aha! of course! just like we're doing now with advcache_array2int:
SELECT DISTINCT SUM(POW(rid, 2)) FROM {users_roles} GROUP BY uid;
#12
Close!
SELECT DISTINCT(SUM(POW(2, rid))) as sum, uid FROM users_roles GROUP BY uid;it's 2^rid
#13
also, you'll want to see if you need the -2 in there as well to accurately align with the real cache ids.
#14
The GROUP_CONCAT works great and most people will be using MySQL so I think it's fine for a solution here. Clearly an alternative is needed for other databases to build a patch though.
On another note, is the logic in while loop how it supposed to work? If not, what is right for this module?
#15
@Reg: "The GROUP_CONCAT works great and most people will be using MySQL " <--- doesn't cut it. It has to be ANSI SQL compatible or it doesn't get in.
#16
Then I leave it to people better than I to figure out the SQL. Still would like to know the correct logic for the loop that does the deleting. I'm willing to bet what I posted only being a best guess is not correct.
#17
It's good that memcache got the tweak it needed however for this issue it's not a great solution since we are clearing all nodes out of cache far too often. We need a more precise solution on this end for the total solution to be useful.
I was hoping someone would tell me if #9 above was the right structure for clearing just the nodes needed but no-one has answer that. And if not, tweak it so that it is right. Regarding the "GROUP_CONCAT", I'm happy to replace it with someone Db independent but I would like to get the basic structure of the lines of code right first. If I knew the module and the code better I would just do the whole thing and offer up a complete solution.
#18
Reg, could you elaborate with example data how your suggestion works? It would keep the discussion going at a moment when I don't have time to actually implement and test your suggestions. Thanks.
#19
I guess so, I'm not sure what you are looking for but here's a start. If I use this SQL statement directly in MySQL:
SET SESSION group_concat_max_len =512;# MySQL returned an empty result set (i.e. zero rows).SELECT DISTINCT (
GROUP_CONCAT( rid
ORDER BY rid ASC
SEPARATOR '-' )
) roles
FROM users_roles
GROUP BY uid
ORDER BY uid;
then I get this data on my system:
roles3
8
10-12
10-11-12-18
10-11-12-13-14-15-18-19-20
10-11-12-18-20
4-10-12-18-20
4
3-10-12
20-21
The SET SESSION... line just lets you see the result instead of getting the word [blob] returned on each line.
#20
Ah, ignore the "empty result set" part. I'm not sure how that got there.
#21
The tables the come the advcache installation are empty.
One more issue is in the advcache_nodeapi().
$maxrole = pow(2, (int)dbresult(dbquery(“SELECT MAX(rid) FROM {role}”)) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. ‘::’. $i, ‘cachenode’);
}
My system has $maxrole = pow(2, 27). It means cacheclearall will execute pow(2, 28) times. And it will result ti a PHP time-out.
And, by the way, I’m also running memcache. Is there any conflict with it?
I hope I’m making sense here.
#22
No conflict with memcache.
#23
$int = advcache_array2int($user->roles);
This line of code is found on advcache_nodeapi().
But then, I always wonder why $int is never used.
I used this replacing $maxrole with $int.
for ($i = 1; $i < $int; $i++) {
cache_clear_all($node->nid. ‘::’. $i, ‘cachenode’);
}
And it works fine.
But then my problem still exists. The tables that come with the advcache installation are empty even after I applied the patches. Why is it that these tables are not used? What are these tables for?
#24
raising priority and category level..
guys this is not about "oh what kind of site has over 31 roles"? this bug just killed my prod site when I added role #5.. because its not about how many roles there are (which is a purely unacceptable excuse in my professional mind) it is about where the rid sequence is at. this took down my site because that sequence suddenly was 31. why is absolutely irrelevant.
there needs to be a clear explanation in the code for what the heck 2^max(roles.rid) has to do whatsoever with clearing cached nodes. i defy the coder to provide an explanation to this thinking.
and i really do think this code needs to be downgraded from production-stable to beta. this is most definitely not production stable and there is a serious risk of bringing a site down with such a lurking bug as this.
#25
$int = advcache_array2int($user->roles);
$int is defined but neve used in this module.
can anyone explain?
#26
The thinking is that every combination of roles is unique and the cache key wanted to represent this uniqueness. There's probably a better implementation, though. Try something like this:
<?phpfunction array2int($roles) {
return md5(sort($roles));
}
?>
#27
As this module is only good for users having only one (1) role, I made a query to check if the user has only one role.
$roles = db_result(db_query("SELECT COUNT(*) FROM {users_roles} where uid = $user->uid"));
Having the value of $roles, we can then check it before we proceed to the clearing of cache. So, I replaced
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll')))) {
with
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll'))) && $roles == 1) { ...
Another point here:
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll')))) {
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. '::'. $i, 'cache_node');
}
}
This line of code is good if the maximum rid is low. But in my case, where MAX(rid) is 28, cache_clear_all() would execute 2^27 times. And it will take so much time that it will result a php-time-out.
Studying in depth, I learned that cache_clear_all api is all about deleting the entries stored in the database.
Evaluating the code above, it is just like executing the database command:
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
db_query("DELETE FROM cache_node WHERE cid = '$node->nid'.::.'$i'");
}
To avoid the loop, I made a query that would practically do the same. But this one, the trick is on the sql and not on php.
I replaced
$maxrole = pow(2, (int)db_result(db_query("SELECT MAX(rid) FROM {role}")) - 1);
for ($i = 1; $i < $maxrole; $i++) {
cache_clear_all($node->nid. '::'. $i, 'cache_node');
}
with
db_query("DELETE FROM {cache_node} WHERE cid LIKE '$node->nid::%'");
So, this is my final code for the nodeapi() of advcache module:
function advcache_nodeapi($node, $op) {
switch ($op) {
case 'update':
case 'insert':
case 'delete':
global $user;
$int = advcache_array2int($user->roles);
$roles = db_result(db_query("SELECT COUNT(*) FROM {users_roles} where uid = $user->uid"));
if (!in_array($node->type, variable_get('advcache_node_exclude_types', array('poll'))) && $roles == 1) {
db_query("DELETE FROM {cache_node} WHERE cid LIKE '$node->nid::%'");
}
// It is unfortunate that we have to use the wildcard here, but it
// comes from the fact that the signature to taxonomy_node_get_terms
// has a $key parameter which goes into building the cache key, which
// we can't reliably reconstruct here.
cache_clear_all('node::'. $node->nid, 'cache_taxonomy', TRUE);
if ($node->type == 'forum') {
cache_clear_all('*', 'cache_forum', TRUE);
}
break;
}
}
This worked out fine on my system. I hope the authors could comment on this.
#28
@jeuelc: did you try my replacement role2int function in #26?
#29
did you mean this:
$int = array2int($role)
no, i haven't tried this. i don't know where in the module is $int used.
#30
Can people please test the approach in this patch and report back? Thanks.
#31
i really think this is moot. see dupe ticket http://drupal.org/node/188085
there's no reason to cache nodes on a role-combo basis, since all access involving node viewing and node lists happens before the node cache is accessed.
#32
<?phpfunction array2int($roles) {
asort($roles);
return md5($roles);
}
?>
This is my proposed solution.