#636454: Cache tag support adds 'cache tag' support to the cache API. This issue is for using it to replace the combination of cache_clear_all(), CACHE_TEMPORARY and minimum cache lifetime.

The problem

The page and block caches are cleared in their entirety every time a node, comment or user are updated. This leads to a very low cache hit rate.

To deal with this the minimum cache lifetime feature exists in core, however it is complex, few people understand it, it relies on side-effects like $user->cache and $_SESSION, and due to all of those reasons is prone to extremely hard to track down bugs.

Additionally the use of CACHE_TEMPORARY is not suitable for cache backends such as memcache, and leads to issues like #891600: Page and block caches are cleared the same way on cron as on content change.

Proposed solution

  • Attach cache tags to items that are cached in the drupal_render() and page caches.

  • At the same time, implement cache()->deleteTags() in the relevant places so that those tags are cleared on system events.

    This is likely to involve support for nodes, comments (which should not add their own tags, but clear tags of the node they're attached to), and users.

  • System configuration changes that aren't granular enough to use the tag system can just do cache($bin)->flush() for the affected cache bins. These changes should be relatively infrequent.

  • Add a #cache_tags key to drupal_render() alongside #cache.

  • Populate 'tags' in node_view() and similar - this needs to be supportable for things that don't actually get cached (in the same way that #attached is).

  • Node submits and similar places should clear tags (not the CRUD API functions).

Original report
The "minimum cache lifetime" feature was useful at one point, but is in a very sorry state right now:

  • The cache lifetime setting is global and inflexible
  • The per-user cache lifetime indicator ($user->cache) is global, instead of being per bin (and as a consequence a flush on one bin will flush non-expired data in the other bin)
  • Some modules are known to break it (because the global $user->cache indicator is lost)

Non-technical users should not use this feature. Moreover, now that we have made cache properly pluggable per bin, we will be able to implement new and better strategies in contrib (the memcache module already have some of that).

I suggest we remove the feature from core.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Status: Active » Needs review
FileSize
9.54 KB
 7 files changed, 12 insertions(+), 103 deletions(-)
aspilicious’s picture

 /**
- * Remove the 'threshold', 'mode' and 'sort' columns from the {users} table.
+ * Remove the 'threshold', 'mode', 'sort' and 'cache' columns from the {users} table.
  *

Can you make that line less then 80 characters makes me and the doc people happy :)

Damien Tournoud’s picture

Sure.

catch’s picture

I don't think we should remove this. Core page caching with a frequent cron run is almost useless without this set. Making the variables (both the setting and the internal user->cache) per-bin would not be overly difficult.

It's tweaky, and every time I look at the code I have to figure out what it does all over again, but I'd feel uneasy about removing something at this stage, especially since it's mainly good for those who can't install memcache etc. and those sites are going to be hurting with some of the other crap with have in D7 at the moment. If you're using a different cache.inc you don't need to worry about this anyway.

If you can show that it's not actually a performance benefit for real sites as it stands, then sure let's kill it. If not this feels like D8 to me.

Dries’s picture

I'd be in support of removing this. It is inflexible, and actually hinders better cache implementations in contrib. This assumes there is no significant performance regression.

moshe weitzman’s picture

i think we need a bit more solid plan that "better solutions in contrib". could you describe what you envision here. unfortunatelym varnish is not available for many sites.

page and block cache get cleared if you blow on them. minimum lifetime is what currently saves you from peril.

retester2010’s picture

#3: 730060-remove-cache-lifetime.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, 730060-remove-cache-lifetime.patch, failed testing.

pillarsdotnet’s picture

This solves the same problem as #1015946: Eliminate $user->cache and {session}.cache in favor of $_SESSION['cache_expiration'][$bin].
Either this issue or that one should be closed as a duplicate.

catch’s picture

Title: Consider removing the minimum cache lifetime feature » Replace CACHE_TEMPORARY, cache_clear_all() (no arguments) and minimum cache lifetime with cache tags support
Version: 7.x-dev » 8.x-dev
Component: base system » cache system
Anonymous’s picture

subscribe. more hot caching action!

Anonymous’s picture

Issue summary: View changes

Updated issue summary.

Berdir’s picture

Issue summary: View changes

removed duplicate lines

catch’s picture

Possible minimal approach to this:

Once #636454: Cache tag support is in:

- tag all page and block cache items with $tags = array('content' => TRUE);
- use CACHE_PERMANENT or add a new variable for default length (which can be long).
- all places that currently call cache_clear_all() instead call cache_tags()->invalidate(array('content' => array(TRUE)));
- remove cache_clear_all()
- remove CACHE_TEMPORARY

At that point we haven't really done much more than #891600: Page and block caches are cleared the same way on cron as on content change, since the page/block cache will still get cleared every content change, but the next step after that would be replacing array('content' => TRUE) with more granular cache tagging and clearing (i.e. array('node' => array(1), 'user' => array(32)); at which point there's not the need for minimum cache lifetime at all any more.

carlos8f’s picture

@catch, that as along the lines of what I was thinking too. Seeing as this is a flag-oriented tag instead of an ID-oriented one, maybe we could do:

$tags = array('flags' => array('content'));

I like the idea of having a default expiration for all cache entries (configurable, probably default to 2 weeks or something), so expire() will garbage-collect old entries after a while, regardless.

We could just throw out CACHE_PERMANENT and CACHE_TEMPORARY in favor of tagging. Having a cache retain data permanently never seemed to make sense. CACHE_TEMPORARY is only used for page and block caches in core. There's never even been a way to set an expiration time for page and block caches in Drupal. Instead we have minimum lifetime, which is a pile of hacks :) Also the ancient hack of cache_clear_all() with no arguments clearing page and block... A flag in the cache tags would be much cleaner.

carlos8f’s picture

but the next step after that would be replacing array('content' => TRUE) with more granular cache tagging and clearing (i.e. array('node' => array(1), 'user' => array(32));

I think "general wipes" will still have to be a fact of life, especially in the page cache, because of the problem of new content.

Example: you have a view. It gets cached in the page cache. New content is posted, which should display in the view. You can't invalidate the page cache by the new entity ID, because it didn't exist when the page cache was set.

Therefore, the new data can only be seen in the view by waiting out the cache_lifetime, clearing a specific cid (or wildcard), employing session-specific cache bypass (such as minimum_lifetime does), or issuing a general wipe (CACHE_TEMPORARY or cache tag flag "content"). In the case of page cache, the only method *not already employed* is clearing specific cid's, but it may be impossible to enumerate the relevant cid's in the case of a page cache.

Cache tags can do two things for us then: allow cache relating to existing, enumerable content to be invalidated (block displaying user details on a profile node, for example), and provide a very efficient method of mass-invalidating data (basically, a faster, less hard-coded general wipe).

moshe weitzman’s picture

@carlos8f - perhaps we should treat new content as a tag also. for example, an View of recent articles might get tagged with 'new-article' and then when a new article node is posted, we clear flush all items with the following tags: node-[NID], new-node, new-article.

catch’s picture

We discussed new content a bit in #636454: Cache tag support.

I think it's possible (although possibly not in core) to provide a mechanism where we have a naming scheme for new content cache tags.

To take core content listings:

/blog - clear this when 'node type blog' is posted.

/node - clear this when 'node promoted'

/taxonomy/term/n - clear this when 'node_in_taxonomy_term_n'.

So you'd need to define a tag for the content listing based on the criteria, then at the other end, clear that tag when new content is posted (i.e in hook_entity_insert()).

Then if we have a way for the page cache to inherit tags from child elements (which we sort of have with #attached), we can use this for new content as well.

If we don't manage to do that in core and are doing similar to cache_clear_all() still, then at the least we should make sure there's a good way to opt out of that behaviour (which is not the case now in D7) so that custom code or contrib can attempt it.

For things that really do just want to clear the entire page cache, then isn't cache('page')->flush() enough?

carlos8f’s picture

@moshe, interesting idea, it works for simplistic cases at least :)

It leaves a granularity question, i.e. should we go for tagging things with everything under the sun, or keep it simple and err on clearing more of the cache.

A 'node promoted' tag might only make sense to invalidate the front page, and only then if you're using the default front page handler. A 'node' tag could probably substitute. A 'node_in_taxonomy_term_n' tag seems overly specific, so more likely a node is tagged with every one of its taxonomy terms, and those term tags are also invalidated when the node is inserted, which invalidates /taxonomy/term/n page cache as well.

Lots of possibilities, and with greater power is greater responsibility to keep it all organized.

moshe weitzman’s picture

Yeah, some interesting questions there. You tag example makes sense to me.

It would also be good to think of ways that a performance architect can measure how well she is doing with cache hit rates. We are building a dynamic system here where the site is constantly changing underneath the architect's nose. We always have this in web dev, but its getting a bit worse.

We already have query logging in DBTNG. Perhaps cache system could offer similar system for forensics and auditing.

catch’s picture

those term tags are also invalidated when the node is inserted

Well this is where it gets tricky.

If the terms are rendered with the node, we might want to invaidate cached node rendering when terms are updated. So it would make sense to tag the nodes with the terms.

However if we then clear tags every time a node is inserted with those terms, we're unnecessarily invalidating potentially hundreds of thousands of rendered nodes with that term (even though we only care if the term itself is edited).

So while the naming convention is annoying, I think there's value (at least for custom code) to make tags which match common listing properties, rather than trying to re-use tags that are otherwise used to identify rendered HTML which contains items being identified.

moshe weitzman’s picture

FileSize
22.3 KB

Attached patch humbly implements #12 as I think thats a good next step. cache_clear_all() is gone, and in its place (for now) is cache_clear_content() which clears the content tag. CACHE_TEMPORARY is gone as well.

This patch isn't passing all tests. Help wanted. I may not get back to this for a while.

Anonymous’s picture

Status: Needs work » Needs review

so the bot can tell me that is failing.

Status: Needs review » Needs work

The last submitted patch, clear_content.patch, failed testing.

moshe weitzman’s picture

Status: Needs work » Needs review
FileSize
22.25 KB

Now rolled against current HEAD

Status: Needs review » Needs work

The last submitted patch, clear_content.patch, failed testing.

moshe weitzman’s picture

Anyone available to work on this?

Anonymous’s picture

Status: Needs work » Needs review
FileSize
22.71 KB

so close, but so far.

moshe's patch just needed to pass on an elements cache tags in drupal_render_cache_set(). attached patch does that, making block cache tests pass locally. lets see what the bot thinks.

Anonymous’s picture

and the same thing for drupal_page_set_cache() - need to make sure to pass on tags.

Status: Needs review » Needs work

The last submitted patch, 730060-27-cache-clear-content.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
24.08 KB

removed the failing test, as it was testing CACHE_TEMPORARY items which we've removed.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

I reviewed the code and it looks good. All tests pass so RTBC.

Thanks a lot for finishing this off.

sun’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/includes/cache.inc
@@ -69,17 +69,9 @@ function cache_get_backends() {
+function cache_clear_content($debug = FALSE) {

Stale $debug argument.

+++ b/core/modules/system/tests/cache.test
@@ -339,40 +339,6 @@ class CacheClearCase extends CacheTestCase {
-    $this->setupLifetime(300);

CacheTestCase::setupLifeTime() is obsolete and can be removed.

Let's also remove all instances of system.performance:cache_lifetime throughout core.

+++ b/core/modules/user/user.admin.inc
@@ -738,7 +738,7 @@ function user_admin_permissions_submit($form, &$form_state) {
   // Clear the cached pages and blocks.
-  cache_clear_all();
+  cache_clear_content();

I'm still not sure why this rename (API change) has to be in this patch, but if others agree with it, then fine.

moshe weitzman’s picture

cache_clear_all() was renamed to more clearly identify what it does. It is code cleanup, since I gutted that function anyway. Hopefully thats clear.

tim.plunkett’s picture

For the reroll:

+++ b/core/includes/common.incundefined
@@ -108,7 +108,7 @@ const HTTP_REQUEST_TIMEOUT = -1;
+ * The block cache is cleared in cache_clear_content(), and uses the same clearing

This goes over 80 characters and needs to be rewrapped.

+++ b/core/modules/update/update.moduleundefined
@@ -843,7 +843,7 @@ function _update_cache_clear($cid = NULL, $wildcard = FALSE) {
+ * on sites that use memcache where cache_clear_content() won't know how to purge

This looks over too.

catch’s picture

Hmm I'd actually be tempted to remove cache_clear_all()/cache_clear_content() altogether and use cache_invalidate() instead, since eventually we'll want to replace this with specific tags, but doesn't have to be done here anyway.

- * on sites that use memcache where cache_clear_all() won't know how to purge
+ * on sites that use memcache where cache_clear_content() won't know how to purge

That comment is completely stale, should be cache() something instead.

Anonymous’s picture

i'm open to #34, this patch is really just one of many to integrate tag support.

catch’s picture

Status: Needs work » Needs review
FileSize
29.62 KB

Here's a re-roll, should cover everything in #31, #33 and I did #34 (it's just a find and replace if we want to put it back again, apart from the cache_clear_all() removal hunk).

Patch is bigger 'cos it removes lots of stuff 29 files changed, 51 insertions(+), 197 deletions(-).

Status: Needs review » Needs work

The last submitted patch, tags.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
30.2 KB

Too many arrays!

Status: Needs review » Needs work

The last submitted patch, tags.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
30.2 KB

fixed $period typo in system_performance_settings(), should be all green now.

Status: Needs review » Needs work

The last submitted patch, 730060-40-no-cache-temp-for-you.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 730060-40-no-cache-temp-for-you.patch, failed testing.

Anonymous’s picture

Status: Needs work » Needs review
FileSize
30.84 KB

ok, so that was not the bot's fault, i just failed at attention to detail.

this patch takes out the CACHE_TEMPORARY bits of CacheInstallTestCase.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Green

bigjim’s picture

One thing that would be good here would be the ability pass custom tags, especially from Views, from hook_block_info. To better support custom Cache plugins. I assume that should be another ticket?

catch’s picture

Custom cache tags for content is the next thing on the agenda, but we don't have an explicit issue for that, it was discussed a fair bit in #636454: Cache tag support though. We should open the follow-up to work on that definitely.

I think at least to start with, we'd want to add this to the drupal_render() cache, in a similar way to #attached is handled now, so:

- add cache tags to the renderable array to be cached
- bubble up cache tags to higher level render arrays in a similar way to what is currently done for #attached. So tags on a node_view() would also end up tagging the overall page cache entry, at least until those start getting treated as separate requests.

I'm not sure this will work in hook_block_info() though, I think this is more of a replacement for something like views content cache - so when the view is actually built/rendered you'd base the tags from the content then.

bigjim’s picture

@catch

"I'm not sure this will work in hook_block_info() though, I think this is more of a replacement for something like views content cache - so when the view is actually built/rendered you'd base the tags from the content then."

Yea that's pretty much what I envisioned, I'm working on something similar using the D7 cache_tags module.

msonnabaum’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
30.98 KB

Quick re-roll. Location of locale tests moved.

msonnabaum’s picture

Status: Needs review » Reviewed & tested by the community

We'll definitely need followups for most of the cases where we're clearing on the "content" tag, but this is a necessary first step that will enable us to work on those issues.

Looks great to me.

aspilicious’s picture

Status: Reviewed & tested by the community » Needs review
+++ b/core/includes/common.incundefined
@@ -108,10 +108,11 @@ const HTTP_REQUEST_TIMEOUT = -1;
+ * The block cache is cleared when the 'content' cache tag is invalidated, ¶

trailing nasty whitespace

Don't want to push this to needs work for this

2 days to next Drupal core point release.

aspilicious’s picture

Status: Needs review » Reviewed & tested by the community
Dries’s picture

I feel this issue lacks more in depth discussion about why this will lead to performance improvements. It seems mostly hypothetical given no one has done any benchmarking. In fact, I feel slightly uneasy about removing these things for those who don't have memcache etc. It is not clear if this will introduce any performance regressions for them. Long story short, I wonder if we truly understand the ramifications of this patch; I don't feel like I do after reading all comments in this issue.

That said, I'd be happy to defer to catch for this patch. He has been very active in it and has been a driving force behind Drupal's cache system.

I think this also warrants an entry in CHANGELOG.txt.

catch’s picture

Issue tags: +revisit before beta

@Dries, benchmarks were done in #636454: Cache tag support at some length. There is not really a simple benchmark that could measure the changes introduced in this specific issue.

I'll try to summarize what the implications of the patch are:

- CACHE_TEMPORARY is gone. This means that #891600: Page and block caches are cleared the same way on cron as on content change can be moved to 7.x, since we will no longer clear the page and block caches every cron run any longer. For sites that do not have content updates more frequently than cron runs, they should see much, much better cache hit rates.

- minimum cache lifetime is also gone. If you have frequent content updates, and previously had a long minimum cache lifetime set, then you'll see worse performance here, since there's no longer an attempt in core to rate-limit the cache flushes. However you also skip an expensive variable_set() every time a comment or node is updated which has been a serious problem on some sites I've profiled.

- if we don't do any follow-up work after this, then it is still entirely possible for contrib cache backends to implement minimum cache lifetime or similar themselves. It would also be possible to write a cache backend that just puts a short expires value on tagged cache items and doesn't bother actually implementing tagging too.

- We can only really justify the removal of minimum cache lifetime though, if we also do #1605290: Enable entity render caching with cache tag support which is the next stage of getting this change applied to things. I'm personally intending to spend some time on that issue and I know others are as well, however we should definitely tag this issue with 'revisit before release' in case that doesn't get done for some reason.

Not committing this yet in case you want to discuss this some more, but I'll commit it at some point towards the end of the week if there's no reply since we'll need it to properly work on the follow-ups.

pillarsdotnet’s picture

Pasted Catch's summary from #54 into a Change Notice. Please review and expand.

sun’s picture

Issue tags: -revisit before beta

Status: Reviewed & tested by the community » Needs work
Issue tags: +revisit before beta

The last submitted patch, 730060-49-no-cache-temp-for-you.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
31.36 KB

Re-rolled, only conflict was PSR-0 test conversions.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC.

aspilicious’s picture

+++ b/core/includes/common.incundefined
@@ -110,10 +110,11 @@ const HTTP_REQUEST_TIMEOUT = -1;
+ * The block cache is cleared when the 'content' cache tag is invalidated, ¶

trailing whitespace

21 days to next Drupal core point release.

oriol_e9g’s picture

FileSize
31.36 KB

Trailing whitespace fix, no credit please.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, cache_tags_9.patch, failed testing.

aspilicious’s picture

Status: Needs work » Needs review
FileSize
33.42 KB

Update module conflicted

catch’s picture

Issue tags: -revisit before beta

#63: cache_tags_63.patch queued for re-testing.

Anonymous’s picture

Issue tags: +revisit before beta

#63: cache_tags_63.patch queued for re-testing.

Anonymous’s picture

Status: Needs review » Reviewed & tested by the community

ok, so i guess this is RTBC.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

catch, thanks for the summary in #54. That was helpful.

I committed the patch to 8.x but I agree that we need to revisit this before the release. As far as I can tell, this patch reduces complexity but is not guaranteed to yield better performance, even after we completed #1605290: Enable entity render caching with cache tag support.

aspilicious’s picture

Can we have a change notification?

catch’s picture

Title: Replace CACHE_TEMPORARY, cache_clear_all() (no arguments) and minimum cache lifetime with cache tags support » Change notification for: Replace CACHE_TEMPORARY, cache_clear_all() and minimum cache lifetime with cache tags support
Category: bug » task
Priority: Normal » Critical
Status: Fixed » Active
pillarsdotnet’s picture

There already is a change notification, but it probably should be edited / improved.

tim.plunkett’s picture

tim.plunkett’s picture

This issue calls for a new change notice.
There are already two: Cache tag support added and Cache Tag Support, with varying levels of accuracy and detail.

I wrote the first one, but I am not confident on merging/updating/adding a new one. Can someone who worked on this issue help with that?

aspilicious’s picture

And I also found this change notidication, that probably needs a different example now: http://drupal.org/node/1362360

catch’s picture

OK so these are the differences, I'm not really sure where to put these in which change notice, so pasting here for now:

Clearing cached HTML (currently the page, block and render caches).

 -    cache_clear_all();
+    cache_invalidate(array('content' => TRUE));

Setting cached HTML:

 -  cache($bin)->set($cid, $data, CACHE_TEMPORARY);
 +  cache($bin)->set($cid, $data, CACHE_PERMANENT, array('content' => TRUE));

(you might also want to set an explicit TTL instead of using CACHE_PERMANENT).

Also cache backends no longer have to worry about the concept of minimum cache lifetime, although it could optionally be re-introduced as a backend-specific feature.

tim.plunkett’s picture

Status: Active » Needs review

@aspilicious good find! Fixed that.

And I updated http://drupal.org/node/1534648.

tim.plunkett’s picture

Title: Change notification for: Replace CACHE_TEMPORARY, cache_clear_all() and minimum cache lifetime with cache tags support » Replace CACHE_TEMPORARY, cache_clear_all() and minimum cache lifetime with cache tags support
Category: task » bug
Priority: Critical » Normal
Status: Needs review » Fixed

Talked with @msonnabaum, and added more details about CACHE_TEMPORARY to the issue summary.

Status: Fixed » Closed (fixed)
Issue tags: -revisit before beta

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Minor spelling correction and formatting improvements.

catch’s picture

Issue summary: View changes
Issue tags: -revisit before beta

'content' cache tag is on its way out, which was the only thing to revisit here.