Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Problems with the current system:
- Any code does
['foo' => TRUE']
, then any other code no longer can do['foo' => [35 => 35]]
. We've had core bugs because of that. - If the
$tags
argument ofdrupal_merge_cache_tags()
has something like['entity_test' => [0 => '1']]
and the$other
argument has['entity_test' => [1 => '1']]
then the result will contain the value'1'
twice.
Proposed resolution
Use strings rather than arrays.
i.e. change from:
$tags = array(
'my_custom_tag' => TRUE,
'node' => array(1, 3),
'user' => array(7),
);
to:
$tags = array(
'my_custom_tag',
'node:1',
'node:3',
'user:7',
);
This was raised in #2254181-39: Comment indentation is incorrect for comments following a replying-comment: don't render cache comments for which threading is enabled.
Remaining tasks
Roll patch + review.
User interface changes
None.
API changes
- Cache tags should be strings. An exception is now thrown when they're not, to improve the DX. This happens when using
::set()
or::setMultiple()
on a cache back-end and when usingCache::invalidateTags()
orCache::deleteTags()
. i.e.: always. Verified with test coverage. - Cache tag merging should now be done using
Cache::mergeTags()
rather than withNestedArray
,array_merge()
ordrupal_merge_cache_tags()
. Cache::buildTags()
is provided for the rare use cases where you manually need to build cache tags (only 3 cases in core, but likely more in contrib).HtmlViewSubscriber::convertCacheTagsToHeader()
andHtmlViewSubscriber::convertHeaderToCacheTags()
have been removed.
Comment | File | Size | Author |
---|---|---|---|
#48 | 2340123_48.patch | 552 bytes | Mile23 |
#38 | cache_tags_as_strings-2340123-38.patch | 182.01 KB | Wim Leers |
Comments
Comment #1
catchI'd be OK with this.
The idea for allowing a prefix + array was for things like array('node' => array_keys($nodes));. Now that we have getCacheTag() on entities, that use case doesn't really exist as much, and the 'flattening' of cache tags has come up in a couple of places already as an issue.
Comment #2
BerdirI'm not sure right now, if you have cache tags like 'node' => array(1, 2, 3), do we support to clear all them?
I'll also leave znerol a ping, he has written a redis cache backend, want to make this is not a problem for him.
Comment #3
catchNope, it's just convenience for setting them. They get stored/treated as individual tags in the implementations.
Comment #4
Wim LeersThis passes all unit tests,
RenderTest
and all cache tags tests.I'm definitely going to have missed some things, so letting testbot tell me where.
It's already possible to review this though.
74 files changed, 279 insertions(+), 444 deletions(-)
Comment #6
Wim LeersChasing HEAD.
Comment #8
Wim LeersSigh, the last-minute cleanup I did caused that syntax error. That'll teach me.
In doing so, I also ran
BlockViewBuilderTest
and made that one pass.Comment #10
znerol CreditAttribution: znerol commentedRe #3: The redis module has a copy of
flattenTags()
which converts the ones containing an array to strings. After this patch we simply can remove this method and everything will be fine.Comment #11
Wim LeersThis should be green, or close to green.
The majority of the failures were hidden in dark corners of Drupal. Debugging that is very, very painful. This is precisely why I want more strict cache tags: because then you can validate the cache tags to easily surface problems. The only reason I've managed to get it green, is by adding such validation to key parts: setting cache items, merging tags, invalidating tags, deleting tags.
Comment #13
Wim LeersI got sick of the red patches, so I iterated in a helper issue until it was green (sorry for the noise! :(). This will be green.
Comment #14
damiankloip CreditAttribution: damiankloip commentedMaybe we should keep an implementation of flattenTags so people can use it if they are not doing
$entity->getCacheTags()
/working with entities?Comment #15
Wim LeersBut where would that live? I don't see a good place for such a helper method, which is only rarely going to be useful?
Comment #16
BerdirIn general, this seems to simplify things a lot. Only not sure about some places that are now using merging that doesn't seem necessary.
This looks strange, why are we doing that conditional? if tags is not set, wouldn't this cause a notice?
How often does this happen? Is it worth to build something like Cache::buildCacheTags($prefix, array $ids) ?
Here we don't do it conditionally.
Do we really need to make it so complicated everywhere? It might be merged anyway somewhere, so why bother?
Wrong cache tag deletion. A few times in here.
nice trick with the comment ;)
Missing something here.
not sure if this is supposed to stay, needs to be formatted properly then.
Comment #17
Wim Leers#16:
Yep, this does simplify things a lot :)
$element['#cache']['tags']
, which might be empty, so we default it to the empty array. The merging happens unconditionally.That being said… since #2273277: Figure out a solution for the problematic interaction between the render system and the theme system when using #pre_render, it is guaranteed that
$element['#cache']['tags']
is set, so we can remove this conditional. Simplified now!That being said… I changed it to use the exact same code pattern as in
drupal_render_cache_set()
. Consistency FTW.$build['#cache']['tags']
is therefore also an array of strings, andis a silly prefix,
book
would be better, but there's nothing "wrong" about it?Also wrote the needed test coverage.
Comment #18
Berdir1. Yes, it was the same logic, but the logic was strange, I didn't understand what the conditional was for? It was not a notice-safe (empty/isset) safe check, so the only case where it could fire if tags is set to FALSE/NULL/empty string and I'm not sure why we support that? If the array tags would not be set, then it would trigger a notice?
Anyway, looking at the interface, I see that you removed the check, so looks fine to me now.
2. I'm not sure, but as @catch said, this was the main reason we went with the current structure in the first place, to avoid code like that. Having a helper method for that would be a nice counter-argument if someone says the old API was easier for those use cases. Agreed that just three use cases is not much.
4.
Now we have:
And we could also do
Which seems a lot easier to me and merging there seems pointless, as a) it is only test code and we know that it is unique and b) If it would not be, then it could also get duplicated in a different part of the render array, so why bother to make it unique?
I don't care that much (again) but I'm pointing it out for two reasons: a) There are other conversions where we don't explicitly merge together, it's not clear from looking at the patch when you should do it and when not, so we should have clear rules what you need to do. b) It's example where the after looks more complicated than HEAD, that's something reviewers/committers could push back?
Comment #19
Wim Leers#18.1: Hrm, you're right. Must've been a leftover from refactoring then :)
#18.4: Good points. The main reason for this is simply this: I first converted cache tags themselves, then I converted cache tag merging. So only at a relatively late stage, I realized that it would be easier to understand cache tags if I'd consistently use
Cache::mergeTags()
. Clearly, you're right in saying that merging seems unnecessarily complex in the example you cited, and in other similar cases.The general rule (now documented) is this: if you have two arrays of cache tags, use
Cache::mergeTags()
. If you just have to add a single cache tag string, then use$cache_tags[] = $cache_tag_string
. But almost always, you have methods that return an array of cache tags, so you'll almost always seeCache::mergeTags()
.However, in the case of page cache tags, there is also a certain necessity: due to HEAD's usage of array keys, it was effectively impossible to have a duplicate of
array('rendered' => TRUE)
— this is not the case witharray(0 => 'rendered')
. Of course, this should not be a concern of anything except the caching logic itself, so I did that in this reroll, but… we at least need aarray_unique()
inDefaultHtmlFragmentRenderer
to remove duplicate cache tags, because these cache tags are also used by reverse proxies, where Drupal's cache layer (the page cache system) is not involved at all!So, to address this, I:
Cache::mergeTags()
and changed the cited example and similar ones to use$tags[] = $string
in favor of$tags = Cache::mergeTags($tags, [$string])
Cache::validateTags($tags)
calls in the various cache back-ends to$tags = Cache::mergeTags($tags, [])
, which removes duplicatesNestedArray::merge
in all of Drupal core and discovered 3 remaining calls that needed to be replaced withCache::mergeTags()
.use
statements that I'd forgotten to remove in previous rerolls.Comment #21
Wim LeersAs expected, a few test failures after the changes in #19. Easy fixes.
Comment #22
BerdirI went through most of the changes here, it's a few patterns repeated many times.
I think I've been the only one doing code reviews here, so I'm a bit careful about RTBC'ing myself. But I'm +1 on that. So if someone else could have a look over it, then I think this can go in.
We have a number of critical overlapping/related cache tag issues, and no matter in which direction we're moving, they are going conflict badly, so best option is just to get them in one by one and then re-roll the others. In that light, I'll RTBC in the next days if nobody else does.
Comment #23
effulgentsia CreditAttribution: effulgentsia commentedLooks great to me.
Comment #25
damiankloip CreditAttribution: damiankloip commentedAnother an empty array just for the validation provided from mergeTags is not intuitive or obvious. Seems like a strange pattern. Shouldn't that just validate the tags?
Seems crazy to remove flattenTags to just create a follow up to add pretty much the same thing back in, but ok...
Comment #26
Wim LeersChasing head. Same number of deletions and insertions, only slightly bigger because one hunk was moved to a new file in HEAD, so this patch had to follow.
Comment #27
catch#25 looks it needs fixing.
Unless I'm missing something Cache::mergeTags() will work fine with a single array, and validateTags() is public so could be used.
Comment #28
BerdirvalidateTags() was replaced with mergeTags() as that also ensures uniqueness, I guess we could also make validateTags() do that?
The method that I suggested has nothing to do with current flattenTags(), tags are already flattened, all the time, everywhere.
What I suggested is a method to build array('node:1', 'node:2') from $ids = array(1, 2), something like Cache::buildTags($prefix = 'node', $ids = array(1, 2)).
should be pretty easy to add that here, but I'm also OK with a follow-up.
Comment #30
Wim Leers#25/27/28: great points. Talked to catch in IRC. I used
::mergeTags()
mainly because it removes duplicates, and none of our cache back-ends have access to a "set" primitive, hence we have to remove duplicates in all of our cache back-ends. The other reason I used it is because it sorts, and the sorting is just nice for DX: it makes it easier when debugging. But it's only actually useful in the DB cache back-end (used by default for most cache bins, hence will be used by *many* sites, just like in D7) and the memory back-end (which is used by default in allKernelTestBase
tests).So I replaced all of those
::mergeTags()
calls witharray_unique()
(which again stresses the cache tags are nothing special anymore, they're just an array of strings, so the standard functions can be used). And for better DX, I added asort()
in the DB and Memory back-ends, because only there it's actually useful.For good measure, I also updated
GenericCacheBackendUnitTestBase
, so that it now verifies that::set()
and::setMultiple()
of each cache back-end indeed validates cache tags. I've also added tests to ensure thatCache::invalidateTags()
andCache::deleteTags()
do the same.I've also updated the doxygen to say
string[] $cache_tags
rather thanarray $cache_tags
.With that, we now have a highly consistent, fully tested, very strict, very simple mechanism for cache tags, which makes the DX much better.
See
interdiff_fixes.txt
for just this part.#28: RE
buildTags()
: agreed that it's easier to do here than delay further by continue to comment about it here, so: done!See
interdiff_buildcachetags.txt
for just this part.Comment #31
Wim LeersUpdating the IS.
Comment #33
damiankloip CreditAttribution: damiankloip commentedNice, thanks Wim.
It doesn't make too much difference but do you think an array_map implementation would be better? functionally no difference. So prob doesn't matter.
Do you think we should throw our own message if e.g. an object is passed in? Otherwise it will throw a (recoverable?) fatal error and we'll get our friend the Drupal 'The website has encountered an error. Please try again later.' page?
Comment #34
Wim LeersI forgot the silliest detail during rebasing, hence the failures… :(
Discussed in IRC, we decided it's not necessary.
Comment #36
Wim LeersQuite often, I hate PHPStorm. This is one of those times.
(Because PHPStorm once again froze, what I typed ended up in the source code and hence ended up in the patch. But not in the interdiff.)
Interdiff of #34 still applies.
Comment #38
Wim LeersChasing HEAD after #2254181: Comment indentation is incorrect for comments following a replying-comment: don't render cache comments for which threading is enabled landed + fixed the 3 fails.
Silliest interdiff ever, for stupidest copy/paste mistake ever, introduced in #30:
Comment #39
damiankloip CreditAttribution: damiankloip commentedRe berdir in #28; yes, not the same is fine but it is a substitute for flatten tags as you have a helper to create the strings for you? Just using a prefix instead of an array key...
Comment #40
BerdirOk, we're green again and I think good to go, if @damiankloip is happy with the interdiff from #30.
Re: the function: Yes, both result in strings, but it is very different from before because the context it is used in and who is using it is completely different. So, IMHO, we're not replacing flattenTags() with buildTags(), we're removing flattenTags() because it is no longer useful and are adding a new method for a new use case :) But that's a detail, apparently we're both happy with buildTags() as it is now.
Comment #41
catchRight the problem that both were trying to solve is the same (having to loop over entity IDs to build cache tag strings manually), but we've taken it out of the cache storage API, and discovered it's less of a problem after applying cache tags everywhere than was expected when cache tags originally went in.
Patch looks great now, committed/pushed to 8.0.x.
Comment #42
Wim LeersYAY!
Next: rerolling #2304987: Don't invalidate cache tags of referenced entities, use entity list cache tags correctly, add test coverage for entity list cache tags and getting that one to RTBC.
Comment #43
Wim LeersFiled a tiny follow-up to update
core.api.php
's Cache API documentation, which I forgot to do here: #2344691: Update core.api.php's Cache API documentation: cache tags are now set as strings.Comment #44
Fabianx CreditAttribution: Fabianx commentedThanks Wim, This is great!
Comment #47
YesCT CreditAttribution: YesCT commentedlooks like a copy and paste error got in here.... check out #2364555-3: Add @covers annotation, fix some --strict for PHPUnit
Comment #48
Mile23Re-opening this as a followup. Apparently this issue introduced a copy-paste error in
core/tests/Drupal/Tests/Core/Cache/CacheTest.php
, wheretestInvalidateTags()
is really testingdeleteTags()
.Here's a teeny tiny patch to rectify this.
I've moved this change out of the newest patch in #2364555: Add @covers annotation, fix some --strict for PHPUnit.
Comment #49
YesCT CreditAttribution: YesCT commentedcommitters might prefer a separate issue for this follow-up... but maybe not. the correction looks fine to me though.
Comment #50
Wim LeersGood catch — sorry about that!
Comment #51
alexpottThe followup is fixing a bug which is why a new issue is ALWAYS preferable to doing a followup. I can't change the category because the original was a task and not a bug and that is the majority of the work. If we weren't committing major tasks right now then I have to spend time on this issue explaining why this is an exception. So people please please please open followups as new issues and put a comment on the old issue and relate the two issues.
Committed e95124d and pushed to 8.0.x. Thanks!
Comment #53
Mile23OK, will make followup next time. Thanks.