Problem/Motivation

Problems with the current system:

  1. Any code does ['foo' => TRUE'], then any other code no longer can do ['foo' => [35 => 35]]. We've had core bugs because of that.
  2. If the $tags argument of drupal_merge_cache_tags() has something like ['entity_test' => [0 => '1']] and the $other argument has ['entity_test' => [1 => '1']] then the result will contain the value '1' twice.

Proposed resolution

Use strings rather than arrays.

i.e. change from:

$tags = array(
  'my_custom_tag' => TRUE,
  'node' => array(1, 3),
  'user' => array(7),
);

to:

$tags = array(
  'my_custom_tag',
  'node:1',
  'node:3',
  'user:7',
);

This was raised in #2254181-39: Comment indentation is incorrect for comments following a replying-comment: don't render cache comments for which threading is enabled.

Remaining tasks

Roll patch + review.

User interface changes

None.

API changes

  1. Cache tags should be strings. An exception is now thrown when they're not, to improve the DX. This happens when using ::set() or ::setMultiple() on a cache back-end and when using Cache::invalidateTags() or Cache::deleteTags(). i.e.: always. Verified with test coverage.
  2. Cache tag merging should now be done using Cache::mergeTags() rather than with NestedArray, array_merge() or drupal_merge_cache_tags().
  3. Cache::buildTags() is provided for the rare use cases where you manually need to build cache tags (only 3 cases in core, but likely more in contrib).
  4. HtmlViewSubscriber::convertCacheTagsToHeader() and HtmlViewSubscriber::convertHeaderToCacheTags() have been removed.
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

I'd be OK with this.

The idea for allowing a prefix + array was for things like array('node' => array_keys($nodes));. Now that we have getCacheTag() on entities, that use case doesn't really exist as much, and the 'flattening' of cache tags has come up in a couple of places already as an issue.

Berdir’s picture

I'm not sure right now, if you have cache tags like 'node' => array(1, 2, 3), do we support to clear all them?

I'll also leave znerol a ping, he has written a redis cache backend, want to make this is not a problem for him.

catch’s picture

I'm not sure right now, if you have cache tags like 'node' => array(1, 2, 3), do we support to clear all them?

Nope, it's just convenience for setting them. They get stored/treated as individual tags in the implementations.

Wim Leers’s picture

Issue summary: View changes
Status: Active » Needs review
Issue tags: +Needs tests
FileSize
108.81 KB

This passes all unit tests, RenderTest and all cache tags tests.

I'm definitely going to have missed some things, so letting testbot tell me where.

It's already possible to review this though.

74 files changed, 279 insertions(+), 444 deletions(-)

Status: Needs review » Needs work

The last submitted patch, 4: cache_tags_as_strings-2340123-4.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
108.8 KB

Chasing HEAD.

Status: Needs review » Needs work

The last submitted patch, 6: cache_tags_as_strings-2340123-6.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
112.75 KB
5.04 KB

Sigh, the last-minute cleanup I did caused that syntax error. That'll teach me.

In doing so, I also ran BlockViewBuilderTest and made that one pass.

Status: Needs review » Needs work

The last submitted patch, 8: cache_tags_as_strings-2340123-8.patch, failed testing.

znerol’s picture

Re #3: The redis module has a copy of flattenTags() which converts the ones containing an array to strings. After this patch we simply can remove this method and everything will be fine.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
161.26 KB
54.47 KB

This should be green, or close to green.

The majority of the failures were hidden in dark corners of Drupal. Debugging that is very, very painful. This is precisely why I want more strict cache tags: because then you can validate the cache tags to easily surface problems. The only reason I've managed to get it green, is by adding such validation to key parts: setting cache items, merging tags, invalidating tags, deleting tags.

Status: Needs review » Needs work

The last submitted patch, 11: cache_tags_as_strings-2340123-12.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
165.73 KB
5.43 KB

I got sick of the red patches, so I iterated in a helper issue until it was green (sorry for the noise! :(). This will be green.

damiankloip’s picture

Maybe we should keep an implementation of flattenTags so people can use it if they are not doing $entity->getCacheTags()/working with entities?

Wim Leers’s picture

But where would that live? I don't see a good place for such a helper method, which is only rarely going to be useful?

Berdir’s picture

In general, this seems to simplify things a lot. Only not sure about some places that are now using merging that doesn't seem necessary.

  1. +++ b/core/includes/common.inc
    @@ -3152,8 +3117,7 @@ function drupal_render_cache_set(&$markup, array $elements) {
    -  $cache_tags = $elements['#cache']['tags'] ?: array();
    -  $cache_tags['rendered'] = TRUE;
    +  $cache_tags = Cache::mergeTags($elements['#cache']['tags'] ?: [],  ['rendered']);
    

    This looks strange, why are we doing that conditional? if tags is not set, wouldn't this cause a notice?

  2. +++ b/core/lib/Drupal/Core/Menu/MenuTreeStorage.php
    @@ -193,7 +193,11 @@ public function rebuild(array $definitions) {
    -    Cache::invalidateTags(array('menu' => $affected_menus));
    +    $cache_tags = [];
    +    foreach ($affected_menus as $affected_menu) {
    +      $cache_tags[] = 'menu:' . $affected_menu;
    +    }
    +    Cache::invalidateTags($cache_tags);
    

    How often does this happen? Is it worth to build something like Cache::buildCacheTags($prefix, array $ids) ?

  3. +++ b/core/lib/Drupal/Core/Page/DefaultHtmlFragmentRenderer.php
    @@ -72,7 +73,7 @@ public function render(HtmlFragmentInterface $fragment, $status_code = 200) {
    -      $tags['rendered'] = TRUE;
    +      $tags = Cache::mergeTags($tags, ['rendered']);
           $page->setCacheTags($tags);
    

    Here we don't do it conditionally.

  4. +++ b/core/modules/block/tests/modules/block_test/block_test.module
    @@ -27,7 +28,7 @@ function block_test_block_view_test_cache_alter(array &$build, BlockPluginInterf
    -    $build['#cache']['tags'][\Drupal::state()->get('block_test_view_alter_cache_tag')] = TRUE;
    +    $build['#cache']['tags'] = Cache::mergeTags($build['#cache']['tags'], [\Drupal::state()->get('block_test_view_alter_cache_tag')]);
    

    Do we really need to make it so complicated everywhere? It might be merged anyway somewhere, so why bother?

  5. +++ b/core/modules/book/src/BookManager.php
    @@ -451,7 +451,7 @@ public function deleteFromBook($nid) {
    -    \Drupal::cache('data')->deleteTags(array('bid' => $original['bid']));
    +    \Drupal::cache('data')->deleteTags(array('bid:' . $original['bid']));
    

    Wrong cache tag deletion. A few times in here.

  6. +++ b/core/modules/comment/src/Tests/CommentDefaultFormatterCacheTagsTest.php
    @@ -101,17 +102,16 @@ public function testCacheTags() {
    -    $this->assertEqual($build['#cache']['tags'], $expected_cache_tags, 'The test entity has the expected cache tags when it has comments.');
    +    sort($expected_cache_tags);
    +    $this->assertEqual($build['#cache']['tags'], $expected_cache_tags); //, 'The test entity has the expected cache tags when it has comments.');
    

    nice trick with the comment ;)

  7. +++ b/core/modules/system/src/Form/ThemeSettingsForm.php
    @@ -426,10 +426,10 @@ public function submitForm(array &$form, FormStateInterface $form_state) {
    -      Cache::invalidateTags(array('theme' => $values['theme']));
    +      Cache::invalidateTags(array('theme:'));
    ...
         }
    

    Missing something here.

  8. +++ b/core/modules/system/src/Tests/Cache/PageCacheTagsTestBase.php
    @@ -14,7 +14,7 @@
     abstract class PageCacheTagsTestBase extends WebTestBase {
    -
    +protected $dumpHeaders = TRUE;
    

    not sure if this is supposed to stay, needs to be formatted properly then.

Wim Leers’s picture

Issue tags: -Needs tests
FileSize
168.8 KB
7.54 KB

#16:
Yep, this does simplify things a lot :)

  1. The logic is still the same as before? The first operand is $element['#cache']['tags'], which might be empty, so we default it to the empty array. The merging happens unconditionally.
    That being said… since #2273277: Figure out a solution for the problematic interaction between the render system and the theme system when using #pre_render, it is guaranteed that $element['#cache']['tags'] is set, so we can remove this conditional. Simplified now!
  2. There are only three cases in all of core. But your suggestions is similar to what damiankloip asked in #14. I don't think it's worth it, but if you both want something like that, and catch +1s it, I'm fine with it. I opened an issue for this: #2342241: Add Cache::buildCacheTags(string $prefix, array $ids)? — if catch +1s, I promise I'll get that done.
  3. See my reply to point 1, it really is the same. The difference is that at this point, we're certain that we're going to have cache tags.
    That being said… I changed it to use the exact same code pattern as in drupal_render_cache_set(). Consistency FTW.
  4. First, I don't think it's complicated; it's consistent. Second: cache tags are an array of strings; $build['#cache']['tags'] is therefore also an array of strings, and
    </code>
    <li>I don't see what's wrong? I think <code>bid

    is a silly prefix, book would be better, but there's nothing "wrong" about it?

  5. Fixed :) And yes, I use this frequently while debugging tests. :)
  6. Fixed.
  7. Removed. Was just for debugging.

Also wrote the needed test coverage.

Berdir’s picture

1. Yes, it was the same logic, but the logic was strange, I didn't understand what the conditional was for? It was not a notice-safe (empty/isset) safe check, so the only case where it could fire if tags is set to FALSE/NULL/empty string and I'm not sure why we support that? If the array tags would not be set, then it would trigger a notice?

Anyway, looking at the interface, I see that you removed the check, so looks fine to me now.

2. I'm not sure, but as @catch said, this was the main reason we went with the current structure in the first place, to avoid code like that. Having a helper method for that would be a nice counter-argument if someone says the old API was easier for those use cases. Agreed that just three use cases is not much.

4.
Now we have:

 $build['#cache']['tags'] = Cache::mergeTags($build['#cache']['tags'], [\Drupal::state()->get('block_test_view_alter_cache_tag')]);

And we could also do

$build['#cache']['tags'][] = \Drupal::state()->get('block_test_view_alter_cache_tag');

Which seems a lot easier to me and merging there seems pointless, as a) it is only test code and we know that it is unique and b) If it would not be, then it could also get duplicated in a different part of the render array, so why bother to make it unique?

I don't care that much (again) but I'm pointing it out for two reasons: a) There are other conversions where we don't explicitly merge together, it's not clear from looking at the patch when you should do it and when not, so we should have clear rules what you need to do. b) It's example where the after looks more complicated than HEAD, that's something reviewers/committers could push back?

Wim Leers’s picture

FileSize
172.25 KB
14.2 KB

#18.1: Hrm, you're right. Must've been a leftover from refactoring then :)

#18.4: Good points. The main reason for this is simply this: I first converted cache tags themselves, then I converted cache tag merging. So only at a relatively late stage, I realized that it would be easier to understand cache tags if I'd consistently use Cache::mergeTags(). Clearly, you're right in saying that merging seems unnecessarily complex in the example you cited, and in other similar cases.
The general rule (now documented) is this: if you have two arrays of cache tags, use Cache::mergeTags(). If you just have to add a single cache tag string, then use $cache_tags[] = $cache_tag_string. But almost always, you have methods that return an array of cache tags, so you'll almost always see Cache::mergeTags().
However, in the case of page cache tags, there is also a certain necessity: due to HEAD's usage of array keys, it was effectively impossible to have a duplicate of array('rendered' => TRUE) — this is not the case with array(0 => 'rendered'). Of course, this should not be a concern of anything except the caching logic itself, so I did that in this reroll, but… we at least need a array_unique() in DefaultHtmlFragmentRenderer to remove duplicate cache tags, because these cache tags are also used by reverse proxies, where Drupal's cache layer (the page cache system) is not involved at all!

So, to address this, I:

  1. inspected every call to Cache::mergeTags() and changed the cited example and similar ones to use $tags[] = $string in favor of $tags = Cache::mergeTags($tags, [$string])
  2. changed the Cache::validateTags($tags) calls in the various cache back-ends to $tags = Cache::mergeTags($tags, []), which removes duplicates
  3. inspected every instance of NestedArray::merge in all of Drupal core and discovered 3 remaining calls that needed to be replaced with Cache::mergeTags().
  4. Removed now-obsolete use statements that I'd forgotten to remove in previous rerolls.

Status: Needs review » Needs work

The last submitted patch, 19: cache_tags_as_strings-2340123-19.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
176.91 KB
8.06 KB

As expected, a few test failures after the changes in #19. Easy fixes.

Berdir’s picture

I went through most of the changes here, it's a few patterns repeated many times.

I think I've been the only one doing code reviews here, so I'm a bit careful about RTBC'ing myself. But I'm +1 on that. So if someone else could have a look over it, then I think this can go in.

We have a number of critical overlapping/related cache tag issues, and no matter in which direction we're moving, they are going conflict badly, so best option is just to get them in one by one and then re-roll the others. In that light, I'll RTBC in the next days if nobody else does.

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community

Looks great to me.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 21: cache_tags_as_strings-2340123-21.patch, failed testing.

damiankloip’s picture

+++ b/core/lib/Drupal/Core/Cache/MemoryBackend.php
@@ -112,7 +112,7 @@ public function set($cid, $data, $expire = Cache::PERMANENT, array $tags = array
+      'tags' => Cache::mergeTags($tags, []),

+++ b/core/lib/Drupal/Core/Cache/PhpBackend.php
@@ -138,8 +138,8 @@ public function set($cid, $data, $expire = Cache::PERMANENT, array $tags = array
+      'tags' => Cache::mergeTags($tags, []),

Another an empty array just for the validation provided from mergeTags is not intuitive or obvious. Seems like a strange pattern. Shouldn't that just validate the tags?

Seems crazy to remove flattenTags to just create a follow up to add pretty much the same thing back in, but ok...

Wim Leers’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
177.28 KB

Chasing head. Same number of deletions and insertions, only slightly bigger because one hunk was moved to a new file in HEAD, so this patch had to follow.

catch’s picture

Status: Reviewed & tested by the community » Needs work

#25 looks it needs fixing.

Unless I'm missing something Cache::mergeTags() will work fine with a single array, and validateTags() is public so could be used.

Berdir’s picture

validateTags() was replaced with mergeTags() as that also ensures uniqueness, I guess we could also make validateTags() do that?

Seems crazy to remove flattenTags to just create a follow up to add pretty much the same thing back in, but ok...

The method that I suggested has nothing to do with current flattenTags(), tags are already flattened, all the time, everywhere.

What I suggested is a method to build array('node:1', 'node:2') from $ids = array(1, 2), something like Cache::buildTags($prefix = 'node', $ids = array(1, 2)).

should be pretty easy to add that here, but I'm also OK with a follow-up.

The last submitted patch, 26: cache_tags_as_strings-2340123-26.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
181.9 KB
9.27 KB
4.76 KB
13.2 KB

#25/27/28: great points. Talked to catch in IRC. I used ::mergeTags() mainly because it removes duplicates, and none of our cache back-ends have access to a "set" primitive, hence we have to remove duplicates in all of our cache back-ends. The other reason I used it is because it sorts, and the sorting is just nice for DX: it makes it easier when debugging. But it's only actually useful in the DB cache back-end (used by default for most cache bins, hence will be used by *many* sites, just like in D7) and the memory back-end (which is used by default in all KernelTestBase tests).
So I replaced all of those ::mergeTags() calls with array_unique() (which again stresses the cache tags are nothing special anymore, they're just an array of strings, so the standard functions can be used). And for better DX, I added a sort() in the DB and Memory back-ends, because only there it's actually useful.

For good measure, I also updated GenericCacheBackendUnitTestBase, so that it now verifies that ::set() and ::setMultiple() of each cache back-end indeed validates cache tags. I've also added tests to ensure that Cache::invalidateTags() and Cache::deleteTags() do the same.
I've also updated the doxygen to say string[] $cache_tags rather than array $cache_tags.

With that, we now have a highly consistent, fully tested, very strict, very simple mechanism for cache tags, which makes the DX much better.

See interdiff_fixes.txt for just this part.


#28: RE buildTags(): agreed that it's easier to do here than delay further by continue to comment about it here, so: done!

See interdiff_buildcachetags.txt for just this part.

Wim Leers’s picture

Issue summary: View changes

Updating the IS.

Status: Needs review » Needs work

The last submitted patch, 30: cache_tags_as_strings-2340123-29.patch, failed testing.

damiankloip’s picture

Nice, thanks Wim.

It doesn't make too much difference but do you think an array_map implementation would be better? functionally no difference. So prob doesn't matter.

Do you think we should throw our own message if e.g. an object is passed in? Otherwise it will throw a (recoverable?) fatal error and we'll get our friend the Drupal 'The website has encountered an error. Please try again later.' page?

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
182.15 KB
602 bytes

I forgot the silliest detail during rebasing, hence the failures… :(


Do you think we should throw our own message if e.g. an object is passed in? Otherwise it will throw a (recoverable?) fatal error and we'll get our friend the Drupal 'The website has encountered an error. Please try again later.' page?

Discussed in IRC, we decided it's not necessary.

Status: Needs review » Needs work

The last submitted patch, 34: cache_tags_as_strings-2340123-34.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
182.15 KB

Quite often, I hate PHPStorm. This is one of those times.

(Because PHPStorm once again froze, what I typed ended up in the source code and hence ended up in the patch. But not in the interdiff.)

Interdiff of #34 still applies.

Status: Needs review » Needs work

The last submitted patch, 36: cache_tags_as_strings-2340123-36.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
182.01 KB
707 bytes

Chasing HEAD after #2254181: Comment indentation is incorrect for comments following a replying-comment: don't render cache comments for which threading is enabled landed + fixed the 3 fails.

Silliest interdiff ever, for stupidest copy/paste mistake ever, introduced in #30:

diff --git a/core/modules/user/src/PermissionsHash.php b/core/modules/user/src/PermissionsHash.php
-      $tags = Cache::buildTags('menu', $sorted_roles);
+      $tags = Cache::buildTags('user_role', $sorted_roles);
 
damiankloip’s picture

Re berdir in #28; yes, not the same is fine but it is a substitute for flatten tags as you have a helper to create the strings for you? Just using a prefix instead of an array key...

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Ok, we're green again and I think good to go, if @damiankloip is happy with the interdiff from #30.

Re: the function: Yes, both result in strings, but it is very different from before because the context it is used in and who is using it is completely different. So, IMHO, we're not replacing flattenTags() with buildTags(), we're removing flattenTags() because it is no longer useful and are adding a new method for a new use case :) But that's a detail, apparently we're both happy with buildTags() as it is now.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Right the problem that both were trying to solve is the same (having to loop over entity IDs to build cache tag strings manually), but we've taken it out of the cache storage API, and discovered it's less of a problem after applying cache tags everywhere than was expected when cache tags originally went in.

Patch looks great now, committed/pushed to 8.0.x.

Wim Leers’s picture

Wim Leers’s picture

Filed a tiny follow-up to update core.api.php's Cache API documentation, which I forgot to do here: #2344691: Update core.api.php's Cache API documentation: cache tags are now set as strings.

Fabianx’s picture

Thanks Wim, This is great!

  • catch committed 2c6cd92 on 8.0.x
    Issue #2340123 by Wim Leers: Setting cache tags can be tricky: use...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

YesCT’s picture

looks like a copy and paste error got in here.... check out #2364555-3: Add @covers annotation, fix some --strict for PHPUnit

Mile23’s picture

Status: Closed (fixed) » Needs review
FileSize
552 bytes

Re-opening this as a followup. Apparently this issue introduced a copy-paste error in core/tests/Drupal/Tests/Core/Cache/CacheTest.php, where testInvalidateTags() is really testing deleteTags().

Here's a teeny tiny patch to rectify this.

I've moved this change out of the newest patch in #2364555: Add @covers annotation, fix some --strict for PHPUnit.

YesCT’s picture

Status: Needs review » Reviewed & tested by the community

committers might prefer a separate issue for this follow-up... but maybe not. the correction looks fine to me though.

Wim Leers’s picture

Good catch — sorry about that!

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

The followup is fixing a bug which is why a new issue is ALWAYS preferable to doing a followup. I can't change the category because the original was a task and not a bug and that is the majority of the work. If we weren't committing major tasks right now then I have to spend time on this issue explaining why this is an exception. So people please please please open followups as new issues and put a comment on the old issue and relate the two issues.

Committed e95124d and pushed to 8.0.x. Thanks!

  • alexpott committed e95124d on 8.0.x
    Issue #2340123 followup by Mile23: Setting cache tags can be tricky: use...
Mile23’s picture

OK, will make followup next time. Thanks.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.