The cache API deals with stale cache items in a non-consistent way. I propose some changes that will make common usage of the API easier while allowing callers to make better use of stale (invalid/expired) data.
$cache->get() may return expired values. This may sometimes be useful, but callers need to check the expire time themselves. This is a somewhat unexpected behaviour (filed as a bug here: #534092: cache_get() returns expired cache items).
On the other hand, if an item has been invalided via $cache->invalidateTags(), it cannot be retrieved via $cache->get(), even if it has not yet been deleted from the backend.
The cache API uses very inconsistent wording when it comes to “removing” cache entries:
- $cache->flush() deletes all items – I propose a renaming to deleteAll() (for consistency with delete() and deleteMultiple()).
- $cache->expire() deletes entries that have expired – I propose a renaming to deleteExpired().
- $cache->invalidateTags() effectively deletes entries with the specified tags (they are not deleted right away, but they cannot be retrieved through get()). I suggest we stick to this but extend $cache->get() with the possibility to retrieve those values.
In addition to implementing these proposals, the attached patch adds the following:
- Two new functions, $cache->invalidate($cid) and $cache->invalidateMultiple($cid), that mark existing cache items as invalid. This is equivalent to modifying the items so they get an expire time in the past.
- $cache->get()/getMultiple() gets an options second argument, $cache->get($cid, $allowInvalid) and $cache->getMultiple($cids, $allowInvalid), that specifies whether expired or explicitly invalidated (via $cache->invalidate()/invalidateMultiple()/invalidateTags()) items should be returned. By default only valid items are returned.
- cache_invalidate() is renamed to cache_invalidate_tags() to match $cache->invalidateTags() and not be confused with $cache->invalidate().
- The structure of DatabaseBackend and MemoryBackend are brought more in sync. We should consider introducing a common abstract base class, BackendBase, that implements shared functionality.
- Terminology clean-up in tests and Doxygen comments.
Comment | File | Size | Author |
---|---|---|---|
#38 | cache-invalid-16.patch | 100.95 KB | c960657 |
#35 | cache-invalid-15.patch | 100.95 KB | c960657 |
#30 | cache-invalid-14.patch | 95.33 KB | c960657 |
#28 | cache-invalid-13.patch | 94.06 KB | c960657 |
#27 | cache-invalid-12.patch | 94.68 KB | c960657 |
Comments
Comment #1
c960657 CreditAttribution: c960657 commentedComment #2
catchI haven't reviewed the actual patch yet but the overall description sounds good.
Comment #3
Lars Toomre CreditAttribution: Lars Toomre commentedIf/when this is re-rolled, please consider the following documentation adjustments. Thanks.
I think that description should start with '(optional)' as well as specifying what the default value is.
I think this should be '@return object|false'.
Also needs '(optional)' at start as well as noting what the default value is.
Can we add a @return type hint here? Also I would suggest replacing CID with 'cache ID'.
I think the coding standard is bool, not boolean, in this case.
I think this should be ::deleteAll() instead of ::flush().
I think you meant ::deleteExpired() here.
This should be rewrapped for 80 characters.
Left over from some development? This does not match the function. Also should be '@return bool'.
Should be ::invalidateMultiple() perhaps?
Comment #4
c960657 CreditAttribution: c960657 commentedThanks for the review. This patch addressess the concerns raised.
Also, deleteTags() is added for completeness. In the database backend, the cache items are not actually deleted from the database, but they are never returned.
Comment #5
Lars Toomre CreditAttribution: Lars Toomre commentedThanks @c960657 for incorporating many of the suggested changes. I have now done a fresh review from start and noticed the following. Hopefully, you you might be able to include these comments in your next re-roll.
Small nit... this should be type hinted as 'mixed|false' to help developers.
Not sure what is meant by the phrase 'flat tags'. Can you expand the comment/explanation?
I think that '::deleteAll()' here should be '::deleteTags()'.
Comment #6
c960657 CreditAttribution: c960657 commentedThank you for your scrutinous review. This patch addresses your latest comments.
I have added a @see pointer DatabaseBackend::flattenTags(). The concept of flat tags is used internally in this class. This is the documentation for a protected method, so I think it is acceptable to use a "local" lingo.
Comment #7
Lars Toomre CreditAttribution: Lars Toomre commentedThanks @c960657 for incorporating many of the suggested changes in the patch #6. I have now done a fresh review from start and noticed the following. Hopefully, you you might be able to include these comments in your next re-roll.
I am leaving as needs work because at minimum the unrelated changes (#mail) need to be deleted.
Sorry for not noticing this before... I understand that the correct type hint is bool instead of boolean (although I prefer boolean!!).
Change s/Delete/Deletes/
Ibid.
For clarity in api display, it might be helpful to change to 'Invalidates cache items ...'
I think this should be '::deleteAll()' instead of '::flush()'.
This docblock should be '::deleteExpired()'.
Just my opinion, but the existing description was better especially if put in single quotes to connote jargon.
Should be int instead of integer.
Exceeds 80 characters... maybe drop the 'the'?
Leftover debug code??
Ibid.
Comment #8
c960657 CreditAttribution: c960657 commentedI have reworded the whole block. The @return part is now expressed in terms of data types ("An indexed array of strings").
Comment #9
Lars Toomre CreditAttribution: Lars Toomre commentedA couple of thoughts from fits reading through of the code.....
I would think you meant 'may' here.
Being naive, it would be really helpful if the documentation could specify if these are exact matches or partial matches.
Comment #10
moshe weitzman CreditAttribution: moshe weitzman commentedNeeds reroll at least for that typo. otherwise, looks good.
Comment #11
c960657 CreditAttribution: c960657 commentedReroll. There were quite a few merge conflicts due to #1393392: Convert prefix cache clears to cache tags, then remove support for them.
You mean whether all tags should match or just one? I have now used the wording from invalidateTags() and deleteTags():
Comment #12
moshe weitzman CreditAttribution: moshe weitzman commentedPatch missing :(
Comment #13
c960657 CreditAttribution: c960657 commentedOops :-)
Comment #15
c960657 CreditAttribution: c960657 commentedI had to add the following to Drupal\locale\Tests\LocaleTranslationTest::testStringTranslation().
In general, I am not sure I like the static caching of tags in invalidateTags() that is not also in deleteTags(). It means that other long-running jobs don't pick up what has been deleted/invalidated after they started. This behaviour is not consistent with that of delete()/invalidate()/deleteAll()/invalidateAll(). I suggest we deal with this in a separate ticket (the issue exists for invalidateTags() already).
Comment #17
c960657 CreditAttribution: c960657 commented#15: cache-invalid-6.patch queued for re-testing.
Comment #18
moshe weitzman CreditAttribution: moshe weitzman commentedReviewed and it looks like a good cleanup to me. catch is really the best evaluator so lets send this along ...
Comment #19
catchThese functions are exactly the same.
Yet this is changing?
There's no information here about why you'd invalidate an item rather than delete it. Overall I can see the case for allowing expired and tag-invalidated items to potentially be returned from cache()->get(), but I can't see any value in allowing people expiring tags to differentiate bewteen invalidating them and deleting them. What's the use-case for that?
Additionally with things like tags it's simply not possible to physically delete items in particular backends (even the MySQL implementation doesn't do this). Memcache doesn't even delete items when you flush a bin at this point, since it uses 'virtual' Drupal bins inside a single Memcache bin depending on configuration. So I agree with the general aim to allow getters to say "give me something if it happens to be there" - since we're inconsistent on that with expired items at the moment, but not at all with the distinction between invalidation and deletion that's added here - it seems like a lot of additional complexity for no gain to me.
Comment #20
c960657 CreditAttribution: c960657 commentedOops, cache_delete_tags() should call deleteTags().
The patch replaces all calls to invalidateTags() with deleteTags(), because the current invalidateTags() actually deletes the items (from the callers point of view).
delete() is relevant if the data being cached itself has been deleted, e.g. if you delete a node, then any cached information about that node will probably be of little use. In some cases would exposing information of deleted content be considered a bug.
invalidate() is relevant if you want to mark a cache item as stale but still allow it to be used until the item has been rebuilt. This would be useful e.g. in drupal_theme_rebuild(). Currently we destroy any cached information, i.e. all threads has to wait for the registry to be rebuilt. If we had just marked it as invalid, one thread could do the rebuilding while others use the cached value.
The idea behind this API change is that delete()/deleteTags() etc. should make the data appear to the user as being deleted. Whether it is actually deleted from the MySQL table or just marked as deleted is an implementation detail that is not revealed externally. Physically deleting items that are marked as deleted can be done during garbage collection.
Do you still disagree after reading my explanation above?
Comment #21
catchHmm I can see the value, but I think something like drupal_theme_rebuild() should probably just change to a write-through cache instead. Would you mind maybe trying to implement the invalidate logic in that function (or another one if it's easier) so we can see how it looks?
Comment #22
c960657 CreditAttribution: c960657 commentedA write-through cache is not always feasible. Sometimes rebuilding the cache may take a long time (especially if many items are invalidated at once), and sometimes rebuilding it may not even be possible, e.g. if rebuilding function is calling an external REST service that is not responding. Spawning a cache rebuild in a background/queue job is one option, but we don't have a good framework for that (yet).
Here is a rewrite of the theme registry load/rebuild functions with added stampede protection (the code has not been tested). Contrary to "standard" use of semaphores, the threads that do not acquire the semaphore can continue using the stale data. Some of the complexity could be wrapped in some helper function (see #1225404: Modern event based lock framework proposal or the Cache Graceful module), but here it is in raw form:
Comment #23
c960657 CreditAttribution: c960657 commentedReroll following the commit of #1651206: Added a cache chain implementation and #1808112: Missing method visibility, bogus phpDoc and coding style in Cache backend classes.
Fixed.
I added some comments to deleteXxx() explaining why it is sometimes better to call invalidateXxx().
Comment #24
c960657 CreditAttribution: c960657 commentedSorry, bad patch.
Comment #26
c960657 CreditAttribution: c960657 commentedComment #27
c960657 CreditAttribution: c960657 commentedComment #28
c960657 CreditAttribution: c960657 commentedReroll.
Comment #30
c960657 CreditAttribution: c960657 commentedComment #31
c960657 CreditAttribution: c960657 commentedThe test bot is happy again. Is this something that you would like to see in D8?
Comment #32
moshe weitzman CreditAttribution: moshe weitzman commentedback to rtbc.
Comment #33
webchickThis one has catch written all over it.
Comment #34
catchOK I read through this again. While the tag invalidation vs. deletion still makes me a bit wary in terms of complexity, I see how it could be useful and as far as I can see there's not really any significant overhead - it's just adding an extra value to load and check against.
However this really, really needs more docs on the invalidate tags vs. delete tags use case, they're almost exactly the same now except for the wording change - i.e. why you'd want to use each one and how it fits with the $allow_invalid argument to ->get().
Moving this to a task since it's just bringing consistency to the API rather than adding a completely new feature.
Comment #35
c960657 CreditAttribution: c960657 commentedI updated the patch with additional explanation in the DocBlock for the CacheBackendInterface itself and various functions. Once we start using this in code, e.g. in the theme registry, we can refer to those functions as example use-cases.
I also added invalidateAll() for completeness to match deleteAll().
Comment #36
moshe weitzman CreditAttribution: moshe weitzman commentedback to catch
Comment #38
c960657 CreditAttribution: c960657 commentedReroll (conflict due to fix for #1848968: Too many checksum tag queries executed by the cache backend)
Comment #39
catchOK, committed/pushed to 8.x, thanks!
This will need a change notice (or updating http://drupal.org/node/1272696 would probably be better).
Comment #40
catchComment #41
tim.plunkettI updated http://drupal.org/node/1272696 with relevant changes, but didn't add any new docs for the new stuff:
Comment #42
tim.plunketthttp://drupal.org/node/1534648 should be merged into the above change notice.
Comment #43
xjmComment #44
c960657 CreditAttribution: c960657 commentedI added a description of the new invalidateXxx() methods (sorry for the delay). Comments?
Comment #45
YesCT CreditAttribution: YesCT commentedIn http://drupal.org/node/1272696 invalidate info was added. Looks like that takes care of the task from #41.
Was #42 done? I'm not sure I understand what it needs. Maybe it's asking for a link to http://drupal.org/node/1534648 to be added to http://drupal.org/node/1272696
Comment #46
webchickThis has been waiting for a change notice for over 5 weeks now. Let's please get this knocked out.
Comment #47
chrisjlee CreditAttribution: chrisjlee commented@webchick
I have one drafted for http://drupal.org/node/1534648. This is my first change notice. YesCT helped me look over this one:
Comment #48
irek02 CreditAttribution: irek02 commentedNew cache API looks good. I made sure it includes nicely revised content of Cache tag support is added. Kicking this sucker out of queue.
Comment #49
c960657 CreditAttribution: c960657 commentedTwo follow-up tasks:
Comment #50
jibranRemoved tag.
Comment #51.0
(not verified) CreditAttribution: commentedFix missing /ul> tag