Missing serial can be a sign of cache inconsistency: either eviction of serial happened, or there was serial bin flush / Memcached restart. To prevent serial collision in this case, we need to flush all content caches that use serials. Thus, we get reduced cache efficiency in exchange for reliability.
For serial evictions to not happen, there should be separate Memcached instance dedicated to serials (probably with with evictions disabled ("-M" option) ).

Comments

crea’s picture

Flushing all content caches on every missing serial is wrong, because there can be also missing serial because of a new tag.
Need to have "cold vs hot" cache flag. On first Memcached bin access, cache is "cold", so we initialize the serial, flush content caches and mark cache as "hot". On subsequent serial operations, if serial is missing but cache is still "hot", we initialize serial too, but don't do flushes.

In this scenario we assume evictions don't happen:

  • In general, serials are almost of the same size and are small, so it's possible to have memcached bin big enough for evictions not happening.
  • It is possible to disable evictions completely using "-M" Memcached option. Though it doesn't really solve the problem (no free memory), only hiding consequence. This brings us to the next step:
  • Independantly from enabled or disabled evictions in Memcached, using monitoring tools (Nagios, whatever) site administrator should be notified about "no memory/eviction" problem of serial Memcached bin.
crea’s picture

Some points learned from #493448: Memcache and cache_clear_all wildcard:
Just move serials from cache id into cache entry itself. Even if a serial for the cache entry is missing, we can always construct cache id. On successful cache_get() with missing serial, or when serial is not equal to the one stored in cache entry, return cache miss and flush (read: delete) cache entry directly. This way serial collision is impossible. Also because we delete cache entries directly instead of letting them be LRU evicted, we don't have evictions generally and don't have to worry about our old, invalid generational keys pushing out yet valid direct keys (e.g. case described in http://blog.evanweaver.com/articles/2009/04/20/peeping-into-memcached/).

crea’s picture

Status: Active » Closed (won't fix)