We've had ongoing reports of problems caused by using memcache with some bins (cache_form), and some bins make less sense going into memcache to begin with (cache_page). This patch adds the ability to exempt any bin from memcache using syntax like this:
$conf['memcache_bins'] = array(
'cache' => 'default',
// Any bin that goes to 'none' will fall through to database caching.
'cache_form' => 'none',
'cache_menu' => 'none',
);
I solved the general problem of how to include memcache.inc and cache.inc at the same time by copy and paste. Ugly, yes, but super easy to maintain. It adds almost no complexity yet makes the module much more flexible.
I've abandoned the idea of memcache.db.inc, and actually think it should be removed from the project. It is woefully outdated and essentially unmaintained.
The implementation of this particular feature (excluding bins) in Drupal 7 is easier, and will look completely different, I imagine.
| Comment | File | Size | Author |
|---|---|---|---|
| #38 | database.inc_.patch | 19.25 KB | robertdouglass |
| #37 | database.inc_.patch | 11.41 KB | robertdouglass |
| #28 | memcache-database.diff | 15.66 KB | robertdouglass |
| #27 | memcache-db.patch | 2.72 KB | robertdouglass |
| #12 | memcache-readme.patch | 1.29 KB | bleen |
Comments
Comment #1
Mark Theunissen commentedOut of curiosity, what is the problem with cache_form? And why doesn't cache_page make sense?
We have a need for what we call a 'durable' bin, that doesn't get cleared during deployments. For that, we have a separate memcached daemon cluster running and map a bin to that.
Comment #2
robertdouglass commentedMark, there are rumors and suspicions, unconfirmed afaik, that some multi-step forms don't function properly with memcache under certain circumstances. Part of the problem is that cache is supposed to be impermanent - you should be able to lose your entire cache and not lose functionality or state on the site. The cache_form table, however, isn't really a cache, but a temporary state store in the case of multistep forms. As I understand it, if the cache_form table went to /dev/null, some forms would not work. This makes memcache less than suitable for caching this information.
As for the page cache, it is easy to exceed the available limits of memcache, so it's nice to be able to route page cache to something with adequate storage available (like the db).
Comment #3
rwohlebsubscribe
Comment #4
bleen commentedLoving this idea ... testing on one of my sites right now ... so far so good
Comment #5
bleen commented... note that using this patch (and excluding cache_form from being memcachified) has completely alleviated my issue (#962578: If cron runs while a form with an imagefield is open, error: "... This form was missing from the server cache..." on img upload ) which IMO confirms that #966114: cache_clear_all ignores cache "expire" attribute is, in fact, a legitimate problem.
Comment #6
rwohlebI'm running this in a large production environment without issue.
Comment #7
robertdouglass commentedhttp://drupal.org/cvs?commit=452056
Committed to 6.x-1.x. Need documentation updates to README.txt, project page.
Comment #8
bleen commentedsweet
Comment #9
emackn commentedI'm a little confused about something. Why would you set up a memcache bin, and then exclude it?
What are the gains/trade-offs for simply not specifying a memcache bin versus specifying a bin but excluding it?
Thanks.
Comment #10
bleen commentedemacken check out this from teh README.txt:
With this setup, anything that would have normally gone into the cache_form, or cache_block, etc... tables will autmatically be sent to the "default" cluster even though neither are specified in this setup. By explicitly excluding cache_form (or whatever you want to exclude) you are telling memecache to follow this algorithm when cache_set() is called:
Does that make sense?
Comment #11
emackn commentedyes, thanks, didn't realize that not specifying a bin would send the cache sets to the default.
Comment #12
bleen commentedI had a minor spelling error in #8
Comment #13
digi24 commentedA suggestion regarding the committed code patch:
Could you please move the whole _cache_get, _cache_set and _cache_clear_all function into a separate file? The reason is simple: Many users probably need to exclude certain bins from memcache and handle them another way. (Cacherouter is IMHO no option, as it provides a very good idea, but neglected and inefficient engines support.)
Having the alternative engine functions in a separate file would make custom cache engines besides memcache API way easier to maintain. And users not requiring this functionality in this patch, could just comment it and save some CPU cycles per request.
Comment #14
robertdouglass commented@bleen18, thanks - added the docs.
@digi24 - provide a patch and I'll review it. Thanks.
Comment #15
jeremy commentedI get the feeling this is an attempt to mask another bug -- if cache_forms is not working in memcache, why? Is moving cache_menu into the database even an option for large Drupal websites? I also don't like how this was implemented, cutting and pasting code from core. Inevitably as core changes this file will get out of sync. If this logic has to stay in, I agree it should be moved into an include file as requested in #13.
Comment #16
jeremy commentedPossibly related:
#978310: drupal.org forms expire way too early
Comment #17
robertdouglass commentedJeremy, I agree that this might be a way to mask other bugs. I also argue that disk space is cheap and memory is expensive, and some people use cache_set strategies (like rendered node caching) that quickly fill up memory, and they should have the option of sending high volume caches to disk.
I'll make a separate file and include it with the extra code if that helps people.
Comment #18
digi24 commented@robert
Regarding the patch - I just moved the functions to seperate files and put in a require_once. This could maybe made nicer and use a better name then 'none'. So I am not uploading it.
One thing I came across, I would suggest to also add the following patch, which moves the cache_clear_all logic out of the db-specific function, by simply slightly postponing the specific functions:
and in the specific _cache_clear_all:
Comment #19
ball.in.th commentedSubscribing. Very interested in this.
Comment #20
jeremy commentedBumping up the priority, as this is currently blocking a release. There have been some performance related fixes that I'd like to release, but not until this patch is either backed out (very tempting), or the logic is moved into an include file.
> I also argue that disk space is cheap and memory is expensive
I see two problems with this argument: 1) your patch sends a configurable number of caches into the database which is also going to end up in memory at least partially, and 2) disks are slow while memory is fast -- if you're deploying the memcache module you're doing it because you want to improve performance. While more expensive than disk space, memory really isn't that expensive these days.
Comment #21
ball.in.th commentedCould 'none' be changed to something else, e.g., database, core, etc, to avoid confusion? It seems 'none' is used in cacherouter to literally mean no cache -- http://drupal.org/node/797610 .
Comment #22
bleen commentedre #20: Please do not back this out ... this change solves a whole bunch of issues that occur when cache is cleared while a form is being filled out. The form cache really is a special case in Drupal and this lets me treat it as such.
Comment #23
jeremy commented> this change solves a whole bunch of issues that occur when cache is
> cleared while a form is being filled out. The form cache really is a
> special case in Drupal and this lets me treat it as such.
Where is the bug report for this? Have you tried the latest development snapshot to confirm whether or not it's necessary to split out cache_form? If it's still true with the latest development snapshot, update the bug report with steps on how to duplicate this, and we'll fix it properly.
Comment #24
robertdouglass commentedJeremy - I agree that hunting down the weirdness with cache form is a good idea, but there's a 90% chance that it is not a memcache problem, rather a core architecture problem. I still maintain that independent of that particular case in point, this functionality is needed and wanted, and wish to hold on to it in the 6.x version (7.x doesn't need it). I'll get the code put into a separate file today, as requested.
Comment #25
robertdouglass commentedJeremy, I didn't understand this sentence:
"which is also going to end up in memory, at least partially" - how do you mean that?
A typical case that I run into is a site that has 4-5 web servers that are CPU bound and never exhaust their memory, so memcache is put onto the webservers (~100M or so per machine). If a site is caching nodes, or has millions of something, then memcache quickly gets exhausted, and most cache calls miss. This is a side effect of deciding that *all* cache_sets need to be stored in memory. The long tail of cache items that account for the "millions of something" are better off in persistent storage. This helps sites like the one I described avoid needing to buy a dedicated memcache machine with 16G Ram just to serve the cache.
The problem might not be as acute if we were tuning slab management to be most efficient for the typical cache item size we're sending in, but we don't do that.
In this way, the cacherouter module has the right idea - being able to direct specific caches to specific storage - but I never wanted to introduce that whole abstraction to this module. This is a compromise.
Comment #26
robertdouglass commented@digi24 I don't think the argument that you could just comment out the include or require statement actually holds. You'd be risking a fatal function not found error if you ever got into a situation where one of the _cache_* were called.
I'm not opposed to putting the _cache_* functions into a separate file, but the only advantage I see of doing so is making it easy for someone else to write a new set of _cache_* functions (say to mongodb) and including that file instead of the database file. Honestly, very few people are going to do that.
So I'll provide a patch to move them into an include file, but I dispute the value of doing so.
Comment #27
robertdouglass commentedHere's a patch that I'm committing to change 'none' to 'database'.
Comment #28
robertdouglass commentedHere's a patch to move the _cache_* functions to a new file, which I called database.inc.
@digi24 I haven't integrated any of your suggested changes from #18 yet. I'm a little reticent to do so as the "easy to maintain" part of this solution (copy and paste from cache.inc) starts to fall apart.
Comment #29
ball.in.th commentedThe code looks good. Maybe the _cache_* functions in database.inc should be renamed to memcache_database_cache_* instead to avoid name collisions -- http://drupal.org/coding-standards#naming .
Comment #30
robertdouglass commented@ball.in.th - what name collisions? Since we hijack cache.inc altogether I don't expect there to ever be a name collision. I'm not opposed to the idea, though.
Comment #31
ball.in.th commented@robertDouglass
Yes, collisions are not likely. Just might be a good idea to have module name as a prefix.
Comment #32
bleen commented@jeremey: just to answer your question from #23, the bug report can be seen here: #966114: cache_clear_all ignores cache "expire" attribute
@robertDouglass: Thanks for keeping this on your radar!
Comment #33
jeremy commented@bleen18 that issue should now be fixed in the latest development snapshot, and hence was closed.
@robert:
> "which is also going to end up in memory, at least partially" - how do you mean that?
Databases use memory for performance. Especially if using innodb, then it's going to load all indexed columns into RAM.
If you have a cache bin that is abnormally large and is pushing other cache bins out of memory, you should move it to a dedicated memcached instance so it only affects itself. Memcache uses LRU to evict pages so the most popular items will live in the cache.
> I agree that hunting down the weirdness with cache form is a good idea, but there's
> a 90% chance that it is not a memcache problem, rather a core architecture problem.
What core architecture problem are you talking about?
Comment #34
robertdouglass commentedCaches should be disposable information. You should be able to run the website (albeit inefficiently) without any caching. cache_form is essential (as I understand it) to multi step forms, and Drupal actually breaks if it isn't available. That's the architectural problem I meant.
Comment #35
jeremy commented@robert: I believe the goal was to make this database.inc file configurable via settings.php, much like 'cache_inc' is configurable in core.
Specifically, this line:
Should look more like this line (from bootstrap.inc):
Comment #36
thebuckst0p commentedSubscribe. I've been having problems with cache_form in memcache causing form_build_id tokens to become invalid and return empty data from CCK "Add another" buttons. This patch looks promising.
Comment #37
robertdouglass commentedOk, I have made it so. The new variable is called "memcache_extra_include". I wrote some documentation and have committed this change.
Comment #38
robertdouglass commented#959200 by robertDouglass, bleen18: Added Add ability to exclude bins (tables) from memcache.
This time with -N so the new file shows up.
Comment #40
quicksketchThough this issue is now somewhat old and already done with, I want to back up something robertDouglass pointed out:
That's right. The cache_form table is not really a "cache" at all. It's a temporary storage of in-progress forms. Any form that utilizes multi-step through $form_state['storage'] or #ahah will automatically make entries into the cache_form table. The problem with this table is that it's content's aren't "temporary" in the manner that they can be disposed of at any time. For example let's say you start a node at node/add/image, using ImageField as an upload. If someone flushes the caches while you're typing out the title to your image, then you try to upload after the caches have been cleared, you'll find that it's impossible. ImageField will helpfully give you the error, #795004: An unrecoverable error occurred. This form was missing from the server cache. Try reloading the page and submitting again..
Now moving cache_form into the database instead of memcache still does not entirely solve this problem, but it does prevent Memcache from pruning in-use "cache" entries that may contain a form because it has used up all its space or from restarting memcache causing forms to break. So kudos, thanks for the patch and I'm looking forward to using 1.8+.
Comment #41
pieter_duijves commentedfor Drupal 7, someone managed to exclude a bin from memcache and have it fall back to the db with these lines in settings.php :