cache_menu: data blobs identical and huge;
| Project: | Drupal |
| Version: | 6.x-dev |
| Component: | menu system |
| Category: | bug report |
| Priority: | critical |
| Assigned: | pwolanin |
| Status: | closed |
Firstly, I'm shooting in the blind here as I don't really know what cache_menu does - I can't easily figure it out from the code and there doesn't seem to be documentation on it.
However I highly presume that cache_menu is broken. Here's what I've got:
- every single page I visit creates two entries in cache_menu, one in 'primary-links', one in 'navigation'.
- navigation entries have the 'data' field as around 80KiB
- primary-link entries have around 6KiB in the data field
- this is every single page. e.g. node/1, node/1/edit, node/1/delete, node/2, node/2/edit... all added as they're visited for the first time by any user
Also, all the entries have '0' as the expiry.
I would presume that the bug is that:
1. the expiry should not be 0 but time() + 86400 or something similar instead
2. should not have duplicates for navigation and primary-links
3. the navigation entries shouldn't be a whooping 80KiB each
4. not every single page should be in that cache
It may well be a combination of bugs here. If the first of the list above, I expect its a nice easy fix.
Killes confirms in IRC the same situation. Both of us are running D5->D6 upgraded sites, rather than new sites.

#1
I think that the main problem is 1. and that there's no cron call to actually clear the menu table. The second part will be discussed at http://drupal.org/node/226728.
We might resolve this other issue by calling drupal_flush_all_caches. In that case this issue would also be fixed since that function calls menu_rebuild which flushed the menu cache.
Not yet marking this a duplicate.
#2
The problem may well be 1, but also more than that. On killes suggestion, I checked the contents of 'data' on a few of the entries. And they're identical (at least for node/1 and node/2 anyway, haven't delved further than that yet).
The 80KiB is the exact same 80KiB for all(?) of the entries. Something's going way wrong with this. Its either putting a load of data in there that it shouldn't be putting in, or it should be putting it into one row in the table rather than one row per page.
#3
every rendered menu on every page will create a cache entry. The cache is cleared as needed, so I don't think the expiry time is wrong.
So yes, on a default install you'll get 3 entries per page.
#4
D6 cache_menu is only per page - NOT per user. all the access checking, localization, etc, is done dynamically used the cached data.
so, for d.o assuming 200,000 nodes + 150,000 user profiles, at 80 kB per page, that's like 28 GB?
There is no reason you'd want to clear this on cron - the data should not get stale. Since the site is getting spidered, you'd just create load building it all again.
#5
but what about the fact that the data is identical for all(?) the pages? That's a hell of a lot of wasted space. 200,000 or whatever copies of the exact same 8KB of data. This just seems completely wrong. And 20 odd GB for Drupal.org is a hell of a lot of space - there has to at least be an option to disable this, or limit it to xMB or xGB, or expire all items after x amount of time.
#6
Unfortunately, there's no way I can run d,o with a cache table that big, be it 80 or 28 GB.
My initial calculation was based at 250k nodes, 250k user profiles + the relevant "edit" pages. The "edit" entries might not exist all the time, but there are plenty of other pages.
#7
so, I guess we never did this calculation, but I'd suggest an option for 6.x (At least) is an option in the performance UI to toggle off menu caching?
#8
equally, there's no way I can run my site with so much cache_menu, and I fail to see the point of having the same exact 7KB entry thousands of times when its identical. Can someone explain to me what's in the data field to help me better understand this? Without this table and cache_form my entire database is only around 100MB. With them it grows to multiple GB.
I guess a good option would be to add an entry for popular pages only (e.g. with more than x veiws per day or week). Otherwise you may have a page visited once *ever* and thousands others like it all taking up pointless room.
#9
here's a patch against 6.x, also applies to HEAD with small offset.
This exposes the option via the UI, but we could also make this a silent option (e.g. advanced configuration in settings.php only)
#10
afaik know bugs get fixed in 7.x then backported. Correct me if wrong.
#11
#12
@catch - sure should deal with HEAD first
here's a radically different, and perhaps better, patch. Implements a 2-level caching scheme so we don't duplicate the stored data, but also don't sacrifice the advantage of caching.
#13
here's a patch that covers book module too, and has slightly better logic in the cache_set()
#14
I tested this on a clean-ish site comparing cache_menu entries with and without the patch applied. Without, I get three entries - 26KB for navigation, 47 bytes for my empty secondary/primary links. With the patch applied I get three 36 byte entries instead. So it looks along the right lines to me.
#15
Very good trick! But the cache tables should never be queried directly, always use cache_get because cache.inc is pluggable (for eg. memcached).
#16
right - also CNW since the new cids don't jive with the cache-clearing code. Both problems should be easily fixable, however.
#17
Thanks chx, new patch attached.
#18
Hadn't seen Peter's comment, I changed the CIDs in the cache invalidation code to match the new ones.
#19
I think for D7 we will want to improve the cache API with something like
cache_entry_exists(), but that can come later. It seems silly to have to fetch 80kB of data to see that the row exists in the table.The cache clearing still needs to be fixed more, I'll try to re-roll later today.
#20
ok, here's a working version. Note that ALL cids are back to the form
'links:'. $menu_name . ':something', so our cache clearing code can still work properly using a wildcard against'links:'. $menu_name . ':'#21
Tested the patch in #20. Testing scenario:
* * D7 HEAD with all cache tables cleared.
* * Created 10,000 nodes via devel.module in all content-types.
* * Promoted all nodes to 1
* * Ran wget against the localhost site.
Without the patch, the cache_menu table was at 66MiB after the test. With the patch, the result was only 1MiB.
Success!
#22
made nodes with Devel and spidered ~450 pages on my localhost with wget (in a race w/ Senpai)
with patch: table size = 0.50 MB
without patch: table size = 11.56 MB
Note this is a bit over-optimistic, since the menus are the same on every page, but does suggest the scale of the problem with the existing core code.
#23
#24
I had a hard time understanding the code (without reading up on the issue details). Can we take another pass over the code comments?
I suspect this patch might have a negative impact on performance. Is there something better that we can do for Drupal 7?
#25
@Dries - any performance trade-off should be minimal, since it is usually only a single, simple, extra query, and for most pages we will be writing much, much smaller blobs into the DB so those queries will be faster.
For D7 I'd like to add an extra API function as suggested in #19, but since killes says d.o cannot be moved to D6 without this, I'd like to see some version in D6 and HEAD before mucking with the cache API.
I'll take another pass at the code comments.
#26
here's a patch with improved code comments.
#27
surely if an issue is critical as this one is then the patch for 6.* should be made first? I'm happy to test a patch on my live site but its on 6 (obviously) not 7. And I badly need a patch as my database is getting silly big without one, which also causes problems with my automated backups.
#28
OK, thanks Peter. I've committed this to CVS HEAD. Should go in into Drupal 6 as well as this is a pretty bad problem.
#29
Just checked and this still applies cleanly to 6.x
#30
Thank you for that patch. My test site (yes, on a cheap, shared hosting ... :) got a suspension warning the next day after 6.1 upgrade, with the menu cache taking over 60% database space on its own. I can confirm this applies cleanly to 6.1 and works.
#31
Yes - we really need this in before 6.2. Also, we should probably further examine whether to remove or limit the caching by book module - the "printer friendly" is always going to produce a unique subtree (I think) , so any site with a large set of books and which exposes printer-friendly to anonymous users (i.e. spiders) may have trouble.
#32
OK folks... I'm a real newbie here but I have a pretty good grasp on what to do if given good instructions. I ran into this cache problem last night - I'm running Drupal 6x in a Wamp stack locally first before I commit it to my site. I was just "fishing around" and clicked the "Track" link on the "My History" page and the whole thing locked up tight. No errors or warnings, just a blank screen and a progress bar at the bottom of my browser. Being one whose a sucker for punishment, I got fed up with my browser locking up so I uninstalled the whole thing deleted the database in MySQL - fished around and found those two (huge) files and deleted them. I have since reinstalled Drupal and things are working OK.
I'm glad I found this page! But now my question is: How do I install the patch? Of all the issues I've been reading, this is a big one because I'd hate to have to do the same radical surgery on my site as I did on my local test.
I assume these are the cache files were talking about(?)
ib_logfile0: 10,240 KB
ib_logfile1: 10,240 KB
ibdata1: 18,432 KB
#33
I believe these would be the database files containing everything that is in it; i don't know what the overhead should be, but 10MB may not be so large.
On linux you would just put the patch file in the main drupal directory, open a terminal and from within that directory issue "patch -p0 < 2-level-menu-cache-231587-26.patch". Seems there is a version of the patch utility for windows too: http://gnuwin32.sourceforge.net/packages/patch.htm, but i've never tried that, so maybe you'll need another tool (you may want to check CygWin or MinGW environments too).
#34
DaveT777: http://drupal.org/patch has instructions for applying patches, including on windows (easiest using Cygwin).
#35
@DaveT777 - that sounds more like a PHP memory issue (RAM) and unrelated to the bug being fixed here. Try tweaking your settings in php.ini.
#36
Thanks for the quick response! I'll give the php.ini tweak a shot first before I look at applying the patch. Is the patch going to be integrated into newer versions of Drupal? If so, and if I don't have problems with the issue you've been discussing, then I'll wait until the new version is released. I'm an "If-it-ain't-broke-don't-fix-it" type of guy.
Thanks for the help!
DaveT777
#37
Thanks. Reviewed and committed to 6.x too.
#38
Automatically closed -- issue fixed for two weeks with no activity.
#39
Will this improvement ever see the light on Drupal-5 ? Is Drupal-5 also affected by the problems outlined in this post ?
#40
I've encountered this on upgrade to 6.1, 5.x probably has not been affected by this issue at all. Got fixed in 6.2 too.