cache_menu: data blobs identical and huge;

jakeg - March 8, 2008 - 12:01
Project:Drupal
Version:6.x-dev
Component:menu system
Category:bug report
Priority:critical
Assigned:pwolanin
Status:closed
Description

Firstly, I'm shooting in the blind here as I don't really know what cache_menu does - I can't easily figure it out from the code and there doesn't seem to be documentation on it.

However I highly presume that cache_menu is broken. Here's what I've got:

- every single page I visit creates two entries in cache_menu, one in 'primary-links', one in 'navigation'.
- navigation entries have the 'data' field as around 80KiB
- primary-link entries have around 6KiB in the data field
- this is every single page. e.g. node/1, node/1/edit, node/1/delete, node/2, node/2/edit... all added as they're visited for the first time by any user

Also, all the entries have '0' as the expiry.

I would presume that the bug is that:
1. the expiry should not be 0 but time() + 86400 or something similar instead
2. should not have duplicates for navigation and primary-links
3. the navigation entries shouldn't be a whooping 80KiB each
4. not every single page should be in that cache

It may well be a combination of bugs here. If the first of the list above, I expect its a nice easy fix.

Killes confirms in IRC the same situation. Both of us are running D5->D6 upgraded sites, rather than new sites.

#1

Gerhard Killesreiter - March 8, 2008 - 12:48
Version:6.1» 6.x-dev

I think that the main problem is 1. and that there's no cron call to actually clear the menu table. The second part will be discussed at http://drupal.org/node/226728.

We might resolve this other issue by calling drupal_flush_all_caches. In that case this issue would also be fixed since that function calls menu_rebuild which flushed the menu cache.

Not yet marking this a duplicate.

#2

jakeg - March 8, 2008 - 14:25
Title:cache_menu size; expiry set to 0 for all» cache_menu: data blobs identical and huge; expiry set to 0 for all

The problem may well be 1, but also more than that. On killes suggestion, I checked the contents of 'data' on a few of the entries. And they're identical (at least for node/1 and node/2 anyway, haven't delved further than that yet).

The 80KiB is the exact same 80KiB for all(?) of the entries. Something's going way wrong with this. Its either putting a load of data in there that it shouldn't be putting in, or it should be putting it into one row in the table rather than one row per page.

#3

pwolanin - March 9, 2008 - 16:34

every rendered menu on every page will create a cache entry. The cache is cleared as needed, so I don't think the expiry time is wrong.

So yes, on a default install you'll get 3 entries per page.

#4

pwolanin - March 9, 2008 - 17:32

D6 cache_menu is only per page - NOT per user. all the access checking, localization, etc, is done dynamically used the cached data.

so, for d.o assuming 200,000 nodes + 150,000 user profiles, at 80 kB per page, that's like 28 GB?

There is no reason you'd want to clear this on cron - the data should not get stale. Since the site is getting spidered, you'd just create load building it all again.

#5

jakeg - March 9, 2008 - 18:59

but what about the fact that the data is identical for all(?) the pages? That's a hell of a lot of wasted space. 200,000 or whatever copies of the exact same 8KB of data. This just seems completely wrong. And 20 odd GB for Drupal.org is a hell of a lot of space - there has to at least be an option to disable this, or limit it to xMB or xGB, or expire all items after x amount of time.

#6

killes@www.drop.org - March 9, 2008 - 19:14

Unfortunately, there's no way I can run d,o with a cache table that big, be it 80 or 28 GB.

My initial calculation was based at 250k nodes, 250k user profiles + the relevant "edit" pages. The "edit" entries might not exist all the time, but there are plenty of other pages.

#7

pwolanin - March 9, 2008 - 19:19

so, I guess we never did this calculation, but I'd suggest an option for 6.x (At least) is an option in the performance UI to toggle off menu caching?

#8

jakeg - March 9, 2008 - 19:47

equally, there's no way I can run my site with so much cache_menu, and I fail to see the point of having the same exact 7KB entry thousands of times when its identical. Can someone explain to me what's in the data field to help me better understand this? Without this table and cache_form my entire database is only around 100MB. With them it grows to multiple GB.

I guess a good option would be to add an entry for popular pages only (e.g. with more than x veiws per day or week). Otherwise you may have a page visited once *ever* and thousands others like it all taking up pointless room.

#9

pwolanin - March 9, 2008 - 20:44
Status:active» patch (code needs review)

here's a patch against 6.x, also applies to HEAD with small offset.

This exposes the option via the UI, but we could also make this a silent option (e.g. advanced configuration in settings.php only)

AttachmentSize
toggle-cache-menu-231587.patch3.98 KB

#10

catch - March 9, 2008 - 20:45
Status:patch (code needs review)» active

afaik know bugs get fixed in 7.x then backported. Correct me if wrong.

#11

catch - March 9, 2008 - 20:51
Status:active» patch (code needs review)

#12

pwolanin - March 9, 2008 - 21:24
Version:6.x-dev» 7.x-dev

@catch - sure should deal with HEAD first

here's a radically different, and perhaps better, patch. Implements a 2-level caching scheme so we don't duplicate the stored data, but also don't sacrifice the advantage of caching.

AttachmentSize
2-level-menu-cache-231587-12.patch4.15 KB

#13

pwolanin - March 9, 2008 - 21:55

here's a patch that covers book module too, and has slightly better logic in the cache_set()

AttachmentSize
2-level-menu-cache-231587-13.patch6.32 KB

#14

catch - March 9, 2008 - 22:43

I tested this on a clean-ish site comparing cache_menu entries with and without the patch applied. Without, I get three entries - 26KB for navigation, 47 bytes for my empty secondary/primary links. With the patch applied I get three 36 byte entries instead. So it looks along the right lines to me.

#15

chx - March 10, 2008 - 11:07
Status:patch (code needs review)» patch (code needs work)

Very good trick! But the cache tables should never be queried directly, always use cache_get because cache.inc is pluggable (for eg. memcached).

#16

pwolanin - March 10, 2008 - 11:24

right - also CNW since the new cids don't jive with the cache-clearing code. Both problems should be easily fixable, however.

#17

killes@www.drop.org - March 10, 2008 - 11:25
Status:patch (code needs work)» patch (code needs review)

Thanks chx, new patch attached.

AttachmentSize
2-level-menu-cache-231587-14.patch6.14 KB

#18

killes@www.drop.org - March 10, 2008 - 13:15

Hadn't seen Peter's comment, I changed the CIDs in the cache invalidation code to match the new ones.

AttachmentSize
2-level-menu-cache-231587-15.patch6.5 KB

#19

pwolanin - March 10, 2008 - 16:04
Status:patch (code needs review)» patch (code needs work)

I think for D7 we will want to improve the cache API with something like cache_entry_exists(), but that can come later. It seems silly to have to fetch 80kB of data to see that the row exists in the table.

The cache clearing still needs to be fixed more, I'll try to re-roll later today.

#20

pwolanin - March 11, 2008 - 01:43
Assigned to:Anonymous» pwolanin
Status:patch (code needs work)» patch (code needs review)

ok, here's a working version. Note that ALL cids are back to the form 'links:'. $menu_name . ':something', so our cache clearing code can still work properly using a wildcard against 'links:'. $menu_name . ':'

AttachmentSize
2-level-menu-cache-231587-20.patch6.29 KB

#21

Senpai - March 12, 2008 - 02:20

Tested the patch in #20. Testing scenario:
* * D7 HEAD with all cache tables cleared.
* * Created 10,000 nodes via devel.module in all content-types.
* * Promoted all nodes to 1
* * Ran wget against the localhost site.

Without the patch, the cache_menu table was at 66MiB after the test. With the patch, the result was only 1MiB.

Success!

#22

pwolanin - March 12, 2008 - 02:32

made nodes with Devel and spidered ~450 pages on my localhost with wget (in a race w/ Senpai)

with patch: table size = 0.50 MB

without patch: table size = 11.56 MB

Note this is a bit over-optimistic, since the menus are the same on every page, but does suggest the scale of the problem with the existing core code.

#23

Senpai - March 12, 2008 - 03:00
Status:patch (code needs review)» patch (reviewed & tested by the community)

#24

Dries - March 13, 2008 - 20:32

I had a hard time understanding the code (without reading up on the issue details). Can we take another pass over the code comments?

I suspect this patch might have a negative impact on performance. Is there something better that we can do for Drupal 7?

#25

pwolanin - March 13, 2008 - 21:44
Title:cache_menu: data blobs identical and huge; expiry set to 0 for all» cache_menu: data blobs identical and huge;

@Dries - any performance trade-off should be minimal, since it is usually only a single, simple, extra query, and for most pages we will be writing much, much smaller blobs into the DB so those queries will be faster.

For D7 I'd like to add an extra API function as suggested in #19, but since killes says d.o cannot be moved to D6 without this, I'd like to see some version in D6 and HEAD before mucking with the cache API.

I'll take another pass at the code comments.

#26

pwolanin - March 13, 2008 - 23:09

here's a patch with improved code comments.

AttachmentSize
2-level-menu-cache-231587-26.patch6.92 KB

#27

jakeg - March 14, 2008 - 07:45

surely if an issue is critical as this one is then the patch for 6.* should be made first? I'm happy to test a patch on my live site but its on 6 (obviously) not 7. And I badly need a patch as my database is getting silly big without one, which also causes problems with my automated backups.

#28

Dries - March 14, 2008 - 08:51
Version:7.x-dev» 6.x-dev

OK, thanks Peter. I've committed this to CVS HEAD. Should go in into Drupal 6 as well as this is a pretty bad problem.

#29

catch - March 14, 2008 - 09:26

Just checked and this still applies cleanly to 6.x

#30

wrwrwr - March 22, 2008 - 09:09

Thank you for that patch. My test site (yes, on a cheap, shared hosting ... :) got a suspension warning the next day after 6.1 upgrade, with the menu cache taking over 60% database space on its own. I can confirm this applies cleanly to 6.1 and works.

#31

pwolanin - March 22, 2008 - 16:24

Yes - we really need this in before 6.2. Also, we should probably further examine whether to remove or limit the caching by book module - the "printer friendly" is always going to produce a unique subtree (I think) , so any site with a large set of books and which exposes printer-friendly to anonymous users (i.e. spiders) may have trouble.

#32

DaveT777 - March 24, 2008 - 05:00

OK folks... I'm a real newbie here but I have a pretty good grasp on what to do if given good instructions. I ran into this cache problem last night - I'm running Drupal 6x in a Wamp stack locally first before I commit it to my site. I was just "fishing around" and clicked the "Track" link on the "My History" page and the whole thing locked up tight. No errors or warnings, just a blank screen and a progress bar at the bottom of my browser. Being one whose a sucker for punishment, I got fed up with my browser locking up so I uninstalled the whole thing deleted the database in MySQL - fished around and found those two (huge) files and deleted them. I have since reinstalled Drupal and things are working OK.

I'm glad I found this page! But now my question is: How do I install the patch? Of all the issues I've been reading, this is a big one because I'd hate to have to do the same radical surgery on my site as I did on my local test.

I assume these are the cache files were talking about(?)

ib_logfile0: 10,240 KB
ib_logfile1: 10,240 KB
ibdata1: 18,432 KB

#33

wrwrwr - March 24, 2008 - 08:39

I believe these would be the database files containing everything that is in it; i don't know what the overhead should be, but 10MB may not be so large.

On linux you would just put the patch file in the main drupal directory, open a terminal and from within that directory issue "patch -p0 < 2-level-menu-cache-231587-26.patch". Seems there is a version of the patch utility for windows too: http://gnuwin32.sourceforge.net/packages/patch.htm, but i've never tried that, so maybe you'll need another tool (you may want to check CygWin or MinGW environments too).

#34

catch - March 24, 2008 - 11:13

DaveT777: http://drupal.org/patch has instructions for applying patches, including on windows (easiest using Cygwin).

#35

pwolanin - March 24, 2008 - 11:36

@DaveT777 - that sounds more like a PHP memory issue (RAM) and unrelated to the bug being fixed here. Try tweaking your settings in php.ini.

#36

DaveT777 - March 24, 2008 - 13:36

Thanks for the quick response! I'll give the php.ini tweak a shot first before I look at applying the patch. Is the patch going to be integrated into newer versions of Drupal? If so, and if I don't have problems with the issue you've been discussing, then I'll wait until the new version is released. I'm an "If-it-ain't-broke-don't-fix-it" type of guy.

Thanks for the help!

DaveT777

#37

Gábor Hojtsy - March 25, 2008 - 14:03
Status:patch (reviewed & tested by the community)» fixed

Thanks. Reviewed and committed to 6.x too.

#38

Anonymous (not verified) - April 8, 2008 - 14:11
Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#39

unknownguy - July 31, 2008 - 09:02

Will this improvement ever see the light on Drupal-5 ? Is Drupal-5 also affected by the problems outlined in this post ?

#40

wrwrwr - August 16, 2008 - 12:14

I've encountered this on upgrade to 6.1, 5.x probably has not been affected by this issue at all. Got fixed in 6.2 too.

 
 

Drupal is a registered trademark of Dries Buytaert.