Problem/Motivation

The issue involved the menu_mask variable being empty, resulting in WSOD. This sometimes could result in the inability to install D7/D6.

Proposed resolution

The proposed solution in #33 explains the problem occurring when "2 requests come in at exactly the same time and menu_masks variable is not defined", resulting in the lack of a request lock and the variable remaining undefined. Proposed solution in #53 is RTBC.

Remaining tasks

Comments by mikeytown in #57 in response to webchick in #55 to clarify the solution in the patch comment need review.

User interface changes

none

API changes

none

Original report by [username]

I haev seen too many requests to this and now it came to me that there is a way to rebuild without running a query to see whether router is empty.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx’s picture

FileSize
534 bytes
moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

looks good to me, and auto-recovers when you get into a nasty situation of your site going permanent WSOD.

chx’s picture

For D6 too.

Gábor Hojtsy’s picture

Why should this happen in the middle of an API function?

chx’s picture

'Cos if the menu build does not happen (or fail) then your masks will be empty -- and this is the only case when your masks can be empty -- so it's a good time and time place to try to recover.

pwolanin’s picture

As an alternative, can we set the variable that says a rebuild is needed and maybe hard code a few masks in for this just this page request?

pwolanin’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
887 bytes

how about this? A quick test shows it works if the masks are deleted/emptied.

alexanderpas’s picture

+1 to #1 nice patch, also fixes #307937: MYSQL Syntax error on installation with pre-filled tables
-1 to #7, that just looks ugly...

chx’s picture

so how do we proceed?

alexanderpas’s picture

how about RTBC on the #1 patch?

robertDouglass’s picture

I've been bitten by this bug. Subscribe. Will review.

webchick’s picture

I can't currently install D7 because of this, which is seriously impeding my ability to commit patches. ;) It'd be nice of chx/pwolanin could have a discussion about the best way to proceed.

alexanderpas’s picture

webchick, did applying #1 fixed it for you?

dropcube’s picture

It's happening for me every time I try to install D7 after the DBTNG patch. Would be good to find what is causing the issue, for what reason the menu rebuild does not happen during a fresh installation. Subscribing.

catch’s picture

I've had this on occasion but not been able to consistently reproduce.

chx’s picture

Give me file level access to some place where I can reproduce something and I will figure out. I can't help a bug I can't see.

pwolanin’s picture

People have seen this on D6 and D7, so it's some kind of race condition I think. But I'm at a loss as to what.

Actually, the D7 code has a flaw that's partly fixable... let me go find it.

pwolanin’s picture

some of the D7 problems may be fixed by this: http://drupal.org/node/238760#comment-1023094

alexanderpas’s picture

chx: see #307937: MYSQL Syntax error on installation with pre-filled tables for a way to forcefully reproduce this bug...

dropcube’s picture

@alexanderpas: I can not reproduce the bug with the instrucctions there, the first step is "instal site normally", how is that if I can not even install Drupal. The error appeared during the installation process.

pwolanin’s picture

I think this is a simpler way to achieve what chx's patch does in #1. With this patch in place, I can variable_del the menu masks via a PHP node and life goes on normally. Without the patch, Drupal dies until I hit update.php or otherwise manually force a menu rebuild.

Patch is against 6.x, though applies to 7.x also with offset.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Masterstroke.

pwolanin’s picture

Assigned: chx » pwolanin
FileSize
839 bytes

with added code comment

alexanderpas’s picture

even better!

alexanderpas’s picture

webchick’s picture

Version: 7.x-dev » 6.x-dev

Ok, I committed this. I've run into this bug before, and it's good to have a fix in. However, I'd LOVE to figure out WHY this happens so we can fix the underlying issue. Talking to Peter and chx though it doesn't sound like it's so easy to determine since it's very erratic.

Moving down to 6.x for consideration.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

Seconded webchick's comment on the fact that it would be better to know the underlying cause. Committed to 6.x now as a stop-gap solution.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

Yura Filimonov’s picture

Since you have marked #307937 as a duplicate of this issue, I'll let you know that:
- I have tried installing Drupal 6.6, downloaded recently on WAMP 2.0c
- during the installation, I get the MySQL syntax error described in #307937
- if the installation is successful (happened once), after the site times out on anything, the menu isn't completely built

My cookies are enabled and PHP memory limit is set to 128Mb.

Anyone confirmed that the problem is completely fixed? Judging by posts elsewhere, it is not.

Thanks.

bkline’s picture

Version: 6.x-dev » 6.19
Status: Closed (fixed) » Needs work

This bug is still present in the most recent release version of Drupal 6. It doesn't happen every time, but I've had it occur twice: once with 6.4 and once with 6.19, both on Ubuntu 10.10. Cookies are definitely enabled. The first time it happened I was installing from the O'Reilly Using Drupal pre-packaged archive. I found a posting from someone who reported having solved the problem by replacing that archive with the latest production release of Drupal 6. So I pulled down 6.19 and tried again. Same problem. I found another posting that said installation had failed using Chrome as the browser but succeeded with Firefox. So I cleared out the settings.php file and the database and tried again using Firefox. This time it succeeded. However, I was skeptical that the browser could actually be the problem so I did some additional testing using fresh downloads of 6.19 and was able to get successful installations using either browser. So it's much more likely to be a timing problem. Yes, I know that makes the bug harder to reproduce, and therefore harder to fix. But it's still a bug which hasn't been fixed. If you do some googling, you'll see plenty of postings from frustrated users struggling with the bug well after this issue was marked as "closed (fixed)" two years ago. I've changed the "Version" field from "6.x-dev" to "6.19"; there's no indication in this interface whether the "Version" field is intended to refer to the version in which I'm reporting the problem, or a target version for the fix. Apologies if I guessed wrong. If the setting to "6.x-dev" was intended to mean "we fixed this in the development branch two years ago but we have no intention of ever deploying the fix in the released version of the system" then the bug is in the process, not the code. It's hard to believe the Drupal development community would really expect all of its users to dig through the bug database and apply all of the approved patches. And I'd also like to think that comment #26 isn't just being ignored.

mgriego’s picture

I ran into this one just now while I was attempting an install of Open Atrium. I figured it was an issue with Atrium's install profile, until I tried to do a "standard" installation and failed there too. When I searched d.o, I saw #280015: cannot install drupal - menu_router error. Comment #40 mentioned the system date. I was installing onto a VMware guest that had been suspended for a long time, and ntpd had died at some point. My system date was about 40 days in the past. Once I resynced my system time, the install went off with no problems.

mikeytown2’s picture

Title: Menu first rebuild does not happen » menu_masks variable is empty (race condition)
Version: 6.19 » 6.x-dev
FileSize
3.46 KB

The Error

User warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') ORDER BY fit DESC LIMIT 0, 1' at line 1 query: SELECT * FROM menu_router WHERE path IN () ORDER BY fit DESC LIMIT 0, 1 in _db_query() (line 169 of includes/database.mysqli.inc).

How to trigger error:
2 requests come in at exactly the same time and menu_masks variable is not defined. menu_execute_active_handler() runs for both of them. In both cases the menu_masks variable doesn't exist thus menu_rebuild() will run on both. The first request to get a lock will complete fine. The request that doesn't get a lock will wait and not run the _menu_router_build() function; this function sets the menu_masks variable, but in the request that didn't acquire the lock, the variable is still not defined. Thus later on when menu_get_ancestors() is ran, the foreach loop (in this function) is not ran and we get 2 empty arrays returned causing the SQL query to blow up on us inside of menu_get_item().

Possible Solutions:

  • Have $write_database = FALSE like we do in module_rebuild_cache() and system_theme_data() and duplicate work.
  • Read from the database for the menu_masks variable and set it in the $conf global (thus allowing variable_get to work correctly in this request).
  • If the menu_masks variable is empty provide an array that goes from 255 down to 1.
  • Have an alternate foreach loop in menu_get_ancestors that checks for a mask everywhere.

My Opinion:
I think we should read from the database and set the menu_masks variable after lock_wait runs inside of menu_rebuild. If for some reason that variable doesn't exist we will still be ok because inside of menu_get_ancestors we will have an alternate foreach loop that doesn't require this variable. Patch attached doesn't contain the alt menu_get_ancestors logic, only a place holder for it.

Thoughts?

mikeytown2’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, drupal-261148-33-menu_masks-race-condition.patch, failed testing.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
1.89 KB

Removed the menu_get_ancestors placeholder logic as that could be causing issues.

chx’s picture

Why is this set to D6 (and not D8 and backported down)?

Congratulations on the detective work! But #36 makes me want to cry like a baby. That's sooo fugly. Can't we just throw an 503 from menu_get_ancestors? If you run into this when called from _menu_find_router_path then you are in so, so deep trouble.

noslokire’s picture

Man I am happy we are have now surrounded this bug after 8 months of digging. Nice work Mike.

Good news, I think you in the right place.
Bad news, applied patch didn't correct our issue.

Good news, we see this a couple times per day randomly across our network of sites so there are plenty of chances to test the approach in #37 or nail down a better approach to #36.

Thanks for the detective work on this!

mikeytown2’s picture

Here's an alt approach that is similar to #7. We will test and report back if this fixes the issue for us. I don't feel like pulling apart the bitwise math in the foreach loop so I'm trying the alt option here. Yeah it slightly slower but this happens about once every 500k requests for us. Patch attached.

chx’s picture

Thanks for your persistence despite my slightly scathing review. That approach is indeed looking better, but I think range(1, 256) would be better. Perhaps with a comment that MENU_MAX_DEPTH is 9 so we need 9 bits at most?

mikeytown2’s picture

Assigned: pwolanin » Unassigned

#39 fixes this bug on our sites.

Code wise the array needs to start from 255 and count backwards. 255 comes from the fact that the path column in the menu_router table is varchar(255). Something like this would work instead of the hard coded array if we want to go that route.

array_reverse(range(1, 255));
chx’s picture

Reading http://php.net/range

Prior to PHP 4.1.0, range() only generated incrementing integer arrays. Support for character sequences and decrementing arrays was added in 4.1.0.

So range(511,1)

Why 511? What this masks things does, unless I completely forgot what I coded some years ago :) is that we map node/%/foo/%/bar to 10101 and the max depth is 9. So there can not be more than 9 bits. Which means we need 511 not 256. Correct me if I am wrong. (The leftmost bit is always 1 because we do not support %/foo so the binary mapping always work.)

mikeytown2’s picture

Patch against D8, D7 & D6 using chx's advice from #42

Status: Needs review » Needs work

The last submitted patch, drupal-menu-261148-43-menu_masks-race-condition-D6.patch, failed testing.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
669 bytes

Man I wish the test bot was a little smarter. Now with only the D8 patch attached. D7 & D6 patches attached in #43

chx’s picture

FileSize
707 bytes
mikeytown2’s picture

This test might fail on your patch.
MenuRouterTestCase::testMenuGetItemNoAncestors

mikeytown2’s picture

Status: Needs review » Reviewed & tested by the community

@chx
Is the default parameter getting removed in D8 for variable_get? If not, what is the advantage of your way? In either case it's doing what it should so I will yield to CHX on this point. Calling this RTBC.

catch’s picture

Status: Reviewed & tested by the community » Needs work

The difference with #46 is that we avoid calling range() on every request - instead that will only happen if nothing is returned from variable_get() - chx pointed this out to me in irc when I asked the same thing, it's easy to forget the variable_get() default actually gets called either way.

That's a good reason to add a comment here though explaining why it's doing that.

chx’s picture

Status: Needs work » Needs review
FileSize
819 bytes
catch’s picture

Status: Needs review » Reviewed & tested by the community

Thanks :)

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)
Issue tags: +Needs backport to D7

Committed/pushed to 8.x, moving to 7.x for backport. Backport should be straightforward so marking novice for that.

mikeytown2’s picture

Patches for D7 & D6 based off of #50. The D6 patch also takes care of some white space issues.

aspilicious’s picture

Status: Needs review » Reviewed & tested by the community

7.X patch is exactly the same :)

webchick’s picture

Version: 7.x-dev » 8.x-dev
Priority: Critical » Normal
Status: Reviewed & tested by the community » Needs work

Damn. Nice sleithing! Committed and pushed to 7.x for parity with 8.x. That should also take care of the critical.

However, this comment doesn't explain the code below it. Or at least not to me.

+++ b/includes/menu.incundefined
@@ -321,7 +321,14 @@ function menu_get_ancestors($parts) {
+  // If the optimized menu_masks array is not available use brute force to get
+  // the correct $ancestors and $placeholders returned. Do not use this as the
+  // default value of the menu_masks variable to avoid building such a big
+  // array.
+  if (!$masks) {
+    $masks = range(511, 1);

How does range(511, 1) "use brute force to get the correct $ancestors and $placeholders returned"? Can we clarify this comment more? We need something along the lines of #40-#42. range() normally goes "little to big," and it's not clear what significance the number 511 has.

abautu’s picture

I'm using this on D6, but I changed
$masks = range(511, 1);
to
$masks = range($end, 1);
since values larger than $end will be ignored anyway in the foreach loop.

mikeytown2’s picture

I think I've explained this better and I've also used the tip from abautu in #56

The 511 comes from (2^9) - 1. 9 comes from MENU_MAX_PARTS.

Patches for D8, D7, and D6. The D6 patch also takes care of some white space issues.

mikeytown2’s picture

Status: Needs work » Needs review
BrockBoland’s picture

Needs issue summary

h4rrydog’s picture

Updated issue summary.

h4rrydog’s picture

Issue summary: View changes

Updated issue summary.

Status: Needs review » Needs work

The last submitted patch, 57: drupal-261148-57-menu_get_ancestors-D8.patch, failed testing.

alansaviolobo’s picture

Issue summary: View changes
Issue tags: +Needs reroll
mikeytown2’s picture

Version: 8.0.x-dev » 7.x-dev

No longer needed in D8; RouteProvider::getCandidateOutlines() seems like this is where the code landed and it uses $end instead of 511 so this change has been incorporated in D8.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
1.16 KB

Patch from #57 applies cleanly to latest D7 so this is essentially the same patch. Will post the D6 one once this goes green.

catch’s picture

#356399: Optimize the route rebuilding process to rebuild on write is related and specifically fixes on race condition where the variable gets empty after a lock_wait() on menu_get_item(), this might be a duplicate.

joelpittet’s picture

Should this be a duplicate or should we re-open #356399: Optimize the route rebuilding process to rebuild on write for D7?

  • webchick committed 34ec8f9 on 8.3.x
    #261148 by chx and pwolanin: Allow recovery from a failed menu build.
    
    
  • catch committed d79b935 on 8.3.x
    Issue #261148 by mikeytown2, chx, pwolanin: Fixed menu_masks() variable...

  • webchick committed 34ec8f9 on 8.3.x
    #261148 by chx and pwolanin: Allow recovery from a failed menu build.
    
    
  • catch committed d79b935 on 8.3.x
    Issue #261148 by mikeytown2, chx, pwolanin: Fixed menu_masks() variable...

  • webchick committed 34ec8f9 on 8.4.x
    #261148 by chx and pwolanin: Allow recovery from a failed menu build.
    
    
  • catch committed d79b935 on 8.4.x
    Issue #261148 by mikeytown2, chx, pwolanin: Fixed menu_masks() variable...

  • webchick committed 34ec8f9 on 8.4.x
    #261148 by chx and pwolanin: Allow recovery from a failed menu build.
    
    
  • catch committed d79b935 on 8.4.x
    Issue #261148 by mikeytown2, chx, pwolanin: Fixed menu_masks() variable...
vijaycs85’s picture

#65 still applies to current 7.x.

  • webchick committed 34ec8f9 on 9.1.x
    #261148 by chx and pwolanin: Allow recovery from a failed menu build.
    
    
  • catch committed d79b935 on 9.1.x
    Issue #261148 by mikeytown2, chx, pwolanin: Fixed menu_masks() variable...