We want to optimize (= speed up) how menu_router_build() writes information into the menu_router table.
This will speed up not only explicitly requested menu rebuilds, but also content type configuration form submits, views configuration submits, and other things.

This issue going into D8 is a requirement for
#1170278: Introduce hook_menu_router_update()

menu_router vs menu_links

A menu rebuild consists of two parts (in D6 / D7):

  1. menu_router_build(), dealing with the menu_router table.
    • Invoke hook_menu() on all modules. -> not discussed here.
    • Update the menu_router table, with the information from hook_menu().
  2. _menu_navigation_links_rebuild(), dealing with the menu_links table.
    -> discussed elsewhere

All of these need optimization, but the second one is dealt with in another issue,
#1010480: Optimize _menu_navigation_links_rebuild()

In this issue we only discuss how the data is saved into the menu_router table.

Drupal versions: D6 big speedup, D7/D8 do also benefit

The improvement has the biggest benefit for Drupal 6.
However, also D7 and D8 can benefit.

Diff-based optimization approach

In original D6, menu_router_build will erase the entire menu_router table, and then rewrite it row by row.
In original D7 / D8, menu_router_build will also erase the table, but then rewrite it in a few big queries, each writing a bunch of rows. This is faster than D6, but can still be improved.

The patches (see below) does a big select query, to compare the table contents with the new data and build a diff (rows to insert, rows to update, rows to delete).
In an average case, this diff will be just a few rows.
It will then execute the diff.

The diff data can be made available to other processes, such as _menu_navigation_links_rebuild(). This is to be discussed in another issue.
#1170278: Introduce hook_menu_router_update()

Memory considerations

It was argued that the diff-based approach could have an unacceptable cost in memory, by loading the full menu router with a big SELECT. This would affect both PHP memory and SQL memory.

However: There are already other situations in Drupal where we load the full menu router table, or something as big as it:
- During menu rebuild, we do a module_invoke_all() on hook_menu(). The data generated this way will become the new menu_router, so it has more or less the same size.
- Building the admin menu (contrib) will load something as big as menu_router (both SQL and PHP memory). The same, if we want to show a fully expanded system navigation menu.
- The form for editing the admin menu or system navigation menu. Building this form needs to load a menu with around as many items as the menu_router table.

Benchmarking: Distinguish different parts of menu rebuild.

We need to make sure to benchmark the different parts of menu rebuild separately.
While the above patch does optimize what it's designed for by sth like factor 10, it does have no effects on invocation of hook_menu, or on rebuild of menu_links. So finally the menu rebuild might be around 1/3 faster.

Original text (quite out of date)

As can be seen in various discussions on this site, the function menu_router is a major cause of trouble. See
menu_link_save making multiple (60+) duplicate queries on admin/build/modules
menu_rebuild causes site to lock up
No-op and slow queries during menu rebuild
etc, or try a google search to find more people complain (some of them being duplicates, if I remember right).

Looking at query logs and stack traces, it seems to me that the implementation is far from optimal. For instance, _menu_update_parental_status is called tons of times, and each call causes a SELECT query.

Now, this my naive idea of a decent implementation:
- One big query to load the menu_links and menu_router tables into php arrays.
- Do some computations (without any database queries!!).
- Write the changes to the DB, if any.
- Add some locking mechanics to prevent other requests from doing a menu_rebuild at the same time, or whatever other bad things might happen. But, please don't wait for the locking framework for long operations.

This would save us a lot of queries for this frequent operation.

Thanks,
donquixote

Files: 
CommentFileSizeAuthor
#277 menu.patch3.85 KBdanblack
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_155.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#273 512962-drupal.menu_router_build.SELECT.d6.test-me-273.patch6.09 KBj0rd
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#267 512962-POST-APOCALYPSE_version_a.patch4.13 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 33,774 pass(es).
[ View ]
#267 512962-POST-APOCALYPSE_version_b.patch4.15 KBxjm
PASSED: [[SimpleTest]]: [MySQL] 33,759 pass(es).
[ View ]
#266 drupal._menu_router_save.d8.clean_.02.patch4.13 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 32,975 pass(es).
[ View ]
#265 drupal._menu_router_save.d8.clean_.01.patch4.11 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 32,985 pass(es).
[ View ]
#261 drupal._menu_router_save.d8.clean_.patch4.56 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 32,989 pass(es).
[ View ]
#253 drupal.menu_router_build.SELECT.d8.static_query.patch4.71 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 33,539 pass(es).
[ View ]
#246 drupal.menu_router_build.SELECT.d8.for_real.patch4.74 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 33,546 pass(es).
[ View ]
#246 drupal.menu_router_build.SELECT.d8.drush-benchmark.patch5.51 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 33,559 pass(es).
[ View ]
#244 drupal.menu_router_build.SELECT.d8.patch4.74 KBdonquixote
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#243 drupal.menu_router_build.SELECT.d8.drush-benchmark.patch5.51 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 33,553 pass(es).
[ View ]
#236 drupal.menu_router_build.SELECT.d8.drush-benchmark.patch5.51 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 29,856 pass(es).
[ View ]
#228 drupal.menu_router_build.SELECT.d6.test-me.patch6.09 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#229 drupal.menu_router_build.SELECT.d6.test-me.patch6.09 KBdonquixote
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d6.test-me_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#227 menu_router_build.SELECT.d6.test-me.patch6.09 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#226 menu_router_build.SELECT.D6.testme.patch6.09 KBdonquixote
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]
#225 menu_router_build.SELECT.D6.patch6.09 KBdonquixote
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_router_build.SELECT.D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#223 drupal.menu_router_build.SELECT.d6.patch6.09 KBdonquixote
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#223 drupal.menu_router_build.SELECT.d6.drush-benchmark.patch6.9 KBdonquixote
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#214 menu-router-build-optimize-512962-214-D6.patch14.33 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-214-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#205 menu_router_D6.patch10.92 KBAndrew Answer
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_router_D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#193 menu-router-build-optimize-512962-193-D6.patch14.44 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-193-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#191 menu-router-build-optimize-512962-191-D6.patch8.55 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-191-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#190 menu-router-build-optimize-512962-190-D6.patch7.96 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-190-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#186 menu-router-build-optimize-512962-186-D6.patch6.51 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-186-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#184 menu-router-build-optimize-512962-184.patch6.51 KBaspedia
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-184.patch.
[ View ]
#166 drupal-512962-166-D6.patch12.13 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-166-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#143 drupal-512962-143-D6.patch11.38 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-143-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#130 drupal-512962-D6.patch10.38 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_2.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#128 drupal-512962-D6.patch7.81 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_1.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#122 drupal-512962-D6.patch7.42 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#109 drupal-512962-D6.patch6.73 KBmikeytown2
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#94 issue-512962-menu_router-3.patch3.42 KBdonquixote
Failed on MySQL 5.0 ISAM, with: 9,647 pass(es), 7,024 fail(s), and 1,366 exception(es).
[ View ]
#91 issue-512962-menu_router-1.patch3.66 KBdonquixote
Unable to apply patch issue-512962-menu_router-1.patch
[ View ]
#89 issue-512962-menu_router.patch3.53 KBdonquixote
Failed to install Drupal on MySQL 5.0 ISAM.
[ View ]
#86 issue-512962-menu_router.patch3.58 KBdonquixote
Invalid patch format in issue-512962-menu_router_1.patch.
[ View ]
#79 menu_router.patch.txt3.52 KBdonquixote
#81 issue-512962-menu_router.patch3.57 KBdonquixote
Invalid patch format in issue-512962-menu_router_0.patch.
[ View ]
#80 issue-512962-menu_router.patch3.52 KBdonquixote
Invalid patch format in issue-512962-menu_router.patch.
[ View ]
#71 querylog.png304.6 KBpicardo
#61 menu_router.patch11.25 KBdonquixote
Invalid patch format in menu_router.patch.
[ View ]
#34 drupal-menu-router-rebuild.patch11.41 KBdonquixote
Invalid patch format in drupal-menu-router-rebuild.patch.
[ View ]
#32 drupal-menu-rebuild.patch35.7 KBdonquixote
Invalid patch format in drupal-menu-rebuild.patch.
[ View ]
#27 menu.rebuild.draft_.inc_.txt28.27 KBdonquixote

Comments

noticed a typo..

As can be seen in various discussions on this site, the function menu_rebuild is a major cause of trouble.

menu_rebuild() is generally not the cause for trouble on the module page. It is expensive, but not expensive enough to cause a 30s delay. The true causes are: (1) modules that do silly things in hook_menu() and other cache flush, (2) setup issues.

That said, if we could store the whole menu tree in memory, menu_rebuild() could be easily rewritten. But this tree can be of any unreasonable size, and we really don't want to do that.

Ok, I see your point.
On the other hand, on most systems the number of menu items will be manageable. And where they are not, the menu_rebuild will need even more queries...

So, what I could imagine:
- Count the number of menu items. If they are not too many, use the fast and simple rewrite. This would do a big favour to the average Joe.
- If there are too many menu items in total, the function could try to process one menu after the other. (I'm not sure if this would work)
- If a single menu has too many menu items, use the old method, or some whatever optimized version.

donquixote:
I don't know whether this is technically possible, but I definitely think that it should be explored at least. I know I thought the "DELETE then INSERT" pattern leaves something to be desired, especially if only a few menu items are changing.

Maybe there's some way that we could save the full menu_rebuild for things like the modules page submit handler, but do a lesser form of rebuild when saving a view or a panel (or even a content type). The lesser form of rebuild wouldn't have to delete the whole thing, but would just insert or delete a few rows based on parameters passed from the calling module as to what paths should be registered or removed.

I wonder what merlinofchaos would think about that. I know at least on my site that Views and Panels (2) were causing the biggest issues with race conditions on menu_rebuild. I haven't tried Panels 3 enough yet to know whether that is going to still be a problem on Panels 3.

Damien Tournoud said,

menu_rebuild() is generally not the cause for trouble on the module page. It is expensive, but not expensive enough to cause a 30s delay.

I disagree. Ok, my module page doesn't load 30 seconds, but it's still slow, 4650 queries in 2243.19 milliseconds (and used to be slower than that).

The big majority of queries (I am using devel query display) happen inside menu_router. Some identical SELECT queries happen 50 times.

Some time ago I had the biggest problem with _menu_router_build, which is also part of menu_rebuild, see http://drupal.org/node/302638#comment-1609162. Switching to MYISAM solved the issue for me, but I really don't see this as a clean solution.

The true causes are: (1) modules that do silly things in hook_menu() and other cache flush, (2) setup issues.

Which modules would that be, and what are the silly things they can do? Any examples?

Maybe I could say more useful things if someone could explain to me what exactly the menu_router table is supposed to do, and how it relates to menu_links.. can I read this up somewhere?

One more question.
If the menus can grow so big, then what about the admin/build/menu-customize/* list of menu items? I don't see a LIMIT in the sql used to load the list.

donquixote:
The {menu_router} table, confusingly enough, has very little relation to how most people conceive of a menu (that is, a list of links displayed on a website). Rather, the {menu_router} table is what dispatches all page requests on a Drupal site, since all page requests are handled by index.php. The 'q' query string parameter, such as mydrupalsite.org/index.php?q=admin/build/modules, is known as the menu path and Drupal has to look in the {menu_router} table to see what function will handle display of it.

Anyway, that's my understanding of the Drupal menu system. Someone with more knowledge can probably give you a more detailed answer.

Ok then, but what is the role of the menu_router table in a menu_rebuild?
As I understand, the entries of menu_router are used to update the menu_links table, but how? I could try to understand all the code, but would be nicer if there would be a public explanation somewhere.

I think the menu_router table is used to figure out which paths are a best fit for a particular page request. But I'm not entirely sure... All I know is that if you DROP the menu_router table your site will be toast.

I think that the menu_links table might only be used for paths that are actually in a public menu, such as Primary Links, whereas menu_router handles *all* paths.

donquixote: http://drupal.org/node/102338 and then includes/menu.inc. Note that this is not trivial code (this doesn't mean that there may not be some ways to furhter optimize menu rebuilding).

- One big query to load the menu_links and menu_router tables into php arrays.

Of course... just make sure you buy a LOT of RAM because menu_links is designed to hold hundreds of thousands of links.

Much rather, I agree with Merlin -- let's throw out menu rebuild, provide CRUD and let modules do their work through that. menu links already have a crude CRUD.

What would be necessary to "throw out menu_rebuild & provide CRUD"? It sounds like a good approach to me; I also think that the idea of manipulating menu_rebuild as a PHP array is absurd on even medium-sized sites. Is there an issue where merlin mentions this as an idea?

Of course... just make sure you buy a LOT of RAM because menu_links is designed to hold hundreds of thousands of links.

Sorry, but.. we are already doing it! At least per menu. Look at:
- Create Content > Menu settings > Parent item. There you have a full list of menu items. At some point this list (at least the html of the select tag) will be completely in memory.
- Site building > Menus > Navigation > List items. Again, we can assume that the html of that list will be completely in memory.
- And of course, any menu block can be set as expanded (especially if it's some sort of suckerfishy on-hover thing).

So, if we would manage to process one menu at once, and only keep a minimum collection of data per menu item in memory, I don't see why it should be such a problem.

That said, I don't deny there might be different and better ways to reduce the number of queries which happen in menu_rebuild. I suspect a lot of redundancy in the multiple calls to _menu_update_parental_status. If the menu_rebuild would use a custom private function instead of menu_link_save, we could also use a different function to bulk fill the p1, p2, p3 etc in a more efficient way.

I know core has a few places which do this, however if you look at them carefully, you will see a variable_get around the panel choosers... #191360: Scalable menu selector

Version:6.x-dev» 7.x-dev

Also, features go into D7. If you want to help, however, then #484234: Big task cleanup: Remove type and tasks from hook_menu() needs to happen first, then we can do more.

donquixote: I don't think you're quite understanding yet; menu_router contains more than just items from user customized menus. It contains *all* the possible path combinations that exist on the site. So it's a larger list to load in memory, and it's not necessarily as chunkable because the items might not be part of a user-defined menu.

chx: Sounds good. I can't really help code, as this issue is beyond my ability, but I can definitely review to a certain extent.

Ok, so it's actually the menu_router table having this sometimes evil number of rows? Or both?

I checked with xdebug, and found the following:
- There are different functions that can slow down a site. However, menu_rebuild is called in many requests on the admin backend, and so it has a big share in slowing things down. For instance, on "add content type" or "add field" operations with CCK, it makes around 500ms on my system.
- Inside the function we have menu_router_build (around 170ms) and _menu_navigation_links_rebuild (around 300ms). By default, the latter one does not touch the menu_router table, unless other modules would throw in their own functionality.

What happens inside menu_router_build?
- menu_router_build looks for implementations of hook_menu in all the modules, collects them in an array of $callbacks (all in memory!!), and then calls _menu_router_build, with $callbacks as an argument.
- _menu_router_build generates a $menu array (all in memory!!) from the information in $callbacks, then deletes the entire menu_router table, then fills it with data from the $menu array.

What happens inside _menu_navigation_links_rebuild?
- Most of the time in _menu_navigation_links_rebuild is spent in multiple calls to menu_link_save (around 1ms each).
- Inside menu_link_save, we typically have two direct DB queries, and one call to _menu_update_parental_status. The queries use only menu_links, not menu_router.
- _menu_navigation_links_rebuild takes an argument called $menu, which is generated in menu_router_build, and basically contains the full menu_router information.

---------------------

Conclusion I (regarding menu_router):
- menu_router doesn't make the RAM explode. Or if it does, it already happens in the current implementation.
- deleting the entire table contents of menu_router is definitely too heavy.
- The entire first part (menu_router update and cleanup) can be optimized independently of the menu_links table.

Conclusion II (regarding menu_links):
- menu_links could sometimes be too big for RAM, although there are examples in other places in the Drupal code, where the entire table is in fact loaded into RAM (as I explained in a previous post). menu_links can be chunked in different ways.
- menu_link_save and _menu_update_parental_status are designed for one-time use. For processing and saving tons of menu items, we should find a better solution.
- The entire second part (menu_links update and cleanup) can be optimized independently of the menu_router table.

I had another close look at the code.

In _menu_navigation_links_rebuild in menu.inc:

<?php
   
foreach ($menu_links as $item) {
     
$existing_item = db_fetch_array(db_query("SELECT mlid, menu_name, plid, customized, has_children, updated FROM {menu_links} WHERE link_path = '%s' AND module = '%s'", $item['link_path'], 'system'));
      if (
$existing_item) {
       
$item['mlid'] = $existing_item['mlid'];
       
// A change in hook_menu may move the link to a different menu
       
if (empty($item['menu_name']) || ($item['menu_name'] == $existing_item['menu_name'])) {
         
$item['menu_name'] = $existing_item['menu_name'];
         
$item['plid'] = $existing_item['plid'];
        }
       
$item['has_children'] = $existing_item['has_children'];
       
$item['updated'] = $existing_item['updated'];
      }
      if (!
$existing_item || !$existing_item['customized']) {
       
menu_link_save($item);
      }
    }
?>

$menu_links is extracted from the $menu array which was generated in menu_router_build.

The following things can speed this up:
- First of all, if $menu_links would only contain things that have changed, then we'd have a lot less iterations in the foreach. To get there, we would have to change the menu_router operations.
- Do more checks on the $item vs $existing item, to avoid unnecessary calls to menu_link_save
- Replace menu_link_save with something custom: Collect all the items to save in memory (it won't be bigger than what's already in $menu_links, so no problem), then find a nice algorithm to store them all at once, including an update to the parental status, where necessary.

Thanks for this research, donquixote. Looks like I stand corrected, at least in part. I just assumed that since menu_router was the larger table in terms of rows that it was the more expensive part of the query.

I understand that the standard procedure is to fix things in 7.x and then backport, but I feel like that procedure is very difficult to follow in cases like this for the following reasons:

1) The issue is, as has been said on several issues (most of which are now marked duplicate), the biggest performance bottleneck in 6.x. 6.x has at least a year of life ahead of it, if not more, and we need a fix for this issue in 6.x.
2) Very few people know how to use DBTNG syntax vs. the number of people who know how to write SQL statements.
3) Any DBTNG code would have to be converted back into SQL statements for 6.x.
4) The performance implications are much more dramatic when you have a significant number of modules installed, which is not the case for anyone on 7.x.

My proposal would be that we work on a fix for 6.x, performance test it, and then port it forward to 7.x, although the Version can remain set to 7.x-dev as per the standard. I know this is unconventional, but I think it's been done on previous issues, especially critical performance-related ones.

Yes!

There is another bottleneck, which is related. It happens when the admin_menu is rebuilt.
Without these two bottlenecks, adding CCK fields and content types would be MUCH faster.

The function in question is _admin_menu_rebuild_links, in modules/admin_menu/admin_menu.inc.
Most of the time is spent on admin_menu_link_save, which in turn calls menu_link_save.

About memory issues: admin menu is a suckerfishy thing, so it is proven to fit into memory. (and the _admin_menu_rebuild_links has it all in memory as well).

What does _admin_menu_rebuild_links do?
- It uses menu_router_build to get the data from hook_menu implementations. Fortunately, this information is still in memory cache (static var in menu_router_build), so this call to menu_router_build is really a cheap one.
- Most of the time is spent on multiple calls to admin_menu_link_save, which then again calls menu_link_save.

Conclusion:
- _admin_menu_rebuild_links suffers from the same problems as the core menu_rebuild, and offers the same possibilities for optimization.
- If we implement bulk menu update functionality in core, it will be a good idea to make this functionality available to other modules, such as _admin_menu_rebuild_links.

As for precedent for starting in 6.x and then porting forward, that's what happened in the earlier stages of #251792: Implement a locking framework for long operations (around #58-71). Unfortunately, that issue seems to have stalled recently and a locking framework might not even be necessary for menu_rebuild if the function was reworked.

Status:Active» Closed (won't fix)

Sorry guys, I have linked the issues that need help, pointed out a roadmap. Before that (well before that) I wrote plenty of handbook pages about the menu system. This issue where people demonstrate how much they do not understand the menu system is not going to further anything.

@chx:
It is true, I only begin to understand the menu system, and it was even worse when I started this thread. What I do understand, however, is that nobody likes slow pages and bottlenecks. I will never have the time to understand all of Drupal's code, but I also know that a lot of the successful things people have been doing ever since were based on incomplete information. Looking for performance bottlenecks and possibilities for optimization does not always require a complete understanding of the system.

I agree that my topic starter was too quick and too naive. I no longer argue for doing it all in memory. But, with the latest posts I think I'm not so far away.

We are dealing with a heavy bottleneck, and even a temporary quick solution would help a lot. There are tons of realworld D6 sites out there, with realworld performance requirements, which sometimes make working solutions more important than roadmaps - that's how I see it.

About your "need help" issue ('big task cleanup'): I don't see a conflict with the optimization possibilities I am proposing here, but maybe I'm again missing something. I guess (really can't tell much, it's not very well explained) it makes perfect sense from a D7 point of view, but doesn't help much for existing D6 sites. But maybe that's again because of my insufficient understanding of the menu system.

so, we need to clean up tabs to get rid of any relations between two router elements and when that's done then throw out menu rebuild and create a CRUD instead.

@chx: What does this mean for existing D6 modules using hook_menu to define tabs? Will they still work as expected?

And what about modules that look into the tables directly?
Right now I find the following modules in my install doing SQL queries on menu_router (searching for "{menu_router}"):
book, devel, fast_gallery, i18n, menu, system
None of them uses tab_parent in the SQL query, as far as my simple search can tell.

Of course, this simple analysis doesn't tell much. Do you think it will be a safe move on D6?

StatusFileSize
new28.27 KB

I prepared a draft implementation for menu_router, which should solve the performance issues.
Please note that it is not at all supposed to work out of the box. It still needs some code review and testing + bugfixing.

Some of the optimization ideas are still not implemented, mainly because of my insufficient understanding of the menu system. I only implemented those ideas that appeared to be safe to me. Especially the second part might still have some potential of optimization.

About chx' roadmap:
I don't think that we have a conflict here. The rewrite is meant as a temporary solution, and does not close any doors towards tab_parent / tab_root or CRUD.

<?php
/**
* DRAFT IMPLEMENTATION
*
* This is a draft implementation for a proposed rewrite of menu_rebuild()
* and the mechanics around it.
* See http://drupal.org/node/512962, "Rewrite the function menu_rebuild"
*
* The proposed implementation does not imply any changes in the DB structure
* or the API specifications. All public functions remain backwards compatible.
*
* About memory limitations:
* - menu_links can be too big for RAM.
* - menu_router was not too big for RAM in the old algorithm.
* - We make sure that whatever we load into RAM is not much bigger than
*   menu_router.
*
* Estimated effects:
* - Memory consumption not much worse than the old implementation.
* - Average case number of SQL queries heavily reduced.
* - Average time for menu_rebuild heavily reduced.
*
* The implementation is supposed to be split in 3 files,
* to reduce average bootstrap time:
* - "menu.inc" with the public functions
* - "menu.rebuild_router.inc" working on hook_menu and menu_router
* - "menu.rebuild_links.inc" working on menu_links
*
* The code is not (yet) tested, except that
* - Eclipse does not find syntax errors.
* - Including the file with require_once does not kill Drupal. (the two
*   public functions need to be renamed to avoid name clashes)
*/
?>

Status:Closed (won't fix)» Needs work

@donquixote: please publish a proper patch, and do not move functions around (it is probably pointless anyway). If you don't, your patch will be unreviewable and, as a consequence, will remain unreviewed.

@Damien Tournoud:
I would love to submit a decent patch, but this will take more time:
- I assume you want a D7 patch. I don't even have D7 installed.
- I don't have CVS or diff installed right now.
- Not all the ideas made it into the draft implementation, because I'm not yet finished with understanding the menu system.
- Before I continue working on this, I would like to have more information about the planned menu CRUD and its implications on existing D6 modules using hook_menu. Maybe I should even have a brainstorm chat with someone..

The main idea of the draft was to have a conceptual showcase, so people who are actually working on the D6 or D7 menu system can have a look and take away some of the ideas. The message is: Yes, the menu rebuild can be much faster, memory does not explode, while API and DB remain unchanged. And here is how an implementation can look like.

---

"move functions around":
That's not what I did.
One reason for the poor performance of the old menu_rebuild is that it uses shared functions like menu_link_save, which are not optimized for bulk operations.
For the new implementation I introduced new files with new functions, which are not used anywhere outside the menu_rebuild. There is no reason to load these files in normal page requests, which is why they go into separate files.

I don't think DamienTournoud necessarily meant a D7 patch, but at least a patch that shows what's being changed. Your proof-of-concept might be good, but it's really unreviewable without seeing how things are being changed.

Diff is pretty easy to use if you are on any Unix based system, or if you are using Cygwin on Windows.

As for the custom functions, they might possibly be useful, but moving them into separate includes seems like a no-go. I don't think it saves that much code weight to move them into separate includes, and it makes the menu system file structure unnecessarily complex.

"Unix based system"
-> no (have it on another partition, but the Drupal I'm working on lives on Win XP).

"Cygwin"
-> currently not installed. (had it some windows installs ago)

"how things are being changed"
-> menu_rebuild() and menu_router_build() are changed. Old versions can be found in menu.inc.
EDIT: and _menu_router_build() is changed.
-> All the other functions in the draft are new (and I gave them new names to avoid nameclashes).
-> Some functions in menu.inc might be redundant now, but it doesn't hurt to keep them.

Maybe I'll use CVS to create the patch. I have some modules in the pipe, so I will need it anyway.

StatusFileSize
new35.7 KB
Invalid patch format in drupal-menu-rebuild.patch.
[ View ]

Ok, here's a patch for Drupal-6-9. It is nothing beautiful, just the above in a different format, for those who prefer that.
The only difference is that I left _menu_router_build() inside menu.inc, so it can be compared to the old version.

Maybe some other day I will upload a nicer patch.

EDIT: Please don't waste too much time reviewing this patch. I'll soon have a working patch ready for the first part of the problem (menu_router rebuild). So, better save your energy for that one.

Subscribing.

StatusFileSize
new11.41 KB
Invalid patch format in drupal-menu-router-rebuild.patch.
[ View ]

Here's the new patch for menu_router rebuild on DRUPAL-6-13.

I tested it on my local install, trying the following things:
- Open the modules page (which always triggers a menu_rebuild), and see if the menu will break.
- Activate and deactivate a module (simplenews), and see if menu items appear and disappear as they are supposed to do.
- Add and delete content types, and see if menu items appear and disappear as they are supposed to do.

Result:
- One issue I found (don't know if this is normal) is that menu changes for enabled/disabled modules only take effect after the redirect (I enabled the redirect page in devel settings).
- Otherwise the system seems to work. Maybe there are still some hidden bugs, but it worked ok on my system, with the few tests I made.
- Time and number of SQL queries inside menu_router_build() is heavily reduced.
- Most of what is left of the execution time for menu_router_build happens inside _menu_router_build_calc_callbacks, which scans the implementations of hook_menu and hook_menu_alter.
- Inside _menu_router_build_calc_callbacks, most of the time is spent in views_menu_alter() and views_get_all_views().

I think there is not much left for optimizing menu_router_build any further. We have to scan all the modules, since some of them can have dynamic menu definitions.

Next step will be to optimize _menu_navigation_links_rebuild. Ideas are in the draft, but they need some more thinking.
- The new implementation of menu_router_build provides some useful information about which rows in menu_router have been added, modified or deleted. We can use this information to update only a part of the menu_links table. The draft does not use this information.
- The entire table menu_links can be too big for memory. What we can do is scan the table for those rows with module='system' and where the router_path matches a given path in menu_router, or a prefix of such a path (for parent items). The rows we find this way can be cached in a PHP array, so we can save a lot of SELECT queries.
- We can compare new rows to existing rows in the PHP array, and skip rows which remain unchanged.
- We can collect any cleanup steps (such as, updating the has_children field) in a PHP array, and execute them all at once. This way, we can save a lot of duplicate UPDATE queries.
- These things will not be possible with existing functions like menu_link_save. We need new functions which are optimized for bulk operations.

Nice work! I'll try to test this soon on one of my development sites. Does the patch apply, or would I have to do it manually?

Now, this is an interesting patch! There is a lot of work to be left, however. First we need this converted to Drupal 7... fixes get into D7 first. Of course, the debug statements need to go, a bit more commenting would also be useful but in general -- I like. I am ... more than surprised. This is unexpectedly good stuff.

Please find me in contact or on IRC if you need help in converting this to D7. I am sorry to be so dismissive but... I really, really did not expect this.

On rereading this issue, I realize the need for apologize. So do I apologize for being harsh. I was reading shallow and the stupid remarks came from EvanDonovan not from donquixote. Thanks for your persistence here.

One question.
As the TODO says, I would like to split the _menu_router_build_calc_menu up into smaller functions, and move it all into menu.rebuild_router.inc. Smaller functions make for nicer code, and more fine-grained profiler info in xdebug. Moving into the separate file reduces code weight for normal requests - which I still believe is worthwile.

The downside is that the patch becomes harder to compare with existing code.

@donquixote: it is not useful at this time to move those functions to a separate file. The difference is really hardly measurable (we do have a lot of discussions going on in #497118: Remove the registry (for functions). Please make sure your patch is as small as possible, and doesn't change anything more then strictly necessary.

chx:
Another one: Could you write something for post #26 ? How will crud and the restructuring of tab_parent / tab_root change the way modules define their menu items?

This is what comes to my mind if you say crud:

Instead of hook_menu, modules will use hook_enable and hook_disable, where they call crud functions like add_router_item(), or remove_router_item() etc. This would suck for two reasons:

  • All the existing modules would need to be modified. This is a no-go for D6.
  • It can easily happen that hook_disable or hook_enable is not called, when in fact it should be. For instance, when a module is disabled via phpmyadmin, or if a module developer wants to add new menu items while the module is active, or if the menu definitions are dynamic (based on other data in the db, such as content type definitions etc). In a similar way, it can happen that hook_enable is called more than once, so we need to be careful with duplicates. Both of this can cause chaos in the menu_router table.
  • It will be harder to repair or update the menu_router table, because we can't just call hook_enable or hook_disable any time we want. Too many side effects.
  • Typically, these crud functions will not be optimized for bulk operations.

We could of course introduce a new hook_update or something, where modules can restore or dynamically update their menu definitions. This would always be called after hook_enable, and then any time something needs to be flushed, repaired, rebuilt, etc. This would solve the problem with side effects of hook_enable. I still don't like it, because:

  • Again, all the existing D6 modules need to be modified.
  • These crud functions will not be optimized for bulk operations.
  • hook_menu is a nice example for the "hollywood principle" ("inversion of control"), which is nice. By introducing a crud system we would give that advantage away.

If we want to keep hook_menu (which I strongly recommend), we could use crud functions inside menu_router_build instead. Now this would be quite pointless, because:

  • CRUD functions typically handle one menu item at once. Thus, we lose the potential for optimization based on bulk operations.
  • menu_router_build will most likely be the only place where these crud functions are used, so no need for being overly generic.

But maybe your crud approach is meant for menu_links rather than menu_router. In this case you are too late, because:

  • crud-like functions already exists for menu_links (menu_link_save, etc).
  • Again, these functions are not designed for bulk operations, and thus we should rather use some custom-tailored functions for _menu_navigation_links_rebuild.

chx was talking about removing hook_menu() and taking a CRUD approach to add / remove menu router items. Of course, that has the disadvantages you listed, but I believe it makes a lot of sense considering how modules actually use the menu router (see for example Views).

I don't know what you mean by "not designed for bulk operations". As soon as add / remove is a one-off operation (you only add / remove menu router items when something changed: for example, you created a new views), there is no point in "optimizing for bulk operations".

@Damien Tournoud:
While I agree that for now it's better to reduce the patch size, in general I still think it's a good idea to move things that are rarely called into separate files. It obviously doesn't make a big difference for one single file, but: If all module (and core) developers would work this way, I imagine the amount of code in memory would be less than 1/3 of what it is now. This would be a useful thing especially for ajax requests which are supposed to be cheap and fast.

The discussion you mention, as I understand, tries to introduce some quite complex solution, which I'm not very keen about.

It could all be so simple: All the hook implementations go into the *.module file, but most of the code is moved out into "private" functions. Most of the "private" functions go into separate files, unless they are really called with each request. Plus, we can have one separate file per menu callback.

I don't think additional files add to complexity, not at all. The opposite is true: Splitting things up into smaller classes is good style in OOP, and the same is true for files and functions (if we manage to group the functions logically, not just randomly).

Maybe we can do this as a second step, once this patch made it into D7 and D6. At least, this will make the CVS history more meaningful.

Of course, that has the disadvantages you listed

Which are heavy ones!!

there is no point in "optimizing for bulk operations"

True, if we really never do a complete repair / rebuild.

but I believe it makes a lot of sense considering how modules actually use the menu router (see for example Views)

It doesn't make a big difference, if now instead of hook_menu we have to call hook_update. Except that in hook_update we won't have the nice hollywood interface any more, and need to call explicit crud functions instead. So, it only gets uglier.

Or, if you don't want hook_update, then please don't forget the poor module developer, who does'nt want to call a crud function for new menu items added in the code.

It can be a different thing for hook_menu_alter, I will have to look into it.

After all, scanning the modules for hook_menu is not that expensive, if done occasionall with menu_rebuild. If we really want to further improve that, we could let the modules define if they are "dynamic" or not. Then, a "quick menu rebuild" would only scan the dynamic modules and those which are newly enabled or disabled, while a "full menu rebuild" would scan all modules (to catch changes in the code). All of this is very possible with hook_menu.

@donquixote: I was not saying it couldn't be useful to split. I'm just saying do not do it right now, as it will only make the patch unreviewable.

After all, scanning the modules for hook_menu is not that expensive, if done occasionall with menu_rebuild. If we really want to further improve that, we could let the modules define if they are "dynamic" or not. Then, a "quick menu rebuild" would only scan the dynamic modules and those which are newly enabled or disabled, while a "full menu rebuild" would scan all modules (to catch changes in the code). All of this is very possible with hook_menu.

The problem is: you never really know if you need to quick or a full menu rebuild. The dynamic/static flag sounds a little bit fragile at first sight.

Issue tags:-Performance

you never really know if you need to quick or a full menu rebuild

I imagine to have a link similar to "flush all caches", for doing a full menu rebuild. Maybe it can even be included.

A full rebuild is necessary in the following situations:
1) a module did not set the "dynamic" flag, when in fact it should have done so. (it's probably a good idea to make all modules dynamic by default, and wait for module developers to explicitly define their modules to be static).
2) you are working on a module, and changed menu definitions in the code. This means, you will know you have to do a full menu rebuild.
3) The menu_router table is broken, and needs to be repaired. Again, you know where to click.

The quick rebuild would happen in the following situations:
- New modules are activated / deactivated.
- One or more modules set the "need quick menu rebuild" flag, or call it directly. Maybe they can even specifiy what exactly has to be refreshed.

-----

That said, I think we can perfectly live without these "advanced optimizations". If we implement all of the other optimization steps I proposed in this thread, we will be more than happy.

Issue tags:+Performance

Adding tag

Issue tags:+Performance

Folks, CRUD is far away but doing a diff style menu rebuild is a good idea and we have a patch. Let's finish this.

chx, if I said anything stupid in this issue, please let me know. I am sorry if my comments irritated or offended you.

I spoke as strongly as I did in #20 because I really wanted to see some movement on this issue & donquixote wouldn't have been able to come up with the patch that he did if he had written it originally for 7.x.

I'll try to benchmark the patch today. Have to get Panels 3 upgraded first... Btw, I agree with Damien Tournoud that the patch shouldn't be split into separate files, at least not at this point. If that is helpful, it can always be done down the road.

donquixote, I really appreciate your persistence on this. I hope I didn't confuse with any of my misguided attempts at explanation.

subscribe. nice to see some deep digging here by donquixote.

There's a bug in the patch.

<?php
/**
* Execute UPDATE queries
*
* @param $update array of rows with modifications
*/
function _menu_router_build_save_update($update) {
  foreach (
$update as $changes) {
   
$set = array();
    foreach (
$changes as $k => $v) {
     
$set[] = "$k = %s";
    }
   
$set = implode(', ', $set);
   
$sql = "UPDATE {menu_router} SET $set";
   
$args = array_values($changes);
   
array_unshift($args, $sql);
   
call_user_func_array('db_query', $args);
  }
}
?>

There is no WHERE statement in the sql. And the call_user_func_array is unnecessary.
I'm surprised this didn't break when I tried it.

Instead, it should be:

<?php
/**
* Execute UPDATE queries
*
* @param $update array of rows with modifications
*/
function _menu_router_build_save_update($update) {
  foreach (
$update as $path => $row_changes) {
   
$set = array();
    foreach (
$row_changes as $k => $v) {
     
$set[] = "$k = '%s'";
    }
   
$set = implode(', ', $set);
   
$sql = "UPDATE {menu_router} SET $set WHERE path = '%s'";
   
$args = array_values($row_changes);
   
$args[] = $path;
   
db_query($sql, $args);
  }
}
?>

I'll re-post when I have a new tested version.

EDIT:
Fixed the bug in this post mentioned by jmpalomar (#55). Thanks!

Sorry for initially leading this issue astray. I shouldn't have attempted to explain something I didn't actually understand. Thanks to kbahey's coaching, now I know the difference between menu_router & menu_links and I see how menu_links is actually a superset of menu_router (user-defined menu paths, system paths, and book paths).

I've learned a lot about the menu system from this issue. Thanks, donquixote, chx, et al. I'll be glad to help benchmark a patch once it's re-rolled.

I tried the patch on my 6.12 install. It runs really fast, but I'm getting lots of warnings:

( ! ) Notice: Undefined index: line in C:\amp\Apache2.2\htdocs\a12\includes\common.inc on line 595
Call Stack
# Time Memory Function Location
1 0.0006 58408 {main}( ) ..\index.php:0
2 0.3890 17327728 menu_execute_active_handler( ) ..\index.php:18
3 0.3997 18040840 call_user_func_array ( ) ..\menu.inc:348
4 0.3997 18040984 drupal_get_form( ) ..\menu.inc:0
5 0.3998 18043504 call_user_func_array ( ) ..\form.inc:102
6 0.3998 18043728 drupal_retrieve_form( ) ..\form.inc:0
7 0.3998 18045856 call_user_func_array ( ) ..\form.inc:366
8 0.3998 18046016 system_modules( ) ..\form.inc:0
9 0.7015 21624704 menu_rebuild( ) ..\system.admin.inc:623
10 0.7025 21624688 menu_router_build( ) ..\menu.inc:1673
11 0.7036 21668024 _menu_router_build_rebuild( ) ..\menu.inc:1698
12 1.0381 30068864 _menu_router_build_save_update( ) ..\menu.inc:1739
13 1.0382 30069720 call_user_func_array ( ) ..\menu.rebuild_router.inc:169
14 1.0382 30069936 db_query( ) ..\menu.rebuild_router.inc:0
15 1.0383 30071488 _db_query( ) ..\database.mysql-common.inc:42
16 1.0391 30073728 trigger_error ( ) ..\database.mysqli.inc:128
17 1.0391 30074224 drupal_error_handler( ) ..\common.inc:0

( ! ) Notice: Undefined index: file in C:\amp\Apache2.2\htdocs\a12\includes\common.inc on line 596
Call Stack
# Time Memory Function Location
1 0.0006 58408 {main}( ) ..\index.php:0
2 0.3890 17327728 menu_execute_active_handler( ) ..\index.php:18
3 0.3997 18040840 call_user_func_array ( ) ..\menu.inc:348
4 0.3997 18040984 drupal_get_form( ) ..\menu.inc:0
5 0.3998 18043504 call_user_func_array ( ) ..\form.inc:102
6 0.3998 18043728 drupal_retrieve_form( ) ..\form.inc:0
7 0.3998 18045856 call_user_func_array ( ) ..\form.inc:366
8 0.3998 18046016 system_modules( ) ..\form.inc:0
9 0.7015 21624704 menu_rebuild( ) ..\system.admin.inc:623
10 0.7025 21624688 menu_router_build( ) ..\menu.inc:1673
11 0.7036 21668024 _menu_router_build_rebuild( ) ..\menu.inc:1698
12 1.0381 30068864 _menu_router_build_save_update( ) ..\menu.inc:1739
13 1.0382 30069720 call_user_func_array ( ) ..\menu.rebuild_router.inc:169
14 1.0382 30069936 db_query( ) ..\menu.rebuild_router.inc:0
15 1.0383 30071488 _db_query( ) ..\database.mysql-common.inc:42
16 1.0391 30073728 trigger_error ( ) ..\database.mysqli.inc:128
17 1.0391 30074224 drupal_error_handler( ) ..\common.inc:0

( ! ) Notice: Undefined index: line in C:\amp\Apache2.2\htdocs\a12\includes\common.inc on line 595
Call Stack
# Time Memory Function Location
1 0.0006 58408 {main}( ) ..\index.php:0
2 0.3890 17327728 menu_execute_active_handler( ) ..\index.php:18
3 0.3997 18040840 call_user_func_array ( ) ..\menu.inc:348
4 0.3997 18040984 drupal_get_form( ) ..\menu.inc:0
5 0.3998 18043504 call_user_func_array ( ) ..\form.inc:102
6 0.3998 18043728 drupal_retrieve_form( ) ..\form.inc:0
7 0.3998 18045856 call_user_func_array ( ) ..\form.inc:366
8 0.3998 18046016 system_modules( ) ..\form.inc:0
9 0.7015 21624704 menu_rebuild( ) ..\system.admin.inc:623
10 0.7025 21624688 menu_router_build( ) ..\menu.inc:1673
11 0.7036 21668024 _menu_router_build_rebuild( ) ..\menu.inc:1698
12 1.0381 30068864 _menu_router_build_save_update( ) ..\menu.inc:1739
13 1.0426 30073952 call_user_func_array ( ) ..\menu.rebuild_router.inc:169
14 1.0426 30073952 db_query( ) ..\menu.rebuild_router.inc:0
15 1.0429 30074800 _db_query( ) ..\database.mysql-common.inc:42
16 1.0432 30077296 trigger_error ( ) ..\database.mysqli.inc:128
17 1.0432 30077936 drupal_error_handler( ) ..\common.inc:0

( ! ) Notice: Undefined index: file in C:\amp\Apache2.2\htdocs\a12\includes\common.inc on line 596
Call Stack
# Time Memory Function Location
1 0.0006 58408 {main}( ) ..\index.php:0
2 0.3890 17327728 menu_execute_active_handler( ) ..\index.php:18
3 0.3997 18040840 call_user_func_array ( ) ..\menu.inc:348
4 0.3997 18040984 drupal_get_form( ) ..\menu.inc:0
5 0.3998 18043504 call_user_func_array ( ) ..\form.inc:102
6 0.3998 18043728 drupal_retrieve_form( ) ..\form.inc:0
7 0.3998 18045856 call_user_func_array ( ) ..\form.inc:366
8 0.3998 18046016 system_modules( ) ..\form.inc:0
9 0.7015 21624704 menu_rebuild( ) ..\system.admin.inc:623
10 0.7025 21624688 menu_router_build( ) ..\menu.inc:1673
11 0.7036 21668024 _menu_router_build_rebuild( ) ..\menu.inc:1698
12 1.0381 30068864 _menu_router_build_save_update( ) ..\menu.inc:1739
13 1.0426 30073952 call_user_func_array ( ) ..\menu.rebuild_router.inc:169
14 1.0426 30073952 db_query( ) ..\menu.rebuild_router.inc:0
15 1.0429 30074800 _db_query( ) ..\database.mysql-common.inc:42
16 1.0432 30077296 trigger_error ( ) ..\database.mysqli.inc:128
17 1.0432 30077936 drupal_error_handler( ) ..\common.inc:0

AND

user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ':1:{i:0;s:17:\"administer school\";}' at line 1 query: UPDATE menu_router SET access_arguments = a:1:{i:0;s:17:\"administer school\";} in on line .
user warning: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ':1:{i:0;s:24:\"wysiwyg_profile_overview\";}, tab_parent = admin/settings/wysiwyg' at line 1 query: UPDATE menu_router SET page_callback = drupal_get_form, page_arguments = a:1:{i:0;s:24:\"wysiwyg_profile_overview\";}, tab_parent = admin/settings/wysiwyg, tab_root = admin/settings/wysiwyg, title = Profiles, type = 136, description = in on line .
There is a security update available for your version of Drupal. To ensure the security of your server, you should update immediately! See the available updates page for more information.

Subscribing.

newbuntu (#52):
Thanks for the testing!
The bug you noticed is exactly what I explained in #50. In all of the 4 stack traces you posted, you have

12 1.0381 30068864 _menu_router_build_save_update( ) ..\menu.inc:1739
13 1.0426 30073952 call_user_func_array ( ) ..\menu.rebuild_router.inc:169
14 1.0426 30073952 db_query( ) ..\menu.rebuild_router.inc:0

The next version of the patch should fix that. (or you can fix it for yourself if you like).
Of course, the error only appears if there is actually something to update (insert or delete is not update). Otherwise, the function is never called.

I've been testing this patch and I got same SQL error #52. I think problem is that missed some quotations.

Maybe this:

    foreach ($row_changes as $k => $v) {
      $set[] = "$k = %s";
    }

should be this:

    foreach ($row_changes as $k => $v) {
      $set[] = "$k = '%s'";
    }

Is correct?

Thanks *very* much to @donquixote for your persistence and digging on this issue - I think it will make a big difference for a lot of people. I very much agree that fixing the d6 performance issues is important, the reality is that people will be running d6 sites for years to come...

Subscribing.

Title:Rewrite the function menu_rebuildxdebug (wincachegrind) numbers need to be multiplied by 10

As pointed out in this blog post,
http://www.matsgefvert.se/blog/archives/401/comment-page-1#comment-1666
and this bug report
http://sourceforge.net/tracker/index.php?func=detail&aid=1957915&group_i...,
the profiling numbers from xdebug need to be multiplied by 10, if viewed with WinCacheGrind.

This means, we have 5 seconds menu_rebuild, not just 500 millis.. Like this, the numbers make more sense, and fit much better with the perceived request time.

Title:xdebug (wincachegrind) numbers need to be multiplied by 10Rewrite the function menu_rebuild

oops.. thought that was the comment title.

I manually applied this patch, since I needed to incorporate the corrections from #50. menu_rebuild() is significantly faster with the patch. The number of queries on my site has gone down from 11277 to 8512 (it's that exact number every time, when I load a node which simply calls menu_rebuild() with Devel query logging enabled). The query time has been cut almost in half - now it averages between 5 and 7 seconds - still slow, but much better than before.

donquixote, could you re-roll the patch for us with the corrections? Also, could you remove the debugging code and make it all one file?

Thanks again for your work on this, and I'll be excited to see how it progresses.

StatusFileSize
new11.25 KB
Invalid patch format in menu_router.patch.
[ View ]

Here you go, a new patch for D6.

Changes:
- added the above discussed bugfixes
- removed debugging code
- tried to make the menu.inc part of the patch as small as possible. Lines are now synced even if they are in a different function (such as _menu_router_build_calc_callbacks).
- everything that would not sync with anything in the old menu.inc is moved to menu.rebuild_router.inc

I disagree with the idea of all in one file, as it would not improve patch readability, neither would it improve code quality.

About profiling:
As explained abve, menu_rebuild consists of two parts, and the patch improves only one of them.

So, if you want to do any profiling, the place to look is menu_router_build.

You will see improvements in menu_rebuild as well, but it will still be slow thanks to the other part, _menu_navigation_links_rebuild. We'll have to fix that in a later step (see my above comments).

Thanks for the re-roll. I'll try to test again in a few days.

I had a look in D7. I have yet to understand how exactly the D7 menu rebuild is different, but I found the following function to be interesting:

<?php
/**
* Helper function to save data from menu_router_build() to the router table.
*/
function _menu_router_save($menu, $masks) {
 
// Delete the existing router since we have some data to replace it.
 
db_delete('menu_router')->execute();
 
// Prepare insert object.
 
$insert = db_insert('menu_router')
    ->
fields(array(
     
'path',
     
'load_functions',
     
'to_arg_functions',
     
'access_callback',
     
'access_arguments',
     
'page_callback',
     
'page_arguments',
     
'fit',
     
'number_parts',
     
'tab_parent',
     
'tab_root',
     
'title',
     
'title_callback',
     
'title_arguments',
     
'type',
     
'block_callback',
     
'description',
     
'position',
     
'weight',
    ));
  foreach (
$menu as $path => $item) {
   
// Fill in insert object values.
   
$insert->values(array(
     
'path' => $item['path'],
     
'load_functions' => $item['load_functions'],
     
'to_arg_functions' => $item['to_arg_functions'],
     
'access_callback' => $item['access callback'],
     
'access_arguments' => serialize($item['access arguments']),
     
'page_callback' => $item['page callback'],
     
'page_arguments' => serialize($item['page arguments']),
     
'fit' => $item['_fit'],
     
'number_parts' => $item['_number_parts'],
     
'tab_parent' => $item['tab_parent'],
     
'tab_root' => $item['tab_root'],
     
'title' => $item['title'],
     
'title_callback' => $item['title callback'],
     
'title_arguments' => ($item['title arguments'] ? serialize($item['title arguments']) : ''),
     
'type' => $item['type'],
     
'block_callback' => $item['block callback'],
     
'description' => $item['description'],
     
'position' => $item['position'],
     
'weight' => $item['weight'],
    ));
  }
 
// Execute insert object.
 
$insert->execute();
 
// Store the masks.
 
variable_set('menu_masks', $masks);
  return
$menu;
}
?>

Now having a look at the DB queries for this function.

DELETE FROM menu_router

INSERT INTO menu_router (path, load_functions, to_arg_functions, access_callback, access_arguments, page_callback, page_arguments, fit, number_parts, tab_parent, tab_root, title, title_callback, title_arguments, type, block_callback, description, position, weight) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, [...]

The entire "crime" of first deleting the entire table, then rebuilding it from scratch, now happens in this function. It does not reset the auto_increment ids, btw.

One difference to D6 is that it does the insert in one big query, instead of each row for itself. This makes it somewhat faster.

I don't know if we'd gain much by doing a diff-like update in this function.

There is also a difference in the API, because now menu_router_build does no longer trigger any writing on the database. So, modules which call menu_router_build (like admin_menu) should consider that. On the other hand, it's probably a bad idea to rebuild the router table without rebuilding menu_links as well.

subscribing

Very interesting. Not sure whether this can be backported, but it would definitely make sense for D7. IIRC, recent benchmarks outlined that it is the sheer amount of queries executed during menu rebuild that are so slow. So we should definitely benchmark this patch in D7.

@sun:
The patch is D6, not D7 !!

Guys, seriously, please read my posts. I feel like talking against a wall :(

@donquixote: Thank you very, very much for working on this.

subscribing

I applied donquixote's patch and tested it, but I don't see any improvement in page load time. IT took 40+ seconds on my local dev install, just as before. It's also worth mentioning that the same slowdown has been happening for drush on commandline as well: it takes almost that long for drush to grab available commands from contributed modules.

@picardo:
As I said, you will still have a long load time. The patch does only speed up one of the offending functions. There is another one that is still slowing things down, and that needs to be dealt with in another patch. Actually, this second function is typically the more evil one.

If you want to do useful benchmarks, you need xdebug, and look specifically for the execution time of menu_router_build(). Alternatively, you can check the number of queries on menu_router using devel module.

Nevertheless, you should see something between 20% and 40% reduction of total request time (rough estimate).

EDIT:
40 seconds looks VERY much!! Maybe your problem is more than just the menu_rebuild. I would be very interested to see some analysis of your problem with xdebug and/or devel query reporting.

You can also have a look at the requests for creating new content types or CCK fields, which also trigger menu_rebuild. Do they take as long as the modules page? Do you see improvements with the patch?

StatusFileSize
new304.6 KB

I'm not sure I can do xdebug but I did the devel query log analysis. I attached a screencap since the query log won't let me highlight and select these parts. It seems like the slowest queries are found in cache_set, and they are called many times. I don't get this. I tried it with caching enabled and disabled, but the slow queries are always cache_set.

@picardo:
Thanks for the screenshot!
To my knowledge, cache_set can do a lot of different things, and disabling the cache does not affect all of that.
cache_set is the function name, but just as interesting is the name of the table the query is run on. ("UPDATE cache_content" or "INSERT INTO cache_menu"). You see, there are different tables that cache_set operates on, and even if it's the same table, it could be a completely different purpose.

The problem is totally unrelated from menu_rebuild, it's a separate bottleneck that needs its own fix. xdebug will be a big help for further analysis, but I think the findings should go into a separate thread (most likely there's an existing one already).

You can ask me if you need help with setting up xdebug, I can assist with skype or what you like. Let's nail this down and then find the appropriate issue, or open a new one, but let's not further bloat this issue with unrelated stuff. I have the feeling it already has far too many posts, and people don't even bother to read the 10 latest.

picardo & donquixote: As I said above in #60, the query time and number of queries definitely dropped by a significant amount after applying the patch in its current form. It's not a cure-all, but it's a good start. I agree that the cache_set() should be a separate issue - I see it too when I run devel.

Identified the problem #68 / #71. Proposed solution:
allow to exclude folders from drupal_system_listing

subsribing

Title:Rewrite the function menu_rebuildRewrite the function menu_rebuild (EDIT: esp. menu_router_build)

This issue is blown-up in an unhealthy way.

I propose to split it up:
- Let's keep this issue for the patches for menu_router_build(). All the patches so far have been about this function, so I don't see a conflict.
- Let's discuss the other function in #593682: Optimize _menu_navigation_links_rebuild().

Thanks for revisiting this, donquixote. I am still watching this issue & will help test any further patches. In fact, I was just thinking about this the other day, since I am about to start another round of performance tuning for my site.

If we want to get them into 7.x, though, we'll have to get someone to mentor on how to convert to the DBTNG syntax. Even if they ultimately don't make it into 7.x, or get backported to 6.x, though, we might be able to get them into the Pressflow Drupal distribution if they were good enough.

Title:Rewrite the function menu_rebuild (EDIT: esp. menu_router_build)Optimize menu_router_build()

StatusFileSize
new3.52 KB

First attempt of a D7 patch.
I'm totally not surprised if this breaks - I don't even have a working D7 install to test with.

What the patch does:
Instead of deleting everything from menu_router and then re-inserting the new table contents, we first check for changes and then delete, update and insert not more than absolutely necessary.
In D7 this is easier than in D6, because it all happens in one function, _menu_router_save().

Memory consumption:
The only thing that could grow considerably big is the $delete array. That is, if the table contains a lot of cruft items that have to be removed. Otherwise, the newly introduced arrays don't consume more memory than the $menu array given as the function argument.
If big $delete arrays really turns out to be a problem, we could stop if it grows too big, and do a full delete + rebuild in this case (or do a "NOT IN" checking against $update, instead of "IN").

StatusFileSize
new3.52 KB
Invalid patch format in issue-512962-menu_router.patch.
[ View ]

Ok, let's see if this triggers a test!

StatusFileSize
new3.57 KB
Invalid patch format in issue-512962-menu_router_0.patch.
[ View ]

Patch created from D7 root dir.

EDIT:
Can someone explain to me how I can request a test??

Status:Needs work» Needs review

Mark the issue as 'needs review'.

Status:Needs review» Needs work

The last submitted patch failed testing.

Invalid patch format:
I made this patch with Tortoise CVS, so what's wrong with it? Line endings?

Needs to be rolled from the root of the Drupal install, not /includes.

StatusFileSize
new3.58 KB
Invalid patch format in issue-512962-menu_router_1.patch.
[ View ]

Tortoise CVS sucks. No matter where I say "make patch", it fails to prefix the filename with "includes/".
Doing it manually now.

Status:Needs work» Needs review

need review..

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new3.53 KB
Failed to install Drupal on MySQL 5.0 ISAM.
[ View ]

Fixed line endings and one syntax error.
Still don't have a working D7 install, so I almost expect this to break.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new3.66 KB
Unable to apply patch issue-512962-menu_router-1.patch
[ View ]

There was a problem caused by an empty "WHERE path IN ()" sql statement for deleting rows.

Status:Needs review» Needs work

The last submitted patch failed testing.

Detect a non-applicable patch
Ensure the patch applies to the lastest checkout.

I don't get it, I don't have anything later than that!!
After this test result I made a CVS update, then made a new patch, changed the line endings and added the "includes/" manually (Notepad++), then made a diff (WinMerge) to compare with the patch from #91. No change, except for a linebreak.

What am I doing wrong??

Status:Needs work» Needs review
StatusFileSize
new3.42 KB
Failed on MySQL 5.0 ISAM, with: 9,647 pass(es), 7,024 fail(s), and 1,366 exception(es).
[ View ]

This one is done with WinMerge, comparing to an independent copy of D7. Maybe this helps.
Btw: Is there a way to test a patch without spamming the issue?

Status:Needs review» Needs work

The last submitted patch failed testing.

donquixote, spamming the issue is fine. I sometimes open a separate issue just for spamming the test bot if running into testbot-specific issues so it doesn't bump up in people's trackers.

The proposed new functionality seems likely to cause memory issues. I'm tempted to make this a won't fix - the rebuild should be happening rarely or never when your site is in production, so this is not a high-impact performance issue.

Wrapping the delete and re-insert in a transaction would potentially prevent transient 404s, but that's a separate issue.

created this #650858: wrap menu_rebuild() operations in a transaction to wrap the rebuild in a transaction.

@pwolanin (#97):

The proposed new functionality seems likely to cause memory issues.

I have repeatedly explained how it can be done without memory issues.
For the above patch there is still one example where it would consume more memory than the old one (if the old table contents is bigger than the new one), but I also pointed out how to fix this. And the above patch is not the finished thing, anyway.

Why is this important?

Menu rebuild happens every time you work on content type and field definitions.
Speeding this up allows me to produce more site in less time, with a less disrupted train of thought. Performance on the admin backend was the biggest argument against using Drupal, when we had to choose a new CMS solution at my previous workplace.
Of course, the tradeoffs are not the same as for end-user performance, but it is still worth it imo.

I should mention that all of this applies much more to D6 than to D7. menu_router_save in D7 uses only one big query to insert the new rows, while the D6 version had separate queries for each row. And actually I'm mostly interested in a D6 solution - the one above, which I am successfully using for months now on several Drupal installs. I'm working on the D7 patch to "follow the book", so to say.

One thing where we could get into trouble is sequence of table rows. I will hopefully soon find out if this is a problem or not.

Version:7.x-dev» 6.x-dev

If menu_router_save() is using a multi-insert already, then this probably should be moved back to D6 if there's no other optimization there. It only makes sense to have issues against 7.x and backport when the same issue is in both releases. In fact I'm going to do that now.

I'm not so sure if it's really that irrelevant for D7.

Here is my xdebug result on localhost, with a fresh install of D7, only the "most common modules" from the default install enabled:

menu_rebuild: 1830 ms
  _menu_navigation_links_rebuild: 900 ms
  menu_router_build: 210 ms
  _menu_router_save: 520 ms
    InsertQuery_mysql->execute: 370 ms
    DeleteQuery->execute: 100 ms

After the patch:

  _menu_navigation_links_rebuild: 860 ms
  menu_router_build: 160 ms
  _menu_router_save: 69 ms

This is on a medium-performance localhost, but it's a very basic Drupal install.
If the patch (a new version of it) can be shown to have no nasty side effects, I would still consider it worthy for D7.

So we save ~ 500ms because of a much smaller (or non-existent if no changes) insert query compared to a delete and very big insert query. If there's really no memory implications then this seems worth doing to me.

I also noticed we're not using db_truncate() in _menu_router_save() - opened #650944: Use db_truncate() in _menu_router_save() since that makes sense anyway.

Looking at the patch it does:

+  $q = db_query("SELECT * FROM {$table_name}");

AFAIK, that means the full result set (the full router) will be pre-fetched into memory.

hmmm
I was all the time thinking about memory occupied by PHP arrays. If this is how MySQL works (pre-fetch the full table in memory), we have an additional problem to think of. I think this can be solved, but this depends on my understanding or non-understanding of the inner workings of the database engine.

Let's start with a bit of theory.
- The old menu_router has n_old rows, the new one has n_new rows.
- The $menu argument has n_new elements, and this array is copied all around as a by-value argument already in the old algorithm. If we stay in O(n_new), we are quite safe*.
- n_old will usually be around the same size as n_new. In many cases it will be smaller, because more often we add functionality (modules, content types..), instead of removing functionality.
- However, it can happen that we we did something which creates a lot of menu_router items, and now we want to get rid of all this junk. The menu_router table might be bigger than what we can handle, and much bigger than n_new.

To be protected against this, we could count the table rows before doing anything. If the table is too big, we delete it, otherwise we do the diff-based approach.

<?php
$q
= db_query("SELECT COUNT(*) n_rows FROM {menu_router}");
if (
$row = db_fetch_array($q)) {
  if (
$row['n_rows'] > (int)($n_new * 1.5 + 100)) {
    ...
// delete and insert all
 
} else {
    ...
// diff-based
 
}
}
?>

*Multiples of n_new
Of course, O(n_new) can still be quite a lot, if it can be multiplied with any constant number. We can't just ignore that.
The old algorithm does already copy the full $menu array to different functions as a by-value argument, and has a few other arrays that can be expected to be around the same size. And, it has the big insert statement, that has to be stored in memory before it is executed.
If the new algorithm does in fact result in a higher multiplicity number than the old one, I am sure we can reduce that by reducing the number of array copies.

I am not sure how PHP handles array copies, if it does any lazy copying or so.. but I guess it does not. And I don't know how efficient or inefficient the database engine's memory cache is, compared to PHP arrays. At least, it does not count against PHP memory_limit (which is a common problem on shared hosting).

-----

That said, I think a more important question is if we get problems with the sequence of menu items.
Unfortunately I have difficulties with testing atm, because even my unpatched D7 fails a lot of the test cases...

I tried to find out why the patch fails the test.
At my localhost I get a "Failed to set field name to s193537hWiVVMKY" with the PageNotFoundTestCase->setUp().

The problem is the following:
- The PageNotFOundTestCase->setUp() method creates a dummy user and wants to log this user in using ->drupalPost().
- drupalPost() calls drupalGet('user'), which does a http call to the 'user' page, but returns a "Page not found" instead of the expected login page.

If I call the 'user' page manually, I get the login page, as expected.

If anyone has an idea why this happens, please let me know!

Hi donquixote, is it possible that your test script sends an 'incomplete' HTTP header, e.g. that the Host header is missing. If you're multi-hosting, or if 'localhost' is not resolvable, then you may run into this type of problems.

How can i check that?

The returned "Page not found" page is clearly a page generated by Drupal (I printed the HTML output).

The requested URL is
http://localhost/d7/user

Hi donquixote,

Usually I use WireShark to capture the network traffic. The capture filter should be "tcp port 80". You can send me the network trace via mail at olivier dot biot at-sign gmail dot com.

Best regards,

Olivier

Status:Needs work» Needs review
StatusFileSize
new6.73 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

This does a multi insert; been using it for boost and it works quite well. Need to optimize the packet size though; currently set at 50k so it will only insert 320 records if your max_packet_size for MySQL is set to 16MB. It's still a large improvement in comparison to doing a lot of db_query commands.

Some issues with the patch:
* The patch contains some unrelated whitespace removal.
* The code you need before you call _db_insert_multi is quite lengthy. Wouldn't it be attractive to have a reusable function that does the fieldname conversion itself?
* I looked at the patch, but I don't really see what's the point with it. You still delete the entire thing and insert the entire thing. It's all in one go, so now you can expect the same performance for D6 as we currently have in D7. The original intention was to be better than that!

I think it can be seen as a safe intermediate step to speed up D6 menu rebuild, but not more.

subscribe

* Being able to do a multi insert (what this patch does #109) is a big speed improvement when dealing with lots of records and/or a slow MySQL server. Also the timing between the del and insert should be a lot smaller since there is hardly any logic that needs to run between each insert. Overall a good step in the right direction, and as far as I know the only patch out there for 6.x ATM.
* What do you mean by the fieldname conversion? I don't see how this can be abstracted out any more. The insert needs the fieldname and the % placeholder. It looks big because there are 20 fields in the menu_router table.
* Blame the white space on my editor; does house cleaning for me - Minor issue.

In terms of reducing the number of calls to db_query() if we could keep track of the size of the data so far we could push the query size closer to the limit of the max_packet_size. run mb_strlen($text, '8bit') on the text fields; set the rest to 255. When that gets within 50kb of the max_packet_size then that's the chunk size.

First step is to see if this actually helps performance. Can I get someone to profile this patch? Once we know it helps then we can work on making this patch a lot better.

EDIT
The other option for reducing the number of calls to db_query() would be a check that runs after the $data array has been built splitting up that array... the overhead on this would probably be too much though, since there isn't a simple way to get the size of an array in bytes in PHP.

Overall a good step in the right direction, and as far as I know the only patch out there for 6.x ATM.

Really?
D6 patch: #512962-61: Optimize menu_router_build() / _menu_router_save()
I don't remember exactly if this is the final and working version, but at least on my own modified copy of D6 I have this thing (or a modified version) running. It works for me, and it is as fast as expected. (use xdebug to benchmark _menu_router_build especially)

Someone who confirms that it's faster: #512962-73: Optimize menu_router_build() / _menu_router_save()

One issue with the above patch: When I wrote this thing I was not aware of the way PHP4 handles objects, so I think we need to add some "&" to make it PHP4 compatible.

* What do you mean by the fieldname conversion? I don't see how this can be abstracted out any more. The insert needs the fieldname and the % placeholder. It looks big because there are 20 fields in the menu_router table.

I imagine something like this:

<?php
$fields_legend
= array(
 
'db_fieldname_1' => 'array_fieldname_1',
 
'db_fieldname_2' => 'array_fieldname_2',
 
'db_fieldname_3' => array('array_fieldname_3', '%d'),  // non-default placeholder
 
'db_fieldname_4' => array(NULL, '%s', $default_value),  // default value
 
'fieldname_5'// here the array and db fieldnames are identical. db fieldnames are never numeric, so it should be safe.
);
$default_placeholder = '%s';
$suppress = FALSE;
_db_insert_multi('menu_router', $fields_legend, $data, $default_placeholder, $suppress);
?>

It's a matter of taste..

--------

Anyway, if for some reason we don't want the diff-based approach that I came up with, then yours is probably the best we can achieve, and definitely better than the current situation. Your patch is safer than my patch, because it does not affect the sequence of rows in the database.

I also think the _db_insert_multi should take care of the chunking, to get more reuse value. Once it's tested, the function could go into database.inc.

-----

Another idea would be to provide a generic function for

<?php
function db_replace_table_contents($table_name, $primary_key, $fields_legend, $data, $options) {
 
$options += array(
   
'default_placeholder' => '%s',
   
'preserve_order' => FALSE// if true, it will delete and then refill
   
'truncate' => FALSE,
   
'suppress_warnings' => FALSE,
   
'max_rows_to_read' => NULL// if the table contains more rows, the function will truncate and refill, instead of doing anything diff-based.
   
'chunk_size' => NULL// number of rows to insert or update at once
   
'delete_chunk_size' => NULL// number of rows to delete at once
   
'remember_changes' => TRUE// can be TRUE, FALSE, or a number limit to avoid memory problems. If TRUE, it will return an array with all inserted, updated and deleted rows. If FALSE, it will only return an array with the count for update, insert, delete.
 
);
  ...
}
?>

An additional argument could control the chunking, if needed.
The return value would be an array of the changes.
Some of the optional arguments could go into an $options array.

I think I should combine $fields & $placeholders into 1 array. For a given table the field name has to be unique, thus I can use that for the array key.

<?php
array('path' => "'%s'",
     
'load_functions' => "'%s'",
     
'to_arg_functions' => "'%s'",
       );
?>

This is 6.x so simpler is better IMHO; try to not add major things, only do minimal changes if possible.

Would it be possible to approach this using LOAD DATA INFILE (mysql) / COPY (pgsql) ? ...if not for all, as an option maybe.

Another possibility: insert to in memory table, then truncate {menu_router} and insert {menu_router} select from in memory table?

I was in the postgreSQL IRC channel last night and they suggested using copy as well; named pipes or stdin where the recommendations. Did some quick checks and for MySQL it can be turned off; setting is called "local_infile". Long story short, I haven't seen an easy cross platform way to make this fly. Granted I didn't look very hard, but it wasn't extremely obvious how to easily do inserts this other way.

Using Temporary tables is an interesting idea; would renaming it work better? The idea behind this is for operations that take a long time, let the old data still be used until it is replaced; not deleted and recreated like it is now. A different way of looking at the locking patch? #251792: Implement a locking framework for long operations

To shorten the time between delete and insert?
Ok, if we want to continue with that path, sure. I still would like to see a solution that works without full delete, though.

The lock is still a good idea, even for a shorter timespan.

It seems to me this part of the menu rebuild is just a little piece of the whole story. Here, while the whole menu rebuild may take 1 minute, the inserts to the menu_router table after the delete take less than 4 seconds.

So the big problem is not here, I think. The menu_router table needs a CRUD interface. For example, CCK forces a menu rebuild just when a field label has been changed, and that's just to update the secondary tabs under the "Manage fields" screen. Wouldn't it be nicer to have an API to update just that entry in menu_router table? That API could then invoke a hook to let other modules know about that change, so it can be altered, or additional actions taken.

However, it seems this idea about CRUD was discussed here above. Too late for D6 it seems...

It seems to me this part of the menu rebuild is just a little piece of the whole story. Here, while the whole menu rebuild may take 1 minute, the inserts to the menu_router table after the delete take less than 4 seconds.

How do you get to these 4 seconds? xdebug? What exactly is taking 4 seconds? It's taking more time on my machine.
The whole menu rebuild has two parts: One that deals with menu_router, another that deals with menu_links. The above patches only speed up the menu_router part, so there is really no surprise there is still a bottleneck. To speed up the menu_links part, visit #593682: Optimize _menu_navigation_links_rebuild().

My own benchmarks (xdebug + wincachegrind + factor 10)
D6 benchmarks: It seems that I never published benchmarks for my D6 patch - shame on me! Anyway, I tested a lot on my localhost and still have the patched version running in my projects, and the part that's responsible for writing to menu_router is much much faster.
D7 benchmarks: #101. Time for _menu_router_save() drops from 520ms to 96ms on my system. Your results might be different.

So the big problem is not here, I think. The menu_router table needs a CRUD interface. For example, CCK forces a menu rebuild just when a field label has been changed, and that's just to update the secondary tabs under the "Manage fields" screen. Wouldn't it be nicer to have an API to update just that entry in menu_router table? That API could then invoke a hook to let other modules know about that change, so it can be altered, or additional actions taken.

I think CRUD is a misleading concept in this case.
What we need instead is a more targetted listener pattern:

  1. CCK tells the menu router system that one or more implementations of hook_menu will return a different result.
  2. The menu router system calls all the hook_menu implementations to build the new $menu array.
  3. The menu router system compares the new $menu array with the old menu_router table contents, and replaces the old contents with the new ones. This doesn't need a full delete + full insert. Only the changes need to be written to the database. Or that's the idea.
  4. The menu router system notifies the menu links system about the changes in the menu_router table. If we make a hook for that, it can also notify other modules that are interested (such as admin_menu).
  5. The menu links system does the necessary updates in the menu_links table (for the system navigation menu).
  6. Other modules, such as admin_menu, do their own updates to whatever tables they manage.

This is not what I would call CRUD.

Re: "How do you get to these 4 seconds?"

I added a simple call to timer_start('menu_router') after the DELETE FROM {menu_router} statement in _menu_router_build(), and a call to timer_stop() after the foreach loop where the INSERT statements take place. My menu router table has around 4000 records, if that helps. MySQL is installed on a separate server connected to the Apache with 1 Gbps (private LAN). Also, the Apache has APC installed.

Re: "I think CRUD is a misleading concept in this case."

Well, CCK could just invoke something like menu_router_update() to just update the title of a secondary tab. This could trigger a hook so that other modules could react on that, and that's probably less work that rebuilding the whole menu router table. I guess, though, this may not be as easy as that.

Anyway, changes to D6 would have to be compatible with current behavior, and your approach looks more focussed on that. For D7, I haven't had a chance to look how it works, yet. :(

StatusFileSize
new7.42 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Here's the updated patch with a better data structure for db_insert_multi(), smarter chunk sizing, and a simple error check. If the round trip for a db_query call is 0.5ms and the menu router table has 1,000 records, this will save 1/2 a second. How ever you slice this, it is bound to improve Drupal's performance.

Where else in core would db_insert_multi() be useful?

List of functions that do an insert inside a loop & don't do an update ... if (!db_affected_rows()) ... insert

http://api.drupal.org/api/function/actions_synchronize/6
http://api.drupal.org/api/function/block_admin_configure_submit/6
http://api.drupal.org/api/function/block_add_block_form_submit/6
http://api.drupal.org/api/function/filter_admin_format_form_submit/6
http://api.drupal.org/api/function/_locale_parse_js_file/6
http://api.drupal.org/api/function/locale_translate_edit_form_submit/6
http://api.drupal.org/api/function/module_rebuild_cache/6
http://api.drupal.org/api/function/node_access_write_grants/6
http://api.drupal.org/api/function/profile_save_profile/6
http://api.drupal.org/api/function/system_initialize_theme_blocks/6
http://api.drupal.org/api/function/system_theme_data/6
http://api.drupal.org/api/function/upload_save/6
http://api.drupal.org/api/function/taxonomy_save_vocabulary/6
http://api.drupal.org/api/function/user_admin_perm_submit/6

Some of these could use my multi delete function as well; or a better variant of it.

<?php
/**
* Delete records from the database where IN(...).
*
* NOTE Be aware of the servers max_packet_size variable.
*
* @param $table
*   The name of the table.
* @param $field
*  field names to be compared to
* @param $placeholder
*   db_query placeholders; like %d or '%s'
* @param $data
*   array of values you wish to compare to
* @return
*   returns db_query() result.
*/
function boost_db_delete_in_multi($table, $field, $placeholder, $data) {
 
// Get the number of rows that will be inserted
 
$rows = count($data);
 
// Create what goes in the IN ()
 
$in = $placeholder;
 
// Add the rest of the place holders
 
for ($i = 1; $i < $rows; $i++) {
   
$in .= ', ' . $placeholder;
  }
 
// Build the query
 
$query = "DELETE FROM {" . $table . "} WHERE $field IN ($in)";
 
// Run the query
 
return db_query($query, $data);
}
?>

example usage of del

<?php
/**
* Removes info from database. Use on 404 or 403.
*
* @param array $files
*  An array of files. Each file is a secondary array with keys for 'filename' and 'hash'.
*  The hash is the primary key in the database. If omitted it will be recalculated from the filename.
*/
function boost_remove_db($files) {
 
$hashes = array();
  foreach (
$files as $file) {
    if (empty(
$file['hash'])) {
     
$file['hash'] = md5($file['filename']);
    }
   
$hashes[] = $file['hash'];
  }
  if (
$hashes) {
   
boost_db_delete_in_multi('boost_cache', 'hash', "'%s'", $hashes);
  }
}
?>

subscribing

just found this MySQL variable http://www.mysqlperformanceblog.com/2006/06/08/mysql-server-variables-sq...
bulk_insert_buffer_size
Should probably use that as well in this patch.

subscribing

Subscribe

StatusFileSize
new7.81 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_1.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

bulk_insert_buffer_size incorporated into this patch. I've been testing this locally and it seems to work very well :) If the overhead for a bulk insert can be improved, let me know; the code is a little heavy. I'm thinking about a function that returns the max packet size; it still requires these variables to be set

<?php
  $data
= array();
 
$chunks = 0;
 
$loop_counter = 0;
 
$max_insert = 51200;
 
$chunk_size = 0;
 
$max_packet = db_get_max_packet():
?>

And for this loop to run
<?php
 
// Run the multi insert
 
foreach ($data as $values) {
   
db_insert_multi('menu_router', array(
     
'path' => "'%s'",
     
'load_functions' => "'%s'",
     
'to_arg_functions' => "'%s'",
     
'access_callback' => "'%s'",
     
'access_arguments' => "'%s'",
     
'page_callback' => "'%s'",
     
'page_arguments' => "'%s'",
     
'fit' => "%d",
     
'number_parts' => "%d",
     
'tab_parent' => "'%s'",
     
'tab_root' => "'%s'",
     
'title' => "'%s'",
     
'title_callback' => "'%s'",
     
'title_arguments' => "'%s'",
     
'type' => "%d",
     
'block_callback' => "'%s'",
     
'description' => "'%s'",
     
'position' => "'%s'",
     
'weight' => "%d",
     
'file' => "'%s'",
      ),
$values);
  }
?>

Any ideas on how to make this better looking; the code in this patch isn't exactly pretty.

Subscribing

StatusFileSize
new10.38 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-D6_2.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

New functions this patch adds:
get_memory_limit()
Implement memory_get_usage() if it doesn't exist
get_free_memory()
db_get_max_packet()
db_multi_insert()

I decided to rename db_insert_multi() to db_multi_insert() so the db_multi_* functions will be grouped together if more of them from boost get ported over.

Testing & feedback would be appreciated.

@markus_petrux (#121).

It seems to me this part of the menu rebuild is just a little piece of the whole story. Here, while the whole menu rebuild may take 1 minute, the inserts to the menu_router table after the delete take less than 4 seconds.

I did not really think this through.. 4 seconds you said, and I thought oh wow, only 4 seconds, that's fast!

But no, actually 4 seconds is a lot! Enough to disrupt someone's train of thought. It is less than 4 seconds on my own install, and I still complain. Menu rebuild happens each time I add new or change existing CCK fields and node types, which happens really a lot!

Here, while the whole menu rebuild may take 1 minute

Now this is the more interesting part. I would expect something like 10 or 15 seconds total (which is still a pain), but not 1 minute. Could you find out what happens in this 1 minute?

@donquixote: Let me post a snippet of the timers I used now:

<?php
function menu_rebuild() {
 
variable_del('menu_rebuild_needed');
 
$menu = menu_router_build(TRUE);
timer_start('menu-links');
 
_menu_navigation_links_rebuild($menu);
$timer = timer_stop('menu-links');
print
'menu-links: '. number_format($timer['time'] / 1000, 6, '.', '') .'<br />';
 
// Clear the menu, page and block caches.
 
menu_cache_clear_all();
 
_menu_clear_page_cache();
  if (
defined('MAINTENANCE_MODE')) {
   
variable_set('menu_rebuild_needed', TRUE);
  }
}
/**
* Collect, alter and store the menu definitions.
*/
function menu_router_build($reset = FALSE) {
  static
$menu;
  if (!isset(
$menu) || $reset) {
   
// We need to manually call each module so that we can know which module
    // a given item came from.
   
$callbacks = array();
timer_start('menu-hooks');
    foreach (
module_implements('menu') as $module) {
     
$router_items = call_user_func($module .'_menu');
      if (isset(
$router_items) && is_array($router_items)) {
        foreach (
array_keys($router_items) as $path) {
         
$router_items[$path]['module'] = $module;
        }
       
$callbacks = array_merge($callbacks, $router_items);
      }
    }
   
// Alter the menu as defined in modules, keys are like user/%user.
   
drupal_alter('menu', $callbacks);
$timer = timer_stop('menu-hooks');
print
'menu-hooks: '. number_format($timer['time'] / 1000, 6, '.', '') .'<br />';
timer_start('menu-router');
   
$menu = _menu_router_build($callbacks);
$timer = timer_stop('menu-router');
print
'menu-router: '. number_format($timer['time'] / 1000, 6, '.', '') .'<br />';
   
_menu_router_cache($menu);
  }
  return
$menu;
}
?>

menu-hooks: 1.264840
menu-router: 4.181670
menu-links: 29.619120

Menu router records after rebuild: 3983
Menu links record after rebuild: 2970
Total modules enabled: 220

We have 2 modules that alter the rebuild of the menu links significantly: admin_menu and a custom module that adds a few links to the admin_menu. So this is basically admin_menu.

We also have a lot of different content types and CCK fields, which is something that also generates a lot of links, and also impacts the process of admin_menu.

Thanks for the info!
menu-links is definitely the more evil part, but it's also the part that I consider more difficult to fix.
I remember something like 5:2 or 2:1 for the duration of _menu_navigation_links_rebuild() compared to menu_router_build().

Originally I wanted to fix both of them, but I consider it more healthy to treat them separately. If you want you can fight with menu_links..

From a theoretical point of view, I imagine there is a lot of potential to a faster menu_links rebuild. But in practice, there is really a lot of head scratching.

Maybe you could you measure inside menu_navigation_links_rebuild and report that in the sister issue?

Well, I think our issue is the number of menu links that admin_menu needs to process + the complexity of the code involved in Drupal core for menu links, but also in admin_menu itself.

My conclusion is that the menu system would be a lot better if it had a CRUD interface.

For us, this issue only matters when doing Views and CCK stuff. End users won't generate menu rebuilds, so when we launch, we'll have to allocate batch windows to performs certain changes. I'm afraid we'll have to live with that for a while. It is not a priority right now for us either, mostly because there are more issues on the table yet.

subscribe

looking at http://api.drupal.org/api/function/_menu_navigation_links_rebuild/6 seems like the quickest way to speed this up is to use db_multi_select for the second loop of this function. All the other queries seem to grab multiple rows. This is the code block I'm thinking of optimizing; if some one could verify my guess that would be nice.

<?php
 
if ($menu_links) {
   
// Make sure no child comes before its parent.
   
array_multisort($sort, SORT_NUMERIC, $menu_links);
    foreach (
$menu_links as $item) {
     
$existing_item = db_fetch_array(db_query("SELECT mlid, menu_name, plid, customized, has_children, updated FROM {menu_links} WHERE link_path = '%s' AND module = '%s'", $item['link_path'], 'system'));
      if (
$existing_item) {
       
$item['mlid'] = $existing_item['mlid'];
       
// A change in hook_menu may move the link to a different menu
       
if (empty($item['menu_name']) || ($item['menu_name'] == $existing_item['menu_name'])) {
         
$item['menu_name'] = $existing_item['menu_name'];
         
$item['plid'] = $existing_item['plid'];
        }
       
$item['has_children'] = $existing_item['has_children'];
       
$item['updated'] = $existing_item['updated'];
      }
      if (!
$existing_item || !$existing_item['customized']) {
       
menu_link_save($item);
      }
    }
  }
?>

The other thing I see is to do a db_multi_update on this code

<?php
  $placeholders
= db_placeholders($menu, 'varchar');
 
$paths = array_keys($menu);
 
// Updated and customized items whose router paths are gone need new ones.
 
$result = db_query("SELECT ml.link_path, ml.mlid, ml.router_path, ml.updated FROM {menu_links} ml WHERE ml.updated = 1 OR (router_path NOT IN ($placeholders) AND external = 0 AND customized = 1)", $paths);
  while (
$item = db_fetch_array($result)) {
   
$router_path = _menu_find_router_path($item['link_path']);
    if (!empty(
$router_path) && ($router_path != $item['router_path'] || $item['updated'])) {
     
// If the router path and the link path matches, it's surely a working
      // item, so we clear the updated flag.
     
$updated = $item['updated'] && $router_path != $item['link_path'];
     
db_query("UPDATE {menu_links} SET router_path = '%s', updated = %d WHERE mlid = %d", $router_path, $updated, $item['mlid']);
    }
  }
?>

Knowing which one is slower would be nice. Currently too busy to benchmark ATM.

@mikeytown,
could you post this stuff in the other issue please?
#593682: Optimize _menu_navigation_links_rebuild()

Status:Needs review» Needs work

Does drupal_strlen() do what I need it to do, get the string size in bytes? Short answer: don't think so.

Can produce a high performance menu.inc that uses memcache only and does not use the database?

I might add, has anybody tried putting in a sleep() statement into menu.inc to see if we can slow down the database calls? Furthermore, does db_query() close the database connections? Running SHOW PROCESSLIST; in MySQL reveals that most of the connections are idle while being jammed up...

subscribing

Status:Needs work» Needs review
StatusFileSize
new11.38 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-143-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

New patch. Changes from last one is it includes a new function called
drupal_strlen_bytes().

Hi,

May we "solve" this situation placing a static HTML menu instead of the Dupal's default menu? Could it avoid these queries or are they called anyway?

Thank you in advance!

@javiaruz: Read the articles on the Drupal menu system in the handbook. This issue is inherent to the way Drupal handles path routing: http://drupal.org/node/102338.

subscribing

The patch from #143 produces the following message during a menu_rebuild() call:

    * warning: pg_query(): Query failed: ERROR: syntax error at or near "WHERE" LINE 1: SHOW VARIABLES WHERE Variable_name = 'max_allowed_packet' ^ in /var/www/html/includes/database.pgsql.inc on line 139.
    * user warning: query: SHOW VARIABLES WHERE Variable_name = 'max_allowed_packet' in /var/www/html/includes/menu.inc on line 2625.
    * warning: pg_query(): Query failed: ERROR: syntax error at or near "WHERE" LINE 1: SHOW VARIABLES WHERE Variable_name = 'bulk_insert_buffer_siz... ^ in /var/www/html/includes/database.pgsql.inc on line 139.
    * user warning: query: SHOW VARIABLES WHERE Variable_name = 'bulk_insert_buffer_size' in /var/www/html/includes/menu.inc on line 2630.

@thekevinday
what database are you using, & what version?

in /var/www/html/includes/database.pgsql.inc on line 139.
This means PostgreSQL, or not?

here's the function in question; I handle PostgreSQL.

<?php
/**
* Get the maximum size of a database query that can be run
*
* Uses database information & free memory to compute the max SQL query size.
*
* @param $default int
*   If an error occurs with the query, use this value
* @return int
*   max size in bytes
*/
function db_get_max_packet($default = 524288) {
 
// Get max insert size for the currently connected database
 
$max_packet = 0;
  if (
stristr($db_type, 'pgsql')) {
   
// Set Max Packet size to 16MB if using postgreSQL.
   
$max_packet = 16777216;
  }
  else {
   
// Get maximum packet size for mysql
   
$max_packet = db_fetch_array(db_query("SHOW VARIABLES WHERE Variable_name = 'max_allowed_packet'"));
   
// default to 1/2 MB
   
$max_packet = (int)$max_packet['Value'] > $default ? (int)$max_packet['Value'] : $default;
   
// Get bulk insert buffer size
   
$insert_buffer_size = db_fetch_array(db_query("SHOW VARIABLES WHERE Variable_name = 'bulk_insert_buffer_size'"));
   
// default to 1/2 MB
   
$insert_buffer_size = (int)$insert_buffer_size['Value'] > $default ? (int)$insert_buffer_size['Value'] : $default;
   
// Set max
   
$max_packet = $max_packet > $insert_buffer_size ? $insert_buffer_size : $max_packet;
  }
  return
min($max_packet, get_free_memory());
}
?>

postgresql version 8.4.2

@thekevinday
can I get the output of this from you

<?php
echo $db_type;
?>

subscribe

Sorry for the long delay, I overlooked your last post, mikeytown2.

I assume you actually want me to do this:

<?php
global $db_type;
echo
$db_type;
?>

The results are: pgsql

Edit:
Looking back at the example code I gave, I do not see $db_type being pulled from a global variable.

Should the function in question instead start like the following?

<?php
function db_get_max_packet($default = 524288) {
  global
$db_type;
 
// Get max insert size for the currently connected database
 
$max_packet = 0;
?>

I think this is important:

Menu rebuild happens each time I add new or change existing CCK fields and node types, which happens really a lot

sadly this is an architectural flaw with node module and cck. Perhaps fixing that would be another way forward.

We could make a new issue "postpone menu_rebuild when working with CCK fields and node types" or "allow to schedule a postponed menu_rebuild". Core would provide an infrastructure for this, that CCK and other modules can use.

"postpone" means: A flag is set somewhere, and a message is shown "the menu has to be rebuilt.", with a link. Similar to content permissions.

I think we should really do this outside of this far too long issue.

--------

Other possible directions, menu_router side:

  • Allow a partial menu_router_build, only for those items that have changed.
    Maybe this is what people mean when talking about CRUD. I would prefer not to use the term CRUD. For me C(R)UD means create, update, delete. With CRUD, adding the same item ten times will create 10 identical items. "Partial update" would be a bit different. Running it 10x should produce the same result as running it 1x, and it should be independent of the previous contents of the menu_router table.
  • Optimize menu_router_build, following my own patches and ideas from somewhere above. The idea is to do only write to the menu_router table if something has actually changed. This means we have to read the entire menu_router table and compare it to the menu array we want to save. The "SELECT * FROM menu_router" could be chunked to save memory, but first we need to make the existing patch work.
  • Optimize menu_router_build, following mikeytown2's approach.

---------

Other possible directions, menu_links side:

  • #593682: Optimize _menu_navigation_links_rebuild(). Some good ideas, but no useful patch yet. The relationship of menu_links with menu_router is still a mystery for me, so I don't expect to fix this any time soon. The idea would again be to write only those things that have changed, to avoid duplicate UPDATE or INSERT queries, and to combine multiple SELECT queries into one (chunked, if necessary to save memory).
  • Allow a partial rebuild of menu_links. This requires that menu_router_build returns a diff set of changes to the menu_router table, instead of just the rebuilt menu array.

All of this should happen in the linked issue above.

perhaps we should look at doing updates where possible rather than delete and insert?

perhaps we should look at doing updates where possible rather than delete and insert?

This is what I was trying to do all the time with my patches. #94 is the last one, but it's too long ago, so I'm not sure what is missing. I'm also not sure which of my patches was for D6, and which of them was for D7.

On the other hand, I do have a working solution on my own patched pressflow D6, that has performed flawlessly on different production sites. Maybe I should just package that thing, but then we'd have this old "D7 first" argument again. We are turning in circles.

The other argument was memory consumption. On my own webspace I always have sufficient memory, but this might be different on most shared hosting accounts.

Doing UPDATE instead of DELETE and INSERT means that we have to know the old contents of the menu_router table. Otherwise, how can we decide when to UPDATE and when to DELETE ?

The most efficient is to fetch all the menu_router table in one big SELECT * FROM {menu_router} statement, but this is also the most memory intensive. We already have an array of similar size in PHP memory, built from hook_menu(). The big SELECT would put an array of similar size into MySQL memory, as someone above argued. In some cases this array could even be a lot bigger, if the menu_router table contains a lot of junk items from some insane module.

The solution is a chunked db_query("SELECT * FROM {menu_router} LIMIT %d, %d", $i*100, ($i+1)*100), fetching 100 items at a time. This is faster than individual SELECT statements, and - I imagine - less memory intensive than doing it all in one unlimited SELECT. Alternatively we could chunk by module name or other criteria.

Now we can loop through this chunked result set, and compare it to the $menu array obtained from hook_menu. After that we can do all the DELETE, UPDATE and INSERT that is necessary, and we get a diff set that we can use in _menu_navigation_links_rebuild().

If we are really afraid about memory, we could even chunk the DELETE queries: Every time we have 100 items to delete, we do it, and flush the $items_to_delete buffer. Then we continue with the chunked SELECT. (but we have to reduce the LIMIT offset by the number of deleted items).

We only need to fetch the keys, not the full contents of the table. Basically - we need to know which to delete, which to update, and which to insert (would be simplified by having a merge query).

Maybe similar to this patch: http://drupal.org/node/307756#comment-2018146

We only need to fetch the keys, not the full contents of the table.

I would fetch the entire row, so we can compare the values and skip the update if nothing has changed.
This method has worked for me very well, and I think it can be done in a way where we don't have to worry about memory.

I think a big and simple select should be faster than a big update. Or not?

#103

AFAIK, that means the full result set (the full router) will be pre-fetched into memory.

The solution has to look something like this:

<?php
function db_table_save($table_name, $new_rows) {
 
$chunk_size = 100;
 
$schema = drupal_get_schema($table_name);
  if (empty(
$schema)) {
    return
FALSE;
  }
 
$primary = $schema['primary'];
  for (
$i=0;;++$i) {
   
$update_rows = array();
   
$delete_keys = array();
   
$q = db_query("SELECT * FROM {$table_name} LIMIT %d, %d", $i*$chunk_size, ($i+1)*$chunk_size);
    for (
$j=0; $j<$chunk_size; ++$j) {
     
$row = db_fetch_array($q);
      if (!
$row) {
        break;
      }
     
$row_key = $row[$primary];
      if (!isset(
$new_rows[$row_key])) {
       
// delete
       
$delete_keys[$row_key] = $row_key;
      }
      else {
       
// update if different
       
$new_row = $new_rows[$row_key];
        foreach (
$row as $key => $value) {
          if (!isset(
$new_row[$key]) && !is_null($new_row[$key])) {
           
// do nothing.
         
}
          else if (
$value != $new_row[$key]) {
           
$update_rows[$row_key][$key] = $new_row[$key];
          }
        }
      }
      unset(
$new_rows[$row_key]);
    }
    if (
$j<$chunk_size) {
      break;
    }
  }
 
_db_delete($table_name, $schema, $delete_keys);
 
_db_update($table_name, $schema, $update_rows);
 
_db_insert($table_name, $schema, $new_rows);
}
?>

I understand why this is a problem and why this generates so many queries, but what I don't understand is why this locks up mysql from serving any other requests at the same time. Is there a way of configuring mysql to be able to handle this kind of query traffic?

subscribe

The patch in #109 is the last patch working reliably for me. All following patches result in pages not found errors while caches are flushed and the menu is rebuild.

+1

StatusFileSize
new12.13 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-512962-166-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Going back to a simple design where the max amount of data we push to SQL is 1MB. Placed functions into the correct files. Lets get this RTBC.

One issue with lock.inc is if you use lock_acquire to renew a lock that disappeared you get a mysqli_num_rows() expects parameter 1 to be mysqli_result, boolean given in includes/database.mysqli.inc in the db_result function. I've increased the lock time from the default 30 seconds to 60 seconds and this error still appears.

<?php
 
if (isset($locks[$name])) {
   
// Try to extend the expiration of a lock we already acquired.
   
if (!db_result(db_query("UPDATE {semaphore} SET expire = %f WHERE name = '%s' AND value = '%s'", $expire, $name, _lock_id()))) {
     
// The lock was broken.
     
unset($locks[$name]);
    }
  }
?>

In the code above the update db_query fails so it returns a FALSE; thus throwing a php error in the watchdog. The fix would be to make sure db_query is not FALSE, then run it through db_result.

subscribing i'm going to have to try this out on some of my servers soon.

subscribing
+ question:
I'd like to test the #166 patch, I should have a couple of sites available to do so. A lot of my websites sport the duplicate entry errors. Are there official D6 releases to patch against or only dev?

Patch should work against 6.19 or 6.x-dev

Great! thanks.

Because I'm having a horror story on one 6.19 version. When cron runs, I get 53 pages full of duplicate entry entries in my log, all in one minute. It's a website under heavy daily use with lots of views and content types and cck's and stuff.

I think it is even preventing cron from running automatically at this point, I can execute cron manually but no longer through my control panel. I'll patch later today and see what happens.

Thanks again!

Hi mikeytown2

I just patched, and I'm afraid I can't report a success as of now.

I noticed a new error: warning: mysqli_num_rows() expects parameter 1 to be mysqli_result, boolean given in /home/[...]/domains/[...]/public_html/includes/database.mysqli.inc on line 178.

Before I ran update.php, I got a duplicate entry error again when visiting a settings page for Administration Tools (module.) Or well, hundreds of them to be precise, but that was the case before the patch as well.

I was mistaken by the way when I said a cron run invoked the errors. They just happen, seemingly at random. Just thought I'd mention it.

what tables do you get the duplicate entry errors on?

on the notes for #166 I mention the new error you now encounter. It's a separate issue, one that I need to create a bug for...

if for example its the

Duplicate entry 'variables' for key 'PRIMARY' query: INSERT INTO cache ...

then try this patch
http://drupal.org/node/561990#comment-3129014

depending on the error, there are different patches to use.

Possible connection with #840680: "page not found" error but panels page exists, subscribing.

Have the same issue with Possible connection with #840680: "page not found" error but panels page exists, subscribing.

subscribe

Subscribe

Tracking. I had a horrifying few hours today... As far as I can figure, the locking mechanism in menu_rebuild() failed somehow and _menu_navigation_links_rebuild() was executed before the router was completely rebuilt. Then, all the menu items in {menu_links} using router patterns that weren't yet rebuilt got deleted, because of this bit:

<?php
// Find any item whose router path does not exist any more.
 
$result = db_query("SELECT * FROM {menu_links} WHERE router_path NOT IN ($placeholders) AND external = 0 AND updated = 0 AND customized = 0 ORDER BY depth DESC", $paths);
 
// Remove all such items. Starting from those with the greatest depth will
  // minimize the amount of re-parenting done by menu_link_delete().
 
while ($item = db_fetch_array($result)) {
   
_menu_delete_item($item, TRUE);
  }
}
?>

I also had some 404s for completely valid view paths. These 404s went away when I re-cleared all the site caches, but the menu items did not come back. I'm feeling a bit dubious about the wisdom of deleting menu items automatically at all, but fixing the rebuild so that it's not such a hog should at least reduce the risk of it happening again.

Subscribe

Applied the patch in #166, we actually found this to be consistently slower on menu rebuilds by about 10 seconds.

Thanks for the heads up. I optimized for the database layer; looks like this group of function has multiple issues. Try it with the database on another server on the network; speed should be faster due to less db_query commands.

Just took a look at this issue again. Looks like D7 is not involved here, according to catch in #100, so presumably a change could still possibly get in for D6, if it didn't modify public-facing APIs.

It's been a long time since I tested, but #61 seemed faster for me than Drupal 6.11 or whatever it was at the time. Now that I am on Pressflow, which uses a transaction, I haven't really had issues with this so I haven't tested again, and I was hesitant to use a patch in production that wasn't very heavily tested.

I am wondering whether the chunked multi-insert is actually better than #61; it certainly seems more complex. Maybe in the next few weeks, I will re-roll #61 against the current 6.x-dev.

Hopefully #61 can be used to slim down PHP time. MD5 comparison is an idea I have for seeing if the row needs to be updated; could be faster than the current _menu_router_build_calc_changes() function in #61.

SELECT path, MD5(CONCAT(path, load_functions, to_arg_functions, access_callback, access_arguments, page_callback, page_arguments, fit, number_parts, tab_parent, tab_root, title, title_callback, title_arguments, type, block_callback, description, position, weight, file)) AS md5
FROM menu_router

Then use something close to md5(implode('', $item)); to do the comparison. If there is a match, then no need to write that row to the database.

I will say that chunked multi-inserts can give a dramatic speed up; but more likely for 10K+ rows. Could be overkill for menu_router.

StatusFileSize
new6.51 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-184.patch.
[ View ]

I finally had enough of this on a site we're building with several views and content types, having a menu rebuild for each view, content type and field add/change.

The attached patch is against the latest 6.x-dev from CVS, and is as minimal as I can make it. It is based off the patch in #61, but without file additions and API changes.

My rough benchmarking resulted in having rebuild's prior to patching coming in at ~95 seconds with about 2578 queries, and 96% of the request time in mysql_query(), and after, coming in at ~5 seconds, with 206 queries, and 58% of the request in mysql_query() when the router had not changed. The menu_router table after rebuild has about 2400 records in it. The memory usage difference was negligible.

These benchmarks were done on my laptop which doesn't have the best disk IO performance for large amounts of MySQL writes, but the massive difference is testament enough for me!

Obviously, this patch needs testing, especially with adding and removing modules that add entires with hook_menu(). My own tests haven't revealed any issues, so hopefully all is well.

Status:Needs review» Needs work

The last submitted patch, menu-router-build-optimize-512962-184.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new6.51 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-186-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

So D6 patch files need a -D6 in them hey...

Status:Needs review» Needs work

Seems I might have lied about the memory usage difference being negligible. Patch on the way that will attempt to minimize memory usage.

Thanks for picking up #61 again. I think this is the way to do this.

What kind of memory issues did you run into? PHP or SQL ?

I think the previous arguments were about SQL memory. The solution would be to chunk the queries, with LIMIT 200, or with some kind of filter. Maybe a filter such as, first digit of MD5(path) or something, would be more robust to changes in the table (if the locking framework fails).

For PHP memory, we already have the $menu array in the current implementation. It is copied around, but then again, PHP does not actually copy the array contents unless we modify the copy. So yes, with the #61 approach we could have increased memory cost in the same magnitude as $menu.

@aspedia: kbahey of 2bits.com has a patch for Drupal that creates a memory.inc file which allows you to track memory on a given page request. Maybe statistics from that would be helpful to see what the difference is in memory consumption.

Thanks for the re-roll of #61 - I'll see if I have time to try it out in the next week or so.

Status:Needs work» Needs review
StatusFileSize
new7.96 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-190-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

It was PHP memory issues, or rather, after the patch the rebuild was using about 40MB more memory than after, but I've reworked it, and now it actually uses less, at least it does on my current large site with ~2400 router items.

Below is output from a XHProf diff on my large site:

                              Run Without   Run With     Diff          Diff%
Number of Function Calls      867,079       571,674      -295,405      -34.1%
Incl. Wall Time (microsec)    268,936,519   156,169,036  -112,767,483  -41.9%
Incl. CPU (microsecs)         13,760,000    15,910,000   2,150,000     15.6%
Incl. MemUse (bytes)          80,304,584    65,153,688   -15,150,896   -18.9%
Incl. PeakMemUse (bytes)      93,358,168    87,592,464   -5,765,704    -6.2%

And on a clean install (Installer was run with patch):

                            Run Without   Run With    Diff      Diff%
Number of Function Calls    55,605        40,888      -14,717 -26.5%
Incl. Wall Time (microsec)  634,324       543,753     -90,571 -14.3%
Incl. CPU (microsecs)       460,000       450,000     -10,000 -2.2%
Incl. MemUse (bytes)        4,280,040     4,277,984   -2,056 -0.0%
Incl. PeakMemUse (bytes)    4,494,664     4,539,272   44,608 1.0%

So on the clean install, peak memory usage was slightly up, but everything else was reduced, and on my large install, everything dropped except the CPU cycles, so the patch is a little more CPU intensive.

This patch obviously will need more testing, and probably better benchmarking, but I'm quite happy with the current results.

StatusFileSize
new8.55 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-191-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Updated patch which improves memory usage even further.

StatusFileSize
new14.44 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-193-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Updated patch which includes some navigation rebuild changes. I'd love feedback on this. I haven't run into any issues as yet, and it seems to improve general rebuild performance.

I added a $full parameter to menu_rebuild(), which allows for bypassing any caching this patch does. in effect, menu_rebuild(TRUE) should be identical to running menu_rebuild() before the patch, at least in the outcome.

Test bot is ignoring. Please reroll or reset status.

@figaro: This is a D6 patch. It needs to be manually reviewed. Way back earlier in the thread, catch said all the optimizations for D7 were already done, so it should legitimately be on D6.

An initial +1 from me on the functionality in #193.

I don't have profiled benchmarks, but seeing 18% fewer queries when a rebuild happens, and reduced execution time as well.

What is current the relationship of the patch in #193 and the patch mentioned in #192?

#193 can sometimes bring in a 50% speed increase in total execution time; via this simple test script

<?php
timer_start
('menu_rebuild');
menu_rebuild();
menu_rebuild();
menu_rebuild();
echo
'menu_rebuild: ' . timer_read('menu_rebuild') . "  ms<br>\n";
?>

@mikeytown (#197):
I suggest the following testing script instead:

EDIT:
I thought myself in the other issue.
What we need to do is, to benchmark these two functions individually:
_menu_router_build()
_menu_navigation_links_rebuild()

I think the best way is to simply hack into the code of each of these functions, on a test site. The timer result can be printed with devel dpm() instead of print.

Status:Needs review» Needs work

The patch in #193 includes improvements for the navigation links rebuild... the #192 reference is probably obsolete, and I don't think that patch works correctly anyway.

I did basic benchmarks with xhprof, and I've noticed improvements. I have had some issues with admin_menu however. Deleting all items in the navigation and admin_menu menus seems to fix this though. I'm not sure what the cause is, but it is obviously something different in the way the navigation links are handled.

I'm now running the patch in #193 on several sites, and issues with admin menu are the only problems I've come across, and I've been able to fix them with a few hacks.

If we run into problems I will endeavour to fix them and provide an updated patch here, and am happy to debug any issues people can tell me how to reproduce, however I do not have the time to continue looking into this further.

For the reference of others, here are the commands I run to fix the admin menu issues with patch #193, as well as a small patch I had to make to admin_menu to get it to work. Not sure if this patch is the cause of my problems. I can't remember exactly why I needed it, but it seems to work.

Drush commands:

drush php-eval 'menu_rebuild(TRUE)';
drush -y dis admin_menu;
drush -y pm-uninstall admin_menu;
drush php-eval 'menu_rebuild(TRUE)';
drush sqlq "DELETE FROM {menu_links} WHERE menu_name IN ('admin_menu', 'navigation')";
drush php-eval 'menu_rebuild(TRUE)';
drush -y en admin_menu;

EDIT: The above commands WILL delete all custom links that have been put in to the admin menu or navigation menus. For us this is never a problem, as we put all custom links in other menus.. but it should be noted for others doing the above. If you are unsure, don't do it. It is basically doing a "reset" on every link in the admin menu and navigation menus, and deleting any custom items. Again, if you don't want this, don't run the above commands.

Admin menu patch against latest 6.x-3.x-dev (as of 6th Nov 2010)

=== modified file 'admin_menu.module'
--- admin_menu.module 2010-11-05 22:18:18 +0000
+++ admin_menu.module 2010-11-30 15:36:58 +0000
@@ -148,7 +148,7 @@
   // those are moved back into the administration menu, then the menu system
   // will not calculate/assign the proper menu link parents when the parent item
   // (here: 'admin') is marked as customized.
-  db_query("UPDATE {menu_links} SET menu_name = 'admin_menu', customized = 0 WHERE router_path = 'admin'");
+//  db_query("UPDATE {menu_links} SET menu_name = 'admin_menu', customized = 0 WHERE router_path = 'admin'");
   // Flush client-side caches whenever the menu is rebuilt.
   admin_menu_flush_caches();

So in summary, the patch seems to break existing links in the admin menu, and I have no idea why, but after doing the above, it all seems to run fine.

By "break" I mean items don't appear to be parented correctly. Again, no idea why.

As Mentioned in #197:

<?php
timer_start
('menu_rebuild');
menu_rebuild();
menu_rebuild();
menu_rebuild();
echo
'menu_rebuild: ' . timer_read('menu_rebuild') . "  ms<br>\n";
?>

Will show huge gains in speed when comparing this patch to core. But FWIW, because it is not getting passed the TRUE param, that is essentially serving the cached copy, no?

In my first runs through it a the time for a menu_rebuild() in core is fairly similar to the time taken by a menu_rebuild(TRUE) with this patch.

(Which, don't get me wrong...is TOTALLY worthwhile when you consider issues like #941970: Views rebuilds the menu more than it needs to )

The menu navigation rebuild changes here are /not/ in D7 afaik. If at all possible it'd be great to split the link and router rebuild changes into separate issues, so that the links part could be done against D7 first (note I haven't reviewed the patch properly so that might not be an option).

I think it would be difficult to break apart in its current state.

The router part is built and cached, and then the links are built based on the changes that are picked up from the router build part, so instead of rebuilding all nav links, it only works on the ones that were added or updated in the current rebuild.

I doubt this approach will be able to be applied however, as some modules seem to alter menu_links directly by deleting or removing links from it. These links are then not picked up by the rebuild as nothing changed in the router.

I don't know of many modules that do this, but admin_menu is one that does. I don't know how to work around this at this stage, and I don't know enough about D7 to port a patch at this time, I might get some time over the coming weeks to have a look though.

subscribing

subscribing

StatusFileSize
new10.92 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_router_D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

#61 patch posted by donquixote rewritten by me to use with Drupal 6.20. It almost twice speed up menu building in my case, and without any side effects.
@donquixote: may be you can apply your technique to #593682: Optimize _menu_navigation_links_rebuild()? Your changes really boost performance.
@aspedia: did you solve problem with #193 and admin_menu?

@Andrew Answer (#205):
_menu_navigation_links_rebuild has more complex requirements, because manual changes need to be preserved. I am a bit scared of this beast.

EDIT:
That said, if there was a well-written and complete spec for the outcome of this process, then I guess a complete clean rewrite could be possible.

+

I just encountered #178 again. A module install triggered a cache clear while cron was running and so two different processes tried to rebuild the menu at the same time, and a bunch of menu items got deleted. At least there's only a dozen menu items to add back on this site, unlike the hundreds there were last time...

@#209: I don't believe so, because it was caused by two simultaneous menu router rebuilds (and and had nothing to do with views)? See #178.

Ya this is easy to duplicate, just open a couple tabs and do module enables or other stuff that rebuilds the menus. I'm surprised a fix hasn't gotten rolled into 6x this this is such an old issue. What is the consensus patch people are using?

Subscribing

Subscribing

StatusFileSize
new14.33 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu-router-build-optimize-512962-214-D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Here is a new patch against the latest 6.x-dev snapshot, which hopefully fixes the issues I mentioned previously in #193 and #199. We'll run this for a few days and see how it goes, but it looks good so far. If there are no problems, I'll look at splitting the nav rebuild out as catch mentioned in #201.

The menu_router_build() function is getting quite long. Do we care?
In general I prefer to split things up in shorter functions. But maybe in this case we want to reduce the namespace footprint.

Two thoughts I had with the original patch approach:
1) Should we keep the $changes_insert, $changes_delete, $changes_update, and make them available to _menu_navigation_links_rebuild() ? Maybe it does allow a faster nav rebuild - but I don't really know.
2) Should we have a hook_menu_router_rebuilt(), so other modules can react to a router rebuild? Probably too late for D6, although I personally believe that introducing new hooks should be allowed even in old releases.

That said, I think discussion about these two points can be postponed, or even moved to a separate issue. Let's see if #214 works and get it in if it does.

---

@aspedia
Eh, one more thing.
Looking at the patch, I see that we have a diff-based update. This makes me happy.
The other thing I see is a static $checksum thing. I don't see this explained in any post in this thread (using a quick find-in-page for "checksum"). I imagine to prevent race con
Could you quickly explain what this is about, for the sake of documentation and me being curious? Thanks :)

Ok, the checksum thing. I'm not sure if you're referring to simply the static function used, or what/why I've used the checksums, so I'll try and explain both here.

The static function is used, as I wanted to store the cache data in the cache_menu table. I also wanted to avoid changing what menu_router_build() returned, as I believe it should just return $menu as it has always done. To this end, at the end of menu_router_build(), where I have access to both "checksum" arrays, I pass them through to the static function for "storage". After the nav links building is done, the menu cache is cleared, which, if i had put the checksums directly into the cache_menu table, would now be wiped, effectively negating all the work I did. :), so after the menu cache is cleared, I then insert the checksums into the cache table, by getting them from the static $checksums in _menu_router_build_checksum_cache(). So it's got nothing to do with race conditions, it's just simply a way of storing the data, so that I can cache it after everything is done, and without changing return values of various existing functions.

In regards to what function the checksums perform, well, I find it to be a very efficient way of quickly determining if a router path has actually changed from its previous definition. The patch uses two checksum arrays, the first is for the router, which creates a checksum of all properties that the router table cares about (which means if a link is "updated", but does not update any properties for the router, then we don't need to change it). And then I have a full checksum array, which keeps track of "any" change to the router item. This list can then be used to signify nav links that, may now be in a different menu for example, but nothing else about the router has changed.

I'm not sure what else to add, but I feel like I haven't explained this very well. Hopefully it answers your question though?

I also wanted to avoid changing what menu_router_build() returned, as I believe it should just return $menu as it has always done.

I agree, the behavior / signature of this function has to remain stable.
In my older patches above I tried to solve this with an by-reference argument to the function (in fact it was an object instead of by-reference array, in complete disregard of php4).

I wanted to store the cache data in the cache_menu table

I wonder, is this the correct / ideal place?
http://api.drupal.org/api/drupal/includes--cache.inc/function/cache_set/6
menu_links and menu_router are completely different things, and menu_router should rather be named just "router". These values belong to menu_router, while cache_menu is for menu links. Best I can think of is the variable table - this only flushes when we really want it.

In regards to what function the checksums perform, well, I find it to be a very efficient way of quickly determining if a router path has actually changed from its previous definition.

So the checksum allows you not having to do a SELECT on menu_router, to find the changes. Probably leaving a smaller memory footprint (**), and being more efficient - although the SELECT approach is pretty fast already.
On the other hand, we now get a bigger footprint in the database, introducing a new cache value.
And we introduce a side effect: Every time the cache_menu is wiped (*), we need a complete router rebuild, where a diff-based one would have been sufficient.

(*) I don't remember when the cache_menu is wiped. For sure it happens in menu_rebuild(), but you already take care of that. But I would find it not surprising if there are other pieces of Drupal / contrib that flush this cache table.

(**) If we are concerned about memory in the database engine for a SELECT query, we could chunk the query. If we are worried about memory in PHP, then we can find other tricks. Such as, md5 an item as soon as we fetch it from the database.
I think none of these worries is justified for menu_router.

Status:Needs work» Needs review

I thought about putting the checksums in the variables table, however, this would put the checksums in $GLOBALS for every page, which I didn't really like. Also, I do think that cache_menu is the correct place. Yes, I agree that it is mostly used for menu links and the like, but it is not cache_menu_links, it is cache_menu, which I go to mean I can put anything menu related in there. The only time in core that the entire table is wiped, (and not a specific cid) from memory, is during the menu rebuild, or on module pages and the like, which I believe is the best behavior, as it still allows for a full "flush" to be done without much effort. The general idea of this patch, is not to make the rebuild fast "every" time, but for most of the time. Having said that, I did originally put the cache in the plain old "cache" table, but I felt that if cache_menu was cleared, there was probably a good reason for it, and that we should do a full rebuild in this case anyway.

Selecting from the database instead of making a cache also has its own issues when, for whatever reason, the router table gets out of sync and we end up with ghost or rogue items in there that should not be, that may cause issues, whereas with the cache, clear the cache, and the table is all fresh and pristine. I'm not sure of a clean interface that would allow the same to be possible using a select approach to calculate the diffs. I agree there will be cases where a diff approach would be ok, and we do a full rebuild, but hopefully those cases will be minimal.

In my mind, the cache tables grow to a reasonable size anyway, and the amount of data the checksums takes up is minimal compared to some of the other cached items in cache_menu. For example, on a largish site we're putting together, the site currently has these stats related to this patch:

# records in menu_router - 2833
# records in menu_links - 2122
Size of menu_router (MyISAM/utf8_general_ci) - 2.2 MiB
Size of menu_links (MyISAM/utf8_general_ci) - 728.6 KiB

Size of menu-rebuild:router-checksums in cache_menu - 527.4 KiB

That checksum size, whilst half a Meg might seem like a lot, is actually a great deal smaller than all of the admin_menu links caches in the same table. I don't think the DB footprint is much of a concern here. If your DB can't handle an increase in size of around even 1MB, then you have bigger problems I believe.

We've been running the above patch for about half a week now on our development sites (which tend to have the most problems due to ever changing menus as content types and views are added/removed) and have not had any issues noticed as yet, so am marking as needs review. If we notice any problems I'll change it back.

If your DB can't handle an increase in size of around even 1MB, then you have bigger problems I believe.

I am not worried about size. With "footprint" I just mean that you look in the cache_menu table and see something you would not see without this patch. This does not have to be a problem, it just means a difference to the old behavior, that we'd avoid with a SELECT approach.
It also means, we create something that other modules can interact with, that theoretically can cause side effects, and can be affected from side effects.

Selecting from the database instead of making a cache also has its own issues when, for whatever reason, the router table gets out of sync and we end up with ghost or rogue items in there that should not be

The idea of the SELECT approach is to achieve a guaranteed 100% replacement of all table contents. Any ghost or rogue item has to and will be detected and deleted. The only thing we can not guarantee with this approach is the order of items in the table.

In #61 there was

<?php
// compare old and new rows
_menu_router_build_calc_changes($rows_old, $rows_new, $changes);
+
// save changes
_menu_router_build_save_insert($changes->insert);
_menu_router_build_save_update($changes->update);
_menu_router_build_save_delete($changes->delete);
?>

In the D7 attempt of #94 (which is not going to work) this was wrapped into a function

<?php
+/**
+ * TODO:
+ * This can be added to the DB abstraction layer for reuse.
+ * Maybe the $primary_key param arg can be made more
+ * flexible to support multi-column primary keys.
+ */
+function _menu_router_save_table_contents($rows_new, $table_name, $primary_key) {
?>

The idea is that after this function to run, the table contents will be identical with $rows_new (except for the order). So it wants to be a complete replacement for TRUNCATE + INSERT. The only thing that could spoil the fun is if anything in Drupal needs the exact order of router items (and does not order the items itself). But I think the checksum approach is no different in that.

Probably the checksum approach is a little bit faster.

if cache_menu was cleared, there was probably a good reason for it, and that we should do a full rebuild in this case anyway.

My point is, the SELECT approach is meant to be a "full rebuild", just that it works smarter than the full delete + insert.

Yes, I agree that it is mostly used for menu links and the like, but it is not cache_menu_links, it is cache_menu, which I go to mean I can put anything menu related in there.

I would argue, this is not menu related. It was a mistake that this thing is called "menu_router", it should rather be called just "router". Anyway, yes, there is some connection with the menu links system, and the router table contains meta information for building the navigation menu. Probably better to ask "when is this wiped". I think there are many cases where we want to wipe the cache_menu for menu_links related stuff, but do not need to rebuild menu_router.

I tend to agree with most of what you've said. To be honest, I'm just happy I now have something that works and doesn't mean waiting for ages when rebuilding the menu. I'd be happy to evaluate a select based approach if there is a consensus that this would be better. I personally don't have the time to write a patch for this however. I agree that there could be room for improvement in the patch I've uploaded, but for my own personal needs this works perfectly, and there is currently not much motivating me to write an alternate version that functions differently. Happy to test and trial other approaches however.

Ok, I do it :)
Btw, here is a little piece of code to test the beast..

<?php
function mymodule_menu() {
 
$items = array();
  for (
$i = 0; $i < 3; ++$i) {
   
$path = 'imagene-test-' . rand(1, 5);
   
$items[$path] = array(
     
'page callback' => 'exp',
     
'page arguments' => array($i),
    );
  }
  return
$items;
}
?>

This will almost always generate some update, delete and insert.
Nice, eh?
(patch to come in the next post)

StatusFileSize
new6.9 KB
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
new6.09 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Here is the patch, based on SELECT.
Things to consider:

  • The physical order in the menu_router table will usually not be identical with the one we would get without this patch. I assume this is not a problem. If I am wrong, then this entire approach (and the one from aspedia) fails. Or, we would have to make it optional.
  • Memory consumption for the db engine and the php request will be a bit higher than it used to be. But, there are other places in Drupal that typically take up much more of memory. The order of magnitude is the old size of the menu_router table, plus the new size of the new menu_router table. I say this, because previous posts in this issue have raised concerns about memory consumption.

I made two versions of the patch.
One of them contains benchmark and debug statements for "drush cc menu", and the function has a return value that might be useful in the future (just to inspire some imagination).
The other is just the plain patch.

The patch turned out a lot simpler than those in older posts. I guess my skills have improved since then :)
The patch is against the current 6.x-dev from git, but I only tested it on 6.19 so far. I will probably do more testing in the next weeks.

Benchmark results:
The drush command tells me,

972 rows inserted.                                                                                                              [ok]
Time for _menu_router_save_table_contents() slow: 1369 ms                                                                       [ok]
1 rows updated.                                                                                                                 [ok]
Time for _menu_router_save_table_contents() fast: 40 ms

Maybe the approach by aspedia will be even faster than the 40 ms, but I don't think this matters.

The _menu_router_save_table_contents() could be generified and go into the db layer, but I think that's too much for D6.

Status:Needs review» Needs work

The last submitted patch, drupal.menu_router_build.SELECT.d6.drush-benchmark.patch, failed testing.

StatusFileSize
new6.09 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_router_build.SELECT.D6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

That's quite interesting - it always tests the last file I upload. And apparently it can test D6! (did not notice that before). Trying again.

StatusFileSize
new6.09 KB
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]

One more try.
Where is this test bot stuff documented?? Time to put a link next to the attachment form.

StatusFileSize
new6.09 KB
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]

Status:Needs review» Needs work
StatusFileSize
new6.09 KB
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]

EDIT:
Ok I give up.

drupal.menu_router_build.SELECT.d6.drush-benchmark.patch is queued by the test bot (and obviously fails, thanks to the drush_log)
drupal.menu_router_build.SELECT.d6.test-me.patch is ignored.
Where is the logic in that?

EDIT:
Anyone with permissions, feel free to delete the garbage.
See also #1162082: Specify why a patch is ignored by the test bot.

Status:Needs work» Needs review
StatusFileSize
new6.09 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d6.test-me_0.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Ehm, the "needs review", probably. duh.

Status:Needs work» Needs review

#229 works for me. I like the simple logic that avoids problems during rebuild.

#229 works for me. Looks like no problems and it speed up menu rebuild! Thanks donquixote!

We have a tasty patch in #229. Let's not let it grow a beard.

If anyone is concerned about memory (I say this, because it has come up over and over above), we could add some things to fix it:
- chunked SELECT. This could be done with LIMIT, or some criterion. LIMIT is only safe if no other write query interferes. (*)
- delete the table contents, if the row count in {menu_router} is more than double the number of new items. This is relevant if some crazy module has filled the router table with a lot of junk.

(*) Do we already lock the db for this process?

@donquixote
Good job of avoiding the max packet issue from #166. I think the multi insert can be useful elsewhere, any reason why we don't roll it into it's own function?

I think the multi insert can be useful elsewhere, any reason why we don't roll it into it's own function?

Reasons:
- We do not want to further delay this fix.
- We want to keep the patch size small, while we are in D6. If we factor out this function, it should probably live in the db layer, so the patch would then live in more than one file.
- I want to avoid yet another "D7 vs D6" debate (*).

Imo we should get this one committed, and then work on a refactoring patch that would move the multi insert into a separate, reusable function in the db layer. We could even have a generic function for the "replace table contents".

Good luck then, to convince core maintainers of these "new features" for the D6 db layer :)
I imagine it will very likely trigger

------

(*) D6 vs D7 vs D8.
We have been talking "6.x-dev" since #100 (catch).
All this under the assumption that the only optimization is a multi-insert.
D7/D8 already has its own way of _menu_router_save() using dbtng, which does multi-insert (afaik), but which does not do any diff-based something.
So, strictly thinking, we now have to switch this back to D8, then backport to D7 and D6. Feels like turning in circles :(
All there is to change for D8 and D7 is _menu_router_save(), which currently uses a db_truncate() approach.
Having done that on D8 / D7, the D6 patch from #229 will still be the same (maybe some line numbers need to change).

So, I guess before I talk forever, I simply bite the apple and do the work for D7 / D8.
I leave this as "needs review", so others can have a look at the D6 patch and decide if it has any flaws.

Sounds good, lets try to get #229 in.

Version:6.x-dev» 8.x-dev
StatusFileSize
new5.51 KB
PASSED: [[SimpleTest]]: [MySQL] 29,856 pass(es).
[ View ]

I measure a significant performance boost on D7 as well.
Patch includes benchmark code, which obviously will need to be removed later. There is a place where you choose "TRUE" or "FALSE" to use either the old truncate-all or the new diff-based technique. For me the diff-based approach was around 6x faster.

The patch is rolled against D8, although it was tested with Drupal 7.0 only.
D7 or D8, all you need is replace the function _menu_router_save().

@donquixote: Your patch is certainly a lot easier to follow than mine, and if it is working for others then that is excellent as it is sorely needed.

My only concern is the approach you've taken will not allow the nav links rebuild to optimized as it doesn't pass any of the "CRUD" information on to the nav links, and due to the way nav links work, it then becomes I believe quite hard to do "CRUD" ops on the nav links as well.

My only concern is the approach you've taken will not allow the nav links rebuild to optimized as it doesn't pass any of the "CRUD" information on to the nav links, and due to the way nav links work, it then becomes I believe quite hard to do "CRUD" ops on the nav links as well.

I would like to make the crud info available to others, but I think this can happen in a separate patch. I was afraid that sending a return value into the void would only cause more questions and delay.

Also, we can bikeshed a bit about the magic number of chunked insert. D7/D8 nowadays use 20, while my D6 patch does 100 at a time. I don't think 100 is too much.

I do have some advanced and nice ideas about how exactly to expose the crud data from the router.
#550254-36: Menu links are sometimes not properly re-parented ("hook_menu_router_update")
#512962-216: Optimize menu_router_build() / _menu_router_save() ("hook_menu_router_built")
#653784: Separate out menu links from hook_menu ("hook_menu_rebuild")
I think this deserves a separate issue. Just created,
#1170278: Introduce hook_menu_router_update()

Let's get this simple thing in, then continue in the new issue.
(going to produce a new D8 patch without the benchmark / debug stuff)

I wonder if dbtng does accept placeholders anywhere.
Some looking around told me that db_update()->execute($args) usually does PDOStatement::execute($args), which treats all arguments as strings. Thus, they will all be with quotes. That is as if we do all '%s' instead of %d even for numbers.

Probably not a big deal, but why then have we been so crazy about using the correct placeholders in the past?

pass any of the "CRUD" information on to the nav links

If you look at the patch in #, there is some serialize() going on. The diff has to be created on the serialized data. However, for nav links rebuild we rather want the unserialized data.
Also, there is the question of 'access callback' (with space), vs 'access_callback' (with underscore). Which version should we send to nav rebuild? I think currently we send the version with spaces, although the nav rebuild does not use any of the fields where this matters.

#229 works for me. No problems seen and does speed up menu rebuilds! Many thanks!

I used this patch @#229 with Pressflow 6.22

What is needed to get #229 or #239 in?
We need to decide if we go the long route D8 -> D7 -> D6, but then...
Could someone with more "authority" say something?

StatusFileSize
new5.51 KB
PASSED: [[SimpleTest]]: [MySQL] 33,553 pass(es).
[ View ]

A new D8 patch without the benchmarks.
And the one with benchmarks re-uploaded so we have them all in one post.

D6 or D8 ?
I think we should really commit this into D8 then D7 then D6.
The direct performance benefit for D7 and D8 is not as big as for D6, but we do gain some interesting options to use the diff for other things (see #1170278: Introduce hook_menu_router_update()).

StatusFileSize
new4.74 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal.menu_router_build.SELECT.d8.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

What's going on.. I wanted to upload two patches not one.
So here is the clean, "non-benchmark" patch.

EDIT:
Hell, why is this "ignored" ???
Maybe because the other test is still running?

#244: Try renaming it so it doesn't end with a number.

StatusFileSize
new5.51 KB
PASSED: [[SimpleTest]]: [MySQL] 33,559 pass(es).
[ View ]
new4.74 KB
PASSED: [[SimpleTest]]: [MySQL] 33,546 pass(es).
[ View ]

Issue summary:View changes

Add an up-to-date summary

Issue summary:View changes

complete list of benefits at the top, including the follow-up issue.

+++ b/includes/menu.incundefined
@@ -3649,17 +3648,83 @@ function _menu_router_save($menu, $masks) {
+  $q = db_select('menu_router')
+    ->fields('menu_router')
+    ->execute();

Since this query is not tagged or anything, can we just use a static query instead?

+++ b/includes/menu.incundefined
@@ -3649,17 +3648,83 @@ function _menu_router_save($menu, $masks) {
+  while ($row_old = $q->fetchAssoc()) {
+    $path = $row_old['path'];
+    if (!isset($menu[$path])) {
+      $delete[$path] = TRUE;
     }
+    else {
+      // Compare the old row with the new row.
+      $row_changes = array();
+      foreach ($row_old as $key => $value_old) {
+        if ($menu[$path][$key] != $value_old) {
+          $row_changes[$key] = $menu[$path][$key];
+        }
+      }
+      if (count($row_changes)) {
+        $update[$path] = $row_changes;
+      }
+    }
+    unset($insert[$path]);
+  }

Could unsetting some of these variables help with memory? Most notably $q, and possibly others? (Or is it worth testing, at least?) This would apply to the D6 patch as well.

One other question, why is D6 doing chunks of 100, but D7 only 20? Edit: Just noticed you answered this above: 20 is what D7/8 were already using, but 100 works on D6 in testing.

20 vs 100.
I decided on 100, before I even noticed the 20 used in D7/D8.
The decision is a bit arbitrary, but I think 100 is a better tradeoff than 20. I imagine 100 is faster than 20, but still far too small to cause memory issues.

> Could unsetting some of these variables help with memory?

I assume this memory is freed automatically when the function ends.
Do you know a way to test memory consumption?
Can you confirm that the memory "problem" even exists? I don't want us to waste time on a non-existing problem.

If we don't have any better way of measuring, then the most reasonable thing is trying to use not more memory than other essential parts of Drupal. So I think if we find another part in Drupal that does use the same amount of memory or more, then we are done. Probably the module_invoke_all('menu') is such a candidate.

> Since this query is not tagged or anything, can we just use a static query instead?

Ok why not. Guess this is a policy, good practice or sth?

Since this query is not tagged or anything, can we just use a static query instead?

Ok why not. Guess this is a policy, good practice or sth?

Static queries are faster; see: http://drupal.org/node/310072#comment-4389610

As for testing the memory use, this will be an important part of benchmarking (regardless of whether or not we unset). There are the PHP functions memory_get_usage() and memory_get_peak_usage(), to start. Otherwise, tools like xdebug have profiling functionality, I believe.

Does anyone know if a chunked query is actually cheaper on (SQL) memory?

There are two possible approaches of chunking.

By count.
This absolutely requires that the table does not change between queries.

<?php
for ($i = 0;; ++$i) {
 
$q = db_query_range("SELECT * FROM {menu_router}", $i * $n, $n);
  [...]
}
?>

By filter condition.
This would survive table changes between requests - although there are other reasons why we would prefer the table not to change during the entire process.

<?php
$characters
= array('a', 'b', ...);
for (
$characters as $c) {
 
$q = db_query_range("SELECT * FROM {menu_router} WHERE MD5(path) LIKE ':c%'", array(':c' => $c));
  [...]
}
?>

I think this approach is smarter, and more likely to reduce SQL memory cost.
For the "LIKE" I hope there is something better that does the same. Any ideas? Ideally I want a hash function in SQL that reduces a string to one character.

Third alternative, combine LIMIT and WHERE.

<?php
$characters
= array('a', 'b', ...);
$q = db_query_range("SELECT * FROM {menu_router} ORDER BY path", 0, 100, array(':c' => $c));
for (
$characters as $c) {
  foreach (
$q as $row) {
    [...]
  }
  if (!
$row) break;
 
$q = db_query_range("SELECT * FROM {menu_router} ORDER BY path WHERE path > ':c'", 0, 100, array(':c' => $row->path));
}
?>

Hey, I have no idea if this even works.
The "ORDER BY" could kill our intention about memory..

StatusFileSize
new4.71 KB
PASSED: [[SimpleTest]]: [MySQL] 33,539 pass(es).
[ View ]

And here we have a patch with static query.
(benchmarks should be similar to those with #246)

This would survive table changes between requests - although there are other reasons why we would prefer the table not to change during the entire process.

Right, for example, if something else tries to rebuild, it deletes all your menu custom menu items.

The second query pattern does not seem like a good idea to me. A WHERE and a LIKE and a wildcard in there; I am sure it will be less performant. Edit: or the ORDER BY as well. Definitely stick with the simple query.

I just tested the D6 patch (in #512962-229: Optimize menu_router_build() / _menu_router_save()) on an actual D6 site with about 1500 entries in {menu_router}. I repeatedly cleared the menu cache both with and without the patch. The PHP memory use and peak memory use over the course of menu_rebuild() were not different in any statistically significant way. However, in my small sample, the average processing time for menu_rebuild() dropped from ≈7.2s avg. to ≈6.4s avg. This is in the case where no menu items were actually changed.

@xjm
Can you benchmark the specific part of menu rebuild that we are attacking in this issue?
(although i think it has already been confirmed quite well that we get a good speed-up on D6)

Well, I just wanted to see what the impact would be on a full, live page load, and whether the memory issue was real in my case.

Ideally we would automate some simpletests both with an unchanged menu and with an updated menu. I can't test that on my production site. :) I'll see if I can set up a script to get some data from drush in my development environment (which is too slow to even consider a cache clear through the browser).

Category:bug» task
Priority:Major» Normal

Sorry, but as the title suggests, this is a performance improvement task.

@xjm: Please create a separate bug report about a potential bug in the menu router rebuilding.

@donquixote: Note that we're slowly reaching 300 comments on this issue, at which point issue follow-ups are going to turn into a mess (due to paging). As of now, the patch still looks a bit "rough" in terms of code and especially comments. I'd recommend to continue to improve it in this issue, but as soon as it's deemed "ready", it would likely make sense to re-start in a fresh and new issue, having a solid issue summary and a clean and working patch right from the start. Ideally, the issue summary would then also hold some details about the different approaches that have been investigated in this issue over time, including hints for why they didn't work out, etc.

Title:Optimize menu_router_build()Optimize menu_router_build() / _menu_router_save()

the patch still looks a bit "rough" in terms of code and especially comments.

I would love to improve, but..

Comments:
A lot of the comments are just left in the same way as they used to be. I don't think they are any worse now than they used to be.
"rewrite some field values"
"Execute in batches to avoid the memory overhead of all of those records in the query object."
-> The logic of these parts has not changed, so I did not feel the need to change the comments..

Those comments that I added are quite ok, imo.
"Determine the diff of new and old table contents."
"Compare the old row with the new row."
"Save the changes."
"Delete rows that no longer exist."
"Update rows that have changed."
"Insert new rows, in chunks of 20."
-> I don't see how this can be said any better.

There are some comments that were useful for working on the patch, but which probably should not be committed.
"new in D7"
"gone in D7"
If that's all, I'm ok with it.

A few things that could be done, but that I don't see as being in the scope of this patch:
- improved docblock for _menu_router_save, with @param $menu and @param $masks. I tried to do that just now, and then found that there is no explanation for $masks to be found anywhere.
- signature of _menu_router_save: Write "array $menu" and "array $masks" instead of just "$menu" and "$masks". Yeah, probably I'd do that in my own module code, but for core I don't even know if that's an accepted policy.

Otherwise, I would really need some more specific feedback, or I can only guess.

Code:
Nothing that pops into my eye. Hint?

StatusFileSize
new4.56 KB
PASSED: [[SimpleTest]]: [MySQL] 32,989 pass(es).
[ View ]

To move on, here is a new patch with cleaned up / slightly improved comments.
For anything beyond that, I really need more specific feedback.

Status:Needs review» Needs work

Didn't do an in-depth review but here's a few things that jumped out in terms of coding standards:

+++ b/includes/menu.incundefined
@@ -3590,42 +3590,39 @@ function _menu_router_build($callbacks) {
+  $field_placeholders = array(
+    'path'              => "'%s'",
+    'load_functions'    => "'%s'",

There's no need for placeholders in Drupal 7/8.

+++ b/includes/menu.incundefined
@@ -3590,42 +3590,39 @@ function _menu_router_build($callbacks) {
+  $field_names = array_keys($field_placeholders);

And they're dropped here anyway.

+++ b/includes/menu.incundefined
@@ -3649,17 +3646,81 @@ function _menu_router_save($menu, $masks) {
+
+  $q = db_query('SELECT * FROM {menu_router}');

Please use something like $result instead of $q.

+++ b/includes/menu.incundefined
@@ -3649,17 +3646,81 @@ function _menu_router_save($menu, $masks) {
+  // Insert new router items, in chunks of 20.
+  if ($n_insert = count($insert)) {

$n_insert is never used. Just if ($insert) { should be fine here?

I'm also wondering if once this patch is in we'd be able to reduce the lock window for router rebuilds at all, feels like there should be much less chance of a race condition.

9 days to next Drupal core point release.

Thanks for the feedback!

There's no need for placeholders in Drupal 7/8.
And they're dropped here anyway.

valid point.
I think D7/8 always uses the '%s' placeholder? Or it looks at the value type to choose a placeholder?
I checked this one day, but don't remember..
anyway, looks like this is no longer needed.

I wonder, can we do it without the field names array? The only place where they are used is here,

<?php
   
// Prepare the insert object.
   
$q = db_insert('menu_router')
      ->
fields($field_names);
?>

Does the db_insert() always need explicit field names? Afaik, it does..

Please use something like $result instead of $q.

Ok.. that's a private habit, but yeah, "$result" is the Drupal standard.
Do we use the same for the insert query? We can not say "$insert", this is already in use.
What about "$query"? That fits both for the select and the insert.

$n_insert is never used. Just if ($insert) { should be fine here?

Same story then for if ($update) and if ($delete).

I'm also wondering if once this patch is in we'd be able to reduce the lock window for router rebuilds at all, feels like there should be much less chance of a race condition.

The chance for race condition on menu_router will become smaller, but we should still attempt to play it safe, and keep the lock.

I agree with keeping a lock, just wondering if we can reduce the window of the lock to the actual insert/update/delete or something along those lines. Doesn't have to happen here either way.

Placeholders are handled by PDO, there's just no need to use them at all except in db_query( and they are all in :placeholder format).

$query is fine instead of $q.

Status:Needs work» Needs review
StatusFileSize
new4.11 KB
PASSED: [[SimpleTest]]: [MySQL] 32,985 pass(es).
[ View ]

So, here's another attempt.

StatusFileSize
new4.13 KB
PASSED: [[SimpleTest]]: [MySQL] 32,975 pass(es).
[ View ]

Alternative, with the $field_names array moved into the db_insert(..)->fields().

StatusFileSize
new4.15 KB
PASSED: [[SimpleTest]]: [MySQL] 33,759 pass(es).
[ View ]
new4.13 KB
PASSED: [[SimpleTest]]: [MySQL] 33,774 pass(es).
[ View ]

Hmm, which of #255 and #256 the above is preferred? For now rerolling both I guess.

xjm ... I think you meant #266 & #267

IMO #267 is a bit easier to follow whats going on. I dont think there are any other differences...

Status:Needs review» Needs work

One thing we need to think about:
The order of tabs and menu items on one level, if they don't have explicit weights.

Possible solution:
Collect the diff by menu parent and weight, and always wipe and re-add the full set of items that has the same parent and the same weight.
I suppose this is still possible, but can be a bit tricky.

Status:Needs work» Needs review

Do you need to switch version to D6 to re-test a D6 test? Or is the version of one test remembered for the re-test?

Version:8.x-dev» 6.x-dev
StatusFileSize
new6.09 KB
PASSED: [[SimpleTest]]: [MySQL] 190 pass(es).
[ View ]

I believe we need to change the branch to D6 to test a D6 patch. Please let me know if I'm wrong, but currently it doesn't seem to be working as it's testing against D6. I've re-rolled the patch to avoid bad hunks against dev and changed to 6.x-dev to see if testbot will pick it up.

I did review this patch. Personally I'm having a problem on my live site with this issue:
#246653: Duplicate entry in menu_router: Different symptoms, same issue Comment #228

My thoughts on the mater, that a bug exists when a lock times out between the DELETE and INSERTs when rebuilding the menu. I believe this patch could assist in minimizing this problem as there's less chance to have the menu rebuilt when it's not quite ready.

I also re-wrote the code to test where all entries are deleted, then right after INSERTED in a single query. This proved much slower than I expected, because of array_merge and array_values surprisingly.

Additionally with this patch, I don't find that it's in fact faster, but the reduced amount of queries I believe my help with my bug. This code actually seems to take longer to run than current core code in my tests. I'm unsure why, but that's what XHProf said, despite this new diff code only performing a single query vs many from existing core code. I didn't test it heavily, just a couple times, so could have been my load.

On the issue of memory vs. queries: Personally since my site is on a large server and has lots of resources, but slow database, the trade off for memory is well worth it.

Sidenote
The memory vs. query debate is something I see through out Drupal. Personally I believe that everything should be optimized for speed and reduce the amount of queries, which is the bane of Drupal. Drupal doesn't really need to be able to be installed on your toaster (NetBSD tm). Drupal has never worked well on shared hosting and always requires a bare minimum low mem VPS. With these getting cheaper, I believe it's less and less important to optimized for toaster servers. With that said, if this was put into D6 core, it could cause problems with pre-existing installs, so it may always exist as a patch.

If your runing a D6 multi-lingual site and need some query vs memory optimizations take a look at this i18n patch
#877016: i18n creates big database load Comment #14 (tests don't work on D6 for i18n, but this patch saved hundreds of queries)

Hi,
some comments, I hope they are helpful.
(I said this before in this issue, but you probably don't want to read all the comments)

  1. If you want to test performance, you should definitely benchmark the function itself, in addition to the total request time, to get a statistically significant result.
    I typically do that with xdebug. xhprof might do the same, I have no idea. Or you could put manual microtime(TRUE) etc.
  2. The rebuild happens in two steps: router rebuild, and navigation links rebuild. The second typically eats way more time. This issue is about optimizing the first step because that is an easier target, but it was always thought as a first step only.
  3. You may be right with your assumption of the locking framework.
    However, I would rather think that the navigation links rebuild is a fragile operation, and would not be surprised if something goes wrong in there.
  4. Performance is easier to measure than memory consumption. Especially, if you are interested in memory consumption of specific functions or variables. (if you know better, let me know!)
    My rationale in this thread has always been that this diff-based rebuild can't eat more memory than a lot of other things happening on the site, where we operate with arrays of the same magnitude or higher. E.g.
    - loading a menu links tree (admin menu) and doing all the access checks.
    - views_get_all_views()
    - invoking hook_menu() on all modules.
    While for performance we should be happy about every millisecond, for memory we should focus more on the big ones.

@donquixote Thanks for the patch. Some more observations from my testing yesterday if anyone is interested.

xhprof gives me the milliseconds for each function call. Also tells you which functions call which functions and how many times. Also the memory before and after the function call. You should play around with it, it's great.

With that said, while _menu_router_build() was sped up, it did use a couple more megs of memory. I am working on a large site with huge menus and 9x languages...so this won't happen for most people. Overall the entire time to clear the caches was a little slower, but only by about 1 second. I'd really need to test further, as the reduction in queries should make it faster.

And yes, navigation links does take the longest time and does way too many queries. With admin_menu installed for my install, a cache rebuild does around 12,000 queries and takes over 30 seconds at times. I believe your code reduced this by a couple hundred queries (more than 500 for sure). If I remove admin_menu it's still around 6000 queries per cache rebuild.

Your code I believe could also be optimized for memory and speed by flattening the $args array and doing array_slice()'s instead of array_merge()'s. I have a feeling you could also save memory by not using a keyed array, as I believe each key takes up additional memory. That would also remove the need to do array_values()'s. Perhaps constants used for array keys may not duplicate memory? Anyone know? I noticed array_merge() followed by array_values() took a lot of CPU time as well, when I changed the insert chunks from 100 to 2000 ...so I think if these can be avoided, it would speed things up. Again these are just my unconfirmed thoughts.

Unfortunately a refactor like this makes the code very hard to follow.

I'd also need to do more reading on PHP arrays to provide a better more memory efficient solution, here's a start:
https://sheriframadan.com/2012/10/a-closer-look-into-php-arrays/

Issue summary:View changes

Added links to patches to summary.

I want to share my (D7 and totally impossible to include in core) patch that seems to have solved my issues with a very slow menu rebuild process.

My issue was that whenever we did something that would initiate a menu_rebuild, we would have multiple processes performing the rebuild and all other requests hanging in the meantime, since there would constantly be a menu_rebuild in progress.
This would go on for up to 10 minutes.

Using Background Process, I delegated the actual rebuilding to one, single process running in the background and while it may lead to a small delay in a new menu router item being available or give a 404 to a request just as the menu cache is being refreshed, it is a small price to pay compared to a 10 minute service lockdown.

Comments are welcome, but I just hope to help others in a situation similar to mine.

(I couldn't seem to add the patch to just my comment, so I added it to the issue - 512962-menu_background_process.patch)

Related issues:+#1499738: db_merge does not allow multivalue inserts.
StatusFileSize
new3.85 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch menu_155.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Apologies, I've written the attached patch (D7) without seeing the existing work on this issue.

Its currently using db_merge. It would be better if it could support multivalue #1499738: db_merge does not allow multivalue inserts. and I'll hopefully give implementing that a go soon.

As its all wrapped in a transaction anyway (menu_rebuild function) so I'm not sure how much performance can be gained. Like all patches here removing db_truncate is an important first step #2229013-21: PDOException: SQLSTATE[HY000]: General error: 1030 Got error -1 from storage engine: TRUNCATE {cache_field} ; Array ( ) in cache for reliability.

Status:Needs review» Needs work

The last submitted patch, 277: menu.patch, failed testing.