Hi,
I just noticed that access checks are performed against all menu callbacks on every page. Tracing this, I found it was fired from the navigation menu, which calls menu_tree(), and this one calls menu_tree_page_data(), and here the menu structure is loaded from cache, but access check is performed against every single menu item, even it is not going to be displayed on the block. So menu items of type MENU_CALLBACK are also checked here.
The whole menu system is so complex, though it looks like in this case menu_tree_page_data() could be optimized to omit 'hidden' items, that will be ignored anyway by menu_tree_output().
menu_tree_page_data() is also called from menu_navigation_links(), where 'hidden' menu items are also ignored. So maybe this could also benefit from such optimization.
menu_tree_page_data() is also called from menu_set_active_trail(). This one processes all menu items. I'm not really sure if checking hidden menu items here is relevant or not, though.
If was correct that there is a performance issue here, could you please consider backporting any fix to D6 as well?
Comment | File | Size | Author |
---|---|---|---|
#59 | menu-tree-page-data-D6-4.patch | 5.04 KB | c960657 |
#54 | menu-tree-page-data-D6-3.patch | 9.04 KB | c960657 |
#50 | menu-tree-page-data-D6-2.patch | 11.7 KB | c960657 |
#43 | menu-tree-page-data-4.patch | 7.96 KB | c960657 |
#34 | menu-tree-page-data-3.patch | 5.19 KB | c960657 |
Comments
Comment #1
kbahey CreditAttribution: kbahey commented@markus_petrux
If you have even a proof of concept patch for this, I will be happy to test D7 and D6 with and without the patch, in order to see if it has a measurable effect.
Comment #2
markus_petrux CreditAttribution: markus_petrux commentedI'm not exactly sure of how the new menu system works, so I would prefer if the patch could come from someone who is able to evaluate the possible impact of touching this.
Comment #3
markus_petrux CreditAttribution: markus_petrux commentedI'm attaching a patch that could be used to see what I meant.
Here's how to test the performance of the patch:
1) Apply the attached patch to a D6 installation.
2) Try the following code with different users:
Results:
Comment #4
markus_petrux CreditAttribution: markus_petrux commentedThe previous patch is mostly coded for testing performance as described.
Here's a patch that would apply the fix to menu.inc, and book.module, which is the one that could benefit from the optimization of menu_tree_all_data(). It's suited to test the functionality, and if it works ok, then applied to D6 branch.
I might have missed something that invalidates the optimization, but if not, there's a significant gain here because the relevant code is executed for every page on the site. ie. user access is checked for all menu items for each and every page view, which is not really necessary, and the performance impact may depend on which access callbacks are involved.
Comment #5
markus_petrux CreditAttribution: markus_petrux commentedOops. Minor fix on the patch in #4. Sorry. :-/
Comment #6
catchWhile the patches might be against Drupal 6, this really needs review and testing against 7.
Comment #7
markus_petrux CreditAttribution: markus_petrux commentedHere's a patch against HEAD.
Comment #8
markus_petrux CreditAttribution: markus_petrux commentedhmm... testing I've found that it is only necessary to change menu_tree_all_data() and menu_tree_page_data(), and certain calls to them.
The reasoning of the optimization is simple. Do not perform access control for items that will not be used. Checking for access control may take a lot of time depending on the number of modules installed, the number of menu items involved. It may range from a few milliseconds to even more than one second per page (access callbacks may involve additional DB queries), for all pages on the site.
Updated patches for both, D6 and D7.
Comment #9
markus_petrux CreditAttribution: markus_petrux commentedComment #10
kbahey CreditAttribution: kbahey commentedI benchmarked the D7 version with 5 concurrent users, with a 2 minute test duration
Without patch
With patch
So the patch is slower as it stands ... :-(
Comment #11
kbahey CreditAttribution: kbahey commentedI ran the test again, with the same parameters (5 users, 2 minute stress test), but hitting only the front page, instead of a combination of the front page (which has lots of nodes on it) and individual node pages.
Before the patch:
After the patch:
There is a very slight improvement with the patch.
Can others please benchmark this with different scenarios?
Comment #12
catchBenchmarked with ab, front page with 30 nodes, then an individual node with 90 comments. Results are pretty close to kbahey's:
node
HEAD:
6.54 reqs/sec
Patch:
6.79 reqs/sec
node/n
HEAD:
13.39 reqs/sec
Patch:
12.86 reqs/sec
Comment #13
markus_petrux CreditAttribution: markus_petrux commentedI think it's not only the time spent on performing access checks, but also memory usage. Access checks may load more files, and some may require additional database queries.
Maybe a plain Drupal installation is not enough to note the difference. As the number of modules installed grows, there may also be more menu items that will be checked, or not with the patch, for user access. Also, the difference may be more significant when a content access module is used.
If you add a print inside function _menu_check_access() you'll see how many times this function is invoked per page.
Comment #14
markus_petrux CreditAttribution: markus_petrux commentedhmm... it seems to me that the latest patch I posted is not completely accurate. My concern is that access check is performed for all menu items, when it shouldn't be necessary. I'll try to investigate this a little more.
Comment #15
kbahey CreditAttribution: kbahey commentedI used the performance module (part of devel), and this is what I got.
Without the patch 6.75 MB avg
With the patch 6.50 MB avg
So, slight savings on a front page of a relatively simple site.
Comment #16
markus_petrux CreditAttribution: markus_petrux commentedWell, unless I'm missing something, the patches in #8 are correct to me. It just confused me the fact that I was using admin_menu in the site I last checked, and that was giving me a lot of access checks. So setting this back to 'needs review'.
kbahey/catch: thanks for taking the time to test this. Does it make sense or not to you what I'm seeing in menu_tree_page_data() ? While the menu structure is cached, function menu_tree_check_access() is called for all the items in the tree, but then when functions like menu_tree_output() are involved, for example inside menu_tree() used to render menu blocks, all 'hidden' items are ommited, so there seems to be no need to call menu_tree_check_access() for them. Am I wrong in this?
Comment #17
markus_petrux CreditAttribution: markus_petrux commentedAnother approach, maybe, would be caching the result of menu_tree_check_access() per user, or per combination of roles.
If it was cached per user, the benefit would just affect anonymous users. For registered users the benefit would only be noticed when the same user visits the same menu path.
If it was cached per combination or roles, the benefit of a single cached block would be reused more times. But it would invalidate access control made at user level. Maybe this worths thinking about. It would be real approach if there was a way to determine if there's something that is performing access control at user level. Maybe it is only possible when an access control module is enabled, and that it does this, offer a way to control access at user level? If this was true, then such a module could tell the menu system not to cache the result of menu_tree_check_access().
If the result of menu_tree_check_access() was cached, and is was possible to reuse that information for many different users, then the benefit would be great, in terms of performance, but also reducing memory and cpu usage. Note that a single execution of menu_tree_check_access() may involve hundreds of items, and that's for every page request.
Comment #18
catchDid kcachegrind of the front page.
Incl. is total time spent in menu_tree_page_data() and all the functions it calls. Self is time spent only in menu_tree_page_data()
HEAD:
Incl: 116879
Self: 3332
Patch:
Incl: 28075
Self: 602
and node/n
HEAD:
Incl: 54411
Self: 637
Patch:
Incl: 39620
Self: 475
This doesn't explain the reproducible slowdown on node/n pages though.
Comment #20
catchresetting.
Comment #21
JohnAlbinMake sure you don't remove menu items that have their 'alter' flag set to TRUE.
Its possible to dynamically modify those types of menu items so that they are no longer hidden. See http://api.drupal.org/api/function/hook_translated_menu_link_alter/6
HOOK_translated_menu_link_alter() is called during _menu_link_translate() which is called during _menu_tree_check_access(). Which means that HOOK_translated_menu_link_alter() will be called after your trimming of hidden menu items. :-(
Comment #22
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedNote that none of the performance tests has any kind of error margin. This makes the results worthless for assessing whether this is an improvement.
Physis Lab: FAIL.
Comment #23
markus_petrux CreditAttribution: markus_petrux commentedI think what I noticed here is being addressed in this issue: #519046: Clean up toolbar menu code.
Comment #24
dyke it CreditAttribution: dyke it commentedsubscribing
Comment #25
c960657 CreditAttribution: c960657 commentedThe issue was marked as a duplicate of #519046: Clean up toolbar menu code, though it seems like the scope of that issue later changed. It contains a reference to #509584: APi fixes for menu_tree_data() including depth param that also contained an optimization, but AFAICT this doesn't completely address the point raised here. Access control is still done on a lot more items than necessarty, so I think it is relevant to reopen this issue.
Markus, do you agree?
Comment #26
markus_petrux CreditAttribution: markus_petrux commentedSure, feel free to proceed.
I haven't had the time to follow recent changes with these issues, but what you say makes sense.
Comment #27
c960657 CreditAttribution: c960657 commentedHow about this? Instead of looking up all items on each level on the menu tree, menu_tree_page_data() can now return only the items in the active trail. This prevents access checks on items outside the active trail when invoked from menu_set_active_trail() that is used by menu_get_active_breadcrumb() and menu_get_active_title().
In an empty D7 install on a randomly selected page, admin/config/regional/settings, the number of calls to _menu_link_translate() is reduced from 46 to 23.
The D6 backport is straight forward (I have attached it for those interested in benchmarking on an existing site). On a D6 site with a lot of menu items, this patch cuts ~15% of the total page generation time on frontend pages, i.e. pages that do not utilize the "navigation" menu. In particular, quite a lot is saved by not retrieving the items with plid == 0 and hidden <> 0.
The menu system is still somewhat black magic to me, so please give this a thorough review.
Comment #29
c960657 CreditAttribution: c960657 commented#27: menu-tree-page-data-1.patch queued for re-testing.
Comment #30
sunInteresting. Need to think about the non-obvious implications of this patch.
Sidenote: How does this patch relate to #620618: Optimize menu tree building and use it for toolbar ?
Comment #31
c960657 CreditAttribution: c960657 commentedFrom a visual inspection, I think the two patches will complement each other. AFAICT the new menu_build_tree() only kicks in in menu_tree_page_data(), after the $only_active_trail argument has reduced the number of menu items fetched from the database.
Comment #32
c960657 CreditAttribution: c960657 commentedBased on a visual inspection I think the two patches will complement each other. AFAICT the new menu_build_tree() only kicks in in menu_tree_page_data(), after the $only_active_trail argument has reduced the number of menu items fetched from the database.
Comment #33
c960657 CreditAttribution: c960657 commentedReroll.
Comment #34
c960657 CreditAttribution: c960657 commentedReroll (what happened to automatic retesting of patches?).
Comment #35
c960657 CreditAttribution: c960657 commentedWe have been running with the D6 version of this in production on several large sites for two weeks, so far without problems but with a considerable performance boost (in particular on our test server - for some reason).
In the beginning we were a bit confused, because the title of our front page suddenly changed, but it turned out that we had two different links with different titles pointing to the same path. AFAICT the old code makes no guarantee about which trail it returned by menu_set_active_trail() in that case (see the following line in menu_tree_page_data() - there is no ORDER BY), but at least it is stable. This stableness is broken by this patch - it is still stable after the patch, but the trail returned is not the same as before the patch.
Comment #36
catchThis and #327918: Optimize menu_tree_page_data() performance are duplicates of each other, although the patches are pretty different. I haven't compared the results or anything yet.
Comment #37
c960657 CreditAttribution: c960657 commented@catch: This is #327918 :-)
Comment #38
catchI always do that :( #620618: Optimize menu tree building and use it for toolbar
Comment #39
c960657 CreditAttribution: c960657 commentedDid you see my comment in #32? The issues may be duplicate, but I think the patches in the two issues will complement each other.
Comment #40
YesCT CreditAttribution: YesCT commented#34: menu-tree-page-data-3.patch queued for re-testing.
Comment #41
sun#34: menu-tree-page-data-3.patch queued for re-testing.
Comment #43
c960657 CreditAttribution: c960657 commentedThis implements the same optimization as in earlier patches following the refactoring made in #620618: Optimize menu tree building and use it for toolbar.
The 'expanded' and 'only_active_trail' options for menu_build_tree() are mutually exclusive. I don't know if there is a more intuitive way to do this ...
Comment #44
sunDo we have an actual use-case for this in core?
I think that this new option might make things even worse, since the only use-case I can think of would be breadcrumbs. However, all of the current code can re-use already cached menu tree parameters + cached menu trees, even if those may contain more items than actually needed, but that should still be faster than re-building everything from scratch instead of re-using caches.
I may be mistaken though. Can't really evaluate without actual use-case.
Comment #45
pwolanin CreditAttribution: pwolanin commentedsubscribe
Comment #46
c960657 CreditAttribution: c960657 commentedThe problem in D7 in definitely smaller that it is in D6.
In D7, menu_tree_page_data() is only called from menu_set_active_trail() if the current menu path is found in an active menu (by default the active menus are "navigation", "management", "user-menu" and "main-menu"). In D6 it was always called
menu_get_active_trail() can reuse the cached tree if that has already been generated in the current request (the tree is saved to {cache} also, but the expensive call to menu_tree_check_access() is not). This is the case if the menu is displayed somewhere. However, a lot of modules expose their URLs to the Navigation menu (e.g. each Panels page is exposed this way), and this menu may not be enabled on the site.
On a fresh D7 install, if you disable the Navigation menu on admin/structure/block/list/garland and then load filter/tips as the anonymous user, the patch reduces the number of calls to user_access() from 34 to 27.
Comment #47
sunNote that I've merged this patch into #576290: Breadcrumbs don't work for dynamic paths & local tasks
Comment #48
c960657 CreditAttribution: c960657 commentedThis was fixed as part of #907690: Breadcrumbs don't work for dynamic paths & local tasks #2. The D6 patch in #27 still applies.
As mentioned in #35 we have been running with the D6 version in production on several large sites since June without problems. #27 lists some performance metrics.
Comment #49
sunmmm... note that this patch wasn't merged as is. It required quite some adjustments to make it work correctly and not make it lead to major performance slowdown. Not sure whether it is a good idea to attempt to backport that.
Comment #50
c960657 CreditAttribution: c960657 commentedThis adopts some ideas from the D7 patch that was committed. Now the static cache ($tree) in menu_tree_page_data() can be reused if the function was already called with $only_active_trail == FALSE.
If scenarios where this reuse fails, there will be a performance hit, because the active trail is looked up twice. In the event that menu_tree_page_data() is only called with $only_active_trail there is a performance win, because we don't have to do access checks on the whole menu tree but only on the active trail. If you have a large menu tree, I would expect the potential win is greater than the potential hit. But I'm not sure how often you would fall into the former case ...
Comment #51
geerlingguy CreditAttribution: geerlingguy commentedSubscribe - I've traced some performance issues to this function on two of my sites.
Comment #52
kenorb CreditAttribution: kenorb commented+1 for 6.x patch
Comment #54
c960657 CreditAttribution: c960657 commentedReroll.
Comment #56
c960657 CreditAttribution: c960657 commentedThe patch still applies. The testbot fails due to #961172: All D6 Core patches are failing.
Comment #57
c960657 CreditAttribution: c960657 commented#54: menu-tree-page-data-D6-3.patch queued for re-testing.
Comment #59
c960657 CreditAttribution: c960657 commentedComment #60
kim.pepper59: menu-tree-page-data-D6-4.patch queued for re-testing.