Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 PST on 31 March 2024, to get $100 off your ticket.
As suggested earlier (http://drupal.org/node/22035) I want to open an issue to backport the performance fix to 4.6
We have a url_alias table of 8,000 entries / 1M and now even when we turn off path & pathauto modules it is still being fully loaded, slowing the site to a crawl (15-40 sec load)
This fix is desperately needed, I am sure not only by us. 4.7 is way out there (especially with contrib modules) and we are already dependent on these paths for search traffic.
Please help !!!
Comment | File | Size | Author |
---|---|---|---|
#42 | path_alias_backport_0.patch | 4.77 KB | chx |
#31 | i18n.module.path_alias_backport_0.patch | 934 bytes | klance |
#30 | i18n.module.path_alias_backport.patch | 918 bytes | klance |
#23 | path_alias_backport2_files.zip | 38.14 KB | Marandb |
#17 | path_alias_backport2.patch | 5.05 KB | matt westgate |
Comments
Comment #1
Adrian Freed CreditAttribution: Adrian Freed commentedI second the motion. This is a showstopper for many drupal projects for me including a million+ node site.
Comment #2
dkruglyak CreditAttribution: dkruglyak commentedRe-filing under Drupal base system. Also note a discussion on some benchmarks in our case: http://drupal.org/node/43431
Comment #3
dkruglyak CreditAttribution: dkruglyak commentedFound OLD patch, have questions. It is introduced here http://drupal.org/node/22035#comment-29593, with disclaimer that it is broken.
Could you help me understand what version that patch applies to? HEAD/4.7? 4.6.x? How different is it from 4.6.5? I can try hacking but any guidance is much appreciated. Maybe you could point me to which functions to work on and where to start modifying 4.6.5...
Comment #4
Jeremy CreditAttribution: Jeremy commentedHere's a backport from CVS HEAD that applies to 4.6.5. It works well in my own testing, but I certainly suggest that you test it first before using it in a production environment.
Feedback of how it helps on busy 4.6 sites would be welcome. It'd also be interesting to see some benchmark results.
Comment #5
dkruglyak CreditAttribution: dkruglyak commentedI just finished my own try to do the backport and found a bug in the original code, which also got carried into the patch.
Many of the aliases on my system do not get looked up. When rendering links, a call to drupal_get_normal_path before drupal_get_path_alias corrupts the map. I fixed the drupal_lookup_path function and rewrote it for greater efficiency by getting rid of array_search with two maps. Frontpage load improves 20s->0.7s.
Please review for commit to both 4.6.5 patch & HEAD:
Comment #6
dkruglyak CreditAttribution: dkruglyak commentedIf you are applying this patch, do not forget to build index on src column in url_alias table.
Comment #7
chx CreditAttribution: chx commenteddkruglyak , you need to read http;//drupal.org/patch
Comment #8
dkruglyak CreditAttribution: dkruglyak commentedThanks for the link. I am still not sure how to properly submit a "patch-of-a-patch". Could you provide any pointers on that?
In any case, the code is available for review / testing, it is just one function changed.
Comment #9
chx CreditAttribution: chx commentedit's easy: apply the patch, change what you think needs changed and provide a new patch.
Comment #10
Jeremy CreditAttribution: Jeremy commentedDkruglyak, your changes didn't work on my test server but I updated my patch to use two maps following your claim that this offers a dramatic improvement in performance.
Is anyone willing to benchmark Drupal 4.6.5 versus Drupal 4.6.5 + my first patch versus Drupal 4.6.5 + the attached patch?
BTW: Both patches update update.inc, so the easiest way to update your database is to run update_132 from update.php. (How many times can you use update in a sentence?)
Comment #11
dkruglyak CreditAttribution: dkruglyak commentedSorry, I think my code broke "as is" since I renamed $map only in one function, but it is used in other places as well. To fix another bug you should remove these lines from bootstrap.inc :
The problem is that the first call to drupal_get_normal_path for "node/xxxx" populates maps with "node/xxxx":"node/xxxx" pairs. Subsequent call to drupal_get_path_alias instead of fetching the alias from DB gets alias equal to normal path from the map.
This sequence of calls is not invoked for every link, but is obviously a bug. Should be fixed both in 4.6 and HEAD.
Comment #12
Jeremy CreditAttribution: Jeremy commentedCan you detail how to trigger the bug you're seeing?
It looks to me like the code is doing the right thing. It first does a db_query to see if there's an alias for the current path. Only if it doesn't find an alias in the url_alias table does it trigger the logic you mention above which prevents it from doing additional db_queries on the same path.
Comment #13
dkruglyak CreditAttribution: dkruglyak commentedThe code is not doing the right thing. When drupal_get_normal_path is called for a normal alias the maps get corrupted with normal/normal pair, even when alias exists.
I originally discovered problem with links in "Recent blog posts" block. The problem does not show inside regular blog posts, maybe because of different call sequence.
Comment #14
matt westgate CreditAttribution: matt westgate commentedThis patch takes the current working path aliasing for Drupal 4.7 and backports it to 4.6. You'll need to run update.php to add an index to the url_alias table.
This patch is a lifesaver for sites with thousands of aliases. The main difference is that aliases are loaded on demand rather than trying to cache the entire path table into memory for each request (i.e., many small, fast queries vs one huge, slow query).
Comment #15
Adrian Freed CreditAttribution: Adrian Freed commentedMatt, I didn't have much luck with that patch. I put the changes in carefully by hand and noticed that the mysql update
ALTER TABLE url_alias ADD INDEX ('src') worked by hand without the quotes around src.
Also my path.module has a call to drupal_rebuild_path_map() which was removed by your patch.
Comment #16
moshe weitzman CreditAttribution: moshe weitzman commentedI added a note to the pathauto project page which links here. I fear that lots of sites have grown slow because of this bug.
Comment #17
matt westgate CreditAttribution: matt westgate commentedNew patch attached which should fix the above issues.
Comment #18
Bill St. Clair CreditAttribution: Bill St. Clair commentedMatt,
Just installed your path_alias_backport2.patch in my Drupal 4.6.5 installation. I did a test import of over 2,600 nodes from the HTML files of my old site. It got very slow, on my 2 GHz Pentium M laptop. The patch sped it right back up.
Thank you, thank you, thank you.
Comment #19
dopry CreditAttribution: dopry commentedTested with 4.6.5, upgrade goes fine. no apparent bugs.
Looking at the code, moving to selecting and caching the specific record needed seems like a vast improvement over selecting them all at one. Seems like it will save processing time and memory. I don't have a lot of aliases to benchmark against, so can't provide hard numbers.
Setting to ready to commit since its already accepted in 4.7 and applies cleanly to 4.6.5.
Comment #20
Adrian Freed CreditAttribution: Adrian Freed commentedI have had good luck with the patch.
It does result in lots of fast querie:there are few hosting services out there crazy enough to meter
mysql requests so a small minority of users may be inconvenienced.
Comment #21
Dries CreditAttribution: Dries commentedDrupal 4.6 is frozen.
Comment #22
dgrant CreditAttribution: dgrant commentedWhat happens when 4.7 is finally released and I want to upgrade? Do I lose the new "function update_132()" that you replaced? Actually, I just checked Drupal 4.7 updates.inc file and update_132 happens to be a postgresql-only update, so it won't affect me if this update doesn't get applied since I don't use postgresql.
Comment #23
Marandb CreditAttribution: Marandb commentedI applied the patch & it made quite a bit of difference on my site. (4.6.5)
Just a note: the patch would NOT run for me using stock 4.6.5 files. (Failed, Failed, Failed) I had to patch each file manually. I hope it is ok that I attached a zip with all the patched files (4.6.5 Stock). Patching is a PAIN in the rear for those of us who are not Unix smart / web developers.
Anyway that said, here are a few quick before & after stats.
HomePage - 189 queries
Before: 269.98ms
After: 60.26ms
Forum Topic - 401 queries
Before: 574.00ms
After: 70.41ms
Page - 160 queries
Before: 424.58ms
After: 52.45ms
-- Marand B.
Comment #24
yongrokc CreditAttribution: yongrokc commentedIt's first time I'm trying to update the database of my first installation of drupal, 4.6.5.
But, I can't seem to find update.php... Where is it and how do I go about using it?
Thanks,
Andy
Comment #25
njehlen CreditAttribution: njehlen commentedI tried using this patch (specifically, I used the files in the attachment that Marandb included) when I installed pathauto, but i get this error:
Fatal error: Call to undefined function: filter_xss_bad_protocol() in /home/action/public_html/ace/includes/common.inc on line 630
Any ideas what I might be doing wrong?
Comment #26
chx CreditAttribution: chx commentedYou are running an outdated 4.6.x , update to latest 4.6.x.
Comment #27
insomoz CreditAttribution: insomoz commentedThis works, I have just tested, I can now finally edit, delete, add content, without having to worry about my running like a snail.
Thanks to the person who released this script.
Load that would hit 30+ and cliimb is now a steady 0.01
Comment #28
venkat-rk CreditAttribution: venkat-rk commentedInsomoz, did you run the patch attached to comment #17 or comment #23?
Comment #29
asb CreditAttribution: asb commentedI applied the patch submitted by Jeremy on January 10, 2006 - 03:08 and ran update.php after this; everything seems to be working fine.
The patch has solved a major performace issue for me; page loading times have decreased from approx. 16-20 sec. to 1-2 sec. or less; accessing the admin menu took 20-30 secs before and takes now just 2 secs. Editing the menu system was impossible because of timeouts (90 sec. in php.ini) and now takes approx. 30 sec. and - more importantly - makes editing the menus possible again.
Thank you VERY MUCH for this patch!
Regards -asb
Comment #30
klance CreditAttribution: klance commentedThanks to matt westgate for this patch. My site is running much faster, though the menu system is still very slow. Just wanted to point out that when I applied the patch, i18n broke, so I made similar alterations to i18n_get_normal_path($path) so that it wouldn't complain anymore. Seems to have done the trick.
Comment #31
klance CreditAttribution: klance commentedI was a little hasty on that verdict, and there are a few mistakes in the patch I submitted. When accessing language-specific administration pages, the previous patch leads to a Page Not Found error for URLs preceded by a locale. This patch fixes that problem, and I have tested it thoroughly.
Comment #32
jasonwhat CreditAttribution: jasonwhat commentedClearer instructions? There are a bunch of different patches going on here and it might be helpful to have some specific instructions of what files need patched. The path.module file does not get patched right?
Rather than applying the patch, I used the already patched files and replaced my old ones. I'm on a CivicSpace install. However when I went to update.php there was "no update". (note: my install had no updates.inc file) Is there something I can do to change this? Or does someone have the SQL statements so I could just apply them directly through php admin? Will using this patch cause complications when upgrading to 4.7?
Comment #33
RobRoy CreditAttribution: RobRoy commentedI think that last patch from matt westgate (comment #17) is the money one. I just patched my Drupal 4.6.5 and it works fine. If you don't have updates.inc, all you need to do is add an INDEX on the src field.
Comment #34
kc CreditAttribution: kc commentedI used the files from #17 but I get this as I try to acess my admin/categories with 7000 terms
Maximum execution time of 30 seconds exceeded in /home/xyz/public_html/includes/bootstrap.inc on line 389
same when I try to access my sitemap with all 7000 categories w/aliases
Comment #35
Stefan Nagtegaal CreditAttribution: Stefan Nagtegaal commentedIs this performance fix currently in HEAD or not? if not, I'll make a patch for it..
Sorry for asking, but I do not have time (this moment) to check if it's present myself...
Comment #36
pamphile CreditAttribution: pamphile commentedI know this question does not belong here, but inquiring minds want to know...
How do you apply a Drupal patch ? Where can I read more on this ?
see:
http://drupal.org/diffandpatch
http://drupal.org/node/19036
Comment #37
nathandigriz CreditAttribution: nathandigriz commentedDoes this patch still work on version 4.6.6? upgrade to 4.6.6 after applying the patch?
Comment #38
mediafrenzy CreditAttribution: mediafrenzy commentedI'd also like to know which patch works for 4.6.6? I'm a bit confused by the number of different patches being discussed..
Thanks very much.
Comment #39
dkruglyak CreditAttribution: dkruglyak commentedIt appears that there are two critical patches making significant changes to path / alias processing. They should be consolidated for 4.6.5 / 4.6.6, since they touch the same files and I am not sure how to do that.
I created a new issue here: http://drupal.org/node/56972
Comment #40
tng@neuralgourmet.com CreditAttribution: tng@neuralgourmet.com commentedOK. I have a problem. I applied Matt's patch from post #17 to a Drupal 4.6.5 system without any errors. This took care of the slow page loads beautifully. Unfortunately, after I applied the patch this error would be thrown every time anyone posted anything:
Fatal error: Unknown function: drupal_rebuild_path_map() in /home/xxxxxx/public_html/modules/path.module on line 148
I reversed the patch and everything is back to the way it was before -- including slow page loads.
What would be my best course of action? Can I just take path.module from 4.6.6 and copy it into modules/ ?
Thanks for any suggestions in advance.
Comment #41
eurokate CreditAttribution: eurokate commentedI applied the path alias patch and also the i18n patch above. My site sped up noticeably, but the i18n module was still broken. When trying to access any nodes with a language prefix I got a page not found error. Is the i18n patch submitted by klance correct? I couldn't make sense of it.
Comment #42
chx CreditAttribution: chx commentedThis patch is badly needed though the patch was not something that likely could get into 4.6 but now it is. Why? Because this version does not change any API because that's not acceptable. This is now only a performance fix and nothing else. I have not removed drupal_get_path_map (though now it's unused by core) nor I changed conf_url_rewrite. This is the way a backport should be done for Drupal.
I have tested with and without conf_url_rewrite.
Patch is against DRUPAL-4-6.
Comment #43
bradlis7 CreditAttribution: bradlis7 commentedPath seems to be working fine for me.
+1
Comment #44
bradlis7 CreditAttribution: bradlis7 commentedwith the current patch applied I mean
Comment #45
xeeya CreditAttribution: xeeya commentedSadly I`m getting the following error when I applied the above latest patch for 4.6. I`ve also run update.php after the patch.
:(
Comment #46
xeeya CreditAttribution: xeeya commentedSorry, of course I meant 4.6.6 . The drupal installation runs on a local machine.
Comment #47
chx CreditAttribution: chx commentedxeeya, your report has nothing to do with this patch, your webserver can't read bootstrap.inc , you reuploaded wrong or whatever. Please do not litter this issue further.
Comment #48
beginner CreditAttribution: beginner commentedThe patch still applies cleanly.
in #21 dries said he wouldn't fix. Chx: have you talked to him?
Are patches still applied to the 4.6 branch?
http://drupal.org/project/issues?projects=3060&versions=11555,11262,1023...
With this one, there are 3 RTBC patches that have been sitting for weeks.
Comment #49
chx CreditAttribution: chx commentedYes, I talked with him and my patch is not the same as it was before -- mine has no API changes.
Comment #50
magico CreditAttribution: magico commentedWhen will this patch be applied to 4.6?
Comment #51
Dries CreditAttribution: Dries commentedWe're not going to backport this to 4.6. It's not a bugfix.