Closed (fixed)
Project:
Pathauto
Version:
5.x-2.x-dev
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Reporter:
Created:
7 Jun 2006 at 01:39 UTC
Updated:
1 Jun 2007 at 04:31 UTC
Jump to comment: Most recent file
When selecting 'Bulk update node paths' under 'Node path settings', I get the following error. Of course, this is for a very large site. Maybe it would be useful to have it use the cron, limit its scope somehow, or do something more graceful than this. I assume that if I run this several times, it will eventually get all the nodes; I've run it three times so far, and it's still doing this.
Fatal error: Maximum execution time of 30 seconds exceeded in /home/vhosts/staging.tpmcafe.com/public_html/modules/pathauto/pathauto.module on line 376
Obviously, the time out happens at whatever line it is at, not necessarily 376.
| Comment | File | Size | Author |
|---|---|---|---|
| #22 | bulkupdate.patch | 2.45 KB | gerhard killesreiter |
Comments
Comment #1
lambert-1 commentedThis sounds like a PHP issue to me. Check your PHP.ini file for max_execution_time --
http://www.php.net/manual/en/ini.php#ini.list
Comment #2
aaron commentedNot sure that it is that. The search cron does just this, to prevent timeouts. (You can configure that at admin/settings/search.) If you have a large site with tens of thousands of posts (like the one I just installed pathauto), you will get time outs, because it just takes a very large amount of time to do it.
Alternatively, I suppose I could set max_execution_time to 10 hours just to make sure... ;)
Comment #3
gregglesYeah, so I'd say this is a feature request to have pathauto model the bulk feed generation on the search index creation.
Personally this seems pretty low priority because it doesn't seem like you'd need it very often. Aaron - if you can provide a patch it's much more likely to get fixed. Otherwise, it will probably wait until someone else really needs it.
Aaron - how many nodes are in this site you are describing? And is it a dedicated or shared server? What specs?
Comment #4
aaron commentedyeah, i'm going to work on it, because we have at least one large site that needs it, and probably more will follow as they see other sites using it. i'll post something when i get around to it. also a lower priority for us right now, but probably higher than for most other folks.
Comment #5
gregglesJust a note that I marked http://drupal.org/node/26550 as a duplciate of this issue.
Comment #6
greggles@aaron - any progress?
Also, note that there is another issue that suggests a change to the queries to get better performance:
http://drupal.org/node/76172
Thanks.
Comment #7
acidcortex commentedexplained everything http://drupal.org/node/76172
c it, and patch it if u like
i have a lot of other modifications for drupal, but I can't post them before December
need to work on some other stuph
steph (acidcorteX)
Comment #8
greggles@acidcortex - thanks for the link, but I see these as separate issues.
At some point it's possible to create a site that is so big it will break pathauto. Improving performance doesn't change the fact that the bulkupdate feature doesn't currently scale to a large site well.
So, the workarounds are things like:
Those are all workarounds to the final problem of making pathauto work in configurable sized chunks (so an admin can enter 1000 or 2000 or 10,000) so that then the bulk update will be guaranteed to work after several executions whether the site is on amazing hardware or a shared server.
@aaron - if I've mis-stated your intention then feel free to restate what you see as the goal. This seems like a relatively difficult task because you have to have
Its the second item that's tougher, in my mind, but perhaps someone has a brilliantly simple idea on that point.
Comment #9
aaron commented1. a guess at the number of updates that area going to succeed
my suggestion would be to have this be a user-configurable number, ala search. some sites will be able to handle more numbers, based on server memory, bandwidth usage, etc. maybe have 100/500/1000/2500/5000/10000 or something.
2. some means of keeping track of which aliases have been created and which need to be done
use a variable to store the last node->nid processed, and on the next cron process the next nn nodes, marking the last node->nid for the next cron run. loop when we reach the end.
here's a brainstorm to get things rolling:
Comment #10
aaron commentedobviously the prior solution needs work. this will just loop forever, reprocessing nn nodes every hour or so. we'll just want to add a flag in there to tell us we're done (in the loop-back loop), and when we submit for a new bulk update, we'll unset the flag and start our count back at zero. i think the only problem we'll have is if the server times out, in which case it will just keep starting over. in that case, we'll also want a pre-processing watchdog alert, so the admin can check to see if bulkupdates are starting but not completing. (maybe the alert can even have a built-in suggestion to set the limit lower if it's not completing).
Comment #11
gregglesYes - great ideas. Keep them coming!
One comment: the ID that we store will have to work for users, nodes, taxonomy, etc. Anything other kinds of objects now, or in the forseeable future, that it should be aware of?
Comment #12
chrisschaub commentedHi. Is there any status on this issue? It's pretty important, and I'd like to help if needed.
Comment #13
gregglesI'm not working on it. Much of this work really needs to happen in core, so Drupal6 would be the timeframe to do that.
Comment #14
chrisschaub commentedIs pathauto going to be moved into core?
Comment #15
gregglesschaub123 - if you want to discuss random parts of pathauto please don't do it in the issue queue. The proper place to do that is in the group for paths discussion: http://groups.drupal.org/paths
Comment #16
chrisschaub commentedOk, fair enough, sorry about that. But maybe this should be marked as "won't fix" since I thought it was an open issue. Otherwise I would have looked elsewhere. Thanks.
Comment #17
gregglesschaub123 - the issue is open regarldess of whether or not I (or anyone else) is currently working on it.
I would only mark it "won't fix" if it was something that should never be fixed.
Comment #18
moshe weitzman commentedI recently did a bulk update on a cliednt site and it crashed mysql because of too much memory or too many queries or something. So, I have two suggestions:
- $GLOBALS['conf']['dev_query'] = FALSE will temporarily disable devel.module query logging. That consumes lots of memory for no benefit.
- use the batch operations patch when it lands (very soon): http://drupal.org/node/127539
Comment #19
gregglesI like the idea of optionally disabling the dev query log - mostly because it's easy to implement.
There's still a lot of hard work to be done to get this to work in configurable chunks even after the progress operations patch lands (which I'm not really interested in working on, personally).
@moshe - if you can try to reproduce the problem to see exactly what fell over and where, it would be interesting. There is a change in the call to node_load so that it resets the cache each time. If you were doing bulk operations on nodes then you can quite easily exhaust php memory if you are using pathauto from before that change went in.
Comment #20
moshe weitzman commentedsorry, i moved on and can't put more time into identifying the problem. i was using 1.44.2.5 2007/01/20 23:26:2. the update in question was an aliasing of terms. most terms were in a freetag vocab.
Comment #21
litwol commentedsubscribing.
Comment #22
gerhard killesreiter commentedHere's a patch that splits node updated into packs of 100.
Comment #23
gregglesThis is now fixed - thanks Killes for the brainstorming. I didn't use the progress meter - just doing it in chunks that the user can specify. The default is 50. I'd love to get feedback about whether it should be 50 or 500 or 5000.
If someone wants to use the progress meter well that would of course be awesome, but at least this is now generally "fixed".
I also documented this (and some other features) at http://drupal.org/node/144904 in the handbook. I'd love a review or two there as well.
Comment #24
(not verified) commented