I just updated from Pathauto 1.2 to 2.0-beta2 and tried a bulk generation of node aliases, which used to time-out and never finish. With just the default 50 specified, it completed quickly, BUT it says 0 aliases generated. However, looking at URL URL aliases at /admin/build/path, I see it created a dozen or more (might be the only ones it needed to create). So likely the problem is just an error in the displayed results message.

Comments

greggles’s picture

Status: Active » Postponed (maintainer needs more info)

This is probably pretty simple, as you point out, but I just can't get the same behavior to happen for me. I tried deleting all aliases and then building some more, but it reported "23 aliases generated".

Any more thoughts on whether it's actually doing anything?

I did think of a scenario where this new "50 at a time" style updates could fail.

Currently pathauto just selects nodes in the order that they exist in the database. If

1) You have patterns for one node type, but not for another and no default pattern
2) You have more than 50 of the type that is not set with a pattern
3) You have several unaliased nodes of the type that should get aliased
4) The data is organized in the database in a manner that selects the 50 nodes that are not set to be aliased BEFORE it selects the nodes that should get aliased

Result: it will select the 50 nodes that don't have a pattern, churn through them all checking this fact, and then finish without actually doing anything.

To test for this situation you can do:

select count(1) from url_alias

Run that query before the bulk update and after the bulk update. If the number is the same both times then you probably have this problem.

The solution is just to increase the number of nodes to attempt to alias in a given run.

greggles’s picture

Status: Postponed (maintainer needs more info) » Active

I found another scenario where this might happen - for the tracker bulk update or for user blogs bulkupdate. But you said you were doing nodes, so I guess that's not your problem.

hawkdrupal’s picture

I think you nailed it...but how to fix.

I have a few node "articles" (our custom CCK node type), then 12,000 "quotes" nodes (imported in LARGE bulk from our other site), then more "articles" nodes. Pathauto aliased the first batch of articles, but likely encountered all the quotes and stopped counting.

I know it would be a big load to traverse every node in a large database looking for needed aliases, but that's really what is needed. Maybe it's do a query first to subset the table based on the node type, THEN do a batch of 50 at a time?

hawkdrupal’s picture

Forgot to mention... the "quotes" nodes don't get or need aliases. They are never accessed via URL, just displayed randomly via a script. (In an ideal scenario, the random quotes module I'm using would have stored them in a separate table, but it didn't...)

Maybe Pathauto could skip over (rather than look at/count) node types that aren't set to get aliases.

greggles’s picture

Component: User interface » Code
Status: Active » Needs review
StatusFileSize
new1.76 KB

Yep - that makes complete sense.

Here's a patch that makes sure we only select nodes that have patterns.

If you can apply it and test it that will be very helpful to me.

I think this should close all the cases where that new "50 at a time" style query will fail, but perhaps not ;)

greggles’s picture

Status: Needs review » Fixed

Now applied - please let me know if this doesn't fix your problem.

Anonymous’s picture

Status: Fixed » Closed (fixed)