Similar to http://drupal.org/node/257281

It turns out this becomes really important with 100k+ nodes and older versions of MySQL (ex: 5.0.51a).

Please credit to:

--author="gcassie <gcassie@80260.no-reply.drupal.org>"

Patch against 7.x-1.x HEAD coming shortly.

Files: 
CommentFileSizeAuthor
#1 pathauto-castconcats-1415930-1.patch2.29 KBRaines37
PASSED: [[SimpleTest]]: [MySQL] 269 pass(es).
[ View ]

Comments

Status:Active» Needs review
StatusFileSize
new2.29 KB
PASSED: [[SimpleTest]]: [MySQL] 269 pass(es).
[ View ]

I tested #1 and changed my life. bulk updating stopped working since I loaded many thousands nodes as the first query to find "unaliased" nodes tooks more than 2 minutes (but I never investigated the issue). However, applying this patch the query takes less than a second and the bulk update works again! Thank you!

Status:Needs review» Reviewed & tested by the community

Status:Reviewed & tested by the community» Needs review

We would have to confirm #1 works on pgsql and sqlite as well. This issue cannot be RTBC until that happens.

@Dave AFAIK the same kind of change has already landed the code in alternative issues (http://drupal.org/node/257281) so I don't see why we should block this issue waiting for some sqlite/pgsql people to appear here (and if they don't have this issue I don't see how we can expect them to appear here and be willing to test this patch).

This is a BLOCKING issue for people using mysql 5.0 as it break bulk updating...

Status:Needs review» Reviewed & tested by the community

@Dave can you give me more informations about the workflow you ask? Why do you think the "CAST" will break anything? The same cast is used in other pathauto queries. Why should we wait for less used DB users to take care of testing a patch they don't need? I put this again to "RTBC" so maybe you see my questions. (I didn't find any policy about testing postgre or sqlite before RTBC, but I may have missed it.)

Simply make a special IF case for those few not using mysql and be done with it, it's important to speed up this bulk process, thank you.

Thanks a lot, this patch saved my day !
I was not aware about that MySQL performance issue, the difference is just amazing ...
(now re-gerating path for a 250000 node site)

Postgres user here.

Sort of have a related issue, in that not having CAST means that bulk generation doesn't work and returns an error:

Ambiguous function: 7 ERROR: function concat(text, integer) is not unique LINE 6: LEFT OUTER JOIN url_alias ua ON CONCAT('taxonomy/term

The above patch removes the error, but the URL alias isn't actually generated.

Instead of just 'CHAR', using 'CHAR(10)' seems to get the alias created nicely.

So from:
$query->leftJoin('url_alias', 'ua', "CONCAT('taxonomy/term/', CAST(td.tid AS CHAR)) = ua.source");

to:
$query->leftJoin('url_alias', 'ua', "CONCAT('taxonomy/term/', CAST(td.tid AS CHAR(10))) = ua.source");

Not sure if that'll affect mysql?

Tested it by deleting a term alias, and then regenerating using the bulk option - haven't tried it with 1000s of nodes.

I can confirm the patch in #1 works like magic.

Also, it's worth noting that optimizing the relevant tables (taxonomy_term_data, taxonomy_index, etc.) can greatly speed up everything, especially if you imported huge volumes.

Another note : you don't have to generate huge amounts of aliases to hit the performance problem. The bulk generation stalled on a dev site with 29 aliases to create. I'm talking about MySQL here.

Thanks to all contributors in this thread, especially to Raines37 for the patch.

Status:Reviewed & tested by the community» Needs work

Comment in #9 indicates this needs work for PostgreSQL, which is why I asked in #4 for it to be tested in all three database system supported by Drupal core to ensure compatibility.

This query without CAST is a problem in mysql version minor of 5.5.

Hi Dave,

Could not the table structure be changed to reflect fields a bit more? Such as adding columns for entity_bundle, entity_id, entity_revision, etc?
It takes extra space, but then lookups could be done much faster and in a more stable format.

- Jeremy