Pathauto supports bulk generation of aliases for nodes that are not aliased.
The number of unaliased nodes that will be updated is controlled by the Pathauto setting:
"Maximum number of objects to alias in a bulk update"
Updating via the admin interface
The bulk update is currently a manual action done by:
- visiting the pathauto settings (/admin/settings/pathauto)
- ticking the checkbox under "Node Path Settings" for:
"Bulk generate aliases for nodes that are not aliased" - clicking "Save Configuration" button
- repeating as many times as required to generate all aliases
If you have a lot of nodes to update, this can be a very tedious button clicking process. You can experiment with the Maximum number of objects to alias in a bulk update to find a number that is relatively high and will not time out.
Using the command line to bulk update unaliased nodes
A faster way to update would be from a command line. To do so, I set up a cron-update-pathauto.php script which contains:
// This gets Drupal started.
include_once './webroot/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
// This gets Pathauto started updating aliases.
_pathauto_include() ;
node_pathauto_bulkupdate();
The include paths here assume you'll put this cron-update-pathauto.php file above your top level Drupal directory with appropriate permissions (eg `chmod 500 cron-update-pathauto.php` assuming the file is owned by the user who will be executing it).
Adding the unaliased node update to cron
To enable this to be called from cron I also setup a modified version of the cron script from http://drupal.org/node/65307 as cron-pathauto.sh:
#!/bin/bash
#############################
# CONFIGURATION OPTIONS
#############################
# Set the complete local path to where the cron.php file
# is (ie the root path) Default is /var/www/webroot/
root_path=/var/www/
# Set the complete path to the php parser if
# different from standard
parse=/usr/bin/php
cronjob=cron-update-pathauto.php
##############################
# END OF CONFIGURATION OPTIONS
##############################
cd $root_path
if [ -e "$cronjob" ]
then
$parse $cronjob
if [ "$?" -ne "0" ]
then
echo "$cronjob not parsed."
else
echo "$cronjob has succesfully been parsed."
fi
else
echo "$cronjob not found."
exit
fi
exit
Finally, added the actual cronjob to run this periodically. Depending on how long it takes to run on your site and how quickly you want to build the aliases you could set it to be every 15 minutes or more or less often.
crontab -e
# minute hour mday month wday command
*/15 * * * * /home/htdocs/cron-pathauto.sh >/dev/null 2>&1
Note that this cron job could be left on a site permanently if you are importing nodes and not creating aliases as you import.
Updating via drush
Another way to do this is to install Drush and then run this command:
drush php-eval '_pathauto_include() ; node_pathauto_bulkupdate()'
As shown above, this command could be included in a script called periodically via cron.
Performance tips
Pathauto's speed in Drupal 6 is directly tied to the creation of tokens. In Drupal 6 when a module uses tokens all of the related tokens are calculated and then the tokens that need to be used get used. In Drupal 7 only the tokens being used are calculated.
Disable cck token generation for fields you don't use
If you have nodes with a large number of CCK fields it can be particularly slow to generate tokens (and therefore slow to calculate the Pathauto aliases).
One helpful tip is to figure out which cck fields and prevent the token generation for that field. In your admin interface browse to › Administer › Content management › Content types and click on "manage fields" for each content type. Click the "Display fields" tab at the top of the page and then click the Token sub-tab. The URL should be something like /admin/content/node-type/food/display/token Review the fields on this list and click "Exclude" for any fields which are not used for token generation on your site (note that there are modules other than Pathauto which might use these tokens).
Disable modules that create tokens you don't need
You can review which modules are creating tokens, again using Drush php-eval to print a list of tokens that implement hook_token_values.
greggles@biff:~/d6$ drush php-eval "print_r(module_implements('token_values'));"
Array
(
[0] => content
[1] => text
[2] => hrules
[3] => og
[4] => signup
[5] => single_field_viewer
[6] => token
[7] => rules
[8] => crazy_token_generator_of_doom
)
greggles@biff:~d6$
If you see a module in that list which you don't really need on your site then you could disable it and see if it performs any faster. If you need the module, but not the tokens from it, consider working with the module maintainer to make the portions of code related to token optional (e.g. with a variable in the admin interface or by simply moving the token code to a sub-module).
Tweaking command line memory
If you are running this via command line and set the Maximum number of objects to alias in a bulk update to a large number then you are likely to run into PHP's memory limit. If the command dies in the middle it is likely because of the memory limit. Their is a separate php.ini configuration file for command line than the PHP running inside the webserver. On Ubuntu that file is stored in /etc/php5/cli/php.ini
. You can modify this file to set the memory_limit parameter to a very high number like 512M to mean 512 Megabytes, or set it to -1 to remove the memory limit.
Comments
I've got a better
I've got a better idea:
Is there a version of this for drupal 6?
That currently does not work, and i get missing cache.inc errors.
Multi-site setup
I wanted to use the script on a site of my own, but I have a multi-site setup, which means the settings are in a different place other than default and you need to trick the script so it can find the settings.php properly.
I took a little bit of coding from drush module, maybe this script should be added as a drush pathauto module?
Anyway:
Thank you!
Great piece of code, just what I was looking for. Thank you!
Script for D7
Your Script doesn't work for D7, this script is ok for me :
Doesn't work in D6
The PHP script returns immediately. Nothing happens.
Does anyone know, how does that function "node_pathauto_bulkupdate()" run without any arguments? Does it pull the number from General Settings or does it just run limitless?
Anyway, there has to be a better way (in D6 at least) of bulk updating aliases across sites.
Bulk generate path aliases for large sites
Hi,
has anyone managed to fix this script for D6? (also still pending as a support request in #505042: Bulk generate path aliases for large sites)
Thanks & greetings, -asb
Bulk generate path aliases for Drupal 6
Hi asb,
The details on this page worked for me. I did adjust the path to pathauto for my installation.
I looped from my Mac which is missing some shell commands, so I used
for i in {1..250}; do echo $i; curl http://domain/script.php; done
to generate the paths. Doing this manually would have been painful.
I refreshed "Delete aliases" page to monitor the progress.
Thanks to all for the help above!
Updating pathauto from the command link (tweaked)
Although my suggestion (below) of using views bulk operations works in most cases, it failed last night with an out of memory error after re-aliasing 25,000 of my 40,000 nodes. But there is no way to start it again from where it left off.
By contrast, here is how to trigger pathauto from the command line, based on js's example. This is faster (I believe) than the VBO option.
First, in the pathauto settings at (/admin/build/path/pathauto), set "Maximum number of objects to alias in a bulk update:" to as high a number as you can that repeatedly works without timing out. I use 400. With 40,000 nodes, that means I need to run this script 100 times. Create a file called pathauto.php with these contents (note the different paths versus js's example):
Now, run the following from a bash shell:
for i in {1..100}; do echo $i; php pathauto.php; done
or...
or
Creating URLs in batches
If you run into a PHP timeout problem with the above script you can lower the number of new URL aliases to create to something like 100 and then add this code at the end of the script - and don't forget to stop it after a while :)
Well this is the combination
Well this is the combination of both ideas
At least works for me.
Enjoy It.
enzo
--
enzo - Eduardo Garcia
weKnow - http://www.weknowinc.com
Please use the git author option: --author="enzo " for any patch I did and used in a new module release
Transliteration
Hey this script is very nice! Thank you! I needed to have the option Transliterate prior to creating alias checked, so I added this to the script:
variable_set('pathauto_transliterate', TRUE);
Here's my code (with a 10 sec page refresh, and 1000 aliases created per refresh):
One of the comments is misleading/incorrect
To clarify, the line below sets up how many nodes to alias in a given run and isn't really related to how many nodes you have.
From what I can tell, this will script will run as long as you have a web browser pointed at http://yoursite.com/pathauto.php and you will need to monitor the aliasing progress and remove the script when aliasing is complete. Your drupal site should flash this message when aliasing is complete:
This is a fantastic way out.
This is a fantastic way out. Works just fine :)
Cheers.
-----------------------------
Subir Ghosh
www.subirghosh.in
Best solution for Drupal 6
The best way to update pathauto aliases on Drupal 6 is to use the incredibly powerful Views Bulk Operations. Install it, then create a view of all of your nodes with style VBO. The key step is that VBO supports the Batch API, which allows Drupal to do thousands of tasks without timing out, and with a nice status page showing you its progress. Anyway, the pathauto option is near the bottom. You need to create separate views for terms and users as well.
VBO for taxonomy terms and vocabularies
> The best way to update pathauto aliases on Drupal 6 is to use the incredibly powerful Views Bulk
> Operations. Install it, then create a view of all of your nodes with style VBO. [...] You need to create
> separate views for terms and users as well.
Using VBO is a really great idea. However, I wasn't yet able to build a view to bulk update all term and vocabulary aliases (I don't want to update node paths). Any ideas?
Thanks, -asb
VBO for taxonomy
Yep, VBO doesn't support the pathauto function for taxonomy terms. So, I would use this solution instead.
node_pathauto_bulkupdate()
node_pathauto_bulkupdate() updates non-aliased nodes, but what if you need to refresh all nodes? I.e. create new aliases, replace existing on cron runs?
node_pathauto_bulkupdate() updates non-aliased nodes
Are you referring to pathauto_node.inc (~line#100) where the query does a "WHERE alias.src IS NULL"? I'd also like to know if this is an actual limitation and how to get around it safely.
drush
with drush you can do
#drush sql-query "TRUNCATE {url_alias}" to delete all url alias
btw the various scripts does not works
i'm running pathauto-6.x-1.5"
Anyone got a solution?
--
Open is Better
On my installation profile,
On my installation profile, with pathauto 2, I want to bulk alias all term menus, and use below at the end of profile tasks along with other useful functions menu_rebuild, drupal_cron_run, etc:
I don't do bootstrap, and aliases are bulk updated.
love, light n laughter
Not updating user and taxonomy paths
This script will create users and taxonomy alias
Getting the results
Thanks for the script, I improved on it by adding this bit to make it call itself continuously:
But is there a way to get to the results in $messages? It would be great to be able to show these here.
Worked great. Thanks for
Worked great. Thanks for helping us out
Global Redirect
To speed up a node_save()-based content migration I recently worked on, I disabled Pathauto during import. Afterward, I tried both the scripted node_pathauto_bulkupdate() version as well as the drush version of updating url_alias. Unfortunately, afterward, Global Redirect isn't working, and it's possible to visit the /node form of URLs as well as the full path URLs. Why would the manual bulk update process not allow Global Redirect to work? Is there a table other than url_alias involved?
See #825006: It removes
See #825006: It removes aliases. Have lots of fun fun reading it. This doesn't just apply for users of the 'Scanner' module.
also
.dan. is the New Zealand Drupal Developer working on Government Web Standards
drush command needs fixing
The drush command looked a little stale so I thought I'd try this for bulk updating all node aliases:
drush php-eval " module_load_include('inc', 'pathauto', 'pathauto'); module_load_include('inc', 'pathauto', 'pathauto.pathauto'); node_pathauto_bulk_update_batch_process(); "
However, I'm getting this error:
Missing argument 1 for node_pathauto_bulk_update_batch_process(), called in/usr/local/Cellar/drush/4.5/commands/core/core.drush.inc(637) : eval()'d code on line 1 and defined pathauto.pathauto.inc:86
I'm dealing with ~9000 node aliases that need updating, and the in-browser bulk update doesn't work (of course), any thoughts on how I can clean this up?
Here is my script for bulk
Here is my script for bulk updating node aliases
It works good during update process, but it can't work with more that several thousands nodes
that may bypass updating the redirects
Hey Eugene, I think that would be fine if I didn't need pathauto to play nicely with the Redirect module, which would automatically create redirects (massively important for SEO) when URL aliases are changed.
It's important for me not to wipe the old URL aliases, which yr script is doing.
It did give me an idea though. I'm going to stuff all the current URL aliases into Redirect's
redirect.source
table in the db. Then I'll try and collect the output ofpathauto_node_update_alias_multiple()
into an array and stuff it into Redirect'sredirect.redirect
table.(I sense a module in the works B-) )
Great idea.. I was looking
Great idea.. I was looking for a solution for the same.
Did you create a custom module for this? Can you share the code you used?
Drush command
All: Rather than discussing the Drush command code here, please help with #867578: Add drush commands for bulk alias updating/deleting instead. Directing our energy over there will get this done faster. Thanks.