Bulk generation of Pathauto node aliases - Manually, from cron, or command line

Last updated on
30 April 2025

Pathauto supports bulk generation of aliases for nodes that are not aliased.

The number of unaliased nodes that will be updated is controlled by the Pathauto setting:
"Maximum number of objects to alias in a bulk update"

Updating via the admin interface

The bulk update is currently a manual action done by:

  • visiting the pathauto settings (/admin/settings/pathauto)
  • ticking the checkbox under "Node Path Settings" for:

    "Bulk generate aliases for nodes that are not aliased"
  • clicking "Save Configuration" button
  • repeating as many times as required to generate all aliases

If you have a lot of nodes to update, this can be a very tedious button clicking process. You can experiment with the Maximum number of objects to alias in a bulk update to find a number that is relatively high and will not time out.

Using the command line to bulk update unaliased nodes

A faster way to update would be from a command line. To do so, I set up a cron-update-pathauto.php script which contains:

// This gets Drupal started.
include_once './webroot/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// This gets Pathauto started updating aliases.
_pathauto_include() ;
node_pathauto_bulkupdate();

The include paths here assume you'll put this cron-update-pathauto.php file above your top level Drupal directory with appropriate permissions (eg `chmod 500 cron-update-pathauto.php` assuming the file is owned by the user who will be executing it).

Adding the unaliased node update to cron

To enable this to be called from cron I also setup a modified version of the cron script from http://drupal.org/node/65307 as cron-pathauto.sh:

#!/bin/bash
#############################
# CONFIGURATION OPTIONS
#############################

# Set the complete local path to where the cron.php file
# is (ie the root path) Default is /var/www/webroot/
root_path=/var/www/

# Set the complete path to the php parser if
# different from standard
parse=/usr/bin/php

cronjob=cron-update-pathauto.php

##############################
# END OF CONFIGURATION OPTIONS
##############################

cd $root_path

if [ -e "$cronjob" ]
then
  $parse $cronjob
   if [ "$?" -ne "0" ]
    then
     echo "$cronjob not parsed."
   else
     echo "$cronjob has succesfully been parsed."
   fi
else
  echo "$cronjob not found."
  exit
fi

exit

Finally, added the actual cronjob to run this periodically. Depending on how long it takes to run on your site and how quickly you want to build the aliases you could set it to be every 15 minutes or more or less often.

crontab -e
# minute  hour  mday  month  wday  command
*/15          *        *          *          *          /home/htdocs/cron-pathauto.sh >/dev/null 2>&1

Note that this cron job could be left on a site permanently if you are importing nodes and not creating aliases as you import.

Updating via drush

Another way to do this is to install Drush and then run this command:

drush php-eval '_pathauto_include() ; node_pathauto_bulkupdate()'

As shown above, this command could be included in a script called periodically via cron.

Performance tips

Pathauto's speed in Drupal 6 is directly tied to the creation of tokens. In Drupal 6 when a module uses tokens all of the related tokens are calculated and then the tokens that need to be used get used. In Drupal 7 only the tokens being used are calculated.

Disable cck token generation for fields you don't use

If you have nodes with a large number of CCK fields it can be particularly slow to generate tokens (and therefore slow to calculate the Pathauto aliases).

One helpful tip is to figure out which cck fields and prevent the token generation for that field. In your admin interface browse to › Administer › Content management › Content types and click on "manage fields" for each content type. Click the "Display fields" tab at the top of the page and then click the Token sub-tab. The URL should be something like /admin/content/node-type/food/display/token Review the fields on this list and click "Exclude" for any fields which are not used for token generation on your site (note that there are modules other than Pathauto which might use these tokens).

Disable modules that create tokens you don't need

You can review which modules are creating tokens, again using Drush php-eval to print a list of tokens that implement hook_token_values.

greggles@biff:~/d6$ drush php-eval "print_r(module_implements('token_values'));"
Array
(
    [0] => content
    [1] => text
    [2] => hrules
    [3] => og
    [4] => signup
    [5] => single_field_viewer
    [6] => token
    [7] => rules
    [8] => crazy_token_generator_of_doom
)
greggles@biff:~d6$ 

If you see a module in that list which you don't really need on your site then you could disable it and see if it performs any faster. If you need the module, but not the tokens from it, consider working with the module maintainer to make the portions of code related to token optional (e.g. with a variable in the admin interface or by simply moving the token code to a sub-module).

Tweaking command line memory

If you are running this via command line and set the Maximum number of objects to alias in a bulk update to a large number then you are likely to run into PHP's memory limit. If the command dies in the middle it is likely because of the memory limit. Their is a separate php.ini configuration file for command line than the PHP running inside the webserver. On Ubuntu that file is stored in /etc/php5/cli/php.ini. You can modify this file to set the memory_limit parameter to a very high number like 512M to mean 512 Megabytes, or set it to -1 to remove the memory limit.

Help improve this page

Page status: Not set

You can: