Certain patterns can be harmful or dangerous for your site.

Aliases that begin at the root

If you configure Pathauto to create alias such as [title-raw] for a piece of content that users can create, you allow them to create pages like "http://www.example.com/google1234567.html" that are used to authenticate Google or Yahoo! webmaster tools. This then allows the user to authenticate as the owner of the site and control the entries inside those tools (including removing your site from the index, changing between www and no-www versions of your site, telling the search bot to visit less frequently). This is not the default and is not recommended. If you want short aliases at a minimum make your pattern something like "c/[title-raw]" to put a character "c" before every title.

Index Aliases

For example, certain situations where you create a pattern like forum/[title] and you have "Create Index Aliases" turned on pathauto will then create a page at example.com/forum/ which over-rides the default example.com/forum page. In order to fix this problem you can clear out that entry from the url aliases either via the GUI at example.com/admin/path or via the database table. A better path to create would be forums/ or discussions/ so that the names don't collide.

Transliteration Problems

Another problem occurred when a user created an alias pattern with user/[user] for his users. Then a user registered with a name that contained characters that were not in the translation table so Pathauto created an alias at user which broke the "My Account" page.

Lack of a Prefix

You can also run into problems using an alias that may not get any prefix. So, for example, if you leave the default node alias of [title] and then a user creates a node with the title "admin" you will get an alias at www.example.com/admin/ which points to the node instead of the administration panel. Similar problems could occur with [cat]/[title] aliases and a post in the "admin" category titled "settings". The opportunities for problems are enormous.

Recommendations

To avoid these problems you should create aliases that start with static text that is not a reserved drupal path/callback/directory. For example: user aliases should probably start with users/ so that they cannot conflict with the user/ set of paths.

Updated Recommendation

Pathauto "Punctuation Settings" at admin/build/path/pathauto supports removing periods from the url, which should render the google webmaster tools verification issue above easy to avoid when saving a node over which pathauto controls the aliasing.

Comments

kkobashi’s picture

I ran into a problem with creating social bookmark links to digg, delicious, stumbleupon, technorati, etc. where pathauto was prefacing the anchor tag href's with the base url of my website. What you need to do is run urlencode on the portions of those urls that need to be encoded properly, not the entire href.

In the code below, $node_url and $title are global Drupal variables available at the theme node level.

  $urlpart = urlencode($node_url);
  $titlepart = urlencode($title);
  $href = "http://www.stumbleupon.com/submit?url=http://www.site.com{$urlpart}&title={$titlepart}";
  $link = "<a href='{$href}'><img border='0' src='http://cdn.stumble-upon.com/images/24x24_su.gif' alt='StumbleUpon logo'/>Stumble It!</a>";
  echo $link;

You can read more about generating social bookmarks with pathauto.

Kerry Kobashi
http://www.kobashicomputing.com
http://www.kerryonworld.com

mcsolas’s picture

I am getting a page not found message when I try and read more about the social bookmarks.

Im having a similar problem with pathauto.. it would be nice to see a little more info on this subject.

gmclelland’s picture

FYI...I believe you can solve the problem of URLs without a prefix with the path_blacklist module.