Initial work for the pathauto scalability discussion started by Adrian.

This patch:
1. Adds the path field to the node table
2. Updates the node table (via update.php) with default aliases from url_aliases
3. Modifies l() to accept a $node object instead of $path, and use $node->path if an object
4. Use default values when loading node (i.e. use node/nid as path if none specified)
5. Updates references to 'node/' . $node->nid to $node->path when appropriate and known to work.

Still to do:
1. Delete aliases existing in the node table out of url_alias table (remove duplicates, leave secondary aliases)
2. Update pathauto.module to use node table for primary alias
3. Update path.inc drupal_lookup_path() to use node table by default, then url_alias if not found.
4. Update path.module to stop using url_alias for primarly path values.
5. Check $node->path exists before using in l()?
6. Menus not fully addressed
7. URL actions (/edit, /delete, etc.) are not yet addressed

Waiting for help/direction from Adrian to continue.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

adrian’s picture

Title: Start of pathauto scalability. » pathauto scalability - add path column and possibly _path hook.
FileSize
42.48 KB

It's looking good so far Kitt, I've updated the code, to remove the duplication of the values in the url_alias table, and also to fix the conflicts that happened with the links patch. What I haven't done is update the _update_ file to remove the node/X aliases from the url alias table.

I also removed the nodeapi hook from the path module, as it is no longer necessary (node_save and node_validate handle it now), although I have left the form_alter which places the field on the node form.

I also made the incoming links (with drupal_lookup_path) work with the node table, but this is not ideal. I think what will be necessary (especially for validation), is a hook_path, that allows modules such as taxonomy etc, to define their own path columns (and bundle it with the same model).

It's looking good so far. Does anyone else have any comments?

beginner’s picture

I just tested the patch.

1) The patch includes your settings.php info :)
2) Should the path.module be enabled by default, now?
3) I enabled path.module. I created a first node, and entered a path ('first-node'), set the menu under primary links. After saving, the primary link links to node/1, not first-node.

beginner’s picture

I just see that the "Menus are not fully addressed".
Also, it won't accept incoming links, yet. In my example, directly accessing the page ?q=first-node gives me 404.

beginner’s picture

FileSize
44.06 KB

fixes incoming link.
doesn't patch settings.php.

doesn't fix menus.

beginner’s picture

FileSize
44.45 KB

fixes menu aliasing.

All my changes in last two patches are made in the function drupal_lookup_path in includes/path.inc .
It works but you may want to look at it to see if the code/logic can be improved.

FiReaNGeL’s picture

Any developments on this? Would be interested in seeing if we can get this in a functional state, as I'll soon have a site with > 15 millions nodes (and im pretty sure it'll get ugly then).

greggles’s picture

Title: pathauto scalability - add path column and possibly _path hook. » path alias scalability - add path column and possibly _path hook.

Changing title to be more descriptive of the problem. It's not a pathauto problem, it's just that pathauto does a wonderful job of highlighting this problem.

m3avrck’s picture

Version: x.y.z » 6.x-dev

This would be most excellent for Drupal 6, especially as we work on a new menu system.

One area to really increase performance would be *de-normalizing* paths and actually storing them in the node, menu, and comment tables. This would be the fastest and is what EZPublish currently does to solve this very same issue.

moshe weitzman’s picture

anyone working on this? would be nice.

moshe weitzman’s picture

any volunteers?

Wim Leers’s picture

+1

Subscribing.

catch’s picture

Version: 6.x-dev » 7.x-dev
catch’s picture

Status: Needs work » Closed (won't fix)

We have path whitelisting and per-page caching in core now, so marking as won't fix.