Core hierarchical page structuring module

Last updated on
24 April 2025

This proposal builds upon item #2 from a recent list of suggestions by Dries for SoC core improvements

Proposed by: Peter (pwolanin)

potential co-mentors:
pwolanin
add1sun
webchick

The Problem

Drupal's current core book module has a number of problems:

  • It is too closely tied to a single, module-defined content type
  • It has (I suspect) quite poor performance in generating the navigation menu, navigation elements, etc
  • The UI for moving a page withing a hierarchy is very frustrating since the drop-down list can be hundreds (or more) long
  • There is no support for versioning or replication of a hierarchy
  • There is no support for mass operations (e.g. publish, add taxonomy, set authorship) on sections of the hierarchy
  • There is no interaction with the path module to generate hierarchical paths automatically
  • ???

Goals

from Dries:

Remove the "book page" node type, and turn the book module into a module that is specialized at creating hierarchical content trees. The module should automatically extract a menu from the node hierarchy, set the correct breadcrumbs, help you define a hierarchical URL structure with clean URLs, etc. This module will be the de facto standard to structure pages and to create hierarchical navigation schemes in Drupal.

Proposal

The following tasks are proposed for a Summer of Code student:

  • Define the optimal features for end users and those adding content
  • Investigate and select a robust data structure (e.g. is a single table better than one per hierarchy, how can sub-trees be efficiently selected and cached)
  • The data structure should lend itself to efficient and but also understandable algorithms (to facilitate module maintenance)
  • Define and implement a migration path for data from the current book module
  • Define and implement an improved UI for common tasks
  • Benchmark one or more implementations against the current book module
  • Investigate and implement caching of tree or sub-trees as appropriate to improve performance (especially for authenticated users)
  • Improve the breadcrumb handling as needed
  • Add support for hierarchical parth generation, or consider integration with the pathauto API
  • Provide clear doxygen comments in the code

Use Cases

  • The Drupal.org Handbooks/documentation
  • Online versions of books
  • Collaborative documentation efforts(wiki-like)

Thought and resources

An existing contrib module: http://drupal.org/project/relativity
This implements more general tree structures and could provide useful code and ideas.

Discussions of a "relationship API" on g.d.o:
http://groups.drupal.org/node/1966
http://groups.drupal.org/node/1323

Issue where some proposals/patches were offered previously for the book module: http://drupal.org/node/81226

Code in my sandbox which should work under Drupal HEAD/6.x-dev

Some articles on storing/traversing hierarchies in a database table:

http://www.sitepoint.com/article/hierarchical-data-database
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html

The book by Joe Celko "Trees and Hierarchies in SQL for Smarties" (available in paper or e-book).

My sandbox code currently uses the nid of the highest-level node (i.e. parent == 0), as a table column that allows one to easily select all nodes in that hierarchy. Would it be better to actually store the path to the top to allow for more efficient select of sub-trees?

For example, imagine a hierarchy like this (nid values shown):

     23
   / |  \
 25  34  76
 |   |   |  \ 
... ...  44  12

Than a query like: "SELECT * from {outline_hierarchy} WHERE path LIKE '23:76:%'"
would return the sub-tree rooted at node/76, and 23:76: could be the cache ID for caching this result.

If a new node was added as a child of node/76, the cached subtrees 23:76: and 23: would need to be expired/deleted but the cached subtrees 23:25:... and 23:34:... would still be valid.

(edit) here's another link that suggests a very similar technique of storing the path to the root: http://www.sqlteam.com/item.asp?ItemID=8866

Comments on this proposal

First of all: I think this
Frando - March 17, 2007 - 18:56

First of all: I think this is a very important topic.
I thought about wether it would be a good idea or not to use the (new) menu system as a base for all this, because the menu system already provides the basic data structures to construct a hierarchy.
If the menu system would be the base, things could look like this:

* each 'thing' in the tree is represented by a menu item
* the module takes care to keep nodes and their menu items synchronized
* the module provides a simple form that can be added to other forms, for example to the node creation form (we have this already, but the UI could be improved) but also to e.g. the views creation form, to create a new menu item
* the module provides a per-root-menu option wether navigation links (like in the book module) shall be displayed or not
* Breadcrumbs are taken care of by the menu system

The advantage would be that you would then have a page hierarchy that can include all types of pages and not only nodes.

I'm not yet totally sure wether this could work out, I just wanted to throw in the idea...
======================================================
======================================================

That is an interesting idea.
pwolanin - March 18, 2007 - 15:26

That is an interesting idea. I just looked at the way data is stored in the new 6.x menu table, and indeed it is already built to store a hierarchy very much as I suggest above.
======================================================
======================================================

More ideas
egfrith - March 29, 2007 - 16:31

I think this an important topic too. As I've mentioned in a previous comment, I know of a number of people who need hieararhical organisation of pages. There are two features that would be nice:
* The ability to give create, edit and delete permissions to particular users to a subtree of a hierarchy. Looking at the projects page, there are a two modules (menu per role, path access) that might implement this sort of thing already with the book module, but perhaps it would be advantageous to think about this feature when designing the new system.
* The ability to insert the book (hierarchy) menu into another menu, so that a separate menu is not needed Sympal book menu does this, but it might be nice to incorporate it.

Like at least one other in this thread, I have used the category module, and would be keen for some its functionality to enter the core. For me, the key feature it offers beyond this proposal is the ability to use pages in the hierarchy as terms to categorise nodes by. The example I posted in the previous comment explains more. It would be great if this type of scenario could be considered in the design.
======================================================
======================================================

Solution given at
vito_swat - March 20, 2007 - 13:21

Solution given at http://www.sqlteam.com/item.asp?ItemID=8866 is very easy and quite fast. I used it in one project and it works very well. Beware that the nid in full path field have to be 0-padded so every segment in full path have to be the same size. It means that you have limited depth of your tree by varchar size in database.

Vito
======================================================
======================================================

Nested sets
chx - March 29, 2007 - 10:18

Please note that core already use an implementation of materialized path in menu.inc, namely nested sets, it should be reused.
--
The news is Now Public | Drupal development: making the world better, one patch at a time. | A bedroom without a teddy is like a face without a smile. | Blog about life in Hungary
======================================================
======================================================

Hi pwolanin Thanks for
TheWhippinpost - March 29, 2007 - 15:41

Hi pwolanin

Thanks for asking me to pop here in furtherance to this comment.

Upfront Confession: I come from a strong SEO-influenced background so I am particularly anal about paths.

REF: ModX

--------------
Overview

As a quick overview for other interested parties reading this:

Last year - after having played with Civicspace (CS) the year before - I decided to move an existing hierarchically-structured site to Drupal (4.7). What I hadn't appreciated when playing with CS, was the enormous battles I would need to overcome - not only in recreating a hierarchical structure that mimicked the existing site, but with the overall path system in general, which presented obstacles at what seemed like every turn.

Eventually, through a not inconsiderable amount of work (and stubborness), I managed to launch the site. But I have to say, although I can (and do) add content, I have to approach the task standing on eggshells because of any potential quirky side-effects that might happen after a certain action is performed, which makes me "risk-averse" to trying new things (with that site).

Just as that site went live, Drupal 5 emerged. I didn't have the stomache to go through all the battles again, so I watched from the sidelines.

I had hoped Drupal 5 would have the general path issue fixed, but after subscribing to Pathauto's issue forum, I instead witnessed a new wave of various issues flooding my inbox (Greggles is doing an extreme amount of superb work there BTW, much much respect).

I note (and have tried) the various attempts to implement a hierarchical solution - Greenash's Category springs to mind - but ultimately, none was without problems.

From various communications and reading, I get the impression that the real "problem" lies with the core Path module and its way of doing things itself. It seems as though all the various attempts to output a clean, reliable path structure by well-meaning module authors, have struggled against the workings of the Path module (indeed, some of them are attempts to "put right", what the Path module should (seemingly) do natively).
ModX

Recently, I had call for another CMS-type platform for another project. I was resisting because of my above experiences. Quite by chance, I happened to try out ModX over at opensourcecms.com

I'm not one for bloat, so am not easily impressed with all this Ajax widgetisation stuff. But I was impressed with both the implementaion and responsiveness of it in admin - The contributors have really worked hard at serving things right to you, quickly, and with minimal clicks.

But my war-wounds dictated my needs for a reliable Path system, with folders (and sub-folders) serving as category containers, as my first prority.

It turned out to be very easy: I just create a new document, declare it as a container, and add new documents beneath it.

Because I can create content in the same window as my site's folder and document tree (visible in left frame), I can - if I so choose - also easily insert links to other documents within my content, by just specifying the documents' ID (printed next to each document in the directory tree for easy referrence).

And d'ya know what? It works. All internal linking (within content), and site-wide navigational links, are faithfully printed, however deep.

I have no idea how they have chosen to implement their solution, but that's how it should be.

I'm not here to sell ModX - I have tried to stick to the topic in-hand - but I am currently sold.

If you can pull this off pwolanin, then I believe you would be making one of the most important core Drupal contributions. You asked me to comment, but by doing so, you put me in the potential position of evangelising ModX, but I'm not for one minute suggesting you follow ModX's example, just that you (or someone), get it right - For me, all the other fancy widgetisation stuff is second to this fundamental.

I'm not confident this is what you were after pwolanin, but feel free to ask more specific questions.

Mike
------------------------------------------------------------------------------------------
A simple thanks to those that help, a price worth paying for future wealth.
======================================================
======================================================

like the foo_attach modules?
drewish - March 30, 2007 - 21:55

humm, modx's whole attaching documents to containers paradigm seems like what people are trying to get working with the image/audio_attach modules (which are horribly broken IMHO). it'd be interesting to see this done in core for the book module because i think it would do what the image_gallery module is trying (and again failing) to do.
======================================================
======================================================

This idea is right on
kirilius - June 14, 2007 - 14:09

This idea is right on target. I have been evaluating Drupal for some time trying to find a solution to a very basic problem: create a hierarchical node type.

CCK module is great because it is so easy to create new node types but it does not solve the main problem that I have. I want to have a hierarchy of type:

Node Type A
|-- Node Type B
|-- Node Type B
|-- Node Type B
.......

Whether it is done by creating two node types (A and B) and establishing a one-to-many relationship between them or simply allowing admins to create super-node types (let's call them AB), it doesn't matter. The reason I need such a structure is to allow my users to go through the following scenario:
1) Add a new A-node
2) Once the A-node is done, allow them to add a number of B-nodes UNDER it (not create them independently and later on try to establish a relationship)

If they want to create a B-node directly, the UI should prompt them to pick an A-node (that's already created).

Of course there is no reason to limit this hierarchy only on two levels ;-)

Node Relativity solves this problem but the actual implementation of it causes a lot of pain since it doesn't care about views and it is not easy to customize.

Another attempt to solve this is the Category module, which allows the categories to act like a regular content type. Again - the idea is good, but implementing it in my case caused only hours of lost time.
======================================================
======================================================

Help improve this page

Page status: Not set

You can: