If a term has multiple parents (the vocab having hierarchy with multiple parents enabled, of course) then the catpath token will produce a path with the ancestor terms duplicated and all parent terms listed. The termpath token also produces this in many instances with nodes that belong to a term with multiple parents. Example:

taxonomy (term, weight):
A, 0
- B, 0
- C, 1
- - E, 0
- D, 2
- - E, 0

Using [catpath] for term D produces path "A/A/D/C/E"

Using [termpath]/[title] for a node produces the path "A/A/D/C/E/title" when the node is classified with term combinations:
E
B and E
C and E
D and E
C, D and E

Also, term combinations:
B,C and E produce path "A/B/title"
A,C and E produce path "A/title
A,B,C and E produce path "A/title

Using Token 1.10, PHP 5.2.0, mysql 5.0.32

Comments

greggles’s picture

Version: 5.x-2.1 » 5.x-2.x-dev

This seems reasonable to me for this to be a problem. I don't think I tested the catpath or termpath tokens for multiple hierarchies. I don't have time to work on this now but would be happy to apply a patch if someone else can create one.

Rowanw’s picture

Title: Duplication of ancestor terms in catpath/termpath when term has multiple parents » Plolyheirarchy: Duplication of ancestor terms in catpath/termpath when term has multiple parents

I've just been discussing this issue in the office and we came to the conclusion that either you would need Pathauto to generate several aliases for the same terms or simply avoid using the catpath/termpath tokens for that vocabulary.

Remember that the main purpose of a polyhierarchy is to provide multiple ways of finding the exact same content. It doesn't make the E in C (from above) any different than the E in D, so ideally there should only be one path.

I honestly don't know how Pathauto would handle this issue, I'll leave it to someone who knows something about programing and IA to come up with an idea.

Rowanw’s picture

Title: Plolyheirarchy: Duplication of ancestor terms in catpath/termpath when term has multiple parents » Duplication of ancestor terms in catpath/termpath when term has multiple parents

Shouldn't have changed the title.

greggles’s picture

Using [catpath] for term D produces path "A/A/D/C/E"

Really? Did you mean Term D or Term E there?

Now that I re-read this in light of rowanw's post...I'm pretty sure that this is "by design" but I'd like to be sure.

Freso’s picture

Status: Active » Postponed (maintainer needs more info)

It's not "by design" (or shouldn't be, IMHO). However, this is either a feature request for term E in that example to generate both "A/D/E" and "A/C/E", which could get increasingly complex with the complexity of the taxonomy tree, or a duplicate of the mentioned #138674: if multiple placeholders exist create multiple aliases for a node. I haven't read the latter yet, so I can't say right now.

Freso’s picture

After reading #138674: if multiple placeholders exist create multiple aliases for a node, I'd categorise this as a duplicate of that.

greggles’s picture

I just tried to repeat this bug. What I found was:

Taxonomy aliases:
Pattern: "the_category/[vocab-raw]/[termpath-raw]"
For D: the_category/vocab1/a/d (as expected)
For C: the_category/vocab1/a/c (as expected)
For E: the_category/vocab1/a/d/c/e

E is not exactly as I expected it but I'm also not really sure what the appropriate behavior could be. As Freso points out, we might want to handle this along the lines of creating multiple aliases for a node or creating a redirect so that term E would have:

Main alias: the_category/vocab1/a/c/e
Secondary alias: the_category/vocab1/a/d/e

Where the secondary alias is a 301 redirect to the first one. However, I don't think that's actually all that valuable.

Node aliases:
Pattern: "[termpath-raw]/[title]"
For E: "a/d/c/e/title" (at least it's consistent)
For C, D, E: "a/d/c/e/title"
For A: "title" (should be "a/title")

So...I confirmed several of the cases you found and agree that some of them are pretty weird and perhaps not the ideal behavior. But, I'm also not sure what the ideal behavior is exactly.

Note: I didn't see the duplication of ancestor terms neither in the catpath nor the termpath which is what this issue is mostly about.

Can anyone else confirm the behavior? IMO, this should be a low priority because this is only a problem for multiple hierarchy and multiple select, which is a pretty uncommon case.

Rowanw’s picture

Status: Postponed (maintainer needs more info) » Closed (works as designed)

"Can anyone else confirm the behavior?"

Yes, there's no doubt about what's happening. I'd suggest for anyone who's using a polyhierarchy to avoid using Pathauto for their taxonomy terms, otherwise you'll have very long paths.

I'm marking this as by design as Pathauto is doing exactly as it's told, if someone comes up with a viable workaround they should make a new issue with their request/patch. This issue is going nowhere fast. :)

greggles’s picture

Sure, but...the duplication of terms (if/when it happens) is wrong.

I.e. the a/c/d/e is OK, in my opinion, but A/A/C/D/E would not be. I just can't get A/A/C/D/E to happen :(

pcambra’s picture

Version: 5.x-2.x-dev » 5.x-2.3

Hi.

I am currently getting this weird behavior (A/A/D/C/E paths), there is a solution or alternative for pathauto with polyhierarchy?
I think that the right behavior would be creating two separate paths:
A/D/E
A/C/E

Or simply choose the best path.

A/C/D/E would be good too!

Oh, i am using the last 28-june version

Thanks

greggles’s picture

Status: Closed (works as designed) » Postponed (maintainer needs more info)

@pcambra - can you provide a consistently reprocible simplified test case? As you can see from the discussion about that's where we got stuck. I could never reproduce the problem :/

opteronmx’s picture

Hi everyone:

Sorry for length...

I have the same problem, in my case the problem is at the leafs of my tree, where I have childrens who has multiple parents in the hierarchy.
I have a very big taxonomy with all Mexican regions (from states to postal codes > 100k terms !!!), and in this case, the same postal code is shared with several neighborhoods.

An example of my tree:

State
--City1
----County1
-----Neighborhood1
-------CP1
--City1
----County1
-----Neighborhood2
-------CP1
--City1
----County1
-----Neighborhood3
-------CP1
--City1
----County1
-----Neighborhood4
-------CP2
etc.

I think it's caused by the way taxonomy.module returns all the parents for specific leaf term, so when pathauto.module tries to get the parents for CP1 trought taxonomy_get_parents_all, taxonomy.module gives an array that lists all parents but WITHOUT explicit hierarchy, and Pathauto doesn't calculate term's lineage, so the array contains ALL the parents without distinction.

I think the solution here is to calculate term's lineage, I tried to patch pathauto.module to accomplish that, but for the (work) time needed to do that I've left it behind, so I did my own code to create the aliases for my particular taxonomy tree.

Here is my code:

//loading drupal enviroment
require_once './includes/bootstrap.inc';

drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

module_load_all();

//Recursive function to calculate multiple lineages from certain term
function get_term_lineages($tid)
{
	$lineages = array();
	
	//query taken from lineage.module
	$result = db_query("SELECT td.tid, td.name, td.weight, th.parent FROM {term_hierarchy} th LEFT JOIN {term_data} td ON td.tid = th.tid WHERE td.tid = '%d'", $tid);
  while ($term = db_fetch_object($result)) {
		$parents = get_term_lineages($term->parent);
		if (!empty($parents))	{
			$term->parents = $parents;
		}
		unset($term->parent);
		$lineages[] = $term;
	}
  return $lineages;
}

//This function start the recursive calls to _get_term_paths and cleans the array returned
function get_term_paths($lineages)
{
	$paths=array();
	$data =_get_term_paths($lineages);
	foreach ($data as $item) {
		if(is_string($item))
		{
			$paths[]=$item;
		}
	}
	
	/*This blocks removes duplicated items from paths generated.
	One example alias is: jalisco/guadalajara/guadalajara/tetlan/44820
	here we can delete second "guadalajara", but then I broke up alias structure
	that is as follows:
	
	state/county/city/neighborhood/postal-code
	or
	state/county/neighborhood/postal-code
	
	So I commented it.
	
	foreach($paths as &$path)
	{
		$exploded=explode('/',$path);
		
		$path=implode('/',array_unique($exploded));
	}*/
	
	return $paths;
}

//Recursive function to construct path aliases. It goes through lineages tree getting term name and cleaning it to be used as alias.
//FUNCTION NOT EXTENSIVELY TESTED!!!!! so I don't really know if it works on all scenarios
function _get_term_paths($lineages)
{
	$paths=array();
	foreach ($lineages as $lineage) {
		if (!isset($lineage->parents)) {
			return clean_term_name($lineage->name);
		}
		else {
			$data = _get_term_paths($lineage->parents);
			if (is_array($data)) {
				$paths[] = $data[0] . '/' . clean_term_name($lineage->name);
				$paths[] = $data;
			}
			else if (is_string($data)) {
				$name=str_replace(' ','-',clean_term_name($lineage->name));
				$paths[] = $data . '/' . $name;
			}
		}
	}
	return $paths;
}

//As it's name says.
function clean_term_name($name)
{
	return str_replace(' ','-',strtolower(pathauto_cleanstring(name)));
}

//including path auto requierments
_pathauto_include();

//this is my vocabulary ID
$vid=3;

//getting all terms from my vocabulary
$result = db_query("SELECT * FROM {term_data} td WHERE td.vid = '%d'", $vid);

while ($term = db_fetch_object($result))	
{
	//if is a postal code (postal codes start at 20000 here in Mexico)
	if (intval($term->name)>20000) {
		$lineages=get_term_lineages($term->tid);
		$aliases=get_term_paths($lineages);
		foreach($aliases as $alias)
		{
			path_set_alias("taxonomy/term/$term->tid", $alias);
		}
		print "$term->tid\n";
	}
}

Maguar’s picture

Version: 5.x-2.3 » 6.x-1.0
Issue tags: +patha, +multiple parents

Please help solve this problem by multiple parents and pathauto! Thank you!

Dave Reid’s picture

Version: 6.x-1.0 » 6.x-1.x-dev
Status: Postponed (maintainer needs more info) » Active
asb’s picture

Priority: Normal » Major

According to #915810: Multiple taxonomy terms, behavior in old version vs new, pulling lightest weight parent, Pathauto is supposed to "pull the lightest weight parent", if a term has multiple parents. At least that was the doumented behavious in 5.x. However, with the current release 6.x-1.5 (marked as "recommended") plus token 6.x-1.16, I get a totally different behaviour. Simple example for illustration:

Taxonomy terms:

  Länder (Countries)
    -- Vietnam
      -- Hanoi
        -- Literaturtempel
  Reisen (Travels)
    --  Vietnam (2011)
        -- Vietnam (2011-08-02)
          -- Literaturtempel

The term "Literaturtempel" has two parents, "Hanoi", and "Vietnam (2011-08-02)". The resulting path is:

./reisen/laender/vietnam_2011/vietnam/vietnam_2011_08_02/hanoi/literaturtempel

This first takes the first parent, then adds the second parent after, and last adds the term. This totally breaks the logic behind the hierarchical structure and "invents" very long URL aliases that are unreadable. Another side effect is that semantical paths are cut of because they become too long, resulting in path stubs ending with incrementing numbers. Totally broken and very harmful for SEO because the paths are simply rubbish (e.g. if a third parent exists, if a subterm has multiple parents, also, etc.).

Path pattern at ./admin/build/path/pathauto for this taxonomy term: [catpath-raw].

Since this also affects the breadcrumb path, there might be other Drupal components involved. This is the breadcrumb path I'm getting:

Startseite » Bildergalerien » Reisen » Länder » Vietnam (2011) » Vietnam » Vietnam (2011-08-02) » Hanoi

This is borked in the same way as the URL alias.

From the user's point of view, the sanest way to deal with this would be to create multiple URL aliases (one for each hierarchy) and to let something like Globalredirect deal with the upcominc problems of duplicate content.

After browsing through the issue queue for a while, I hope that this is the right issue. I believe this is a major issue since it simply breaks all human-readable logic in URL paths for taxonomy terms with multiple parents. If you're using this mechanism to build image galleries, the URL structure becomes totally useless.

Live examples, where the key URL part - the actual term name - is cut off:

http://taxidi.org/reisen/reisen/kambodscha_2011/laender/kambodscha_2011/... (it's supposed to be "Bayon")

http://taxidi.org/reisen/kambodscha_2011/laender/kambodscha_2011/kambods... (it's supposed to be "Pre Rup")

http://taxidi.org/reisen/kambodscha_2011/laender/kambodscha_2011/kambods... (it's supposed to be "Ta Prohm")

Thank you very much for looking into this issue!
-asb

kenorb’s picture

Issue summary: View changes
Status: Active » Closed (won't fix)

Issue too old, therefore not active.