It appears that the list of related taxonomy links is generated randomly from any node that shares at least one taxonomy term. This often leads to poor results.

For example: a story "Barry Bonds Indicted" might get a related link of "Jeff Gordon Wins Daytona 500" because they're both filed under category "Sports." Yet there may be other, more related nodes that share multiple taxonomy terms with this story because they have been tagged as "baseball," "steroids," and "Barry Bonds."

My feature request: eliminate the random aspect, and create a simple logic for determining selection and order of related links:

1. Multiple taxonomy matches get higher ranking
2. If the number of taxonomy matches is equal, the more recent link gets higher ranking

This way, other stories about Barry Bonds and steroids would be at the top of the related links list, while a story about Bonds' performance in a game would get a lower placement, and a story about Nascar would probably rank too low to show up on most lists.

Comments

TomArah’s picture

Very much like this as an idea. I also think there should be some mechanism to take the term weights into account. And ideally a simple way to exclude matches based on vocabularies and to limit matches to the current node type.

colan’s picture

In my version, I've dealt with some of these issues:

  1. Taxonomy links (or any other, for that matter) are no longer listed in random order. They are sorted by weight (of the link type, in this case "Taxonomy"), then by link type, and finally alphanumerically (within the sublist).
  2. Taxonomy vocabularies can be restricted. I've added checkboxes, that are by default selected, for each available vocabulary. Simply uncheck any unnecessary ones.

Hopefully these are sufficient for now. Thoughts?

TomArah’s picture

Fantastic. The ability to restrict to certain vocabularies in particular is a major advance.

Still trying it out but a couple of points occur. Not sure that sorting by name is always going to be desirable, presumably by date would be easy to offer and by perceived relevance ie weight of terms or number of matches would be most useful of all though I can see that implementing it would be more complicated.

And a small bug - after switching on parsed links I can't now switch if off.

Thanks for all the work.

Zen’s picture

The random generation of related links was due to my particular use case. I would prefer retaining this feature as an option (and hopefully improving it) and introducing "weighted" links as another (default) option.

Ideally, the weighting of a related link would involve all the following criteria in an appropriate user defined order:

  • Taxonomy terms.. as mentioned in the first post.
  • User ID.. nodes created by the same user are more likely to be related.
  • Search index.. make use of the search module's (if enabled) indexing and scoring system.
  • Node ID / date.. avoid stale data by preferring newer nodes.
  • Node type..
    • Due to the above parameters, the block will cease to be a "taxonomy" block as such.

      -K

Zen’s picture

Status: Active » Closed (fixed)

Closed in favour of http://drupal.org/node/91543 .