Node Links Ranking Addition

BlakeLucchesi - May 11, 2008 - 18:19
Project:Drupal
Version:7.x-dev
Component:search.module
Category:feature request
Priority:normal
Assigned:Unassigned
Status:needs work
Description

This patch adds to Doug Green's patch #145242 that adds support for a hook_ranking. It allows nodes that are linked to by other nodes to be scored higher than nodes without links to them. This requires a new table which keeps a count of all the links that a node has received that is updated on cron in the search_index function

#1

BlakeLucchesi - June 19, 2008 - 07:57
Status:active» needs review

This patch adds a ranking hook definition for the node module so that nodes with more inbound links can receive a higher relevancy factor.

Note: Because of the way the link finder works, you may need to run cron a couple of times to ensure that the links were counted. Links get inserted into the table: search_node_links first, and each node that was linked to is then tagged for re-indexing, during this second indexing we count the number of links and store them in table: search_node_links_total.

AttachmentSize
search_link_ranking.patch 4.46 KB
Testbed results
search_link_ranking.patchfailedFailed: Failed to apply patch. Detailed results

#2

douggreen - June 19, 2008 - 13:05
Status:needs review» needs work

I don't like that there are up to 3 extra queries on every node that gets indexed, especially if this ranking isn't used.

I'd have a look at putting something like this in search_update_totals:

if (variable_get('node_rank_link', 0)) {
  $dirty = search_sid_dirty();
  $placeholders = db_placeholders($dirty);
  db_query("DELETE FROM {search_node_link_totals} WHERE type = 'node' AND sid IN ($placeholders)", $dirty);
  db_query("INSERT INTO {search_node_link_totals} SELECT sid, type, count FROM {search_node_links} WHERE type = 'node' AND sid IN ($placeholders) GROUP BY sid, type", $dirty);
}

The search_sid_dirty() function doesn't exist. I suspect that this will be useful so that the DELETE and INSERT only affect a dozen nodes instead of ALL of them.

Notice that I'm only doing the update if the node_rank_link is used. Because of this, you'll need to do two more queries when the node_rank_link gets set to something other than 0, maybe in a submit handler from a form_alter?

db_query("DELETE FROM {search_node_link_totals}");
db_query("INSERT INTO {search_node_link_totals} SELECT sid, type, count FROM {search_node_links} WHERE type = 'node' GROUP BY sid, type");

#3

keith.smith - June 22, 2008 - 02:28

Minor, but at least one code comment does not end in a full stop (period).

#4

keith.smith - June 23, 2008 - 18:04

I have no real comment on the patch regarding post #2, Doug is in a much better position to evaluate its merits.

However, the attached patch makes the following changes from the previous patch:

+ 'description' => t('The number of links that the searchable item has linking to it')
---
+ 'description' => t('The number of links that the searchable item has linking to it.')

+ 'description' => t('The number of links that the searchable item has linking to it')
---
+ 'description' => t('The number of links that the searchable item has linking to it.')

+ // Update node_links_totals count for this node
---
+ // Update node_links_totals count for this node.

#5

cwgordon7 - June 23, 2008 - 20:54

There is no attached patch in #4

#6

keith.smith - June 24, 2008 - 00:30

Hmmm. D.o eats another patch. It's on my office computer so I'll upload it tomorrow. Thanks for noticing!

#7

keith.smith - June 24, 2008 - 14:55
Status:needs work» needs review

I believe that I meant to upload this patch (or did, and it went somewhere mysterious).

AttachmentSize
search_link_ranking_2.patch 4.55 KB
Testbed results
search_link_ranking_2.patchfailedFailed: Failed to apply patch. Detailed results

#8

Anonymous (not verified) - November 11, 2008 - 08:05
Status:needs review» needs work

The last submitted patch failed testing.

 
 

Drupal is a registered trademark of Dries Buytaert.