This patch adds to Doug Green's patch #145242 that adds support for a hook_ranking. It allows nodes that are linked to by other nodes to be scored higher than nodes without links to them. This requires a new table which keeps a count of all the links that a node has received that is updated on cron in the search_index function

Comments

BlakeLucchesi’s picture

Status: Active » Needs review
StatusFileSize
new4.46 KB

This patch adds a ranking hook definition for the node module so that nodes with more inbound links can receive a higher relevancy factor.

Note: Because of the way the link finder works, you may need to run cron a couple of times to ensure that the links were counted. Links get inserted into the table: search_node_links first, and each node that was linked to is then tagged for re-indexing, during this second indexing we count the number of links and store them in table: search_node_links_total.

douggreen’s picture

Status: Needs review » Needs work

I don't like that there are up to 3 extra queries on every node that gets indexed, especially if this ranking isn't used.

I'd have a look at putting something like this in search_update_totals:

if (variable_get('node_rank_link', 0)) {
  $dirty = search_sid_dirty();
  $placeholders = db_placeholders($dirty);
  db_query("DELETE FROM {search_node_link_totals} WHERE type = 'node' AND sid IN ($placeholders)", $dirty);
  db_query("INSERT INTO {search_node_link_totals} SELECT sid, type, count FROM {search_node_links} WHERE type = 'node' AND sid IN ($placeholders) GROUP BY sid, type", $dirty);
}

The search_sid_dirty() function doesn't exist. I suspect that this will be useful so that the DELETE and INSERT only affect a dozen nodes instead of ALL of them.

Notice that I'm only doing the update if the node_rank_link is used. Because of this, you'll need to do two more queries when the node_rank_link gets set to something other than 0, maybe in a submit handler from a form_alter?

db_query("DELETE FROM {search_node_link_totals}");
db_query("INSERT INTO {search_node_link_totals} SELECT sid, type, count FROM {search_node_links} WHERE type = 'node' GROUP BY sid, type");
keith.smith’s picture

Minor, but at least one code comment does not end in a full stop (period).

keith.smith’s picture

I have no real comment on the patch regarding post #2, Doug is in a much better position to evaluate its merits.

However, the attached patch makes the following changes from the previous patch:

+ 'description' => t('The number of links that the searchable item has linking to it')
---
+ 'description' => t('The number of links that the searchable item has linking to it.')

+ 'description' => t('The number of links that the searchable item has linking to it')
---
+ 'description' => t('The number of links that the searchable item has linking to it.')

+ // Update node_links_totals count for this node
---
+ // Update node_links_totals count for this node.

cwgordon7’s picture

There is no attached patch in #4

keith.smith’s picture

Hmmm. D.o eats another patch. It's on my office computer so I'll upload it tomorrow. Thanks for noticing!

keith.smith’s picture

Status: Needs work » Needs review
StatusFileSize
new4.55 KB

I believe that I meant to upload this patch (or did, and it went somewhere mysterious).

Anonymous’s picture

Status: Needs review » Needs work

The last submitted patch failed testing.

jhodgdon’s picture

Version: 7.x-dev » 8.x-dev

Bumping to 8.x at this point.

jhodgdon’s picture

Status: Needs work » Closed (won't fix)

Drupal 8 core search had a totally broken implemention of the node link checking stuff, and so when we converted search to a plugin system, we stopped supporting node link checking. This would need to be done in a contrib module now. Sorry...