update search index when adding tags

rcourtna - January 8, 2008 - 23:39
Project:Community Tags
Version:5.x-1.x-dev
Component:Code
Category:bug report
Priority:normal
Assigned:dropcube
Status:patch (code needs work)
Description

When users add tags to content, the search index is not updated (after running cron.php). To get the index to include the new tags, the node author actually needs to resubmit the node edit form (which sort of dilutes the value of community_tags).

I was hoping to create a patch and contribute it back, however I'm having a hard time groking what I can call from within the module to force the node to be reindexed. Can someone provide some guidance?

#1

rcourtna - January 9, 2008 - 03:45
Status:active» patch (code needs review)

I've fixed this problem by implementing hook_update_index(). The following function can by put anywere in community_tags.module. It basically determines what tags are new by comparing the last index date (node_cron_last), and then goes ahead and reindexes the affected nodes.

<?php
function community_tags_update_index() {
 
//last cron run
 
$last = variable_get('node_cron_last', 0);
 
$limit = (int)variable_get('search_cron_limit', 100);
 
 
$sql = "SELECT nid FROM community_tags WHERE date > %d";
 
$result = db_query_range($sql, $last, 0, $limit);
 
  if (
$result && db_num_rows($result) > 0) {
    while (
$data = db_fetch_object($result)) {
     
$node = node_load($data->nid);
     
// Build the node body.
     
$node = node_build_content($node, FALSE, FALSE);
     
$node->body = drupal_render($node->content);
 
     
// Allow modules to modify the fully-built node.
     
node_invoke_nodeapi($node, 'alter');
 
     
$text = '<h1>'. check_plain($node->title) .'</h1>'. $node->body;
 
     
// Fetch extra data normally not visible
     
$extra = node_invoke_nodeapi($node, 'update index');
      foreach (
$extra as $t) {
       
$text .= $t;
      }
 
     
// Update index
     
search_index($node->nid, 'node', $text);
    }
  }
}
?>

Hope this helps someone else and that we can get this committed.

#2

owahab - January 18, 2008 - 13:47
Assigned to:Anonymous» owahab
Status:patch (code needs review)» patch (reviewed & tested by the community)

I think this patch is good to go.

If none had a point against it in the next week, it'll go.

#3

owahab - January 25, 2008 - 17:38
Status:patch (reviewed & tested by the community)» fixed

#4

Anonymous (not verified) - February 8, 2008 - 17:41
Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#5

jaydub - February 19, 2008 - 10:54
Status:closed» active

This line:

$sql = "SELECT nid FROM community_tags WHERE date > %d";

needs to be changed to allow for different table name prefixes:

$sql = "SELECT nid FROM {community_tags} WHERE date > %d";

#6

owahab - February 19, 2008 - 10:59
Status:active» fixed

Thanks jaydub.

#7

Anonymous (not verified) - March 4, 2008 - 11:04
Status:fixed» closed

Automatically closed -- issue fixed for two weeks with no activity.

#8

dropcube - July 18, 2008 - 04:49
Assigned to:owahab» dropcube
Status:closed» patch (code needs work)

I understand the need of re-indexing the node content after a user tagging, however, implementing hook_update_index() does not seem to be a good idea. The current implementation of community_tags_update_index() is nearly the same as node_update_index(). Updating the search index is a responsibility of the node and search modules. All we need to do is to notify that a node has been updated, or in the worst case update the search index of a single node after tagging it.

- The search module of 6.x has a function search_touch_node(), which can be called after saving new community tags, so this may be the solution for 6.x

- For 5.x, calling _node_index_node($node) may be an approach, but we need to evaluate performance implications that it may has.

Thoughts?

#9

dropcube - July 18, 2008 - 05:03

In fact, for 5.x an approach similar to search_touch_node() may work.

#10

dropcube - July 19, 2008 - 17:36

I have removed the hook_update_index() from the 6.x branch and it's using search_touch_node() as described in #8.

Nodes are marked for re-indexing after adding/removing tags and the tags are included/removed from the search index when the cron runs. So, it's working as expected.

I have to work out in a similar approach for 5.x.

 
 

Drupal is a registered trademark of Dries Buytaert.