While tracing through the search indexer to see if there was a way to make it faster with large numbers of nodes, I noticed by inspection that if there is a database failure while cron is running, search_update_totals() can fail to keep {search_total} up to date properly.

Consider: node_update_index() is running. Some of the _node_index_node() (node.module line 1787) calls have completed, and some have not. Each of the completed _node_index_node() calls may have modified {search_index} and added some items into search_dirty() (search.module line 577)

Now the database encounters an error, and future database queries for that run of cron.php all fail. I've seen this happen often on shared hosting providers.

php attempts to run search_update_totals() at this point (registered on search.module line 269). However search_update_totals() makes database calls, and if those database calls fail (perhaps the connection to the database has failed), then {search_total} is not updated properly.

A {search_total} which is not up to date will either:

* cause search hits for "search items" which do not contain the search keys
* cause search misses for "search items" which _do_ contain the search keys
* cause errors in the sort order of results

Comments

jhodgdon’s picture

Version: 6.4 » 7.x-dev

This needs to be fixed in Drupal 7 first, then back-ported to Drupal 6.

pavelsof’s picture

Priority: Minor » Normal

I have also seen this problem happen and resulting in a huge database and a website stopped by the hosting provider. In other words, this bug can make Drupal unusable.

_sparrow_’s picture

+1

jhodgdon’s picture

Status: Active » Closed (duplicate)

The database problems are I think being caused due to
#2032851: Placeholder count limit exceeded in search_update_totals() during cron run
making the tables get out of synch. So I believe this issue is a duplicate of that other one.

There are probably other ways for the search index to get corrupted by database problems; there is an issue about being able to completely clear it out and start over that would be the only way to get around those issues:
#326062: Add clear search index functionality

So I think the cause is a duplicate of the first issue, and the fix is a duplicate of the second, and therefore we don't need this issue to stay open, since it's a duplicate of other issues.