With Mysql >= 5.0.13 search index process does not work correctly.
The reason is a change in GREATEST mysql function. As stated on official Mysql site:
"Before MySQL 5.0.13, GREATEST() returns NULL only if all arguments are NULL. As of 5.0.13, it returns NULL if any argument is NULL."
(from http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html)
During the search indexing process the node_update_index() function in node.module launch this query:
$result = db_query_range('SELECT GREATEST(c.last_comment_timestamp, n.changed) as last_change, n.nid FROM {node} n LEFT JOIN {node_comment_statistics} c ON n.nid = c.nid WHERE n.status = 1 AND ((GREATEST(n.changed, c.last_comment_timestamp) = %d AND n.nid > %d) OR (n.changed > %d OR c.last_comment_timestamp > %d)) ORDER BY GREATEST(n.changed, c.last_comment_timestamp) ASC, n.nid ASC', $last, $last_nid, $last, $last, $last, 0, $limit);
If c.last_comment_timestamp is NULL the function GREATEST(c.last_comment_timestamp, n.changed) should return n.changed. But with the new GRATEST behavious it returns NULL.
The result is that the node will be skipped (and if more than $limit nodes are in that situation NO NODES WILL BE INDEXED at all).
The query should be changed in this way:
$result = db_query_range('SELECT GREATEST(if(c.last_comment_timestamp is NULL, 0, c.last_comment_timestamp), n.changed) as last_change, n.nid FROM {node} n LEFT JOIN {node_comment_statistics} c ON n.nid = c.nid WHERE n.status = 1 AND ((GREATEST(n.changed, if(c.last_comment_timestamp is NULL, 0, c.last_comment_timestamp)) = %d AND n.nid > %d) OR (n.changed > %d OR c.last_comment_timestamp > %d)) ORDER BY GREATEST(n.changed, if(c.last_comment_timestamp is NULL, 0, c.last_comment_timestamp)) ASC, n.nid ASC', $last, $last_nid, $last, $last, $last, 0, $limit);
| Comment | File | Size | Author |
|---|---|---|---|
| #3 | greatest_5013.patch | 1.45 KB | chx |
Comments
Comment #1
chx commentedComment #2
chx commentedI wanted to change this... mmmm
Comment #3
chx commentedComment #4
profix898 commentedI just tried to review this patch using MySQL 5.0.24 (PHP 5.1.6 on WinXPsp2). But actually indexing works fine even without this patch! Are there any special settings/modules/... required to reproduce this bug?
Comment #5
chx commentedI think you need comment module disabled and a node added in this state. That node won't be indexed.
Comment #6
profix898 commentedThanks, I didnt realize that comment.module is enabled by default ;)
To review I added 95 random nodes, then disabled the comment.module and added another 5 with known (searchable) content. On the search settings page AFTER the cron run the 'Indexing status' was 95%, what (I think) means the 95 random nodes were indexed but the other 5 were not. But actually performing a search (for a word in the known nodes) I got results even from the 5 (not indexed?) nodes.
With this patch the index status AFTER cron is 100% and all content seems indexed.
BUT the patch doesnt work with PostgreSQL!
I'm not a PostgreSQL/SQL expert, so I dont know there is a solution that works with every db. Maybe we need a
switch() {}as in hook_install to get a db-specific solution!?Comment #7
chx commentedYour pgsql install is wrong, there was a time when a bug in system.install caused the functions not install.. Try a fresh setup.
Comment #8
profix898 commented@chx: You are right, I tried with latest code but using an existing db. The
FUNCTION "if"stuff is only added in hook_install, but not in a hook_update_x, what means it is not available after an upgrade to Drupal 5, right?However the patch works nicely with a clean install on both MySQL and PostgreSQL. RTBC?
Comment #9
sammys commentedthe if function was already added to a 4.7 database in database.pgsql and will still be available after an upgrade.
Comment #10
dries commentedAnother solution might be to make last_comment_timestamp default to '0' instead of NULL. Not sure if that has any implications, but from a code's point of view, it might be a tad more elegant.
With that in mind, I decided to look at the database definition and it reads like this:
Wouldn't that mean that GREATEST() can never return NULL? Both changed and last_comment_timestamp should be non-NULL ...
Comment #11
dries commentedStupid me. The NULL comes from the LEFT JOIN, of course.
Comment #12
dries commentedLooks like this change actually improves compatibility with other databases: see http://bugs.mysql.com/bug.php?id=15610.
Reviewed the code and it looks good to me.
Comment #13
dries commentedCommitted to DRUPAL-4-7 and CVS HEAD. Thanks.
Comment #14
dries commentedComment #15
(not verified) commented