Hiya,

There was an issue on my site whereby not all nodes on my site were being index. I was finding prior nodes that had just been indexed were being reindexed (they had not been updated as this was on a dev site where I was testing this out) and suddenly the entire site would be index. When I would check the index stats, I would see 430 nodes had been indexed (the site has 1700ish nodes). When I checked through the following query (line 51 of lucenapi_node.index.inc):

  $sql = 'SELECT GREATEST(IF(c.last_comment_timestamp IS NULL, 0, c.last_comment_timestamp), n.changed) AS last_change, n.nid'
       .' FROM {node} n'
       .' LEFT JOIN {node_comment_statistics} c ON n.nid = c.nid'
       .' WHERE n.status = 1'
       .'   AND ('
       .'     (GREATEST(n.changed, c.last_comment_timestamp) = %d AND n.nid > %d) OR'
       .'     (n.changed > %d OR c.last_comment_timestamp > %d)'
       .'   )'
       .' ORDER BY GREATEST(c.last_comment_timestamp, n.changed) ASC, n.nid ASC';

I found that running the query by itself (set timestamp and nid to 0 in command line query), I would get the nodes not in the sorted order. Then I realized that the GREATEST(c.last_comment_timestamp, n.changed) was actually getting the value null (my site has commenting disabled and I found doing this query as part of the select retrieval showed null as opposed to the max). Changing the query to:

  $sql = 'SELECT GREATEST(IF(c.last_comment_timestamp IS NULL, 0, c.last_comment_timestamp), n.changed) AS last_change, n.nid'
       .' FROM {node} n'
       .' LEFT JOIN {node_comment_statistics} c ON n.nid = c.nid'
       .' WHERE n.status = 1'
       .'   AND ('
       .'     (GREATEST(n.changed, c.last_comment_timestamp) = %d AND n.nid > %d) OR'
       .'     (n.changed > %d OR c.last_comment_timestamp > %d)'
       .'   )'
       .' ORDER BY last_change ASC, n.nid ASC';

seems to fix the issue. This probably needs more testing (and to make sure the variables around this are also alright) but it seems to work so far.

EDIT: I just realized any calls like this for greatest of n.changed and c.last_comment_timestamp aside from the select field list will likely need to be changed to last_change. I have attached a patch that I believe addresses all of this.

Comments

btmash’s picture

Status: Active » Needs review
StatusFileSize
new2.29 KB

Ok, this did not work as well as I thought it would have - attaching new patch with updated queries.

cpliakas’s picture

Issue tags: +6.x-2.5

BTMash,

Thanks for your continued work on this. The query is a relic from D5, so I am not surprised there are issues. Will test out the patch when I have some free time. Also, would appreciate other people's help in testing to make sure it works in all use cases.

Thanks,
Chris

organicwire’s picture

Status: Needs review » Reviewed & tested by the community

Hi BTMash,

I tested your patch successfully. I had the same issue and I'm happy that I can solve it using your patch. My cron runs are now significantly slower though. I suppose this is due to the fact that no nodes are ommitted anymore.

Best

cpliakas’s picture

Thanks for testing!

cpliakas’s picture

Status: Reviewed & tested by the community » Closed (won't fix)