Search performance improvement: Extra join in do_search coming from node_search

robertDouglass - May 11, 2008 - 22:41
Project:Drupal
Version:5.x-dev
Component:node.module
Category:bug report
Priority:normal
Assigned:Unassigned
Status:patch (to be ported)
Description

LIVE FROM THE MINNESOTA SEARCH SPRINT

The old queries:

SELECT SUM(i.score * t.count) AS score FROM search_index i INNER JOIN search_total t
  ON i.word = t.word
  INNER JOIN node n ON n.nid = i.sid
  INNER JOIN users u ON n.uid = u.uid
  WHERE n.status = 1
    AND (i.word = 'drupal' OR i.word = 'rocks')
    AND i.type = 'node' GROUP BY i.type, i.sid
  HAVING COUNT(*) >= 2 ORDER BY score DESC LIMIT 0, 1;           


SELECT i.type, i.sid,
  10 * (3.38941376736 * SUM(i.score * t.count)) +
  5 * POW(2, (GREATEST(MAX(n.created), MAX(n.changed), MAX(c.last_comment_timestamp)) - 1038678891) * 6.43e-8) +
  1 * (2.0 - 2.0 / (1.0 + MAX(c.comment_count) * 0.00271002710027)) AS score
FROM search_index i
INNER JOIN search_total t ON i.word = t.word
INNER JOIN node n ON n.nid = i.sid
INNER JOIN users u ON n.uid = u.uid
LEFT JOIN node_comment_statistics c ON c.nid = i.sid
WHERE n.status = 1
  AND (i.word = 'drupal' OR i.word = 'rocks')
  AND i.type = 'node'
GROUP BY i.type, i.sid HAVING COUNT(*) >= 2
ORDER BY score DESC LIMIT 0, 10

In both cases, the clause INNER JOIN users u ON n.uid = u.uid is superfluous. Taking it out produces identical results and executes faster.

AttachmentSize
extra-join.patch910 bytes

#1

David Lesieur - May 11, 2008 - 22:54

I have also taken this join out in the Faceted Search module, with no known adverse effects.

#2

douggreen - May 11, 2008 - 23:09
Status:needs review» reviewed & tested by the community

I reviewed and agree that this is superfluous. None of the joins use user columns. I checked all of the {node}, {search_index} clauses, the clauses from the node-ranks such as {node_comment_statistics} and {node_count}, and the clauses from the advanced search form such as {term_node}. It passes all existing tests plus the new tests in #145242.

Great catch Robert! Marking as RTBC!

#3

douggreen - May 11, 2008 - 23:14

I've also reviewed 5.x and 6.x and think that this can be backported, given that a few more eyes look at it and do some testing.

#4

earnie - May 12, 2008 - 15:07

I gave the patch a go on my 5.3 version site. Looks good to me.

#5

Dries - May 13, 2008 - 17:41
Version:7.x-dev» 6.x-dev

I've committed this to CVS HEAD. Thanks Robert. If Gabor wants to backport it to D6, he's welcome to do so.

#6

Gábor Hojtsy - May 19, 2008 - 08:08
Status:reviewed & tested by the community» needs work

Does not apply against Drupal 6, needs a reroll.

#7

catch - May 19, 2008 - 12:04
Status:needs work» patch (to be ported)

Changing status.

#8

David Lesieur - May 19, 2008 - 16:12
Status:patch (to be ported)» needs review

Re-rolled for D6.

AttachmentSize
257279_extra_join_d6.patch 930 bytes

#9

robertDouglass - May 19, 2008 - 16:18
Status:needs review» patch (to be ported)
AttachmentSize
extra-join.patch 923 bytes

#10

robertDouglass - May 19, 2008 - 16:18

ha. David and I were working on it simultaneously.

#11

David Lesieur - May 19, 2008 - 16:38
Status:patch (to be ported)» needs review

I'll re-join the Search Sprint chat room... :)

Both patches are identical.

#12

douggreen - May 20, 2008 - 12:39
Status:needs review» reviewed & tested by the community

I confirmed that the patch applies to the DRUPAL-6 tag and I tested search on the 100k node database and the patch returns the expected results.

#13

Gábor Hojtsy - June 24, 2008 - 14:17
Status:reviewed & tested by the community» fixed

Thanks, committed to D6.

#14

robertDouglass - June 25, 2008 - 09:39
Version:6.x-dev» 5.x-dev
Status:fixed» patch (to be ported)

Given that performance for search on D5 is still a hot topic this should get ported further.

 
 

Drupal is a registered trademark of Dries Buytaert.