in a discussion about the load on the DB server that runs drupal.org [1], keiran mentioned:

"5 queries on CVS repositories table are slow."

there are no indexes at all in the cvs.install file. we can probably improve things a lot, simply by adding a few indexes to a few tables. i'm creating an issue so a) i don't forget and b) other folks can post patches here if they are inspired to work on this before i get the chance...

[1] http://lists.drupal.org/archives/infrastructure/2006-06/msg00374.html

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dww’s picture

ok, whoops, i was mis-reading things. there are a bunch of indexes, after all, since "KEY" works as an index, too. ;)
however, there aren't quite the right keys for all tables, since we're still getting crappy, non-indexed performance on this:

SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN cvs_files f ON m.cid = f.cid INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 0, 10

killes ran this via EXPLAIN on drupal.org just now, and got these results:

mysql> EXPLAIN SELECT DISTINCT m.*, r.*, u.name, u.uid FROM cvs_messages m INNER JOIN
    ->              cvs_files f ON m.cid = f.cid INNER JOIN cvs_repositories r ON m.rid = r.rid
    ->              INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 0, 10;
+-------+--------+---------------------+---------+---------+-------+-------+---------------------------------+
| table | type   | possible_keys       | key     | key_len | ref   | rows  | Extra                           |
+-------+--------+---------------------+---------+---------+-------+-------+---------------------------------+
| m     | ALL    | PRIMARY,uid,cid,rid | NULL    |    NULL | NULL  | 36203 | Using temporary; Using filesort |
| f     | ref    | cid                 | cid     |       4 | m.cid |     2 | Using index                     |
| r     | ALL    | PRIMARY             | NULL    |    NULL | NULL  |     2 | Using where                     |
| u     | eq_ref | PRIMARY             | PRIMARY |       4 | m.uid |     1 |                                 |
+-------+--------+---------------------+---------+---------+-------+-------+---------------------------------+
4 rows in set (0.02 sec)

so, we're still searching through *ALL* records in the cvs_messages table. :( i'll look more closely at why this is happening in the next few days. keep in mind that the cvs_* tables on drupal.org are different from the ones in cvs.install, since killes added some additional keys at some point (he forgets exactly when). i could provide a copy if anyone else is planning to investigate this...

dww’s picture

Assigned: Unassigned » dww
Status: Active » Needs review
FileSize
1.52 KB

ok, first patch in this thread. this one just gets the KEYs in sync between what's in the d.o schema and what's in cvs.install.

dww’s picture

next optimization (unrelated to the previous patch): killes tells me that all the places we were doing SELECT DISTINCT on m.cid was pointless. this patch removes 4 such needless DISTINCT's.

Gerhard Killesreiter’s picture

mysql> EXPLAIN SELECT DISTINCT m.*, r.*, u.name, u.uid FROM
-> cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users
-> u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 10;
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| m | ALL | rid,uid | NULL | NULL | NULL | 36924 | Using temporary; Using filesort |
| r | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where |
| u | eq_ref | PRIMARY | PRIMARY | 4 | m.uid | 1 | |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
3 rows in set (0.00 sec)

mysql> EXPLAIN SELECT DISTINCT m.*, r.*, u.name, u.uid FROM
-> cvs_messages m INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC
-> LIMIT 10;
ERROR 1051 (00000): Unknown table 'r'
mysql> EXPLAIN SELECT DISTINCT m.*, u.name, u.uid FROM cvs_messages m INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 10;
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| m | ALL | uid | NULL | NULL | NULL | 36924 | Using temporary; Using filesort |
| u | eq_ref | PRIMARY | PRIMARY | 4 | m.uid | 1 | |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
2 rows in set (0.00 sec)

dww’s picture

doh! wish he used a <pre> around those tables. ;) if only follow-ups were real comments and you could edit them.... oh yeah, right, i'm supposed to be working on that. ;)

dww’s picture

Status: Needs review » Active

ok, both of my patches are now committed to HEAD and 4.7.

i'm moving this issue back to active once again, since these queries are *still* big and nasty, and it'd be nice to optimize them futher...

dww’s picture

looks like we were overzealous with removing DISTINCT. for a little while there, the cvs message pages for d.o projects were showing each commit message N times for each of the N files in a single commit. we really *do* need distinct on the commit ids, since a given commit id can have N files. ;)

dww’s picture

Status: Active » Needs work

killes already undid this change via revision 1.106.2.16, which is now running on d.o. i gotta forward port this revision to HEAD.

Jeremy’s picture

FileSize
2.99 KB

Attached patch addresses a notorious slow query on drupal.org. Sorry, untested.

dww’s picture

Version: 4.7.x-1.x-dev » 6.x-1.x-dev
Status: Needs work » Fixed

Yup, as per reviews and testing here at the sprint, that works and should majorly improve performance. Committed to HEAD, backported to D5, and deployed on d.o.

http://drupal.org/cvs is awfully snappy now. ;) Yay.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.