Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 PST on 31 March 2024, to get $100 off your ticket.
in a discussion about the load on the DB server that runs drupal.org [1], keiran mentioned:
"5 queries on CVS repositories table are slow."
there are no indexes at all in the cvs.install file. we can probably improve things a lot, simply by adding a few indexes to a few tables. i'm creating an issue so a) i don't forget and b) other folks can post patches here if they are inspired to work on this before i get the chance...
[1] http://lists.drupal.org/archives/infrastructure/2006-06/msg00374.html
Comment | File | Size | Author |
---|---|---|---|
#9 | cvs.module.patch | 2.99 KB | Jeremy |
#3 | cvs_no_pointless_distinct.patch | 2.83 KB | dww |
#2 | cvs_db_keys.patch | 1.52 KB | dww |
Comments
Comment #1
dwwok, whoops, i was mis-reading things. there are a bunch of indexes, after all, since "KEY" works as an index, too. ;)
however, there aren't quite the right keys for all tables, since we're still getting crappy, non-indexed performance on this:
killes ran this via EXPLAIN on drupal.org just now, and got these results:
so, we're still searching through *ALL* records in the cvs_messages table. :( i'll look more closely at why this is happening in the next few days. keep in mind that the cvs_* tables on drupal.org are different from the ones in cvs.install, since killes added some additional keys at some point (he forgets exactly when). i could provide a copy if anyone else is planning to investigate this...
Comment #2
dwwok, first patch in this thread. this one just gets the KEYs in sync between what's in the d.o schema and what's in cvs.install.
Comment #3
dwwnext optimization (unrelated to the previous patch): killes tells me that all the places we were doing SELECT DISTINCT on m.cid was pointless. this patch removes 4 such needless DISTINCT's.
Comment #4
Gerhard Killesreiter CreditAttribution: Gerhard Killesreiter commentedmysql> EXPLAIN SELECT DISTINCT m.*, r.*, u.name, u.uid FROM
-> cvs_messages m INNER JOIN cvs_repositories r ON m.rid = r.rid INNER JOIN users
-> u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 10;
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| m | ALL | rid,uid | NULL | NULL | NULL | 36924 | Using temporary; Using filesort |
| r | ALL | PRIMARY | NULL | NULL | NULL | 2 | Using where |
| u | eq_ref | PRIMARY | PRIMARY | 4 | m.uid | 1 | |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
3 rows in set (0.00 sec)
mysql> EXPLAIN SELECT DISTINCT m.*, r.*, u.name, u.uid FROM
-> cvs_messages m INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC
-> LIMIT 10;
ERROR 1051 (00000): Unknown table 'r'
mysql> EXPLAIN SELECT DISTINCT m.*, u.name, u.uid FROM cvs_messages m INNER JOIN users u ON m.uid = u.uid ORDER BY m.created DESC LIMIT 10;
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
| m | ALL | uid | NULL | NULL | NULL | 36924 | Using temporary; Using filesort |
| u | eq_ref | PRIMARY | PRIMARY | 4 | m.uid | 1 | |
+-------+--------+---------------+---------+---------+-------+-------+---------------------------------+
2 rows in set (0.00 sec)
Comment #5
dwwdoh! wish he used a
<pre>
around those tables. ;) if only follow-ups were real comments and you could edit them.... oh yeah, right, i'm supposed to be working on that. ;)Comment #6
dwwok, both of my patches are now committed to HEAD and 4.7.
i'm moving this issue back to active once again, since these queries are *still* big and nasty, and it'd be nice to optimize them futher...
Comment #7
dwwlooks like we were overzealous with removing DISTINCT. for a little while there, the cvs message pages for d.o projects were showing each commit message N times for each of the N files in a single commit. we really *do* need distinct on the commit ids, since a given commit id can have N files. ;)
Comment #8
dwwkilles already undid this change via revision 1.106.2.16, which is now running on d.o. i gotta forward port this revision to HEAD.
Comment #9
Jeremy CreditAttribution: Jeremy commentedAttached patch addresses a notorious slow query on drupal.org. Sorry, untested.
Comment #10
dwwYup, as per reviews and testing here at the sprint, that works and should majorly improve performance. Committed to HEAD, backported to D5, and deployed on d.o.
http://drupal.org/cvs is awfully snappy now. ;) Yay.