Doing bulk deletion of users in a cron job, I was only getting 290-300 users deleted per run (240 seconds). It turned out that 95% of the time was being spent in comment_user(), which updates {comments} and {node_comment_statistics} on their respective uid columns. This is in an environment with no comments, but 138,000 nodes (thus {node_comment_statistics} has 138,000 rows).
The attached patch indexes the respective uid columns (I know {comments}, being empty, didn't contribute to my issue, but adding that as well seems like a no-brainer). After applying the patch and running the update process, my next cron batch deleted 3900 users.
Comments
Comment #1
mikeryanAlways forget to add my changes into the schema hook as well...
Comment #2
frankcarey commentedno comments? bump! any reason not to do this?
Comment #3
robertdouglass commentedIs this applicable to D7?
Comment #4
moshe weitzman commentedHere is a D7 patch. In addition to these 2 indexes, I added foreign keys to user.uid and moved the misplaced comment_update_7005() back into the updates defgroup where it belongs. no changes to comment_update_7005()
Comment #5
catchsubscribing, looks rtbc but could do with EXPLAIN output.
Comment #6
moshe weitzman commentedHere is an EXPLAIN for the delete query in comment_user_cancel() before and then after this patch.
Comment #7
catchComment #8
webchickCommitted to HEAD! Thanks!
Comment #9
moshe weitzman commentedRTBC for D6. see patch in #1
Comment #10
gábor hojtsyThe comment update functions should have a clear indication of when they were added. Ie. before Drupal 6.0, inbetween Drupal 6.0 to 7.0. etc. See http://drupal.org/node/278592#comment-2038486
Comment #11
mikeryanRerolled with @defgroup updates-6.x-extra.
Comment #12
catchJust caught user deletions scanning 21 million rows in node_comment_statistics. This no longer applies, re-roll forthcoming.
Comment #14
catchHere's the re-roll. The index in D7 is called 'comment_uid', updated the patch to reflect this.
Also adding a D7 patch to ensure that upgrade does not attempt to add the index twice.
Upgrading to major, spammers get deleted a lot.
Comment #16
catchla la la.
Comment #17
moshe weitzman commentedLooks like a good backport to me.
Comment #18
gábor hojtsyDid not apply anymore due to #336483: Performance: SELECT MAX(comment_count) FROM node_comment_statistics does full table scan changing indexes also in node comment stats. Needs a reroll.
Comment #19
moshe weitzman commentedRerolled and tested a new install.
Comment #20
gábor hojtsyCommitted to Drupal 6. Needs updates in Drupal 7 to fix update path.
Comment #21
catchRe-rolled the D7 patch.
Comment #22
webchick/me violates the cardinal rule of committing patches from needs review :\
However, it seems we need this to avoid causing problems in the upgrade path from the latest version of D6 to D7. The patch is simply wrapping a part of the upgrade path in in "if exists" check so "should" be safe (famous last words...)
Committed and pushed to 7.x. Thanks!