Forked from #145353: Forums performance improvements.
_forum_update_forum_index() performs the following expensive query on every comment insert, update, publish, or delete, regardless of whether the comment is attached to a forum topic node. In my context, this has a serious impact on migration rollback of comments.
$last_reply = db_query_range('SELECT cid, name, created, uid FROM {comment} WHERE nid = :nid AND status = :status ORDER BY cid DESC', 0, 1, array(
':nid' => $nid,
':status' => COMMENT_PUBLISHED,
))->fetchObject();
This is to setup a db_update on forum_index, which will do nothing anyway if the parent node wasn't a forum topic. So, we want to avoid this query unless we know the node is already in forum_index. I added an inner join to forum_index on the initial comment query in the function, and in a very rough informal test the average time of forum_comment_delete() goes from 54ms to 1ms (this is with 73k comments in the comment table).
I'll freshen my D8 checkout and submit a patch.
Comment | File | Size | Author |
---|---|---|---|
#15 | forum-index_update_performance_d7-1715402-15.patch | 648 bytes | mikeryan |
#12 | forum-index_update_performance_d7-1715402-12.patch | 645 bytes | mikeryan |
#1 | forum-index_update_performance-1715402-1.patch | 666 bytes | mikeryan |
Comments
Comment #1
mikeryanComment #2
mikeryanSee also #1715450: Allow disablement of forum_index updates.
Comment #3
moshe weitzman CreditAttribution: moshe weitzman commentedMakes sense, and code looks good.
Comment #4
chx CreditAttribution: chx commented*blink* if that query is expensive , is it expensive because it's not indexed? If it is not, why doesn't forum add an index on comment table? If it's indexed how it can be expensive?
Comment #5
mikeryanEXPLAIN shows it's using the comment_num_new index - the issue is the sort ("Using where; Using filesort").
At any rate, this query is always going to be more expensive than the join I added to the initial query, and it's an unnecessary query for any non-forum nodes, so this is a big win when dealing with 176k blog comments...
Comment #6
chx CreditAttribution: chx commentedAfter chatting with catch about this it seems #5 is a mis-post and I misunderstood what's going on. While the count query is likely to be near-instant as I suspected the point is not the query in the patch but not running the other two controlled by this count query. So yes, good to go.
Comment #7
chx CreditAttribution: chx commentedAlso, further talking -- shouldnt we change the ORDER in the $last_reply query to be ORDER BY created, cid so the comment_num_new index (nid,status,created,cid,thread) can be used fully? Can be a followup or can be done here. Won't that speed up further?
Comment #8
effulgentsia CreditAttribution: effulgentsia commentedRe #7: that might speed up comment updates on long forum threads, but seems like a separate issue. #1 looks good for this issue, which is to prevent an unnecessary db_query_range() when dealing with a non-forum node.
Comment #9
effulgentsia CreditAttribution: effulgentsia commentedComment #10
webchickCommitted and pushed to 8.x. Thanks! Does this make sense to backport too?
Comment #11
mikeryanYes please! Let me generate the D7 patch...
Comment #12
mikeryanComment #14
droplet CreditAttribution: droplet commentedc.nid=i.nid
a little code style problem ? (in both D7 & 8 patch)
Comment #15
mikeryanAhhh, bad habit I have of compressing my SQL...
Comment #16
moshe weitzman CreditAttribution: moshe weitzman commentedComment #17
webchickCommitted and pushed to 7.x. I also went back to 8.x and fixed the spacing.