| Project: | Drupal core |
| Version: | 7.x-dev |
| Component: | comment.module |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed (fixed) |
| Issue tags: | Performance |
Issue Summary
I am working on a site using flat comment structure, and has lots of comments (app 1.5m in the comments table). The site currently operates on D5, but nevertheless this bug seems to be present in Drupal 7 as well.
Summary: The call of comment_num_replies() becomes very slow for a large number of comments, moreover it is completely unnecessary for flat comments. This function is designed to calculate the number of replies onto a single comment, and is used to determine if the comment is editable.
This patch adds an if wrap around the core of comment_num_replies(), performing the count only when threaded comments are used.
A more detailed workflow of how this is happening:
When a user submits a comment on a certain page, comment_links() gets triggered for all comments listed on that page. This function will try to add the "edit" link to the comment. For this it performs an access check if the user is allowed to post comments on line 964, and does a comment_access() check. comment_access() goes on and calls comment_num_replies() to verify if this comment has been replied to, by counting all replies (a query using WHERE pid = :pid AND status = :status). On my local box for 10 replies on one page from me this meant an app 10 second overall query time.
TODO: We could index the comments table on status + pid to make this query a lot faster for threaded comments, currently there is no index for these fields together. Need a performance check on this.
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| flat_comments_reply_count.patch | 1.39 KB | Idle | Unable to apply patch flat_comments_reply_count.patch | View details |
Comments
#1
Flat comments are very common
#2
Rerolling, still needs a benchmark.
#3
Flat comments still store whether they have replies or not depending on if their reply link gets used, this is the reason for flatcomment.module in contrib. Both needs to be fixed IMO, but technically they do actually have replies at the moment - so we should revive that issue (will look for it).
#4
Here it is #81373: Allow option for comments to be completely flat (no threading in database)
#5
Updated the patch - following changes / comments:
Drupal 7 has an index on pid, status, however we were doing a COUNT(*) query in comment_num_replies(), where this should be SELECT 1 ... LIMIT 0, 1 for performance on non-MyISAM storage engines. I've merged that with the current patch.
comment_access() is the only place which uses comment_num_replies(), compound that with the change above and it seemed best to rename this function to comment_has_replies() instead. For backwards compatibility we could also add a completely new function, but I'd probably want to check contrib for usage before doing that since otherwise it's just cruft.
The above patch had the wrong variables for threading, updated those to the D7 one (we don't have collapsed or expanded any more), and they are also per-node type since D6. That required adding an (optional) $node parameter to comment_access() - if it's not there it fills it in itself, but that stops comment_links() triggering a node_load() for every comment on a page, which even with static caching isn't ideal invoking the entity loader up to 300 times for the same node.
So, this saves a database query for every comment where the current user authored it. It also optimizes the same query if threading is enabled. Also I think it's fine to fix this separately from the old flat comments issue since they're separate bugs.
#6
Subscribed
#7
Stupid question maybe, but why do we really need to check whether a comment has replies in an access check? It would be a big win if this check could simply be removed, or if it could be simplified. Given that we save both the creation and the updated date for comments now, we could make it a theme issue for those who care? :P That is, once could mention in the comment, through the theme layer, that the comment was edited (after it has gotten replies).
The current solution strikes me as a bit broken in that it is hard for end-user to understand why some comments, but not other comments, can be edited.
Some other thoughths:
- Instead of counting, wouldn't it be faster to see if there was a comment with my pid?
- I believe Disquz gives you 5 minutes to edit your comment. After 5 minutes, it automatically becomes un-editable. That would be a much faster check.
#8
That's not a stupid question at all, this is yet another 'feature' in comment module which just makes it suck in general.
We could do a couple of things -
1. add an edit own comments permission - no such permission exists.
2. Add the timer - although that would be a more complicated patch since iirc Basecamp, which has the same feature, incorporates a countdown as well so you know when you can edit and when you can't. But yes this would be a very fast check and quite nice to have. It's also something which could be done from contrib since comment links are finally alterable.
On the counting, I've already removed that in favour of a range query, I think the current patch is probably as fast as we'll get in terms of keeping the current functionality but let's kill it nicely altogether if we can.
#9
Here's the "edit own comments" patch. If we switch to this, there'd be no performance penalty in HEAD for either flat or threaded comments, and contrib could add back alternative access if it wants to.
#10
The last submitted patch, edit_own_comments.patch, failed testing.
#11
Awesome!
The edit own comments permission in core would be nice, and it also solves for instance this issue: #290246: Edit own comments permission
The web user in the test needs to have the 'edit own comments' permission in order for the test to pass. What is the best way to approach this? Do we need to do any more altering of the test?
I'm posting the patch with the additional permission added to the test.
#12
better name =)
#13
Fix looks right to me, don't think we need any test changes beyond that assuming the test bot comes back OK. There's a couple of whitespace issues in the patch though, see http://drupal.org/coding-standards
+++ comment.test 30 Dec 2009 21:48:05 -0000@@ -10,7 +10,7 @@
- $this->web_user = $this->drupalCreateUser(array('access comments', 'post comments', 'create article content'));
+ $this->web_user = $this->drupalCreateUser(array('access comments', 'post comments', 'create article content','edit own comments'));
This is missing a space before 'edit own comments'.
+++ comment.test 30 Dec 2009 21:48:05 -0000@@ -290,7 +290,7 @@
$this->assertTrue($this->commentExists($comment), t('Comment found.'));
-
+
Try to avoid adding trailing whitespace.
I'm on crack. Are you, too?
#14
Thank you for the pointers =)
Here's a cleaned up version!
#15
Looks fine now, back to CNR.
#16
The last submitted patch, edit_own_comments_with_test_2.patch, failed testing.
#17
Silly me =)
The Indexes were wrong in the previous patch.
Rerolling with correct paths this time.
#18
Looks good now, nice and green. No API change with this one since comment_num_replies() is left in but only as a utility function which core never calls.
#19
I'm OK with this approach. It would be interesting (but not required) to understand the performance gain this brings.
I'd also be OK with removing
comment_num_replies()as it is now unused.Not sure why this is marked 'critical' as Drupal works without this patch -- moving it back to 'normal'.
#20
@Dries, it's hard to quantify the performance benefit. We issue a count query when displaying any comment to the user who posted the comment. If someone has one visible comment on a thread, that's one query, if they have 100 visible comments (not impossible, can think of d.o and at least one other site with that sort of pattern), then it's 200.
Re-attaching the same patch with comment_num_replies() removed. No other changes so leaving at RTBC.
#21
Makes sense to me. And yay for fewer queries. :) It's a bit of a behaviour change that people may not be fully prepared for, but I think the "edit own comments" permission is a good trade-off. And a contrib module could easily add back the "only show the edit link if they have this permission, and there are no replies, and it's the first Tuesday of the month, and..."
I was going to mark this needs work for tests, but we don't have tests checking any other menu item's access callback, so...
Committed to HEAD.
#22
Automatically closed -- issue fixed for 2 weeks with no activity.