Posted by bdragon on July 14, 2012 at 10:45pm
7 followers
| Project: | Drupal core |
| Version: | 7.x-dev |
| Component: | comment.module |
| Category: | bug report |
| Priority: | normal |
| Assigned: | chx |
| Status: | needs work |
| Issue tags: | needs backport to D7 |
Issue Summary
While analyzing issues on drupal.org, I detected that it is possible for multiple comments on a node to have the same {comment}.thread value.
To ensure data integrity, we should have a unique index on (nid, thread).
Drupal.org had 34 instances where duplication had happened. Most of this apparently happened at times where things were especially flaky. (I manually repaired the damage already.)
Detecting duplicate thread values:
(d6 naming, use 'comment' for a d7+ db)
select distinct l.nid from comments l, comments r WHERE l.nid = r.nid and l.thread = r.thread and l.cid < r.cid;
This takes 5 minutes to run on drupal.org slave.
Comments
#1
Here's a patch to add the unique key.
It's probably faster to attempt to add the key and error out than it is to search for problems.
Whether this is acceptable or not, I don't know.
#2
Doing that will abort the write and lose the comment. Not a good idea IMO.
#3
Here's the diff -b for easier review. Also, D7.
#4
Well, that made it easier for me to spot the wrong prefix setting.
#5
The last submitted patch, 1685170_4.patch, failed testing.
#6
Weird. All the comment tests passed for me. Let's then see without the function removal.
#12
We had a bunch of bot flukes but we are seemingly back on track. So here's a much more complete version, with lots more comment, lock_release and all that.
#13
Tested against postgresql and mysql using tsung.
Here's the configuration I used.
With the patch in I was unable to cause a duplicate.
Without the patch, I was getting a large percentage of duplicates -- some were even 3-way dupes.
#14
This felt like something we should add tests for, but chx explained that it's catching a race condition and we don't have a good way for testing for those. Thanks for the pointer to Tsung; never heard of that before!
Anyway, this looks fairly sane. It's adding checks for locking around this code and failing if they're encountered.
Committed and pushed to 8.x. Not sure if we should backport this to D7 too?
#15
Yes.
#16
Tagging then.
#17
More tagging -- this is important for the drupal.org d7 upgrade.
#18
Assigning to @bdragon for review of @chx's patch in #15
#19
I'm getting duplicates during testing even after applying #15.
Also, typo in comment
+ // We need a lopp here to avoid race conditions: if another process
-- should be 'loop' instead of 'lopp'.
#20
Adding an estimate for closing this out and/or mitigating it on Drupal.org.
#21
This has been mitigated for Drupal.org: #1944448: Drupal.org D7 deployment. We'll run a query to check for duplicates before proceeding.