comment thread IDs grow too fast
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | comment.module |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
Background
As an introduction, the comments module uses a thread column to properly sort threads not only in timestamp order but also based on what parents the thread has. When a comment has no parents (i.e. is a comment on the posting node) it has just the value, e.g. "148". These codes are generated using the int2vancode and vancode2int functions, and the goal is to have them sort in order as a string. Since strings sort differently than numbers, this works out rather ingeniously.
When the comment is about an existing comment, the thread value gets a dot and a second vancode is generated for what is now one level below the comment. So, lets say comment 148 gets a comment, its thread will be "148.00". If a second comment on 148 is added, it would be "148.01", and so on. Now if a comment on 148.01 is added, its thread would be "148.01.00".
The issue
I have a posting with a large number of comments (hence the vancode "148" in my example since they start at "00") . To be fair this came from my phpBB forum via the phpbb2drupal module I recently patched to fix how it assigned thread IDs.
I had a comment 148, a comment on 148 "148.00", and a comment on 148.00, which generated a "148.00.00".
The next comment made was on the original posting. What thread should it get? My guess is "149", but the software gives it "52mbrwh" instead.
The code should take the first value from before the first dot (if there are any dots) and then increment that, but instead it takes the whole value, dots and extra levels and so on, and increments it.
Am I missing something?
The fallout
This thread field is so large, and the number of threads is probably so few for a given topic, that this is never likely to be a problem. My work on the phpbb2drupal vancode generation problem just happened to have me wandering around the database long enough to notice this.
Should I make a patch and submit it?

#1
Here's a patch that takes care of the problem.
#2
I second the motion. I had similar problems with the new menu code which utilizes vancode, and in there I vancoded per level which is essentially the same as what this patch does. I won't RTBC any patch at 4am but I have a feeling this one is RTBC.
#3
There seems no such issue with Drupal 5.x HEAD version. Attached is a screenshot of the thread row in comments table.
(Have not tested on Drupal 4.7.x.)
#4
Here's the screenshot.
#5
$thread = int2vancode(vancode2int($max) + 1) .'/';vancode2int() should take care taking out the part before the first dot. Have tried it on 3 digit thread values too.
function vancode2int($c = '00') {return base_convert(substr($c, 1), 36, 10);
}
#6
Um, is there an actual problem described here?
The conversion functions exist only as simple casts form and to integers. Properly using them to generate tree structures is the responsibility of the caller.
#7
While I'm happy that this doesn't occur in the 5.0 beta, the original problem in 4.7 remains. That's what the original patch is for.
#8
Tagging as 4.7.4.
#9
Anybody want to vet/commit this for 4.7.4 so we can put this behind us?
#10
Patch seems to work on 4.7 head. Screenshot here:
http://highervisibilitywebsites.com/comment-ids.jpg
Will let someone else RTBC though...
#11
I've done the following:
Created a node
created 500 comments on that node
added followups to one of the comments
added a new comment on the node.
That last comment's thread is 1dx/ and looks perfectly normal.
I think that the import script you mention messed something up.
#12
Killes, you tested with this patch on 4.7.4 or without it?
I tested the scenario described in the initial posting, so I did see a problem.
#13
without the patch
Also, since the code is the same in 4.7 and 5RC the issue should pop up in both versions.
#14
Here is a modified version of generate-content.php which does what I did for my test, e.g create a node with 500 comments.
Beware: It will remove all other content from your site.
#15
considering my extensive testing without any hint towards a bug, I mark this "won't fix".
#16
Finally I've got a very clear description of the problem. This came to me while examining the comment.module in the new 4.7.5 release.
Here's the code that creates a thread ID for a new comment that does not have a parent comment. That is, it is a comment on the node itself.
On Line 583 we see:
if ($edit['pid'] == 0) {
// This is a comment with no parent comment (depth 0): we start
// by retrieving the maximum thread level.
$max = db_result(db_query('SELECT MAX(thread) FROM {comments} WHERE nid = %d', $edit['nid']));
// Strip the "/" from the end of the thread.
$max = rtrim($max, '/');
// Finally, build the thread field for this new comment.
$thread = int2vancode(vancode2int($max) + 1) .'/';
}
This code is wrong in the following circumstance:
- The last comment in a node is a comment-on-a-comment.
- Someone attempts to add a comment on the node.
If the last comment in a node is a comment-on-a-comment, its vancode has a dot in it. For example, "0a" is a thread ID of a comment on a node, while "0a.01" is the thread ID of the second comment on the 0a comment.
The code above increments the whole thread ID, and not the first segment of the thread ID when adding a new comment to the node. This is the bug.
By getting the first segment of the maximum, incrementing it, and using that as the new comment's thread ID in this circumstance, we ensure the next thread ID is correct.
See this in action with this PHP code in a node:
echo 'Current code: 0a<br/>';
echo 'Next: '.int2vancode(vancode2int('0a')+1).'<br/>';
echo 'This is correct!<p/>';
echo 'Current code: 0a.01<br/>';
echo 'Next: '.int2vancode(vancode2int('0a.01')+1).'<br/>';
echo 'This is wrong!<p/>';
which generates this result:
Current code: 0a
Next: 0b
This is correct!
Current code: 0a.01
Next: 2a02
This is wrong!
Incidentally, the comment module also starts comments on a node at vancode "01" instead of "00". The attached patch fixes that as well in the 4.7.5 comment.module.
#17
moving to HEAD to get more reviews. The code is the same in 5 and 4.7.
#18
There isn't any problem with this in 5.x as discussed above. Pointing, in case the code is same.
#19
5.1 seems to still have this problem.
I added a new node.
First comment on the node has a thread of 01/, and that's ok.
First reply on the First comment has a thread of 01.00/, and that's ok.
Second comment on the node has a thread of 2101/. This is too high.
First reply on the Second comment has a thread of 2101.00/, and this seems ok.
Third comment on the node has a thread of 410101/, and this is getting too big.
The following patch fixes it, again.
#20
Paul,
Thanks for persevering with this issue :) However, trying to replicate this issue in 5-dev with your provided steps:
This code is wrong in the following circumstance:- The last comment in a node is a comment-on-a-comment.
- Someone attempts to add a comment on the node.
does not realise any errors.
A dump of my comments table:
mysql> select cid, subject, thread from comments;+-----+----------+-----------+
| cid | subject | thread |
+-----+----------+-----------+
| 1 | c1 | 01/ |
| 2 | c1.1 | 01.00/ |
| 3 | c2 | 02/ |
| 4 | c2.1 | 02.00/ |
| 5 | c2.1.1 | 02.00.00/ |
| 6 | c3 | 03/ |
| 7 | c4 | 04/ |
| 8 | c5 | 05/ |
| 9 | c6 | 06/ |
| 10 | c7 | 07/ |
| 11 | c8 | 08/ |
| 12 | c9 | 09/ |
| 13 | c10 | 0a/ |
| 14 | c8.1 | 08.00/ |
| 15 | c11 | 0b/ |
| 16 | c10.1 | 0a.00/ |
| 17 | c12 | 0c/ |
| 18 | c10.1.1a | 0a.00.00/ |
| 19 | c13 | 0d/ |
| 20 | c2.1.2 | 02.00.01/ |
| 21 | c2.1.3 | 02.00.02/ |
| 22 | c2.2 | 02.01/ |
| 23 | c2.1.1.1 | 02.00.00.00/ |
| 24 | c14 | 0e/ |
| 25 | c13.1 | 0d.00/ |
| 26 | c15 | 0f/ |
+-----+----------+-----------+
Please provide concrete steps to manually reproduce this on a clean database.
-K
#21
I took a look at the code in the 5.x comment.module and it looks identical to my patch. Looks like someone integrated it in without noting it here.
Do I need to wait for whenever enough security holes are found to release a 5.2 to get this into production?
#22
The fix must have been abandoned, because 5.2 is now broken the same old way. The patch is identical for this version, but I've regenerated it anyway.
Pretty please integrate my fix? Please?
Once again, the test case is:
I added a new node.
First comment on the node has a thread of 01/, and that's ok.
First reply on the First comment has a thread of 01.00/, and that's ok.
Second comment on the node has a thread of 2101/. This is too high.
Once I saw the comment ID in the URL for the first comment, I just did this to check it in the database and see what the thread value was being set as:
Broken way:
drupal_farbot=# select cid, subject, thread from comments where cid >= 9227;cid | subject | thread
------+----------------------------+--------
9227 | first comment | 01/
9228 | first comment reply | 01.00/
9229 | second comment on the node | 2101/
(3 rows)
Fixed way:
drupal_farbot=# select cid, subject, thread from comments where cid >= 9230;cid | subject | thread
------+--------------------+--------
9230 | comment 1 | 01/
9231 | reply to comment 1 | 01.00/
9232 | comment 2 | 02/
(3 rows)
#23
the problem still in 5.6
I made another solution
#24
This is still broken on 5.6 and 5.7. I don't know whether to scream or cry. Or both.
gdavid's patch for 5.6 works with 5.6 and 5.7, which is broken in the same way on the same lines. It's a simpler patch than my own previous patches, so maybe this one will make it in.
6.2 is also broken in the same way from the look of the comment module code, but its patch will need to cover different lines.
Who can commit this into the next release so we can move on with our lives?
#25
This is almost certainly broken in 7.x too, so I'm moving it there in order that it get committed first, backported to D6, then to D5.
Patch doesn't apply to D7 though so marking as Code needs work.
#26
This bug is still present in 5.10.
#27
Confirmed as a bug in 7.x dev as of tonight 2008-08-19, here is the patch to fix it.
#28
And this patch is for 5.10 in case anyone else wants it.
#29
I just tried the test in @22 on a drupal HEAD install, and everything seemed to work fine without the patch in @27. Is this related to php or other configuration settings in some way then? I got the same results after applying the patch, and the code does look a little safer, and the query should perform better with large numbers of comments, so I'm still in favor of committing to HEAD.
I added a new node.
First comment on the node has a thread of 01/
First reply on the First comment has a thread of 01.00/
Second comment on the node has a thread of 02/
mysql> SELECT cid, subject, thread FROM comments;+-----+----------------+--------+
| cid | subject | thread |
+-----+----------------+--------+
| 1 | first comment | 01/ |
| 2 | first reply | 01.00/ |
| 3 | second comment | 02/ |
+-----+----------------+--------+
3 rows in set (0.00 sec)
#30
I understand the bug, and agree with the solution (except that I don't get why the cast to int could ever be necessary). But I *can't* reproduce it.
PMunn could you give as much details about your setup as possible?
#31
Thinking about this a little more, I'm not convinced there is a bug here.
'.' is supposed to be < '/', so '01.00' < '02/' because string comparisons are typically done in lexicographical order.
Maybe the string collation you use breaks one of these two assumptions (either '.' > '/' or the comparison is done differently). But this will break the whole thread sorting algorithm anyway, so you have a bigger problem on your hands.
Which database engine and collation do you use?
#32
The full test case is first described above at #16.
The important thing is to do the test in this order:
Create a Node, let's call it NODE-1.
Comment on this node, call it COMMENT-1.
Reply on COMMENT-1, call this COM-1-REP-1.
Comment on the node, call it COMMENT-2.
Observe the Thread IDs in the comments table for the three comments on that one node. Most notably the COMMENT-2 thread ID will be wrong. It will still sort properly, yes, but the thread ID is still wrong.
The patch gets the highest thread ID of all comments without a parent comment, that is, it gets the highest thread ID that is directly pointing to the NODE. It then increments that thread ID. The original, flawed code increments the entire thread ID of the last comment.
I adopted gdavid's patch because it's smaller and seems to work just as well instead of my own patch before his and used it to make the 5p10 and 7dev patches. The tactics in either patch, mine or gdavid's, are the same across all versions of drupal right now. The comment module hasn't changed lately.
If my solution will break the sorting at some point let me know how and we can close this up for good.
#33
@PMunn: can you give us as most specifics as you can about your environment? (that's what I called "setup", above), database engine, database configuration, PHP version, etc.
#34
Some more research:
$ export TEXT="01/\n02/\n01.00/\n01.00.00/\n02.00/"
$ echo -e $TEXT | LC_COLLATE=C sort
01.00.00/
01.00/
01/
02.00/
02/
$ echo -e $TEXT | LC_COLLATE=en_US.ISO-8859-1 sort
01/
01.00/
01.00.00/
02/
02.00/
$ echo -e $TEXT | LC_COLLATE=en_US.UTF-8 sort
01/
01.00/
01.00.00/
02/
02.00/
#35
Okeydoke:
PostgreSQL 8.3.3
PHP 5.2.6
Database encoding is UTF8
Sitting on a Fedora Core 8 box running kernel 2.6.25.14-69.
#36
@PMunn can you confirm that the comment ordering is wrong at display time with your setup (please try both ascending as descending ordering)?
#37
The comment ordering is correct.
This is a bug that doesn't exhibit itself visually to the end user. It's just an observation that the thread IDs are being generated incorrectly internally.
#38
I still think this patch should go in, regardless of being able to reproduce the bug.
The code is clean and the sql query is much more correct with the patch, and will execute faster on sites with a lot of comment threads.
#39
Patch no longer applies.
I, too, can't reproduce this bug. I tried it on a clean HEAD install and got
ID Title Thread1 Comment 1 01/
2 Comment 1.1 01.01/
3 Comment 2 02/
MySQL 5.0.5, PHP 5.2.4, Ubuntu
I do think that
<?php$max = db_result(db_query('SELECT MAX(thread) FROM {comments} WHERE nid = %d', $edit['nid']));
?>
should become
<?php$max = db_result(db_query('SELECT MAX(thread) FROM {comments} WHERE nid = %d AND pid = 0', $edit['nid']));
?>
because that makes more sense (you only need the parentless nodes to determine the max) and is less work when looking at thready comments.
#40
maartenvg, your code change to that line fixes the problem in 5.11.
#41
#42
Noted, moved to http://drupal.org/node/320784
#43
@johnskulski: good catch, that's a regression introduced by DB:TNG... but this has nothing to do with the thread at hand. Please open a new one with your patch.
#44
Here is an updated patch for the 5.12 release. Same code change, but the row it's on is a different row number so some work must've been done on the comment module since the 5.11 release.
#45
The last submitted patch failed testing.
#46
Following HEAD.
#47
The last submitted patch failed testing.
#48
Test failure : #335122: Test clean HEAD after every commit
#49
I think we should do "AND" and not "and" (markind CNW for this) but otherwise looks like a simple code change. Should give the next comment thread the right incremented thread number instead of taking the whole thread structure from the last comment.
#50
BTW I've been pointed to here on the Hungarian Drupal conference this past weekend and got test results from gdavid there (on postgresql), which shows that the third comment on the same node does indeed get a wrong thread identifier.
drupal7=# select * from comments;cid | pid | nid | uid | subject | comment | hostname | timestamp | status | format | thread | name | mail | homepage
-----+-----+-----+-----+---------+---------+-----------+------------+--------+--------+--------+--------+------+----------
1 | 0 | 1 | 1 | egy | egy | 127.0.0.1 | 1226745858 | 1 | 1 | 01/ | gdavid | |
2 | 1 | 1 | 1 | ketto | ketto | 127.0.0.1 | 1226746009 | 1 | 1 | 01.00/ | gdavid | |
3 | 0 | 1 | 1 | harom | harom | 127.0.0.1 | 1226746032 | 1 | 1 | 2101/ | gdavid | |
(3 rows)
#51
#35 also refers to Postgresql, so I think we can safely assume that the problem lies there. And that is also the reason why I (and others) couldn't reproduce it, because most people use MySQL.
Attached patch with capitals for "and" per #49.
#52
Patch looks great. Since this is environment dependent I don't think it's viable to write a test for it (and it's also not harmful, just a bit odd), so marking RTBC.
#53
I discussed this some with DamZ on IRC.
He makes the case that while the patch will fix the symptom of the problem (and is a performance improvement to boot), there is obviously still something funny going on with threading on some systems.
Drupal has 4 threading modes (threaded desc order, unthreaded desc order, threaded asc order, and unthreaded asc order), and while this messed up value in pgsql might not have user-facing consequences when using whichever mode phpbb2drupal uses (probably unthreaded asc), it likely would have user-facing consequences on one or more other modes.
We have a couple very simple tests for threading as part of testing comment replies, but we lack comprehensive tests for all threading modes for comments.
I'm therefore marking this CNW so that we can get those in place, which should hopefully help us figure out the "root" cause of the problem on pgsql at the same time.
#54
On a side note, here's a patch that fixes this for 5.18, where it is still a problem. Patch does the same thing as previous one.
#55
this is still problem in 6.x. It's not too good. This is 3year old problem. And only need " AND pid = 0" text to the module code. Only 12 characters (with spaces). It's ridiculous a little bit. What work need more to accept these changes?
#56
@gdavid - see webchick's comments in #53. To commit this to 6.x, it needs to be committed to 7.x first. To commit it to 7.x, we need tests so that comment threading doesn't break in the future. If you can help write these tests, we can get this issue committed quickly
#57
the root problem is it: (postgresql)
# SELECT '.' > '/' ;
?column?
----------
t
(1 sor)
template1=# SELECT '.' < '/' ;
?column?
----------
f
(1 sor)
#SELECT '123456/' > '123456.01.01/' ;
?column?
----------
f
(1 sor)
# SELECT 'a' > '/';
?column?
----------
t
(1 sor)
# SELECT 'a' > '.' ;
?column?
----------
t
(1 sor)
# SELECT '0' > '.' ;
?column?
----------
t
(1 sor)
Why? look at this:
# SELECT hashchar('.');
hashchar
----------
-47
(1 sor)
# SELECT hashchar('/');
hashchar
----------
-48
(1 sor)
I hope this can help to show the ROOT problem, why working wrong the comment threading in PG.
Thanks.
#58
Are you trying to say that the forward slash is causing the sorting problem? Can we just change the terminator to another character? My results for running those hashchar calls, which doesn't look meaningful to me:
# select hashchar('.');
hashchar
------------
-163135266
(1 row)
# select hashchar('/');
hashchar
------------
1043037474
(1 row)
The code change is needed to create and properly increment the segments of the thread ID. This isn't disputed.
I get it that this seems to have exposed a sorting problem particular to PostgreSQL.
#59
I don't understand. Why dosen't enoght good this syntax?
$max = db_result(db_query('SELECT MAX(thread) FROM {comments} WHERE nid = %d AND pid = 0', $edit['nid']));
why whould be better solution to change the terminator character to another? " AND pid = 0" 12 characters (with spaces) no other script, no transformation. it will run on mysql, postgres, oracle, mssql and others too.
#60
gdavid, I agree this fix is simple and should be implemented.
However, it seems that some have found a sorting problem in PostgreSQL even when all is working properly.
I vote we split that into a separate bug, and fix and close this one.
This bug, again, is simply that the thread ID increases too quickly. The code fix described should iron that out and with more testing perhaps the fixed code can contribute meaningfully to the second bug discussion.
All in favor?