The query is this:

INSERT INTO {node_comment_statistics} (nid, last_comment_timestamp, last_comment_name, last_comment_uid, comment_count) (select nid, c.timestamp, name, uid, comment_count FROM {comments} c INNER JOIN (SELECT MAX(timestamp) AS timestamp, COUNT(*) AS comment_count FROM {comments} WHERE status=%d GROUP BY nid) as c2 ON c.timestamp=c2.timestamp)

The MySql error i'm getting is this:

Duplicate entry '641' for key 1 query: INSERT INTO node_comment_statistics (nid, last_comment_timestamp, last_comment_name, last_comment_uid, comment_count) (select nid, c.timestamp, name, uid, comment_count FROM comments c INNER JOIN (SELECT MAX(timestamp) AS timestamp, COUNT(*) AS comment_count FROM comments WHERE status=0 GROUP BY nid) as c2 ON c.timestamp=c2.timestamp)

The reason is:

This query takes a join between two tables on a non-unique field (timestamp), selects a field from that join that is therefore not guaranteed to be unique (nid) and attempts to insert the results into a unique indexed field (node_comment_statistics.nid).

A nid will show up twice in this join whenever any two comments on any two nodes have the same timestamp, and one of them was the most recent comment on its node. The left-hand table will have N records with that timestamp, each matching the one record in the right-hand table. The INNER-ness of the join doesn't seem to help in MySQL.

Basically if you site is busy enough that you ever get two comments within a second of each other, then this bug can arise.

The solution is to join on a unique key:

INSERT INTO node_comment_statistics (nid, last_comment_timestamp, last_comment_name, last_comment_uid, comment_count) (
SELECT distinct c2.nid, c.timestamp, name, uid, comment_count
FROM comments c
JOIN (
SELECT MAX(timestamp) AS timestamp, COUNT(*) AS comment_count, nid
FROM comments WHERE status=0 GROUP BY nid
) AS c2
ON (c.nid = c2.nid AND c.timestamp = c2.timestamp)
)

... i also added DISTINCT because my database has some duplicate comments, imported from Wordpress. (Hopefully Drupal never has that problem.)

Comments

mykle’s picture

sorry, typo. that should be:

INSERT INTO node_comment_statistics (nid, last_comment_timestamp, last_comment_name, last_comment_uid, comment_count) (
SELECT distinct nid, c.timestamp, name, uid, comment_count
FROM comments c
JOIN (
SELECT MAX(timestamp) AS timestamp, COUNT(*) AS comment_count, nid
FROM comments WHERE status=0 GROUP BY nid
) AS c2
ON (c.nid = c2.nid AND c.timestamp = c2.timestamp)
)

... plus or minus my random indentation. =)

Summit’s picture

SUbscribing, having this error also by inserting a new node!
How can I solve that please?

Greetings,
Martijn

gpk’s picture

Version: 5.x-0.2 » 7.x-1.x-dev

Looks like this is still present in 7.x - in devel.module, see the TODO against devel_rebuild_node_comment_statistics().

Related: #744438: devel_rebuild_node_comment_statistics() should use n.created not n.changed for nodes without comments.

moshe weitzman’s picture

Status: Active » Fixed

Committed to D6 and D7. I can't even bear to open D5 file anymore. Patch would be welcome.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.