I had a some of spam comments that went right through the spam filter and caused some weird side effects with Drupal's edit functions. The spam filter should have caught the comment; by the content of the subject line, by the content of the body and by the massive number of links -- but it didn't.

I thought they had found some special character or command to bury in the text so I went looking to see if I could find it and block it.

Observations:

Weirdness in Drupal - The comment appeared in the article but when you went to edit the comment -- the text box was blank -- but other entries were filled out (name, email, etc.)

So I went to the database and edited the entry. The body content was there, so I copied and pasted it into Kate, my editor.

Kate didn't display it properly and the vertical scroll bar didn't work. I studied it and noticed that it was a single line, 128,528 characters long, (no newlines in it) and there was an <h1> at the beginning and </h1> at the end of the line. The body was composed solely of hundreds of these:

<a href="different links">"different copy"</a> |

links with the vertical bar used as a spacer between them.

I thought there possibly could still be embedded "somethings" in the text so I broke it into lines of less than 512 characters, so that I could scroll down the text and see if anything weird popped out.

Nothing was there so I have to assume it was simply the length of the line (128,528 columns) causing all the problems.

A test that rejected a single line that was more than xxx characters long would fix this -- if I am correct that it is the length of the line that is the problem.

Comments

jeremy’s picture

I am unable to duplicate your problem. What version of the spam module are you using? Do you see any errors in your web logs (ie, your apache logs, not your Drupal logs)? Can you dump the problematic comment from your comment table and attach it here so I can see it/test with it?

Do you have MySQL tuned to allow extra-large packets? When I create large comments like you've described, I get MySQL errors (max_allowed_packet) that prevent the comment from being comitted to the database. When I create comments that are within the allowable size, the spam module parses them correctly, in this case it recognizes the URLs for what they are and marks the comment as spam for having to many URLs.

I have made two updates to the spam module while looking into your error report:
1) when breaking text into tokens, I include "|" as a delimiter
2) I only grab the first 255 characters of a token (as we're storing it in a varchar(255) database column)

jeremy’s picture

Status: Active » Fixed

Unable to duplicate, and no further feedback. Marking fixed. If you are still having this problem, please retest with 2.0.12.

Anonymous’s picture

Status: Fixed » Closed (fixed)