Today we ran into the first case where a valid comment was marked as spam. So, we tried the mark as not spam link, but that 404s. Hovering over the link I can see that the link is for http://www.bstw.com/spam/comment/280/notspam . The displayed URL for the comment, however, is http://www.bstw.com/content/getting-bumped-out-story#comment-280 since we have the page title module in use. I tried adding /notspam to the end of that URL, but again 404. So, the first question is, what should this be and how do we fix it.

I then tried editing the node. Under Administration, I deseleted Spam and selected Published. This "works" in the sense that the state of the node is changed, but when it is displayed it still shows UNPUBLISHED above it, even though it then becomes visible to an anonymous user. The spam total is still 99 also. I have all authorized user and the administrator selected under bypass filters. So, second question is how do we get the UNPUBLISHED banner to go away.

Also, if marked not spam, shouldn't its score revert to zero?

Finally, I am curious why this one got marked. I had the threshold set at 70 and this got a score of 70 I think I can see in the logs, but it isn't apparent to my why it would have gotten much of any score at all. I have tentatively moved the threshold to 72, but that doesn't actually seem like a solution, if you know what I mean.

The trace looks like this:

comment 280 04/10/2009 - 08:48 marked as spam, score(99) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 final average(70) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 Bayesian filter: total(99) redirect() gain(100) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 total(1485) count(15) probability(99) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 Node age: total(99) redirect() gain(150) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 Custom filter: total(40) redirect() gain(250) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 matched adjusted total of 0 probably-not spam rule(s). Anonymous trace | detail
comment 280 04/10/2009 - 08:48 URL filter: total(0) redirect() gain(250) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 Surbl filter: total(0) redirect() gain(250) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 Duplicate filter: total() redirect() gain(100) Anonymous trace | detail
comment 280 04/10/2009 - 08:48 inserting Anonymous trace | detail
comment 280 04/10/2009 - 08:48 -- Anonymous trace | detail

Could someone explain a bit about which factors contributed? To me it looks like node age, Bayesian, and something in between with no name. It is true that the primary node was from January, but for this site it is quite reasonable that an older node would generate a new comment. So, does that mean I should turn that filter off? I don't know what about this post would have annoyed the Bayesian filter and I don't know what that middle line is about.

Comments

gnassar’s picture

Status: Active » Closed (duplicate)

Duplicate of #352179: bad "Not Spam" link.

The UNPUBLISHED banner you mention must, I imagine, be part of the content of the node -- by default, to my knowledge, Drupal doesn't add that banner (nor does the Spam module).

Yes, if your site would tend to have comments added to older content, you want to turn the node age filter off, or reduce its gain if it's OK but not as likely as new posts getting comments.

Please open a new issue for any further problems.