Marking a comment as "published" should mark it as "not spam" [#1021696]

Comment	File	Size	Author
#9	spam.patch	2.06 KB	bleen
#11	spam.patch	2.06 KB	bleen
#2	spam.patch	495 bytes	bleen
#1	spam.patch	713 bytes	bleen
	spam.patch	497 bytes	bleen

Comment #1

bleen CreditAttribution: bleen commented 10 January 2011 at 18:55

File	Size
spam.patch	713 bytes

ooops ... minor change to this patch

Log in or register to post comments

Comment #2

bleen CreditAttribution: bleen commented 10 January 2011 at 18:56

File	Size
spam.patch	495 bytes

... and now without the watchdog statement :)

Log in or register to post comments

Comment #3

gnassar CreditAttribution: gnassar commented 10 January 2011 at 19:51

Status:

Needs review

» Reviewed & tested by the community

I really can't come up with objection to this either.

Anyone else see any reason not to do this? I'll leave it up for another day or two, and if nobody objects I'll commit it.

Log in or register to post comments

Comment #4

AlexisWilke CreditAttribution: AlexisWilke commented 10 January 2011 at 21:29

gnassar,

Is the "Mark as not spam" supposed to cancel the effects of a "Mark as spam"? As in, a Baysian computation reversal? I'm wondering of the side effects, although I guess what we're saying here is that either way the function would be called.

Thank you.
Alexis

Log in or register to post comments

Comment #5

gnassar CreditAttribution: gnassar commented 10 January 2011 at 23:07

Yes, "mark as not spam" should undo the tallies done in the Bayesian module when the content was first marked as spam.

Log in or register to post comments

Comment #6

AlexisWilke CreditAttribution: AlexisWilke commented 10 January 2011 at 23:10

The you have my RTBC too. 8-)

Log in or register to post comments

Comment #7

Jeremy CreditAttribution: Jeremy commented 11 January 2011 at 06:49

I agree that this makes sense, and agree it should be committed.

Log in or register to post comments

Comment #8

gnassar CreditAttribution: gnassar commented 11 January 2011 at 08:02

Status:

Reviewed & tested by the community

» Needs work

OK, there's a little bit of a problem.

This should clearly only mark as not spam if the content is currently marked as spam. (Since we'll accidentally be effectively doubling up on the Bayesian training if something's already not spam, and just unpublished. I like the ability to train_as_spam as much as the next guy, but we sure shouldn't be doing it accidentally.)

So instead of just wanting:

  case 'publish':
    spam_mark_as_not_spam('comment', $comment->cid);
    break;

we want:

  case 'publish':
    if (spam_content_is_spam($comment, 'comment') {
      spam_mark_as_not_spam('comment', $comment->cid);
    }
    break;

Because as spam_content_is_spam()'s comment says,

/**
 * API call to simply test if content is spam or not.  No action is taken.
 */

The problem is: that's not actually true.

spam.module: spam_content_is_spam(), lines 96-101:

  if ($score >= variable_get('spam_threshold', SPAM_DEFAULT_THRESHOLD)) {
    if ($id) {
      spam_mark_as_spam($type, $id, array('score' => $score));
    }
    $spam = 1;
  }

Obviously that should *not* be there.

The problem is, it's been there for a *while*. And I have no idea what breaks if we change that.

My initial temptation is: it's a dev build for a reason; let's fix it so it's right, and then if something else breaks because of it, we fix that independently.

But I'm not so sure that's a great idea. We're very close to a new beta build -- and we really do need one, pretty desperately. So I'm thinking instead that we should just push this entire issue until a beta-1.1 rollup and then we can mess with it.

How does this sound to you guys?