I cant think of a situation where this would be a bad thing ...

CommentFileSizeAuthor
#9 spam.patch2.06 KBbleen
#11 spam.patch2.06 KBbleen
#2 spam.patch495 bytesbleen
#1 spam.patch713 bytesbleen
spam.patch497 bytesbleen
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bleen’s picture

FileSize
713 bytes

ooops ... minor change to this patch

bleen’s picture

FileSize
495 bytes

... and now without the watchdog statement :)

gnassar’s picture

Status: Needs review » Reviewed & tested by the community

I really can't come up with objection to this either.

Anyone else see any reason not to do this? I'll leave it up for another day or two, and if nobody objects I'll commit it.

AlexisWilke’s picture

gnassar,

Is the "Mark as not spam" supposed to cancel the effects of a "Mark as spam"? As in, a Baysian computation reversal? I'm wondering of the side effects, although I guess what we're saying here is that either way the function would be called.

Thank you.
Alexis

gnassar’s picture

Yes, "mark as not spam" should undo the tallies done in the Bayesian module when the content was first marked as spam.

AlexisWilke’s picture

The you have my RTBC too. 8-)

Jeremy’s picture

I agree that this makes sense, and agree it should be committed.

gnassar’s picture

Status: Reviewed & tested by the community » Needs work

OK, there's a little bit of a problem.

This should clearly only mark as not spam if the content is currently marked as spam. (Since we'll accidentally be effectively doubling up on the Bayesian training if something's already not spam, and just unpublished. I like the ability to train_as_spam as much as the next guy, but we sure shouldn't be doing it accidentally.)

So instead of just wanting:

  case 'publish':
    spam_mark_as_not_spam('comment', $comment->cid);
    break;

we want:

  case 'publish':
    if (spam_content_is_spam($comment, 'comment') {
      spam_mark_as_not_spam('comment', $comment->cid);
    }
    break;

Because as spam_content_is_spam()'s comment says,

/**
 * API call to simply test if content is spam or not.  No action is taken.
 */

The problem is: that's not actually true.

spam.module: spam_content_is_spam(), lines 96-101:

  if ($score >= variable_get('spam_threshold', SPAM_DEFAULT_THRESHOLD)) {
    if ($id) {
      spam_mark_as_spam($type, $id, array('score' => $score));
    }
    $spam = 1;
  }

Obviously that should *not* be there.

The problem is, it's been there for a *while*. And I have no idea what breaks if we change that.

My initial temptation is: it's a dev build for a reason; let's fix it so it's right, and then if something else breaks because of it, we fix that independently.

But I'm not so sure that's a great idea. We're very close to a new beta build -- and we really do need one, pretty desperately. So I'm thinking instead that we should just push this entire issue until a beta-1.1 rollup and then we can mess with it.

How does this sound to you guys?

bleen’s picture

Status: Needs review » Needs work
FileSize
2.06 KB

I havent really tested this ... but what about this to address #8.

Since this function is currently only called in one place, why not move that bit of logic out of the is_spam function?

bleen’s picture

Status: Needs work » Needs review
bleen’s picture

FileSize
2.06 KB

the patch in #9 had some parens ou of place

gnassar’s picture

Because that function would actually also be called for action and trigger code, which we're developing.

Plus, it's its own separate op hook in the API. It should definitely be in its own function.

The real solution here, I think, is fixing the function to do what it's supposed to do.

bleen’s picture

Oh you and you're "lets do things properly" mumbo jumbo... :)