We have our Drupal site configured to require approval before comments are posted. When a new comment is submitted it appears in the comment approval queue as expected. Marking the comment as spam does not move the comment into the spam queue, it just stays in the approval queue. This appears to be a bug and a usability issue since it's not clear which comments are already marked as spam if they're just left in the approval queue.

Comments

mcarbone’s picture

I'm experiencing a similar issue. I'm finding that comment spam is appearing in the general spam admin page (admin/content/spam) but not on the comment-specific spam page (admin/content/comment/list/spam). This seems to be because the comment-specific spam page lists comments with a status field = SPAM_COMMENT (9), whereas the general spam admin page uses the spam tables. Thus, it's looking like marking things as spam via the approval queue is not setting the status field properly for comments.

danny englander’s picture

I am having the same issue. I am wondering what version of Drupal people are using who are also having this issue. I am on 6.15.

drein’s picture

I don't know if this is the same things, anyway, I have a similar problem.
Users make comments, spam module mark them as spam, I see this from the log, but I can't see these comments. The comment page and the spam general page are empty, so I can't show spam module if a comment is really spam or not.
I changed the option from allowing contents but unpublishing them, to "put spam in special queue" but it's the same, I don't see spam comments.
Drupal 6.14.

gnassar’s picture

Will be addressed after issue http://drupal.org/node/544260 is complete.

thepuzzlemaster’s picture

Would just like to add that this issue affects me as well. Would love to see a patch!

RagsToRich’s picture

Yes, this is driving me nuts. Fix will be muchas gracias!

danny englander’s picture

I am now using 6.x-1.x-dev (2010-Feb-02) and still have this issue.

chris.p.bailey’s picture

StatusFileSize
new762 bytes

Here is a workaround which redirects the spam link to the (hidden?) admin/content/spam/list page.

Doesn't move the spam out of the approval queue but makes it easier to bulk delete.

Chris.

chris.p.bailey’s picture

StatusFileSize
new3.29 KB

Update: I realised the admin/content/spam/list page doesn't provide a delete spam function.

Here is an updated patch (made against the 6.x-1.x-dev 2010-Feb-03) which adds this menu option.
Be warned, this patch seems to work on my site. But I'm not guaranteeing it won't delete all your content!

Chris.

gnassar’s picture

Status: Active » Needs review
StatusFileSize
new1.09 KB

Boy, this one took a while to track down.

Turns out the only thing that defines comments as in the approval queue is that they're unpublished -- I wish I'd known that a few hours ago! :)

Spam deals with comments marked as spam by unpublishing them -- it assumed that a published comment was not-spam, and an unpublished comment was spam.

The following patch fixes that assumption in the specific case listed above, but it'd probably be good to see if that issue is echoed anywhere else.

AlexisWilke’s picture

Ah! I see... that explains why I don't see anything in the admin/content/comment/spam/list table.

I have comments that appear as spam in many places, but their status is 1 (COMMENT_NOT_PUBLISHED) and not 9 as expected by the Spam module.

I do not think that the Core comment table status should be messed with, ever. This being said, I guess that the simplest right now. However, if other modules start doing things with the comments, they are not likely to know about the SPAM_COMMENT (9) value.

There are two ways to handle that:

1) create a spam_comment table where we save that spam status of the comment (annoyingly slow)

2) add a column to the Core comment table (much easier, a but on the hacking side)

Also it seems to me that if we have the % info about that one comment somewhere, then we can determine whether the status is "Spam Comment" by checking that % and not by testing a Core value that Core and 3rd party modules is expected to manage.

-----

Also... under admin/content/spam we are showing the list of spam_tracker, aren't we? Why limit the list to only spam? I think it would be of interest to people (like me) to see everything and mark what is considered spam and what isn't.

Is the tracker table being trimmed once in a while? If so, how do you keep track of the spam when the user changes the threshold?

Many questions, I know... 8-}

gnassar’s picture

"I do not think that the Core comment table status should be messed with, ever. This being said, I guess that the simplest right now. However, if other modules start doing things with the comments, they are not likely to know about the SPAM_COMMENT (9) value."

To be honest, Alexis, I don't either. This patch uses SPAM_COMMENT in the comment status because that's what the entire comment module is currently based on. I mentioned that the bug took me a while to track down -- your concern is exactly why. I didn't think we'd be messing with the comment's status in its own table, so I didn't think to look.

The patch fixes the problem within the context of the current module's design, and we'd probably be best to stick with that for this particular bug fix. But I am in complete agreement that we need to talk about how we handle these sorts of things in the first place, as part of a larger conversation. The time I spent digging through code to find this fix raised a lot of concerns in my mind for the current design of content modules -- especially considering as the comment module is supposed to be our exemplar for how contrib modules should design third-party content add-ons.

AlexisWilke’s picture

StatusFileSize
new1.69 KB

There is a patch that fixes the "Mark as spam" and more or less "Mark as not spam". There is a nice problem we can talk about in regard to having that SPAM_COMMENT status.

So... the hook_spamapi() function needs to change the comment status and that's why I could not see them when I was going to the list of spam comments (that list stayed empty all the time.)

The patch here (not including your own change which is good too, I think) will change the status whenever an action is applied to the comments.

Speaking of this... there is a nice side effect to changing the spam status, the approval queue does not display the comments that were marked as spam. Without changing the status directly, it would show up in there. But at this point, when you mark a comment as not spam, we do not know whether you want to publish or unpublish it. (although publishing the comment would mark the comment as not spam within the comment status, but the spam filter may still have a 99% somewhere!)

Thank you.
Alexis Wilke

AlexisWilke’s picture

Ah! On the other hand, the publish and unpublish marking the comment as not spam or spam may be a bit of a stretch (well... with publish, I agree, with unpublish, who knows whether it is spam... it could be inappropriate, wrong timing, etc. not exactly spam.)

gnassar’s picture

lol. We're thinking alike! I actually wrote nearly the exact same code right before I came up with my patch. (By the way, you'd have to change the $extra[] variables in there to $arg4[].)

Here are the reasons I decided not to do it that way, so you can tell me what you think:

1) the spam_mark_as_spam function in spam.module *already* calls spam_unpublish() after it invokes the mark_as_spam hook. So fixing the operation in unpublish solves the problem, and doing it in mark_as_spam just makes it update twice.

2) It requires a redefinition of the spamapi() function call -- which, obviously is supposed to be an API and therefore standard across usages; the comment_spamapi() call shouldn't be different from the node one, etc.

3) (really 2b, in a way) You will find that there is *no* spam content module that answers to the mark_as_spam or mark_as_not_spam hook invocations. They all answer to _publish and _unpublish. mark_as_(not_)spam is used solely (and nearly universally) for the filter modules.

That is why the function calls don't match, and why you had to change the function declaration in 2) above just to get that to work. But that actually speaks to a bigger problem -- and this is one of the "things that concerned me" that I mentioned above.

We are implementing spamapi() two different ways: one for content, and one for filters. I could understand this if they were two separate hooks, but they're not. So the "correct" fix for this is to make the API usage consistent across the board, like an API should be -- that, or have a separate content and filter API, which may be easier to code from the current codebase but probably harder both to plan and to use.

I didn't want to get into that hornet's nest, so I decided to leave it be and put API refactoring in my list of "big things" to save for the future after we've fixed all the straightforward open bug reports. (So far, the list is API refactoring, Forms 2.0 compliance, the spam/admin UI changes, making a workflow map of function calls and refactoring redundancies, and then the big move to D7. I plan to have "tasks" made for these in the issue queue at some point, though some have them already.)

So getting back to the point, _mark_as_spam is right now (for better or for worse) a filter-only API call, which is why the function declarations don't match. If we were to change that and allow its use by both filter and content modules, I'd rather just do the job right and make the API call syntax the same across the board. For now, sticking with the current "standard" and just fixing it at the publish/unpublish level solves the existing problem(s).

What do you think?

AlexisWilke’s picture

Status: Needs review » Reviewed & tested by the community

Hmmm... okay, so that's why it worked, I had your fix in (since $extra would not do much...) So I mark RTBC for your fix in #10 (forget mine!)

I see now that "mark_as_spam" implies "unpublish", which is most certainly what people want 99.9% of the time.

However, I see that "mark_as_not_spam" implies "publish" which is not so good for a node which may first require an additional review process. For comments, I would think that's fine though. But whoever wrote this made it a bit... too specific, I guess. This being said, it's already a good piece of work! 8-)

So... my point of view would be that the "mark_as_not_spam" should not imply "publish" in the content. This means we'd need another signal to the content to say that it isn't spam and "revert" the regular review process (which could be to auto-publish, I guess that's where Rules would rule!)

I guess an API documentation in the DEVELOPER.txt would be a good idea (I did not yet look at it, but that would help tremendously for anyone trying to implement their own filter or content type.)

I would see two ways to make it easier to catch:

1) rename the API operations to 'content_...' and 'filter_...' (weak solution),

2) add a new function to distinguish between the two APIs (spam_filter_invoke_api() and spam_content_invoke_api()) (strong solution)

I would think that either one would be easy enough to accomplish.

If you agree, I could create a patch for (2).

gnassar’s picture

Yes, mark_as_not_spam doesn't make it clear as to whether it should be published or unpublished. Thing is, you never know -- there are many that never use the approval queue at all (including myself, for comments), so there's no utility to "unpublished" comments. Besides, having it be marked as not spam implies that someone reviewed it in the first place -- so why wouldn't they publish it too?

On the other side of that, there are some folks (including myself, for nodes, though I imagine others could do this with any content type) that use workflows to regulate how content gets up on sites. For those, publish/unpublish status shouldn't be defined in the first place, save for how it's used in the workflow.

So, like you said, it'd be best to not touch the core content module status (publish/unpublish), now for two reasons.

I actually think the best fix for that, though, is to actually make content "modules" *real* modules, like filters are right now. Then they can have their own schema that can track the spam status of any content type in any way necessary, and we never have to worry about clobbering useful data in core content. I'm pretty sure that separate tables (aka leaving core tables alone) is the Drupal way to do this.

This will also make it much easier for third parties to contribute content modules -- filter modules right now would look like regular modules that they'd be used to, but to make a content module you would apparently have to make a funny .inc file and drop it in spam's content folder -- that's not flexible or intuitive at all.

As for having separate APIs -- I still don't know that's the best way; that means documenting both separately. And while they have areas of divergence, they also have areas of overlap.

I think I like the idea of having one spamapi, in the end -- if nodeapi can handle all node actions in one call, I'm sure we can manage it too. It just means that we have to pick and standardize on a consistent call structure. And we've never really invested the time in really defining that call structure in the first place (which is why the API isn't documented).

gnassar’s picture

Status: Reviewed & tested by the community » Fixed

Committed patch in #10.

http://drupal.org/cvs?commit=470552

AlexisWilke’s picture

gnassar,

So... would you want to change the node, comment and user in modules now? Instead of waiting any longer. Because that way it would make it a lot easier to create webform contrib... And it should not be too compilcated to accomplish.

If so, I'd create a new issue for this to be done at some point...

Thank you.
Alexis

gnassar’s picture

I'd like to do that as one of the "big things" after bug fixes (and any very necessary feature additions) are complete. For better or for worse, we're pretty familiar with the old codebase, and there are a number of very long-standing issues which our users deserve to have addressed. So I'd like to get to those first, as long as they're important and don't require large rewrites.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.