This is more or less a duplicate of #364575: Support alternative spam rejection methods than a CAPTCHA (e.g. unpublishing), but it's 1) for D6 and 2) totally simplified to the most common use-case. That other issue should focus on D7 and investigate advanced configuration scenarios, possibly integration with other modules, or not, and how to approach a user interface for them.

Here, we just want to allow: If Mollom identifies a post as spam, allow me to configure that I want to put the unpublished post into the moderation queue instead.

Quite simple, but still, quite challenging kung-fu. ;)

CommentFileSizeAuthor
#100 mollom-HEAD.bogus-feedback.100.patch20.7 KBsun
#96 mollom-DRUPAL-6--1.bogus-feedback.96.patch8.28 KBsun
#95 mollom-DRUPAL-6--1.bogus-feedback.94.patch6.85 KBsun
#93 mollom-HEAD.bogus-feedback.92.patch21.05 KBsun
#93 mollom-DRUPAL-6--1.bogus-feedback.93-d6.patch9.9 KBsun
#91 mollom-HEAD.bogus-feedback.91.patch8.69 KBsun
#89 mollom-HEAD.bogus-feedback.89.patch5.55 KBsun
#86 mollom-DRUPAL-6--1.reject.86.patch33.57 KBsun
#84 mollom-DRUPAL-6--1.reject.84.patch32.81 KBsun
#82 spam-handling.jpg180.77 KBDries
#82 spam-handling-HEAD.patch852 bytesDries
#80 mollom-HEAD.reject.80.patch36.54 KBsun
#78 mollom-HEAD.reject.78.patch36.54 KBsun
#77 mollom-HEAD.reject.77.patch36.55 KBsun
#75 mollom-HEAD.reject.75.patch36.55 KBsun
#73 mollom-HEAD.reject.73.patch34.67 KBsun
#71 mollom-HEAD.reject.70.patch33.64 KBsun
#68 mollom-HEAD.reject.68.patch33.63 KBsun
#66 mollom-HEAD.reject.66.patch33.12 KBsun
#64 mollom-HEAD.reject.64.patch33.02 KBsun
#63 mollom-HEAD.reject.62.patch32.76 KBsun
#61 mollom-HEAD.reject.60.patch32.28 KBsun
#59 mollom-HEAD.reject.59.patch31.26 KBsun
#57 mollom-HEAD.reject.56.patch33.17 KBsun
#53 mollom-HEAD.reject.53.patch31.06 KBsun
#51 mollom-HEAD.reject.51.patch31.02 KBsun
#49 mollom-HEAD.reject.48.patch32.73 KBsun
#47 mollom-HEAD.reject.47.patch20.39 KBsun
#44 mollom-HEAD.reject.43.patch16.75 KBsun
#42 mollom-HEAD.reject.42.patch13.34 KBsun
#40 mollom-HEAD.reject.40.patch13.34 KBsun
#35 mollom-HEAD.reject.34.patch12.51 KBsun
#33 mollom-HEAD.reject.33.patch10.25 KBsun
#29 mollom-DRUPAL-6--1.reject.29.patch10.97 KBsun
#24 mollom-DRUPAL-6--1.reject.24.patch17.11 KBsun
#18 mollom-DRUPAL-6--1.reject.18.patch14.07 KBsun
#17 mollom-comment-moderation.jpg204.54 KBDries
#11 mollom-DRUPAL-6--1.reject.11.patch14 KBsun
#8 mollom-DRUPAL-6--1.reject.8.patch13.4 KBsun
mollom-DRUPAL-6--1.reject.0.patch13.92 KBsun
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sun’s picture

One of the key questions: After Mollom identified a post as spam, what happens if a client requests and checks a CAPTCHA afterwards?

1) Can the server use that additional data somehow or is it totally useless?
2) Or is it actually bad to additionally do a CAPTCHA, and the client should rather just accept the post, if it wants to?
3) Should Mollom servers get to know that we did NOT block the post on the client-side?

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.0.patch, failed testing.

Dries’s picture

After Mollom identified a post as spam, what happens if a client requests and checks a CAPTCHA afterwards?

Could you elaborate? I'm not sure I understand the question/scenario.

The client only gets a CAPTCHA when the post is 'unsure'. The scenario you describe might not exist.

Dries’s picture

+++ mollom.admin.inc	12 Aug 2010 16:54:16 -0000
@@ -138,6 +138,28 @@ function mollom_admin_configure_form(&$f
+      // @todo Ideally, instead of a checkbox, we'd allow to define the
+      //   "treshold level" (returned 'quality'?), until which posts identified
+      //   as spam would be accepted (but unpublished), allowing site admins to
+      //   configure an individually suitable treshold. But that would require
+      //   to get a 'quality' also for CAPTCHAs.

The challenge with communicating the a score, is that it allows spammers to tweak their content until it is less spammy. They could do that on their own site, and than re-use the content on other sites. It makes more Mollom vulnerable.

+++ mollom.admin.inc	12 Aug 2010 16:54:16 -0000
@@ -138,6 +138,28 @@ function mollom_admin_configure_form(&$f
+      $form['mollom']['reject'] = array(
+        '#type' => 'checkbox',
+        '#title' => t('Unpublish spam posts instead of rejecting'),
+        '#default_value' => (int) !$mollom_form['reject'],
+      );

Shouldn't this only be applied to comments?

+++ mollom.module	12 Aug 2010 17:41:30 -0000
@@ -1353,16 +1356,25 @@ function mollom_validate_analysis(&$form
+      else {
+      }

Redundant?

+++ mollom.module	12 Aug 2010 17:41:30 -0000
@@ -1353,16 +1356,25 @@ function mollom_validate_analysis(&$form
+      if ($result['spam'] = MOLLOM_ANALYSIS_UNSURE) {

This if-test puzzles me. We're already in 'case MOLLOM_ANALYSIS_UNSURE' so the if-test seems redundant. Either way, I don't really understand this change.

Powered by Dreditor.

sun’s picture

Alright, let me clarify what this patch does:

  1. The feature asked for and the goal of this patch is to allow site administrators to override Mollom's default behavior, so as to accept spam, but do not publish it.
  2. Whether spam should be accepted (but unpublished) should be configurable per form, as it is likely that only certain form submissions should be able to slip through the regular spam blocking behavior.
  3. Implementing the option, I thought that it would be illogical to show a CAPTCHA for "unsure" submissions, but not for "spam" submissions. Compared to "unsure", I figured that "spam" can just be a slightly more confident decision, but may not be a 100% confident. If "unsure" is 50%, then "spam" can be anything between >50% and 100%. The client only gets the decision, but not the rating. When not showing a CAPTCHA for to-be-unpublished spam submissions, we would may be showing a CAPTCHA for a 50% confident rating, but not for a potential 55% rating.

    If you absolutely do not feel comfortable with showing a CAPTCHA for "spam" submissions, then we can surely remove that part from the patch.

  4. The concept of a publishing status, potentially also supported by a "moderation queue" for site content administrators, may be supported by many different entities. Just accounting for Drupal core, we already have $node->status, $comment->status, and even $user->status, so there are various different use-cases already. This option can therefore be applied to more than just comments.
  5. When modules are registering forms for Mollom, and they want to allow textual analysis of the submitted form values, then they need to specify a form element mapping, so the Mollom module is able to map certain submitted form values to its special XML-RPC data properties.

    We already have a similarly special mapping property -- 'post_id' is only used internally to let our form processing know where to find the id of an entity in the submitted form values. This patch merely advances on that by saying: If the entity, which belongs to your registered form, supports a publishing status, then you use the 'post_status' property to let us know where to find (and set) that status value in the submitted form values.

    Effectively, only forms that specify the 'post_status' property are able to accept (and unpublish) spam posts (instead of rejecting them).

You additionally asked for a "spam callback", which would allow (other) modules to alter Mollom's default decision (accept/reject) and possibly also do custom changes to the submitted form values. While that would certainly be possible, I'm not sure about concrete use-cases where it would be helpful. Overall, I think this idea rather relates to #364575: Support alternative spam rejection methods than a CAPTCHA (e.g. unpublishing)

Dries’s picture

Implementing the option, I thought that it would be illogical to show a CAPTCHA for "unsure" submissions, but not for "spam" submissions. Compared to "unsure", I figured that "spam" can just be a slightly more confident decision, but may not be a 100% confident. If "unsure" is 50%, then "spam" can be anything between >50% and 100%. The client only gets the decision, but not the rating. When not showing a CAPTCHA for to-be-unpublished spam submissions, we would may be showing a CAPTCHA for a 50% confident rating, but not for a potential 55% rating.

I don't follow this. I'd like to see the following behavior:

  • Ham: publish.
  • Unsure + correct CAPTCHA: publish.
  • Unsure + incorrect CAPTCHA: do not accept.
  • Unsure + no CAPTCHA: do not accept.
  • Spam: store in moderation queue (do not add a CAPTCHA for the time being).

I need to think more about 'post_status' -- it is not a good variable name, IMO. It feels like the code that deals with 'post_status' makes too many assumptions about how $foo->status works. What if I have a complicated workflow, and status has 5 states?

sun’s picture

Ok, let's do it without CAPTCHA then :)

It feels like the code that deals with 'post_status' makes too many assumptions about how $foo->status works. What if I have a complicated workflow, and status has 5 states?

For now, and even more so in Drupal 7, $entity->status is always what we expect here if it exists; a Boolean whether a post is published or not. Re-using or abusing $entity->status to implement workflows is still an open core discussion, which we'll hopefully focus on for D8 or D9, but also, the current discussion rather tends to an approach similar to Flag module, i.e., separate storage columns per workflow state.

Bottom line being for D6 and D7: If your entity supports the concept of a publishing status, then there must be a property which holds that Boolean. Most contributed modules simply follow the $entity->status property of Drupal core entities, which means that it has become an unofficial standard. By using the name 'post_status', we allow developers to directly associate it with the commonly used $entity->status property, just like we do for $entity->id (post_id). So we are purposively making assumptions about how those properties work.

sun’s picture

Status: Needs work » Needs review
FileSize
13.4 KB

Incorporated #6.

sun’s picture

#8: mollom-DRUPAL-6--1.reject.8.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.8.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
14 KB

ok, this one should pass.

sun’s picture

#11: mollom-DRUPAL-6--1.reject.11.patch queued for re-testing.

Dries’s picture

I'm still not convinced I like the post_status handling and the assumptions it makes about the status value. While it might be OK for now, it will backfire at which point we'll have to break our APIs and back out the changes. These assumptions are generally a bad idea because they are based on share coincidence -- in my mind, it doesn't really make sense to make those kind of assumptions.

For example, the next thing I'd like us to work on is more advanced moderation queue. It would be good to separate spam posts from profane posts.

+++ mollom.admin.inc	16 Aug 2010 16:33:44 -0000
@@ -138,6 +138,24 @@ function mollom_admin_configure_form(&$f
+      $form['mollom']['reject'] = array(
+        '#type' => 'checkbox',
+        '#title' => t('Unpublish spam posts instead of rejecting'),
+        '#default_value' => !$mollom_form['reject'],
+      );

Depending on the configuration, ham posts might also being moderated/unpublished. For example, I still moderate all ham posts on my blog.

This solution is a bit of a quick fix -- it doesn't provides the best user experience as there might be different kinds of unpublished posts that need different ways of interacting with them. Ideally, we'd be able to separate moderated spam posts from moderated ham posts by showing them in 2 different queues. When we show them in different queues we can also optimize the operations (e.g. show a quick 'approve' link) as well as optimize the information presented (e.g. show whether a CAPTCHA was required, whether profanity was detected).

We don't need to do that now but we'll certainly want to add this kind of functionality in the future. For any big site that gets hundreds of spam comments a day, mixed with ham comments, we can't just lump things into one queue.

Going forward, we'll also want to add features like "Automatically delete spam comments after x days". Some of our users have hundreds of spam comments a day. Having to manually delete hundreds of spam comments is a real pain.

It feels like the current approach is trying to abstract things too much whereas reality might be that we need to add comment module specific code to create the best experience.

+++ mollom.admin.inc	16 Aug 2010 16:33:44 -0000
@@ -138,6 +138,24 @@ function mollom_admin_configure_form(&$f
+      if (!isset($mollom_form['mapping']['post_status'])) {
+        $form['mollom']['reject']['#disabled'] = TRUE;
+        $form['mollom']['reject']['#description'] = t('This form does not support moderation of unpublished posts.');
+      }

If the form doesn't support it, I'd suggest we don't show the option.

+++ mollom.module	16 Aug 2010 16:34:21 -0000
@@ -1418,6 +1422,28 @@ function mollom_validate_captcha(&$form,
+ * Required, since our individual form validation handlers may not be re-run
+ * after positive validation. Thus, any changes applied to form values may not
+ * persist across multiple submission attempts/form rebuilds.

Not sure I follow this comment. It might make too many assumptions about people's understanding of how things work.

Dries’s picture

Thinking about this some more from an end-user point of view.

+++ mollom.admin.inc	16 Aug 2010 16:33:44 -0000
@@ -138,6 +138,24 @@ function mollom_admin_configure_form(&$f
+      // @todo Is this checkbox limited to spam? (what about profanity?)
+      $form['mollom']['reject'] = array(
+        '#type' => 'checkbox',
+        '#title' => t('Unpublish spam posts instead of rejecting'),
+        '#default_value' => !$mollom_form['reject'],
+      );

Assumption: most new end users expect that we store spam comments in a moderation queue. So instead of 'Unpublish spam posts instead of rejecting' maybe we should have Delete spam comments after [ x ] days and always store spam comments? Just thinking up loud, trying to look at this with a fresh pair of eyes from an end-user point of view.

sun’s picture

Thank you for reviewing and sharing your ideas, Dries!

Trying to address all issues separately:

  1. You are right that it's suboptimal to base the post_status handling on a non-standardized publishing status value. However, for now the solution works (for everything I know), and we can still add a separate "entity status callback" property (or similar) later on without having to break APIs.
  2. We are already able to separate spam posts from profane posts and ham posts for content listings, but as of now, we do not use or display the recorded information anywhere. We need to implement Views support, perhaps ideally integrate with admin_views module to not duplicate the work of replacing the regular administrative content listing pages throughout Drupal. Doing so might be relatively easy. However, this recorded information and its usage is not related to unpublishing spam posts, because, if we accept and store information about a submitted post, then we always store what we know, regardless of publishing status, CAPTCHA, or anything else.
  3. On the configuration side, it is true that this single option is the most simple solution attempt. That is, because any attempt to support

    - Unpublish if spam
    - Unpublish if unsure, after passing CAPTCHA
    - Unpublish if ham
    - Unpublish if profane

    separately will ultimately turn into a star trek trooper cockpit of options and basically means that a proper and future-proof solution attempt follows a trigger/action workflow within a form submission context:

    ( trigger ) => [ action ] => ( trigger ) => ...
    

    Implementing something like this will take some time to get right and could easily duplicate the Rules module, that's loved & hated by many. For now, we need to solve the pressing spam rejection problem for false positives at hand, and may even slightly advance on the functionality later on.

    Note that with regard to Comment module only, "Unpublish if ham" is equal to configuring comments to be unpublished/moderated by default.

  4. Regarding abstraction: The module supports to protect any form, so it would not take long until people would ask for the functionality, if it was limited to and hard-coded for comments only. Also, let's bear in mind that D7 will come with many new entities, which may or may not be used for content similar to comments. Technically, the 'post_status' mapping is really just a way to declare that an entity supports a publishing status (and where to set it). With our Mollom form hooks, we are "describing" various aspects of forms. In D7, there are almost no callbacks left, because Mollom can fully integrate with a form solely based on the registered mappings. That also simplifies Mollom integration for other modules. However, if you feel more comfortable with a callback instead of the 'post_status' mapping, then we can do that, too. Overall, I don't think we should treat any content type or entity special in any way.
  5. #access and PHPDoc will be improved in the next patch.
  6. Always unpublishing instead of rejecting, and "lazy-rejecting" after some time is a brilliant idea! Love that. Though that would likely require to have administrative content listings that take Mollom's ratings into account, so you are able to see the spam/profanity ratings...
Bojhan’s picture

Regarding #14. I do feel its common for users to expect spam to be stored somewhere, rather then rejected - this because its often how spam in terms of e-mail is treated as well from a user p.o.v. Especially with the introduction of profanity filter I see the need for such a listing and option to be important.

I do wish to note that the current indicator of spam is somewhat strange, unpublished sure - but when looking at a list of comments the unpublished background doesn't trigger the "ohh this is spam". Is there anything we can fix with that? Instead of adding a *new* adding *spam* indicator or so?

Dries’s picture

FileSize
204.54 KB

- Bojhan, that is exactly why I think we need to come up with better displays for comment moderation, and possibly different/additional status handling.

- Sun, not sure about using the Views module for that. I'd prefer not to create a dependency on Views at this point. At Acquia, we have 15.000+ Drupal 7 sites that are using Mollom -- we don't have the Views module enabled for those sites.

- Sun, I'd like to see us experiment with an alternative approach to post_status that does not rely on non-standardized publishing status values. It doesn't have to work but it would be great if we could compare and contrast code. Maybe timebox the amount of time invested to 2-3 hours or so to make sure we're not going down a rat hole.

- Here is how I think about the navigation:

mollom-comment-moderation.jpg

sun’s picture

I've reworked this patch to add a "reject callback" instead of a 'post_status' mapping. The behavior is almost identical: if a reject callback is defined for a protected form, then Mollom does not reject bad submissions, and instead leaves it to the defined reject callback to properly change the submitted form values in a way that it won't be published, resp., end up in a moderation queue.

Regardless of which way we choose, it always has to be an optional property in the form definition, because not all implementing modules support the concept of a publishing status.

However, for me, the reject callback design is pretty much code duplication. As mentioned in #5, I still think that a simple form value mapping ('post_status' or similar) is sufficient for most if not all modules in D6 and D7, and we can still add an alternative reject callback, if a module happens to need it. The other major concern I have is that a reject callback would make integration of other modules with Mollom module more complex, and we would be putting lots of (too much) power in the hands of other module authors, while Mollom's own API documentation clearly states that clients should normally reject posts that Mollom evaluates as bad. Thus, while Mollom module might ensure what exactly happens for rejected core entities, we'd be no longer able to ensure the same for other modules/entities, because they can technically do whatever they want in that callback. Compared to that, a simple form value mapping that is either supported or not, keeps the power of intended reject actions in Mollom module's hands.

We should discuss the moderation queue and other UI aspects like the cron purge in a separate issues, as those are entirely different topics having to touch other parts of the module.

P.S.: I think we need a new word, like "bad post" or "bad submission". "Spam", "profanity", etc. doesn't cut it anymore, if we mean a post that is spam or profane or both.

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.18.patch, failed testing.

Dries’s picture

Status: Needs work » Needs review

I personally like the callback approach better so let's continue down that path for now. It gives us more flexibility and while a bit more verbose, it is actually easier to read and understand.

Here are some thoughts on the latest patch:

+++ mollom.admin.inc	30 Aug 2010 18:11:55 -0000
@@ -138,6 +138,24 @@ function mollom_admin_configure_form(&$f
+      // The negated form widget and value handling looks a bit weird, but users

It would be good if we could explain what you mean with 'negated form widget and value handling'. The code comments are great, but they assume that the reader understands what 'negated form widget and value handling' means.

Note that this became clear after reading the remainder of the code in this patch. A bit more documentation would have eliminated the initial "WTF?", but clearly it is not the end of the world. As we all know, even good code has some WTFs. ;-)

One possible way to get rid of the negated checkbox stuff is by using radio buttons instead of a single checkbox. That could actually be a good thing for the end-user as we can explain the difference a bit more carefully. In addition, it would clean up the code a little bit and eliminate the WTF. Thoughts on that?

+++ mollom.module	31 Aug 2010 10:53:22 -0000
@@ -1902,6 +1921,7 @@ function node_mollom_form_info($form_id)
+    'reject callback' => 'node_mollom_reject',

Is the reject callback only for 'spam', or for both 'spam', 'profanity', etc? Looking at the code, it only seems to be called for spam right now. Hence, I'd suggest renaming the callback 'spam reject callback'.

However, we probably want to have a callback for profanity too. In other words, maybe we should keep it as 'reject callback' but figure out how/when to call it for profanity. In that case, we might also need to reword the newly introduced setting, or add a second configuration setting.

+++ mollom.module	31 Aug 2010 10:53:22 -0000
@@ -632,6 +631,7 @@ function mollom_form_alter(&$form, &$for
+      $form['#validate'][] = 'mollom_validate_status';

I think we're really validating the 'Mollom response' so maybe instead of 'status' we should use 'response'?

Dave Reid’s picture

Would it be possible to use the trigger/action system instead? Then this could be managed by users themselves. Mollom provides a 'Comment is analyzed as spam' trigger that could be linked to the 'Unpublish comment' action. Same for nodes.

sun’s picture

@Dave: That's the amount of complexity I was specifically trying to avoid. #364575: Support alternative spam rejection methods than a CAPTCHA (e.g. unpublishing) is discussing that approach in detail already. Likely results in 100% duplication of Rules module.

Dries’s picture

Integration with Rules modules is beyond the scope of this issue, IMO. It also doesn't provide the kind of user experience that I envision for such a fundamental feature.

sun’s picture

re #20:

1) Although I can understand the idea of using radio buttons, the UX principles are pretty straight here: Two radio buttons is always a checkbox.

2) The D6 module does not implement profanity checking (yet). I don't think there is a difference between spam and profanity for reject callbacks. They can check the response in $form_state, if it is important for them.

3) I've renamed the last new form validation callback into mollom_validate_post(), as that also matches its phpDoc - the validation handler is for post-validation tasks only.

Sorry for also cramming that $GLOBALS refactoring into this patch. It's required, because we'd have to ensure that the global values equal the form state values in some more locations for this issue, so we rather want to change it in the right way directly. Working on this, I recognized that the module's global variable handling can easily break - too easily, actually. Since that global variable code hasn't changed for a long time, I need to look up what actually requires it, and how we can fix it. Separate issue.

EDIT: oh, and I also revamped lots of comments ;)

Dries’s picture

A checkbox usually means 'enabled' or 'disabled'. This situation might be a bit more subtle, and because of that, I think radio buttons could work. We're offering the user two mutually exclusive options (i.e. reject spam or unpublish spam) and we're expecting the user to select exactly one choice. While it could be combined into a single checkbox, it does not have to because it doesn't have to be formulated as 'enabled' or 'disabled'. One can argue that we dilute the two mutually exclusive options by 'cramping' them into a checkbox. That said, I'm OK with a checkbox if you feel strong about keeping it a checkbox.

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.24.patch, failed testing.

Dries’s picture

Status: Needs work » Needs review

#24: mollom-DRUPAL-6--1.reject.24.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.24.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
10.97 KB

Re-rolled.

sun’s picture

Note, however, that this patch contains no tests yet and that we likely want to postpone a commit to after the next stable release.

Dries’s picture

Also, "0 pass(es)" so the green is not really green.

Status: Needs review » Needs work

The last submitted patch, mollom-DRUPAL-6--1.reject.29.patch, failed testing.

sun’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Status: Needs work » Needs review
FileSize
10.25 KB

Hm. So let's follow the usual procedures and do this for HEAD first; so we get proper test results.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.33.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
12.51 KB

The automated deletion of not published posts gets too complex without #909484: Add {mollom}.form_id and store originating form_id for session values, so let's do that first.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.34.patch, failed testing.

Dries’s picture

Dries’s picture

Status: Needs work » Needs review

#35: mollom-HEAD.reject.34.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.34.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
13.34 KB

So something like this could work. Didn't have time to test yet.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.40.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
13.34 KB

Sorry, merely a update number hiccup.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.42.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
16.75 KB

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.43.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

So the remaining challenge to resolve is this @todo:

+++ mollom.module	27 Sep 2010 16:43:09 -0000
@@ -2099,6 +2165,40 @@ function comment_mollom_form_info($form_
+/**
+ * Implements hook_comment_presave().
+ */
+function mollom_comment_presave($comment) {
+  // If an existing comment is published and we have session data stored for it,
+  // send 'ham' feedback to Mollom.
+  if (!empty($comment->cid) && $comment->status == COMMENT_PUBLISHED) {
+    if ($data = mollom_data_load('comment', $comment->cid)) {
+      _mollom_send_feedback($data->session, 'ham');
+      // Update the stored session data, so this comment is no longer deleted
+      // in upcoming cron runs.
+      // @todo Actually, we should update the stored values for 'spam' and/or
+      //   'profanity', so they no longer appear in content overviews that join
+      //   on {mollom}. However, as of now, the stored values are 1:1 the return
+      //   values of Mollom servers, and just updating them would mean that when
+      //   editing a post, we'd potentially overwrite the "good" values again.
+      //   Hence, we likely need to do both: Update the analysis values AND use
+      //   a new column to prevent existing published content from being deleted.
+      db_update('mollom')
+        ->fields(array('delete' => 0))
+        ->condition('entity', 'comment')
+        ->condition('did', $comment->cid)
+        ->execute();
+    }
+  }
+}
sun’s picture

Issue tags: +Needs tests
FileSize
20.39 KB

oh yay! This one works quite smoothly already. Now going to write some tests ;)

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.47.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
32.73 KB

So this is the first actually working patch, which includes a first stab at tests. (I purposively left trailing white-space for absolutely required @todos in this patch)

The currently proposed implementation, which we should discuss, is based on the following:

  • Registered forms may opt-in for this new "delayed rejection" functionality. It has to be optional, since there are forms that have no entity associated with them, which means that posts on those forms cannot be rejected later on. For example, the Contact module forms in Drupal core. Additionally, this pattern makes only sense if posts can be moderated or published. For example, although Webform submissions are stored in the database, they do not have a publishing status, so a delayed rejection is not possible.
  • To support delayed rejection, registered forms have to:
    1. specify a 'reject callback', which basically is a form validation handler (like any other), but only invoked from Mollom's form validation handler, if the form was configured for delayed rejection and the post contains either spam or profanity. The reject callback's job is to manipulate the form values in a way that the post goes into a "moderation queue" (which most commonly means to set an "unpublished" flag for the post's 'status' value).
    2. implement a 'presave' hook or other facility (optionally on behalf of mollom.module), which makes 100% sure to call mollom_data_unreject() when an existing post is published/moderated. If this invocation fails for any reason, the post will be rejected (deleted) later on. This is the scary part of the otherwise fantastic delayed rejection idea, so let's make sure we think about ways to prevent us from incorrectly deleting valid posts.
    3. specify a 'entity delete multiple callback', which is very common for modules following the API design of Drupal 7 core. The function takes an array of entity ids to delete. Popular core examples are node_delete_multiple() and comment_delete_multiple().
  • Given all of that, what happens under the hood is:
    1. A spam or profane post is submitted.
    2. The reject callback unpublishes it, so it goes into moderation queue.
    3. If a moderator happens to publish the post (i.e., turns "bad" into "good" [yuck! we still need new terms!]), then the presave hook calls mollom_data_unreject(), which marks the post as ham/not-profane (+ ideally also sends feedback to Mollom) and clears the {mollom}.reject flag of the stored Mollom session data.

      In this case, the following last step does not occur.

    4. When cron runs, it collects the Mollom session data for all posts older than 2 weeks that are either spam or profane, and have the reject flag set. The posts get grouped by their entity, and lastly, the entity delete multiple callback is invoked with the corresponding ids.
  • As also mentioned in a @todo, mollom_data_unreject() needs a better name ;) Also, during writing this summary, I figured that this function can take over the entire Mollom session data handling and should just accept $entity and $id as parameters.
  • Another @todo is that the entity delete multiple callback is not yet validated in the registered form information; only reject callback is. Both have to exist to make delayed rejection work.

Working on this patch made me realize that our current binding on form_ids is not really perfect for all entity-kind of operations. If we would not support entity-less forms, then we could potentially re-use and integrate with parts of the existing entity API in D7. However, I think it is unlikely that we are going to change that, and given D7's limited entity API, it may also be not worth to consider integrating with it. But speaking macro-level, we are facing the challenge of mapping entity data to arbitrary forms and form data, on performing operations on either data type. While we were happy with just caring for form submissions in the past, we now face the need of having to perform actions in situations when no form might be involved.

I hope this summary helps. I'd love to clarify everything that needs clarification.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.48.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
31.02 KB

Re-rolled against HEAD.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.51.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
31.06 KB

Odd test failures.

Status: Needs review » Needs work
Issue tags: -Needs tests

The last submitted patch, mollom-HEAD.reject.53.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

#53: mollom-HEAD.reject.53.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Needs tests

The last submitted patch, mollom-HEAD.reject.53.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
33.17 KB

So it turns out that my PHP's error_reporting was bogus, so I didn't see some problems.

Dries’s picture

+++ mollom.admin.inc	6 Oct 2010 15:54:29 -0000
@@ -153,6 +155,24 @@ function mollom_admin_configure_form($fo
+          '#default_value' => $mollom_form['reject'],
+          '#description' => t('Disable to manually moderate all posts.'),
+          // Only possible for forms supporting moderation of unpublished posts.

My personal preference would be to use radio buttons. I think that could make this option more explanatory as people could properly weigh the two alternatives.

It itches me that the description is inverted; i.e. instead of 'Enable' is starts with 'Disable'. This trips people up. I had to read it twice to grok it.

+++ mollom.api.php	6 Oct 2010 15:54:29 -0000
@@ -205,6 +208,16 @@
+ * function im_mollom_reject(&$form, &$form_state) {
+ *   $form_state['values']['status'] = 0;
+ * }

I wonder if there is a better name for the hook_mollom_reject.

You have to implement the reject callback in order not to reject the post. See what I mean? One would think you implement the reject callback to reject the post.

I can't think of a better name though.

Powered by Dreditor.

sun’s picture

FileSize
31.26 KB

Changed a couple of things to use "moderation" and "moderate". The new names make much more sense now.

The last remaining potential issue I see is the administrative UI. Would love to hear some feedback on that! :)

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.59.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
32.28 KB

Added mollom.sendFeedback to the fake server implementation. Also, since we're using radio buttons now, we have to make sure that a form configuration value for 'reject' is submitted.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.60.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
32.76 KB

So this one finally passes for me locally. However, I just realized that I still need to fill in some @todos in the test.

sun’s picture

Issue tags: -Needs tests
FileSize
33.02 KB

Completed the tests.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.64.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
33.12 KB

Stupid me :P

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.66.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
33.63 KB

Sorry for cluttering up this issue. This one should finally pass now.

Dries’s picture

I'm still struggling with the language/terminology.

+++ mollom.admin.inc	8 Oct 2010 15:52:23 -0000
@@ -153,6 +155,24 @@ function mollom_admin_configure_form($fo
+          '#options' => array(
+            1 => t('Block bad posts'),
+            0 => t('Moderate bad posts manually'),

I'm not sure this makes it crystal clear.

The word 'block' is ambiguous and so is the word 'moderate'.

Maybe it should read more like:

- Discard all bad posts and avoid them being in the database
- Keep a copy of all bad posts for moderation or review

I'd also like to figure out a better title.

I also think that 'discard' might be a better name for the database field than 'reject'.

In the future I can actually see us nuke the 'discard' option, and to always go with the 'keep copies' option. With the right automated clean-up, this becomes a no-brainer IMO.

+++ mollom.install	8 Oct 2010 15:58:41 -0000
@@ -109,6 +109,13 @@ function mollom_schema() {
+        'description' => 'Whether the content needs to be moderated.',

The content doesn't /need/ to be moderated. Shouldn't that be /has been/ moderated?

Dries’s picture

The more I think about it, the more I'm leaning towards always keeping a copy in the database, and just adding an auto-discard feature. How do others feel about that?

sun’s picture

FileSize
33.64 KB

Tweaked the admin UI options.

re: {mollom}.moderate, the 0/1 in this column indicates whether a post needs or should go through manual moderation. An alternative would be 'needs_moderation', but I think that is too lengthy. Everything with a 1 in there will have to show up in a moderation UI, or alternatively, will be automatically deleted after a certain time; depending on what we are going to implement next :)

Dries’s picture

The UI looks perfect now. :)

+++ mollom.install	8 Oct 2010 22:08:30 -0000
@@ -174,6 +181,13 @@ function mollom_schema() {
+      'reject' => array(
+        'description' => 'Whether to reject (1) or unpublish (0) bad form submissions.',
+        'type' => 'int',
+        'size' => 'tiny',
+        'not null' => TRUE,
+        'default' => 1,
+      ),

Can we name this field to 'discard' instead of 'reject', and can we update the description to use the words 'discard' and 'retain' instead of 'reject' and 'unpublish'? For example, in the future, the contact module's mails could also be 'retained' (but that isn't necessarily 'unpublished'). Retained is more accurate or more generic.

+++ mollom.module	8 Oct 2010 22:08:30 -0000
@@ -1225,6 +1250,7 @@ function mollom_process_mollom($element,
+      'require_moderation' => FALSE,

I still think this is a poor name. They don't really 'require' moderation, do they? The comments are retained for review. Should this variable be called 'retain'?

+++ mollom.api.php	8 Oct 2010 22:08:30 -0000
@@ -276,6 +289,10 @@ function hook_mollom_form_list() {
+ *   - moderation callback: (optional) A function name to invoke when a form

I wonder if this should be called 'retention callback'.

sun’s picture

FileSize
34.67 KB

Improved the tests, but did not rename the 'reject' column yet.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.73.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
36.55 KB

Renamed 'reject' to 'discard'. Also, interesting test failure -- the admin UI does not pre-select the first available protection mode if the registered form does not define one. :)

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.reject.75.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
36.55 KB

Sorry, I forgot that reset() returns the array value and not the array key.

sun’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
36.54 KB
+++ mollom.install	9 Oct 2010 03:13:19 -0000
@@ -174,6 +181,13 @@ function mollom_schema() {
+      'discard' => array(
+        'description' => 'Whether to discard (1) or unpublish (0) bad form submissions.',

@@ -751,3 +765,27 @@ function mollom_update_7008() {
+    db_add_field('mollom_form', 'discard', array(
+      'description' => 'Whether to discard (1) or moderate (0) bad posts.',

Doesn't match.

But aside from that, I'm really happy with this patch now.

Powered by Dreditor.

Dries’s picture

We're very close!

+++ mollom.admin.inc	9 Oct 2010 13:47:15 -0000
@@ -29,7 +28,10 @@ function mollom_admin_form_list() {
+      t('!protection-mode (@discard)', array(
+        '!protection-mode' => $modes[$mollom_form['mode']],
+        '@discard' => $mollom_form['discard'] ? t('block') : t('moderate'),

We should write 'discard' and not 'block', not?

+++ mollom.module	9 Oct 2010 03:14:59 -0000
@@ -1484,6 +1533,20 @@ function mollom_validate_captcha(&$form,
+  // Unpublish a post instead of discarding it. If 'discard' is FALSE, then the

We should write 'Retain' instead of 'Unpublish'. Whether the post is unpublished is up to the module.

sun’s picture

FileSize
36.54 KB

Awesome, thanks! Additionally changed "moderate" into "retain" in the administrative form configuration overview/list, so we see either "discard" or "retain" there. "moderate" didn't make much sense in that context.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Superb! Committed to CVS HEAD. There is important follow-up work to do, but this will/should make a lot of people happy.

Dries’s picture

Category: feature » bug
Status: Fixed » Needs review
FileSize
852 bytes
180.77 KB

I'm going to re-open this issue because we didn't got it 100% right. See screenshot and attached patch.

spam-handling.jpg

Dries’s picture

Status: Needs review » Fixed

Sorry, this was a non-issue.

sun’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
Status: Fixed » Reviewed & tested by the community
FileSize
32.81 KB

Backported to D6, and ran all tests manually.

Dries’s picture

This looks good.

Reviewing the code, I found a couple of things, none of which should hold back a commit.

Looking at the code again after a week or so, it's still clear that the 'retain' vs 'moderation' nomenclature is a bit of a WTF. More detailed feedback below.

+++ mollom.install	16 Oct 2010 16:26:08 -0000
@@ -172,6 +179,13 @@ function mollom_schema() {
+        'description' => 'Whether to discard (1) or moderate (0) bad posts.',

Should be retain instead of moderate?

+++ mollom.install	16 Oct 2010 16:26:08 -0000
@@ -594,3 +608,29 @@ function mollom_update_6116() {
+      'description' => 'Whether to discard (1) or moderate (0) bad posts.',

See above.

+++ mollom.module	16 Oct 2010 16:24:48 -0000
@@ -1667,6 +1707,20 @@ function mollom_pre_render_mollom($eleme
+  // Retain a post instead of discarding it. If 'discard' is FALSE, then the
+  // 'moderation callback' is responsible for altering $form_state in a way that
+  // the post ends up in a moderation queue. Most callbacks will only want to
+  // set or change a value in $form_state.
+  if ($form_state['mollom']['require_moderation']) {
+    $function = $form_state['mollom']['moderation callback'];
+    $function($form, $form_state);
+  }

We don't check 'discard' although the code comments make it sound like we will.

+++ tests/mollom.test	16 Oct 2010 17:04:08 -0000
@@ -2449,6 +2457,137 @@ class MollomDataTestCase extends MollomW
+class MollomAnalysisTestCase extends MollomWebTestCase {

It feels like, in a follow-up patch, we may want to move some existing tests over to this new class?

+++ tests/mollom.test	16 Oct 2010 17:04:08 -0000
@@ -2449,6 +2457,137 @@ class MollomDataTestCase extends MollomW
+  function testDiscardModeration() {

I felt like this should have been called testRetainModeration(). Update: I was once again confused by the moderate vs retain terminology. Nevermind.

+++ tests/mollom_test.info	16 Oct 2010 16:35:24 -0000
@@ -4,3 +4,4 @@ description = Testing module for Mollom 
+php = 5

What's this?

Powered by Dreditor.

sun’s picture

Fixed all of the review issues -- except for the condition that checks for 'require_moderation', since the value of 'discard' is merely the form configuration value. We can revisit that in a follow-up though.

Dries’s picture

Thanks for incorporating those suggestions. Feel free to commit this patch if the testbot is happy with it. I'm at an airport and the firewall won't let me commit it.

PS: I haven't tested the patch myself.

sun’s picture

Status: Reviewed & tested by the community » Fixed

Yes, tests are passing. As always, I think we will manually test the entire code prior to publishing the next stable release.

Thanks for reporting, reviewing, and testing! Committed to D6.

A new development snapshot will be available within the next 12 hours. This improvement will be available in the next official release.

sun’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Status: Fixed » Needs review
FileSize
5.55 KB

I'm still seeing bogus mollom.sendFeedback attempts in my site's logs.

Attached patch adds some extensive test coverage for new, published HAM/UNSURE comments. Actually, they should fail, but they don't want to, and I cannot explain why yet. In HEAD, that is. I'll go ahead and backport + run them on D6 now, which is where I actually see the bug. Perhaps it just doesn't exist in HEAD only.

sun’s picture

To clarify the remaining bug that we need to fix, this is the typical sequence of log messages I see in my logs:

Unsure: 'This post shows the information ...
Unsure: 'This post shows the information ...
Correct CAPTCHA ...
Ham: 'This post shows the information which ...
Error -32700 from ...
Reported 'ham' for session id ...

The Error -32700 can be ignored - looks like the new backend still bails out on 'ham' feedback.

Somehow, I think that the added tests are doing exactly the same. But I also ran them in D6 now, but don't get any failures. :(

Actually, I also understand why this happens -- because comment_save() unconditionally invokes the comment-published hook, regardless of previous comment publishing status, so our code runs for any comment that happens to be published, while we're only interested in comments that transition from unpublished to published -- but we totally need to be able to cleanly reproduce the bug in order to fix it.

Any thoughts or ideas?

sun’s picture

This is how I think we could solve this problem -- at least for D7/HEAD.

The key part is to copy ->status into ->original_status, and comparing that prior to marking anything as moderated.

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.bogus-feedback.91.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
9.9 KB
21.05 KB

Attached patch fixes the remaining issues with retaining bad posts. To ensure that we are catching the bug, I've backported the additional Comment module integration tests, and they fail now.

Dries’s picture

+++ tests/mollom.test	27 Oct 2010 23:50:49 -0000
@@ -562,6 +562,20 @@ class MollomWebTestCase extends DrupalWe
+  protected function enableTestServer() {

I think 'xxxFakeServer' would be a slightly better name. TestServer got me initially confused with the 'test mode' or 'developer mode'. You're actually using the words 'fake server' in the documentation so why not consistently use it in the API?

+++ tests/mollom.test	27 Oct 2010 23:50:49 -0000
@@ -1928,7 +1942,7 @@ class MollomCommentFormTestCase extends 
-  function testUnprotectedCommentForm() {
+  function xtestUnprotectedCommentForm() {

Oopsie. Remove the 'x'. Happens a few times in the code.

sun’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
FileSize
6.85 KB

So we agreed on putting the patch for D7 on hold for now, as we don't know whether and how we will actually use the data yet.

For D6, we just want to remove the ->moderate flag and related functionality. Merely keeping the discard/retain option in the form configuration.

Attached patch does that. Will run tests manually now to verify.

sun’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
8.28 KB

Forgot mollom_test.module. This one passes all tests in D6.

Dries’s picture

Version: 6.x-1.x-dev » 7.x-1.x-dev
Status: Reviewed & tested by the community » Needs work

Committed to DRUPAL-6--1. Moving to Drupal 7 now...

mgriego’s picture

sub

sun’s picture

Category: bug » task
Status: Needs work » Postponed
sun’s picture

Re-rolled against Mollom and Drupal HEAD.

sun’s picture

Status: Postponed » Needs review

This patch is back on the table now, at least for D7.

@Dries needs to decide on whether we want to "revert-revert" the changes for D6 (originally added, rolled back in #96).

Status: Needs review » Needs work

The last submitted patch, mollom-HEAD.bogus-feedback.100.patch, failed testing.

sun’s picture

Category: task » feature
1kenthomas’s picture

Version: 7.x-1.x-dev » 7.x-2.0

Did this get implemented or is it still pending?

The issue I have is that giving users access to Mollom results in the ability to delete posts without an audit trail / record of the posts. That is not an authority which I usually wish to grant to any user but root (if root-- still violates the "preserve all user input" principle).

Thanks/bedankt.

1kenthomas’s picture

Version: 7.x-2.0 » 7.x-2.x-dev

Ooops, did not mean to alter version, but might as well make current -dev.

sun’s picture

This feature is part of 2.0 already. You can configure whether spam posts should be discarded (default) or retained (as unpublished). The option is only available for entities that actually support a notion of a unpublished status (which normally goes hand in hand with a moderation queue).

I need to check which parts of the patch are still to be considered, as it appears to me that some parts have been extracted and committed through other issues/patches in the meantime.

1kenthomas’s picture

Thanks again. I'll install and review.

sun’s picture

Status: Needs work » Fixed

The essentials of the follow-up patch have been committed via other issues to 2.x already.

Therefore, we can finally close this issue as fixed. :)

Thanks all!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

  • Commit 125fafc on master, fai6, 8.x-2.x, fbajs, actions by sun:
    #881534 by sun: Changed stale instances of "moderate" into "retain".
    
    
  • Commit eba4ff0 on master, fai6, 8.x-2.x, fbajs, actions by Dries:
    - Patch #881534 by sun: allow to unpublish spam posts instead of...

  • Commit 125fafc on master, fai6, 8.x-2.x, fbajs, actions by sun:
    #881534 by sun: Changed stale instances of "moderate" into "retain".
    
    
  • Commit eba4ff0 on master, fai6, 8.x-2.x, fbajs, actions by Dries:
    - Patch #881534 by sun: allow to unpublish spam posts instead of...