Mollom on d.o implementation plan

1) create new role "bypass spam filter"
2) install mollom D6
3) grant "bypass mollom protection" to the bypass spam filter role
4) update the /home/greggles/.drush/mollomperms.drush.inc fix-mollomperms drush script to use this new role, put that somewhere it can run more easily on a daily basis (like jenkins) (the role on g.d.o is 13, so basically replace 13 with the number of the role created in step 1)
5) Provide new text for the "Your content has triggered our spam filter and will not be accepted" message - maybe suggesting they send a mail to webmasters? If they can't get past mollom then asking them to post an issue will be frustrating...

Also:
Start working on spam.module for D7

Later:
1) switch d.o to D7
2) replace mollom by spam.module when it is ready.

Justification and background

I realize that we typically shy away from third party services, but I believe that Mollom is a fantastic fit for Drupal.org, and here's why:

1) It's actively maintained, especially compared to spam.module. They were also one of the few modules to meet the D7CX pledge.
2) Mollom has a much bigger data source than we ever will, so that presumably translates to their filters being more accurate.
3) Spam.module doesn't have an active D7 port in progress (the port to D7 issue has been sitting in "active" for 1.5 years now).
4) Mollom has very smart people working on it for many more hours than I could ever put into spam.module.
5) Adding spam.module to Drupal.org is just that much more code that we need to port every time Drupal.org upgrades. If we use Mollom, we likely will not have to worry about porting that code. Ever. (This is a huge plus in my book)
6) We had offers from sun and Dries in #1293186: Spam - meta: better spam-combating suggestions to

The reasons that I've been given for why Mollom is a no-go are:

1) Our data is in the hands of some third party
2) Drupal.org becomes reliant on a third party service
3) It can be annoying; and
4) It's not open source

Re #1) All of our comments are public anyways.
Re #2) This is a configurable behavior in the Mollom module. Worst case, we can't communicate with mollom for whatever reason and the module just starts letting things through without filtering them. This is exactly what we have now, and while it's not ideal, I'd be willing to do it on a temporary basis while we wait for Mollom to come back up.
Re #3) I'm pretty sure that this behavior has been toned down quite a bit. In fact, http://mollom.com/how-mollom-works says that only 2% of Mollom users should ever be presented with a captcha.

In my mind, it really boils down to idealism. Do we want to go the easy way and use Mollom? Or do we want to do this the hard way?

In my opinion, we should just install Mollom and be done with it. This is not an issue tracker or a code hosting service that we're talking about outsourcing. This is spam filtering. Mollom can do it better than we can, so I say we let them.

This issue is marked major because it's a blocker for everything else in the Spam hitlist. I definitely don't want to spend a bunch of time getting Spam.module up to snuff if we can use Mollom instead.

Please cast your votes, concerns, flames, etc.

Comments

killes@www.drop.org’s picture

I think the non-open-source issue is the one that is important here.

And yes, I am aware that we use GA, I hate it with a passion and would love to see it replaced by e.g. a hosted version of piwik.

cweagans’s picture

I don't think it's an issue though: I know that as an open source community, we should use open source software where ever we can, but only to the extent that it makes sense. Spam filtering is hard to get right.

I'd even spend some time setting up piwik to replace Analytics if we can use Mollom! =D Honestly, piwik is just as good as Google Analytics - there's no reason not to use it. But rolling our own spam solution is just that much more code we need to maintain, more time we need to spend tweaking the filters, etc. Mollom has already offered to handle this for us. We should let them!

One other point: if our spam filters are open source, won't that mean that spammers are able to easily figure out how to game the system?

s.Daniel’s picture

One of Drupals strengths is the ability to quickly adapt to new trends within the web. With every maintenance task we take time from community focusing on what is really important for Drupal to succeed. (Not to say there aren’t maintenance tasks that 100% are necessary!)

Yes, theoretically we could build something more suitable than Mollom.
No, using a closed source product is not a choice we love but in the end I think we should make a decision based on the question "What's best for Drupal and the Drupal Community?". Not "What is the theoreticaly best solution?".

So for me the answer is, get the job done with little effort, especially little maintenance "costs". Use Mollom or try another working solution like botcha or hidden_captcha. These are solutions we can switch on and off as we like. E.g. if there is a better solution when d.o gets the next update, let’s go for it.

webchick’s picture

On this issue, I defer to the spam fighting team on what works best for them. I also prefer 100% open source whenever possible, but regaining the time, energy, attention, and focus that spam sucks out of our community every week is something I'd be willing to compromise my ideals on. Since we already have Mollom on g.d.o it might be interesting to pull stats (if possible) between the two sites to attempt to quantify this.

I will say though that there's also an ongoing maintenance factor here in terms of the burden on our infrastructure team, and Mollom is incredibly well maintained, which eases concerns about Yet Another Module We Need To Port Between Releases™ (aka, Yet Another Module The Drupal Community (via the Drupal Association) Needs To Pay For). I pointed this out at #1293186-39: Spam - meta: better spam-combating suggestions, which was in December, and Spam module still doesn't have a Drupal 7 version and usage is still flat-lined. :\

It was pointed out at that time that we could just have nice spam-fighting features for the duration of the 6 release and that would be sufficient, but the sunset of Drupal.org on D6 is visible on the horizon (October 1 was communicated at the last DA board meeting), so dumping a bunch of energy into a D6-only solution at this stage seems foolhardy to me.

WorldFallz’s picture

I also prefer 100% open source whenever possible, but regaining the time, energy, attention, and focus that spam sucks out of our community every week is something I'd be willing to compromise my ideals on.

I've come to agree with webchick 100% here. The mere fact this the spam problem has lasted this long in spite of the recent and incessant onslaught of spam is proof enough we shouldn't be taking on yet another large and labor intensive functionality that will have to be maintained by the drupal.org team. Keeping up with spam is a full time job-- as evidenced by the existence of companies like mollom, akismet, defensio, etc. It's also an irritatingly wasteful one (who wants to waste precious volunteer time on <profanity of choice> spammers).

Also, there's nothing permanent about using mollom-- there's nothing to prevent us from selecting a different open source method, should one that does exactly what we need and is also well maintained become evident, at some point in the future.

killes@www.drop.org’s picture

In my opinion, nothing is as permanent as a decision that has been made once.

greggles’s picture

In my opinion, nothing is as permanent as a decision that has been made once.

Hardly.

We had Xapian or Sphinx for a while. It didn't work. We dropped it. We have google analytics. It works well. We keep using it. I don't have a precise memory, but the D6 upgrade involved the removal of several Drupal.org-only modules like http://drupal.org/project/feature

We can change our minds on things if they stop working or a better solution comes along.

As webchick said, it's important that this decision meets the needs of a lot of stakeholders and perhap the most important group is those fighting spam and those building tools to fight spam on this site. We've now had a period of several months with at least 3 different people working to create new tools to fight spam inside Drupal and they have all stopped. Enabling mollom is a 10 minute operation and it will then save hours of volunteer time within a week. That's one hell of a return on investment. I think the people (namely, killes) who are opposed to Mollom need to implement a solution soon or let the Mollom solution move forward.

On Mollom in general as g.d.o maintainer

Mollom has been a benefit and a pain for us. It definitely keeps spam content to a minimum and that is great. On the other hand, we get periodic complaints (maybe once ever other week) that it is blocking legitimate users. For a while we got those complaints more often (maybe three or more times a week). We worked with the mollom team to send them session information and they made tweaks and that helped. The biggest help to reduce that problem was that we created a role that has "bypass mollom" permission and we grant that to people whose accounts match some criteria that a spammer is unlikely to match. That's a drush command we run periodically and we could obviously run it on d.o as well.

Somewhat obvious disclosure: I do work at Acquia which has close ties to Mollom.

cweagans’s picture

I think that simply installing mollom is not enough. We should use the same criteria that gdo uses for marking accounts as "trusted". This way, Mollom is only going to be used for new users or users who haven't really done anything on the site (or whatever the criteria specifies). I would also be in favor of adding a string override for the "Your content has triggered our spam filter and will not be accepted" message to add a link to post a new issue in the webmaster queue requesting the trusted user role. It should be quite simple to add the trusted user role to somebody manually to prevent that problem from happening again.

I can't really see a spammer spending the time to post such an issue and take the time to ensure that they get the role.

silverwing’s picture

+1 for mollom (from someone, as greggles mentioned, who gave up on spam.module). I'm getting really annoyed that we aren't using a tool that actually works to deal with this problem.

I would make sure any role we have (except for that standard git access) get to bypass mollom.

@#8 - "I can't really see a spammer spending the time to post such an issue" - I can.

cweagans’s picture

Maybe we could require them to post their message text as part of the issue?

dddave’s picture

I have said it in various issues but I strongly oppose the deployment of Mollom unless it has made spectacular progress dealing with the following:

1) English input written by non-native speakers. I consider my English pretty solid but my comments get shot down by Mollom constantly (hurray if they get into moderation). My end-user experience of Mollom is pretty poor.
If my posts are already problematic, imagine what will happen to post from Asians, especially Chinese folks (or morten.dk).

2) Mollom also has problems when you use loose language or "create" words. At least that is what I am experiencing a lot.

But: As long as normal, logged-in users over a certain threshold (membership, number of posts) don't get hassled by Mollom (as suggested by silverwing)I can live with it. Let us make sure that established users never face it. Thanks!

killes@www.drop.org’s picture

So, I suggest that we proceed as follows:

1) install mollom D6
2) Work on spam.module for D7
3) switch d.o to D7
4) replace mollom by spam.module when it is ready.

This saves us from making sure D6 spam works for d.o and gives us greater liberty to rework it for D7 as needed.

tvn’s picture

The plan in #12 sounds good. It is not really the best time to work on anything D6 right now, especially on quite complicated things. That energy better be spent elsewhere.

Same as most everyone in this issue I do prefer open source solutions, but sometimes, as webchick stated, you just have to compromise. What also makes it easier for me to "+1" Mollom on d.o is the fact that it's not just random closed source product developed by random company, but it was created by Dries and various other community members are working on it.

Oh and +100 to replace GA with piwik.

WorldFallz’s picture

yep, agreed-- #12 seems a reasonable plan, allows us to do something *now*, while still keeping options open for the d7 drupal.org upgrade.

greggles’s picture

I updated the summary with an implementation plan that takes #12 and incorporates the steps necessary to let trusted users bypass the whole system.

The queries in that drush command may need tuning and I'm willing to do that.

greggles’s picture

Issue summary: View changes

plan

sun’s picture

Note that I intend to do a new 6.x-2.x release in the next days, which will include some fixes and improvements.

I don't know what the drush script is doing exactly - the script was mentioned both here and in #1298402: Automatically grant "bypass mollom" privilege to "trusted" users.. Given its business logic, I could try to implement that as a built-in feature for the Mollom module, presuming it's not doing too crazy things and if there is a way to make it generic. (If all fails, we could also incorporate the drush command only.)

Furthermore, I wonder whether #717874: Provide exportables for Mollom forms is of any interest for drupal.org?

cweagans’s picture

So it sounds like we're more or less in agreement about installing Mollom, at least on a temporary basis. To be absolutely clear, are we talking about installing Mollom on D6, upgrading Mollom as part of the D7 upgrade, and then eventually replace it with Spam.module?

If that's the case, then I think there's a few things that need to happen:

1) This issue should be RTBC
2) When the new Mollom 6.x-2.x release is done, this issue needs a "needs drupal.org deployment" tag
3) Somebody needs to get the Mollom API keys on behalf of Drupal.org, install mollom, plug in the keys, and configure the module appropriately.

If somebody can provide the API keys, I'm willing to set this up on a dev site and get Mollom configured to facilitate easy deployment.

webchick’s picture

" are we talking about installing Mollom on D6, upgrading Mollom as part of the D7 upgrade, and then eventually replace it with Spam.module?"

Yes. We do not need more blockers in front of the Drupal.org D7 upgrade. :P So Spam module would be a post-upgrade task.

I'll ping Dries about the Mollom API keys.

Dries’s picture

@cweagans: I e-mailed you some keys for d.o

cweagans’s picture

Assigned: Unassigned » cweagans

Thanks, I got 'em. I'll try to spend some time on this tomorrow.

cweagans’s picture

cweagans’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +needs drupal.org deployment

Mollom deployment steps:

- Download and install the latest Mollom release
- Create a new role: "bypass spam filter"
- Grant "bypass mollom protection" permission to new role
- Grant "access mollom statistics" to site maintainer role
- Run this query:

INSERT INTO mollom_form (form_id, mode, checks, discard, moderation, enabled_fields, strictness, module) VALUES
('comment_form', 2, 'a:1:{i:0;s:4:"spam";}', 0, 0, 'a:1:{i:0;s:7:"comment";}', 'normal', 'comment'),
('contact_mail_user', 2, 'a:1:{i:0;s:4:"spam";}', 1, 0, 'a:2:{i:0;s:7:"subject";i:1;s:7:"message";}', 'normal', 'contact'),
('forum_node_form', 2, 'a:1:{i:0;s:4:"spam";}', 0, 0, 'a:2:{i:0;s:5:"title";i:1;s:4:"body";}', 'normal', 'node');

- Run these drush commands (you can get the public/private mollom keys from the spam dev site on devwww: drush vget mollom ):

drush vset mollom_public_key "KEY FROM THE SPAM DEV SITE"
drush vset mollom_private_key "KEY FROM THE SPAM DEV SITE"
drush vset mollom_fallback 1
drush vset mollom_moderation_redirect 0
drush vset mollom_privacy_link 1
drush vset mollom_testing_mode 0

- Visit admin/settings/mollom/settings to ensure that communication with the Mollom servers is working correctly (which will set the "mollom_status" variable)

This configuration protects the comment form, the personal contact forms, and the forum topic node forms, which are the biggest spam candidates at this point. The last blocker is to get greggles' drush command working.

greggles, do you want to work with somebody to get your drush command somewhere that both gdo and Drupal.org can use it? If not, can you post the drush command somewhere (or email it to me if it's super secret) so that I can do so?

We don't need to worry about the spam message because users won't see it in this configuration. The comment form and the forum node form is configured to retain posts for manual moderation (see http://drupalcode.org/project/mollom.git/blob/refs/heads/6.x-2.x:/mollom... - that message doesn't get displayed unless the form is set to "discard" mode).

cweagans’s picture

Status: Reviewed & tested by the community » Needs review
Issue tags: -needs drupal.org deployment

Oopsie.

greggles’s picture

The drush command is now in my home directory on util.drupal.org - I think you can read that, no?

cweagans’s picture

I actually don't have access to util :(

drumm’s picture

We should add the Drush command to drupalorg or drupalorg_crosssite so it is public and version controlled along with our other custom code. It looks like it might be nice as something generic, either in Mollom or somewhere else, but that can wait. If the criteria for bypassing Mollom need to be private, let's move that into configuration; which will be nice for changing them.

klonos’s picture

I realize that perhaps I'm jumping in too late here, but I really need to clarify a couple of things. If we go this way...

- install Mollom 6.x in do.
- upgrade to Mollom 7.x once d.o goes D7
- replace Mollom 7.x with spam 7.x once it's ready

I seriously doubt that any serious work on a 7.x version of the Spam module will be done soon(ish) unless there is need for it. Having Mollom in place reduces the chances for this need to occur, while going this way increases them:

- install spam 6.x in d.o
- work on a 7.x version of spam
- upgrade d.o to D7 once all modules (including spam) are D7-ready

...plus if we go the Mollom way, I don't see usage stats of the Spam module going up anytime soon. I'm sure that this will constantly be giving excuses to people to present that fact as another reason to not replace Mollom 7.x with Spam 7.x even when there is a working D7 version available.

Adding Mollom to d.o provides free marketing for it (some might claim it doesn't need it - I won't argue). The fact is that It will bring more people to Mollom and thus potentially take and keep them away from the Spam module (that AFAIK is the most fitting open-source counterpart of Mollom). Some of these people might even be developers willing to help with maintaining the Spam module. The small user base of Spam + the fact that it was rejected over Mollom might put them down.

I guess what I'm trying to say is that I hate to say "Told you so!", but I'm afraid that even when a D7 version of Spam is made available and we propose to replace Mollom with it, we'll see replies like "Why try fixing something that already works?", "We are working on getting exciting new features in D8 right now - there's no manpower to work on this", "Mollom has a larger user base than Spam" etc.

In the meantime, we'll be providing a free "testing ground" for Mollom to evolve as it handles our spam. They will be using this data/experience gained to improve their closed-source services and better serve their customers. Once (if ever) replaced by Spam module, will we at least get the chance to train our solution from Mollom data in return? I seriously doubt it (since it's closed-source/commercial and all).

So, resisting closed-source is not merely an ethical stance that we stubbornly insist on taking as stated above in certain comments - there are serious practical reasons behind all these arguments/concerns we present.

The argument that spam.module doesn't have a D7 port and thus shouldn't be considered for d.o is ridiculous IMNSHO. On the contrary, as I've already said, getting it deployed on d.o will definitely "push" for a faster port to D7. Anyways, is d.o running on D7 as we speak? We waited for more than 18 months for it to be upgraded to D7. What's another month or so? Or do you believe that if people were actually working on the Spam D7 port it would take a considerably longer time than that?

...from Cameron's comment in #1690134-5: Allow the URL filter to check against a whitelist:

... Spam.module is not a complicated beast, so it really shouldn't take too long to port. ...

Same goes for the claim that the benefit of deploying Mollom on d.o will be greater for the community. We've been having the spam issues for years now. What's another month or so? Pushing for a 7.x version of Spam (since it will be required for the D7 upgrade of d.o) will be what will greatly benefit the community I say.

Lastly, as far as it goes for:

- taking too long to upgrade d.o to D7
- the need to use solutions that "work now"
- the fact that crafting our own ones is a hard task to undertake
- the fact that using 3rd party solutions removes maintenance burden

...I didn't see any proposals for switching to Trac or Bugzilla for example instead of undertaking the whole (huge) project* D7 porting task. Nope, we decided to stick to our custom solution and work on it no matter how complicated it was. One of the main reasons was that the community would benefit from this (either by the resulting 7.x version of project/project_issue or by the things we learn during the process).

Anyways, as I said I'm jumping here a bit too late I guess :/

PS/BTW: +1 on replacing GA with Piwik too! I've been using it in all of my installations when there's need for web analytics and I've never had complaints from end customers nor any requests to replace it with GA. Is there an issue filed for this?

cweagans’s picture

There's a lot of speculation there. The bottom line is that Mollom is an easy, quick fix to a big problem. As stated in the issue summary, this is nowhere close to outsourcing out issue tracker: this is spam filtering. Project/issue tracking is relatively easy. Spam fighting is not, as evidenced by the fact that we've had three different people pick up the torch for a better spam fighting solution over the last year or so, and we still don't have anything that works.

We are absolutely not going to delay the Drupal 7 upgrade for a spam.module upgrade. I'm going to continue working on spam.module when I have the time, but that seems to be in short supply these days, so progress will be slow.

I'm afraid that even when a D7 version of Spam is made available and we propose to replace Mollom with it, we'll see replies like "Why try fixing something that already works?", "We are working on getting exciting new features in D8 right now - there's no manpower to work on this", "Mollom has a larger user base than Spam" etc.

Sorry, but this is FUD. When/if those comments happen, we can say: "Well, replacing Mollom was always part of the plan, per #1694494: Install Mollom on Drupal.org. The rest of the infra team signed off on it, and it makes sense for us to do it".

If there's a release of Spam.module for Drupal 7 in the next month or two, then we can stick to the plan and deploy it once Drupal.org is on Drupal 7. Installing Mollom is a 10 minute operation and is a good band-aid while we get a real solution for D7 in place.

In short: nobody is saying that Mollom is the "optimal" solution, but it's one that works right away that will allow contributors to stop wasting time on dealing with spammers.

WorldFallz’s picture

We've been having the spam issues for years now. What's another month or so?

wow... thanks for being so generous and thoughtful about the time of those that actually waste precious volunteer time keeping this site clean of spam. Time wasted on spam is time taken away from other, meaningful, drupal activities.

So please, unless you're planning to take over spam patrol in the meantime (whats a month a so, right?), please don't throw a hand grenade into the proposed solution.

klonos’s picture

Sorry, but this is FUD ...

Yes it is! That's exactly what I feel about this and that's not good. That (coupled with the fact that #1378456: Install Spam module on drupal.org was wontfixed instead of postponed) is why I wrote my thoughts in the first place and I'm still not at ease, but I'll live with it.

wow... thanks for being so generous and thoughtful about the time of those that actually waste precious volunteer time keeping this site clean of spam.

I honestly don't see how this conclusion can come out of my comment when clearly all I say is that I believe we can all tolerate spam (that is going to be removed eventually anyways) for another month or so if we knew that a solid anti-spam solution was being worked on. As an example, we still see new "subscribe" comments even today. I personally don't go crazy about it - I simply do what most do (for years now) ...ignore them and read on. Similarly we'd just have to do the same with spam.

So please, unless you're planning to take over spam patrol in the meantime...

I am willing to do it - even for 4 months till the end of this year (provided we switch our efforts towards getting a 7.x version of Spam instead of using Mollom of course). Where do I apply for it and who do I talk to about the specifics?

cweagans’s picture

coupled with the fact that #1378456: Install Spam module on drupal.org was wontfixed instead of postponed

That was before the testing issue was reopened, which is part of the reason that I wanted to try to consolidate the spam issues earlier this year. Maintaining multiple issues for one task is not a fun task.

We can install Mollom for now, and then klonos can help me with the Spam module port using the immense amount of time he'd spend hunting spam. That seems like a better allocation of resources to me. I am unwilling to agree on any solution that involves simply tolerating more spam for any length of time, especially when Mollom deployment is so mind-numbingly simple and will free up time for other people to work on Spam.module.

klonos, some of the responses to your posts may come across as fairly combative, but I believe that's because you really don't understand the gravity of what you're suggesting. #1382008: Ongoing Vietnamese forum spam reports is full of people spending WAY too much time on spam and it's awful. This is why we're not willing to just wait until spam.module is ready. We need something *now*.

silverwing’s picture

To a lot of people it may not even seem like spam is a problem here. But the *vast* majority of spam we delete isn't reported in issues, and are the Obvious Spam that any half-decent filter would catch. (I just deleted 10 pieces of spam that wouldn't have gotten through Mollom/spam.module. And a few days ago I got about 600 pieces of spam.)

klonos’s picture

I do not underestimate neither the huge volume of spam that is cleaned by people currently on d.o nor the time I'll have to spent on the task should I be approved to join the spam squad. My offer still stands (ping me at #1378456: Install Spam module on drupal.org), but we have to choose to go the Spam module way. It is a solution that works *now* for D6.

IMHO, one of the main reasons why we get so much of that Vietnamese spam is because we currently have no anti-spam measures in place - we rely on hard, time-consuming, manual work by volunteers. I've had pretty good results using the Spam module in D6 (though I admit that none of my projects was a high-traffic site). Sure, we'll have to give the Bayesian filter some time till it starts working for us, but I'm confident it'll soon start having results. So, when I said we'll all have to tolerate spam, I meant only what would bypass the filters during the filter training and tweaking period - by no means did I imply to leave d.o to its fate. The only thing I implied that people could object to I guess was holding back the d.o upgrade to D7 till spam 7.x is ready. I'll take the heat for that one.

cweagans’s picture

we have to choose to go the Spam module way. It is a solution that works *now* for D6.

No, we don't and we shouldn't at this point. Spam.module works for Drupal 6 right now, but not Drupal 7. Deploying anything to Drupal.org right now requires that there is an upgrade path for Drupal 7. Spam.module does not fit that requirement. Mollom does.

Look, I like spam.module a LOT more than Mollom. We're on the same page, believe me. But we can't deploy it right now because it doesn't have a D7 port. I opened this module because I don't believe that I'll have time to port it before d.o is running on D7 and we need a solution yesterday.

This solution has been +1'd by ~10 people in this thread. I realize you're against using a closed source solution, but as I said, we need a solution that works now while we get a better one in place. On top of that, you have not been a part of the countless discussions that have taken place about this issue on IRC, in person, etc, so you coming in at the last moment when pretty much everyone is in agreement about our game plan is kind of aggravating.

So I ask you: please put your efforts toward a D7 port of Spam.module and not derailing this issue any further. I'd be happy to spend some time helping you understand how the module is built, creating issues, etc., but arguing about a solution that will work until spam.module is ready is pointless.

klonos’s picture

Yes, I understand all this (see how my first post in this issue starts with "I realize that perhaps I'm jumping in too late here..."). My intent was not to derail this issue or attempt to revert the decision already taken. I just didn't want to let some disturbing claims that a closed source solution is better then what we have in contrib go unanswered. Further more, I wanted to make sure that we had clear statements that Mollom was chosen as a temporary solution. That's all.

...other than that, fair enough!

Heine’s picture

Status: Needs review » Needs work

I believe it is more appropriate to mark this as 'needs work', per #26. Sun's also asked for the script to see if the mollom module could contain this functionality. This would remove yet another burden from the d.o. maintainers.

That said, the spam situation is getting worse for the webmasters. I used to remove a bunch of spam during my morning coffee, now it's nearly every time I hit /tracker that I have to remove something.

To illustrate: The day before leaving for Drupalcon, I spend nearly 1.5 hours _continuously_ deleting spam posts and blocking users. I'm near the break point.

greggles’s picture

I don't like the idea in #26 nor Sun's proposal because it would make what is a very simple level of protection public - my belief is that spammers will start to abuse the criteria used so that they can bypass it.

That said, if making it public is a blocker to getting this deployed then I don't see it as critical and welcome anyone who has access to util.drupal.org can now read /tmp/mollomperms.drush.inc to make the script public.

webchick’s picture

Without giving away the secret sauce entirely, the way to move what /tmp/mollomperms.drush.inc does to configuration would be for Mollom module to expose a number of criteria (most likely a hook for this) for modules to inject a configurable amount of _somethings_, above which we were fairly sure the person was legit.

For example:

- Account more than X days old
- More then X nodes/comments on the site
- More than X user points
- More than X "up votes" on their content

and of course:

- Role with the "bypass mollom" permission.

Then the values for X + the role could be encoded in the $conf array and we could keep our special sauce by keeping exactly which criteria and how many of each of them factored into the role assignment.

Given Heine's statement in #36 though, I would heavily recommend that we deploy this more or less immediately, and make this abstraction of bypassification a second step.

WorldFallz’s picture

Yeah, it definitely appears to be getting worse-- though only from 2 sources. There are the occasional wedding dress, christian louboutin, and running shoe spams that occur 1 at a time (or at most, a handful). The barrage of hundreds of posts are from the vietnam spammers (there's another issue already tracking those) and sports streaming spam from bangladesh.

I would think there would be a way to teach the spam algorithm to weigh those more heavily but I don't know enough about it to say for sure.

catch’s picture

Account more than X days old
- More then X nodes/comments on the site
- More than X user points
- More than X "up votes" on their content

I've done a couple of these for clients with custom code. Things to be careful about:

- make sure you only count published nodes/comments
- make sure that either userpoints are moderated, or they're only awarded for published nodes/comments (the latter isn't an option with the userpoints_nc module but probably rules would do it, or it's a tiny change to use hook_comment_publish() in your own implementation).

Heine’s picture

This may be a bit less urgent because of #1759272: Test honeypot module on http://drupal.org . Not sure if it psychological, but the only spam deletion run I did was early this morning.

catch’s picture

Issue summary: View changes

small modification regarding spam module work.

mvc’s picture

My 2¢: I personally am fine with using Mollom together with honeypot to filter public data, with the caveat that it has serious accessibility issues that need resolving (see #273964: Accessibility problems with audio CAPTCHA). On the plus side, an Acquia employee has been making excellent progress on that issue lately. Once that's resolved and been properly tested it will be a much better fit for *.d.o.

geerlingguy’s picture

I also think Mollom includes a honeypot field as well (but no time-based protection), so there's that to consider.

dddave’s picture

I still oppose the deployment of Mollom as it will hound any non-native speakers. Has anyone cared trying Mollom with Indian commentators for example?
Additionally Mollom in gdo is a pretty big fail.

dddave’s picture

Could we please set this to won't fix? The performance of Mollom on gdo is an embarrassment and all the "fixes" they deploy continue to fail.

joshuami’s picture

We are going to start testing Mollom for users below the role of trusted (not a spammer). No data will be rejected to start, but we will be looking at the logs to see what sort of configuration could work for Drupal.org.

dddave’s picture

I highly recommend to especially investigate the impact on non-native speakers and even Indian community members. I don't know how much Mollom has evolved but not long ago it was hounding anybody using not picture perfect school English.

webchick’s picture

On Drupal.org that should not be an issue, since we don't support non-English content here (this is even in the terms of service). While nothing stops you from writing in Arabic in issue comments or forum posts or whatever, no one else will be able to communicate with you. ;)

greggles’s picture

re #48: I think dddave's point was about atypical english word-choice/sentence-structure which are a result of a non-native speaker writing in English. I don't have that experience but I can imagine it happening for sure.

FWIW, comment #45 mentions that g.d.o's use of mollom has been frustrating. I believe that for the past 6 months it has worked fairly well (I believe as a result of a mix of updates to the module and configuration changes).

The approach of logging first makes a ton of sense to me.

dddave’s picture

Yup Greggles, that's exactly what I meant.
And during the last months Mollom indeed worked much better on gdo after a long time of essentially being a constant failure. That is encouraging.

geerlingguy’s picture

Just a quick Q: has Honeypot been less effective recently? According to Heine's comments a couple years ago, incorporating Honeypot with some custom tweaks to make it a little more strict was doing a pretty good job on its own. I'm just wondering what's prompting reconsideration of adding Mollom at this time.

killes@www.drop.org’s picture

dddave’s picture

Related: What is the (real money) cost of going the Mollom route?

killes@www.drop.org’s picture

Re: "impact on non-native speakers"

Still seems to be an issue, this post

https://www.drupal.org/node/2395877

by that user:

https://www.drupal.org/u/thisisguan

was marked as spam.

joshuami’s picture

@killes, we are monitoring. We pulled back from using Mollom on all our content forms and are just using it for profile spam at the moment. We've seen over 1,400 spammer accounts blocked over the weekend. In checking the logs, it is not triggering false positives, but we are seeing some spam accounts that are still getting created. It's not perfect, but it is significantly better.

@ddave, there is no additional cost at this time. Our account with Mollom should be able to cover the additional submissions because Mollom only charges for legitimate posts, not spam blocked. We are keeping an eye on this as well.

So far, the testing is positive based on our targeted roll out.

tvn’s picture

Component: Spam » Site organization
Assigned: cweagans » Unassigned
Status: Needs work » Fixed

I opened #2408359: Configure Mollom on Drupal.org to detail the configuration we are testing. Since Mollom is already installed on Drupal.org I am going to close this issue and we can continue discussing configuration in #2408359.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.