Install Spam module on drupal.org [#1378456]

Comment #1

dman commented 21 December 2011 at 07:42

+10
This month has been really dirty

Log in or register to post comments

Comment #2

gerhard killesreiter commented 17 January 2012 at 09:06

see http://drupal.org/node/1293186#comment-5478346

Should we mark this as a duplicate?

Log in or register to post comments

Comment #3

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 17 January 2012 at 09:38

I think not. That issue over there is more of a generic discussion and its goal shifts all the time with a lot of different opinions being expressed every now and then. This one here is specific and to the point.

Log in or register to post comments

Comment #4

cweagans

He/Him

English

Boise, ID, USA

commented 15 March 2012 at 02:35

I think Spam.module fits our needs pretty well - a filter that learns seems to be the ideal tool for us. Looking at some of the filters that come with the spam module, I think there are some pretty cool tools that we could use effectively:

Included filters:
Bayesian filter - auto-learns, performing statistical analysis on the words in new content
Custom filter - regexp/plain text matching.
URL limiter - auto-learns spammer websites and blocks content linking to these URLs
SURBL - blacklist of URLs that commonly occur in spam (Third Party).
Node age filter - treats comments on old content as likely spam
Duplicate filter - blocks duplicate posts and bans associated IPs

I'd like to look at using that custom filter to find things that say 'rel="dofollow"' and automatically mark that as spam. I wish we had a database of all the spam that has been posted on Drupal.org - it'd be an unpleasant task, but we could analyze that and figure out some of the common patterns. Perhaps we could even train the bayesian filter with it.

Log in or register to post comments

Comment #5

cweagans

He/Him

English

Boise, ID, USA

commented 3 April 2012 at 15:05

Another thing that we could add is a flood control module. That is, with each subsequent post in a given timeframe by a given user, the more likely it is that the post is spam.

Log in or register to post comments

Comment #6

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 3 April 2012 at 21:42

Yeah, but there should be a way for legitimate users (TBD) to be excluded so that we don't accidentally block them if they happen to post a series of successive comments from time to time.

Instead of having to calculate such things on the fly for each new comment, I believe it would be wise performance-wise (no pun intended) to simply check for a specific role assigned to the user posting the comment(s). We already have flag deployed in d.o, so another way to do this would be to auto-assign a "newcomer" flag to new user accounts and have it so that only comments from such accounts are checked by this flood control routine. That'd save us some CPU time and decrease the possibility of false positives.

The thought behind my proposal is that it is highly unlikely for a legitimate user like you or me to suddenly start spamming. So, why even bother checking those accounts in the first place? The only thing we need to figure out is what account properties define a "legitimate user" and how to flag those accounts as such.

Log in or register to post comments

Comment #7

cweagans

He/Him

English

Boise, ID, USA

commented 3 April 2012 at 21:53

How about we focus on the functionality first, and optimize later where needed? We can sit here and talk forever about what needs checked where, but in the end, spam.module will solve more problems than it creates for us. Let's just get it deployed, get the filters trained, and then we can start talking about skipping spam checks for users with a certain role or length of membership or whatever.

Log in or register to post comments

Comment #8

silverwing commented 3 April 2012 at 22:00

@klonos - I would have any roles (including 'vetted git user') bypass the filters.

As for flag.module, there's a JOIN argument that freaks out killes and others a bit, so using it any more on d.o without more investigation into its performance implications probably wont happen.

@cweagans - killes mentioned a database table that he'd be worried about before deployment, so I'm looking into that.

Log in or register to post comments

Comment #9

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 3 April 2012 at 22:01

Yes, definitely! By no means was my comment intended as a show-stopper argument. Lets deploy now and tweak as we go.

Log in or register to post comments

Comment #10

andypost

he/him

Russian

commented 6 May 2012 at 14:50

Having flag module already installed suppose it's much easy to add flag to report spam

Log in or register to post comments

Comment #11

cweagans

He/Him

English

Boise, ID, USA

commented 6 May 2012 at 21:57

Status:

Active

» Closed (won't fix)

The idea is to reduce the amount of manual intervention required for taking care of spam. We want to automate it. If people flag things as spam, somebody still has to go through and review the flagged content. In addition, the testing issue was marked as won't fix, so I'm going to won't fix this as well: Spam module doesn't seem to do what we want (or it wasn't configured properly or something). Spamicide was mentioned as a possible solution.

Log in or register to post comments

Comment #12

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 5 August 2012 at 16:09

Status:

Closed (won't fix)

» Postponed

...from #226678-53: Add a "Report spam/abuse" link to forum/issue comments (next to the "edit" & "reply" links).:

...RIght now, we're going to install Mollom. ... This is a temporary solution and will only be used until there is a working port of spam.module and report_spam.module for Drupal 7, at which point we'll start using those.

So this is not a wontfix but postponed on: #1063524: Port spam module to Drupal 7 and #1714302: Port Report Spam module to Drupal 7 I guess. Right?

Log in or register to post comments

Comment #13

cweagans

He/Him

English

Boise, ID, USA

commented 5 August 2012 at 22:47

yep

Log in or register to post comments

Comment #14

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 6 August 2012 at 00:56

...that's a relief. Thanx.

Log in or register to post comments

Comment #15

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 6 August 2012 at 09:16

When we get back to this task, perhaps we should consider implementing a way to stop spammers from creating an account in d.o in the first place. That should considerably reduce the amount of work spam.module would need to do. There is such a solution available and it has a 7.x version available too: http://drupal.org/project/spambot (it uses www.stopforumspam.com)

Log in or register to post comments

Comment #16

killes@www.drop.org commented 6 August 2012 at 09:21

We've had bad experiences with IP-based blocks when we had the http:bl module enabled. Some countries are only connected to the net through a smallish amount of external IPs.

Log in or register to post comments

Comment #17

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 6 August 2012 at 09:38

Hmm, I wasn't aware of that situation. Still, we can use the (an) external service and instead of blocking registration completely for blacklisted IPs simply give user accounts created a certain amount of "spaminess" points to begin with. Besides, from what I see http://www.stopforumspam.com/ doesn't log only IPs, but a combination of IP-username-email used to register. The spambot module on it's part takes that under account:

Checks (username, email, ip address) data against the www.stopforumspam.com blacklist. Blacklisting can be based on either of email, username or IP address (with configurable thresholds).

That should be safe enough I guess.

Log in or register to post comments

Comment #18

killes@www.drop.org commented 6 August 2012 at 09:40

The issue is that we then need to send our users' email to a 3rd party service which is one of the issues with mollom.

Log in or register to post comments

Comment #19

klonos

he/him

English

90% Melbourne, Australia - 10% Larissa, Greece

commented 6 August 2012 at 11:34

Yes, I know, but thankfully http://www.stopforumspam.com/ besides offering the API to connect and check things against their db they also provide their db data in various downloadable formats!! No need to send out any data at all - just set a cron job to download their hourly/daily ip/email/username files and store them locally. This way I guess the check will be faster too. Perhaps in return for the benefits of using their data we should implement a way to send back out to http://www.stopforumspam.com/ only data of registered users that are indeed deemed as spammers by our Bayesian filter or any manual clean up. Alternatively we could consider donating a certain amount each year ;)

Log in or register to post comments

Comment #20

anarcat commented 17 December 2012 at 00:39

For the record, I have had numerous problems with spam.module on my blog, I can't imagine installing this on something of the scale of Drupal.org. Spam.module does a lot of things, and it's not always clear which part marks which post as spam. And to get an idea why, you need to crank up debugging which will yield too much data here (see #1118442: Trace option in spam module just shows blank page for a discussion about this).

So some caveat... I am not sure spam.module even works anymore... Right now my situation is that I am on the verge of disabling it because it's marking *everything* as spam right now...

Log in or register to post comments

Comment #21

killes@www.drop.org commented 17 December 2012 at 01:18

Status:

Postponed

» Closed (won't fix)

I think we are reasonably happy with honeypot & co.

Log in or register to post comments

Install Spam module on drupal.org

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

News items

Our community

Documentation

Drupal code base

Governance of community