Hidden Spam In Comments

Robert Castelo - March 23, 2004 - 15:00

Just had an unsettling experience.

Someone just posted a comment on one of my Drupal sites:

"The info here is very good, thanks!"

This looked fine, and encouraging, but for some reason I was curious and clicked on the "edit comment" link...

Good thing I did, it turned out they had used a div after the comment text, used CSS to make it invisible (display:none), and hidden a page full of links to porn sites in there!

When I viewed details for this comment, I saw the hostname was: 80.97.89.12

All hidden links started www.segravi.com

I've updated my filter tags to disallow div, but I suppose this trick can also be done with most other tags.

Anyone have any ideas how to stop this?

Yikes!

Ernest - March 23, 2004 - 16:02

Thanks for the tip.

Search engine spamming

jxs2151 - March 23, 2004 - 16:58

What you experienced is known as "search engine spamming" or "bulletin board spamming". Google's PageRank algorithm is very dependant on the number (and quality) of links pointing to a site. What this person was doing was increasing the number of links pointing to his filth sites. Once he posted his comment and Google indexed your site, Google would list your site as a backlink to his site and would increase his PageRank.

Lower than scum if you ask me ...

Me, too

cel4145 - March 23, 2004 - 21:31

Thanks for the tip. I got 'em too, today.

CSS Filter

Robert Castelo - March 23, 2004 - 23:07

Just wondering if we could filter for CSS. I can't imagine any situation in which I would want display:hidden or display:none included in a comment.

Attrbiute filters

Dries - March 23, 2004 - 23:14

On the filter configuration page (home » administer » configuration » filters) you can instruct Drupal to strip style-attributes from user-contributed content. Would that be sufficent?

Worked

Robert Castelo - March 24, 2004 - 00:29

Thanks. That did it

I saw that option, but didn't register what it meant, maybe "strip CSS" would be more immediate. Or maybe it's just me being slow thinking :-?

Ooops

jxs2151 - March 24, 2004 - 03:25

I turned on "strip tags" and now images in my static pages don't appear. What tages besides the obvious would you recommend allowing that would solve the spamming problem yet allow my content to appear?

- www.harvesterchurch.org

What I think...

Stefan Nagtegaal - March 25, 2004 - 11:44

where I thought about was switching the 'Strip tags'-site option into an permission like 'Allowed to use HTML inside posting'.
Than for every role, we should be able to set the allowed HTML-tags which could be used..
At this way, we can control all HTML posted by the users...

Don't Filter Code In Code Tags

Brian@brianpucc... - March 25, 2004 - 14:17

Drupal shouldn't filter anything inside of a code tag though. No reason to and doing so makes discussing CSS and HTML inline very hard to do.

Make a filter

Steven - April 1, 2004 - 07:22

In that case, you'll want to make a filter which escaped everything inside of <code> tags, so you can write <code><b></b></code> and have the b-tags show up literally.

Check the documentation for filtering in 4.4+, especially the prepare vs process mechanism. It was made exactly for this.

Edit Please

Brian@brianpucc... - April 1, 2004 - 13:11

I'm just making this post to close his &lt;code&gt; tag.

I think Drupal should make use of the HTML Corrector module.

rampant spam elsewhere & solutions

heather - March 31, 2004 - 23:13

the comment spam problem is well known in the movabletype community. it doesn't matter if you don't allow CSS or HTML, the spammers are cunning. in the MT example, a user's name has a URL link, and this is often used, in addition to posting links in the body of the text.

you may be aware of their proposed solution:
http://www.typekey.com/

this allows for centralised comment-registration. funny enough, drupal has this built in.

other solutions MT users have done are:

- a tool which filters spam and posts a centralized blacklist of IP addys:
mt-blacklist

- a comment manager tool, allowing for bulk managing of comments on a site (can you imagine doing that with drupal.org? eek.)

would simplest option in drupal be requiring registration? or is that open to another kind of abuse?

----------------------
the illusion of progress
http://nearlythere.com/

Registration Won't Stop Them

Brian@brianpucc... - April 1, 2004 - 00:19

Requiring users to be registered before posting comments won't stop them as all they have to do is sign up one place and than use distrubted authentication to login everywhere else.

The mass editing of comments would be nice, as well as IP banning and such, though that could be done in .htaccess.

registration to comment

miasmo - April 28, 2004 - 16:29

Sorry this is off topic, but...

I am new to drupal (just installed it yesterday.) Is it possible to allow people to post comments without registering? I really would prefer that if it is possible.

Comment spam is real

garym@teledyn.com - October 24, 2004 - 00:41

Two weeks ago, I would have agreed with the sentiment that Drupal had time to think about comment spam ... then I ported my prime spam-target MT personal blog to Drupal, and within an hour had collected 17 Levitra ads, and currently get over two dozen 'enhancement' ads a day.

What impressed me, if you can call it that, was how quickly the spammers adapted to the new rules -- they obviously had my URL in their database of MT targets, and during the transition to Drupal I had to spend more time than usual monitoring the Apache error log; I could see hits to mt-tb.cgi and mt-comment.cgi pouring in dozens of times per minute. No wonder my webhost has 'issues' with my website ;)

And here's what happened: Once I completed my rewrite rules such that the old URLs were now redirected to /node/NNN addresses, the first few spams appeared, seemingly done as normal comments (ie through the form at the base of the node page) and probably cut-and-paste (since they were all identical). Within 20 minutes, they had figured out that the true address to hit was /node/reply/NNN and from that moment the frequency escalated to several per minute.

The next thing I discovered was just how defenseless Drupal is to such attacks ...

since mt-blacklist stores it's list of perl expressions in an external file, perhaps a quick but reasonably effective solution might be to have a filtre applied to comments where any match to any expressions in the blacklist file (or table) results in a polite rejection. Later we can emulate Paul's other features such as auto-extraction of the base hostname from tagged comments and an online editor for the perl expressions list, maybe even allow Drupal sites to export these lists back to Paul's central repository of blacklist expressions.

Indexing

candygenius - October 24, 2004 - 11:56

They are using indexers to find the comment links. I blocked the indexers as well as the originating spammer and cut down on the spam 99%. This block in .htaccess got rid of the latest and most persistent one I have seen.

RewriteCond %{HTTP_REFERER} 12\.163\.72\.13 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (Fetch\ API\ Request) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (Microsoft\ Scheduled\ Cache\ Content\ Download\ Service) [NC,OR]
RewriteRule .* - [F]

Doesn't stop Google or anyone legitimate from indexing.

Spamassassin module

javanaut - April 28, 2004 - 15:37

I'm considering building a module that uses Spamassassin rules to filter incoming content. Below are some of my notes on the topic. Note that this assumes that the reader is familiar with how Spamassassin works:

Create a Drupal module that uses spamassassin to filter content.

Create a rule to apply to each role to determine whether filter applies.

Specify which node types/comments get filtered. Possibly have a per-node config that allows poster to turn off comment spam filter.

Specify a spam score above which content post gets filtered.

Options for filtering actions include message being:
1. denied
2. denied quietly (appear to be accepted)
3. queued for moderation
4. queued for moderation quietly

Spamassassin will need to be modified to only apply body tests, as all
header tests will be unapplicable. Consider the performance
options/payoffs between using spamassassin as a separate process vs.
using the spamd client/server configuration.

Rule sets will need to be customized for each community. Consider
developing an interface that allows moderators to adjust scores for
rules on a per-message-acceptance basis using a "This is not spam"
button in addition to an "Approve" button. e.g. if a blackjack gaming
community posts 25 posts all containing the word 'Casino' and the CASINO spam rule by default adds 2.8 points to the overall spam score, each message that gets accepted reduces the spam score for rule CASINO by some predetermined value (like 0.1). After moderators have accepted all 25 posts in this way, the revised spam score for rule CASINO will be 0.3 (most likely below spam threshold).

Alternatively, if a post gets published that is obvious spam (to the human eye), the moderator can specify so. This action will increase the points for each matching spam rule that applied to the post. This is not very accurate, as it may be only one rule that was insufficient, or a new rule needs to be created. Perhaps that level of detail should be offered (TBD).

Now that I've written all of this out, I'll probably x-post it as a new forum topic :)

Update: Please defer commentary on this post to the new topic discussion in the Module Development forum section:
http://drupal.org/node/view/7446

re: Spamassassin module

Jeremy - December 10, 2004 - 21:48

You might consider the spam module. Additional discusion here.

This is very common

mike3k - December 10, 2004 - 21:30

I always check all comments for any suspicious URLs, since the majority of comment spam I've seen consists of random or innocuous text with a URL to some offending site.

I just added a default input filter which disallows {a} so anonymous users can't post any links. Only registered users can select the full filtered HTML format.

--
Mike Cohen, http://www.mcdevzone.com/

I've noticed a change

charybdis - December 10, 2004 - 22:37

I've previously been getting the 'massive list of links' approach, but recently they've moved towards the old anti-Bayesian system of an English (but gibberish) post with a link on the end, or trying to hide the massive list of links with instantly thwarted CSS.

re: I've noticed a change

Jeremy - December 10, 2004 - 23:33

I've noticed this same trend, and thus made some very recent changes to the 4.5 spam.module. The Bayesian filter will now blacklist all domain names from URLs that it finds in spam posts -- you still have to train it, but this should prove quite helpeful.

There's an administrative interface where you can manually add domains, too. So, if you can find a list of spammer domains, you could manually add them into the filter.

I just upgraded an hour ago

charybdis - December 10, 2004 - 23:44

Saw a post about the URL limiter and thought "Wait a minute, WHAT URL limiter?!" One CVS later.... ;-)

 
 

Drupal is a registered trademark of Dries Buytaert.