Some content not filtered
Jeremy - July 26, 2009 - 21:50
| Project: | Spam |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | closed |
Jump to:
Description
On Kerneltrap, I'm finding that some postings made by anonymous users are getting a score of 0 as if they've not been filtered at all. It seems to be happening to forum posts, the majority of which are properly filtered. This is allowing spam to slip through that otherwise would not.
I need to turn up the debug level and see if I can collect more information about why this is happening. Has anyone else experienced this?

#1
I managed to collect some debug. We get as far as invoking the first filter, and then it seems to just exit and so the content is approved (you have to read the logs from bottom to top, the trace feature is evidently not working correctly):
node 24323 Jul 28 2009 - 12:13 invoking Duplicate filter [1], gain = 100 Anonymousnode 24323 Jul 28 2009 - 12:13 invoking content filters Anonymous
node 24323 Jul 28 2009 - 12:13 inserting Anonymous
node 24323 Jul 28 2009 - 12:13 -- Anonymous
Here is the 'detail' view from the final log entry:
Content type nodeNode ID 24323
Date Tuesday, July 28, 2009 - 12:13pm
User Anonymous
Spam module function spam_content_filter
Message invoking Duplicate filter [1], gain = 100
Hostname 79.127.169.164
Options trace
I do not find any PHP errors or anything else odd in the Apache logs around the same time.
Here are the logs when I manually unpublish this content - as you can see there does not appear to be anything special or fancy about this spam content:
node 24323 Jul 28 2009 - 12:34 updating Jeremynode 24323 Jul 28 2009 - 12:34 unpublished Jeremy
node 24323 Jul 28 2009 - 12:34 update token(goodnano-av.com) class(url) yes(64) no(0) prob(99):... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(goodnano-av.com) class(url) yes(63) no(0) prob(99):... Jeremy
node 24323 Jul 28 2009 - 12:34 mark_as_spam Jeremy
node 24323 Jul 28 2009 - 12:34 update token(avcom) class(spam) yes(64) no(0) prob(99): added ye... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(cargoodnano) class(spam) yes(64) no(0) prob(99): ad... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(games) class(spam) yes(620) no(12) prob(98): added ... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(car) class(spam) yes(7650) no(7) prob(100): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(online) class(spam) yes(35535) no(6) prob(100): add... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(and) class(spam) yes(111375) no(1976) prob(98): add... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(mover) class(spam) yes(33) no(0) prob(99): added ye... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(car) class(spam) yes(7649) no(7) prob(100): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(dallas) class(spam) yes(98) no(0) prob(99): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(loan) class(spam) yes(325) no(0) prob(99): added ye... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(car) class(spam) yes(7648) no(7) prob(100): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(car) class(spam) yes(7647) no(7) prob(100): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(day) class(spam) yes(2340) no(30) prob(99): added y... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(the) class(spam) yes(186227) no(4816) prob(97): add... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(reviews) class(spam) yes(235) no(0) prob(99): added... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(best) class(spam) yes(14409) no(49) prob(100): adde... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(break) class(spam) yes(811) no(23) prob(97): added ... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(avcom) class(spam) yes(63) no(0) prob(99): added ye... Jeremy
node 24323 Jul 28 2009 - 12:34 update token(cargoodnano) class(spam) yes(63) no(0) prob(99): ad... Jeremy
node 24323 Jul 28 2009 - 12:34 mark_as_spam Jeremy
node 24323 Jul 28 2009 - 12:34 marked as spam, score(99) Jeremy
#2
Reviewing the actual node, I do note that the content has an empty teaser -- we need to test to be sure that this doesn't cause it to slip by some how. The subject is also a URL, though I'm sure I've tested that many times.
Another oddity -- it's a forum post, but it's not assigned to any forum, which seems quite peculiar.
#3
Ugh -- I duplicated this late last night somehow and ran out of energy for debugging. This morning I sat down to debug, and now it's not happening even though I believe I'm doing the same thing. Evidently it's something intermittent that I'm not yet seeing.
An empty teaser does not affect anything -- this content is filtered per normal on my dev server.
#4
I reinstalled the spam module and once again thought I might be duplicating this bug. Unfortunately I was instead running into this:
#541876: Spam filters are not enabled until you visit filter settings page
Has anyone else run into an issue where content is slipping through the filters with a score of 0, as if it was not filtered at all? This is happening very frequently on KernelTrap.org, and I've yet to figure out why.
I'm adding additional debug to try and figure out where things are going wrong:
http://drupal.org/cvs?commit=247532
#5
I finally tracked this down, thanks to this fix:
#541950: spam content not prevented
With this fix applied, we see this in the logs:
--Duplicate filter: total(99) redirect(duplicate/denied/ip) gain(100)
Spam score [], redirect to: duplicate/denied/ip
So the nodes that were slipping through were being told that their IP was denied, but had already been posted -- thus there's still a bug here, we should be sure to unpublish any content that we determine is spam.
#6
Actually, looking closer at that section of the code, it works as designed. I'm committing a small change so that the score is actually calculated, but otherwise it will indeed mark content as spam if it needs to.
It seems that fixing #541950 has fixed this bug too -- I no longer have any spam slipping through my filters on KernelTrap. Hooray!
One final minor change to fix logging, then closing this bug:
http://drupal.org/project/cvs/11104
#7
Automatically closed -- issue fixed for 2 weeks with no activity.