I'm considering building a module that uses Spamassassin rules to filter incoming content. Below are some of my notes on the topic. Note that this assumes that the reader is mildly familiar with how Spamassassin works:
Create a Drupal module that uses spamassassin to filter content.
Create a rule to apply to each role to determine whether filter applies.
Specify which node types/comments get filtered. Possibly have a per-node config that allows poster to turn on/off comment spam filter.
Specify a spam score above which content post gets filtered.
Options for filtering actions include message being:
1. denied
2. denied quietly (appear to be accepted)
3. queued for moderation
4. queued for moderation quietly
Spamassassin will need to be modified to only apply body tests, as all
header tests will be unapplicable. Consider the performance
options/payoffs between using spamassassin as a separate process vs.
using the spamd client/server configuration.
Rule sets will need to be customized for each community. Consider developing an interface that allows moderators to adjust scores for rules on a per-message-acceptance basis using a "This is not spam" button in addition to an "Approve" button. e.g. if a blackjack gaming community posts 25 posts all containing the word 'Casino' and the CASINO spam rule by default adds 2.8 points to the overall spam score, each message that gets accepted reduces the spam score for rule CASINO by some predetermined value (like 0.1). After moderators have accepted all 25 posts in this way, the revised spam score for rule CASINO will be 0.3 (most likely below spam threshold).