Hi,

I'd like to post up some thoughts on an idea for a reverse bounty and gather some preliminary opinions from other people. Depending on the response, I may follow through, or postpone the idea.

The reverse bounty proposal is a module to handle bounced emails, and optionally to integrate statistics on bounced emails. For convenience, let's call the proposed module EBH for now.

I've been aware for the last year or so that Drupal core doesn't have any facilities to handle bounced emails. See, for example, a previous post of mine which didn't elicit anything. If I am incorrect in this observation, please let me know.

The only thing that might have been relevant was Bounced Email, but that looks like a dead module. More on Bounced Email below.

Functionality

EBH will handle the following main use cases:

o Privileged users can view a list of all bounced emails resulting from emails sent by the website.
o For each account, privileged users can view a count and a list of bounced emails associated with that account.
o EBH can also display individual bounce emails.

EBH may optionally handle the following secondary use cases:

o For some modules which also store and use email addresses (ie. simplenews), EBH can display a list of the bounced emails associated with each email address. So, for example, privileged users can view the bounce counts for simplenews subscribers which may not necessarily be user accounts.
o Statistics are to be provided on bounced emails.
o EBH will provide a way to list users with more than 3 bounced emails (so administrators can take the appropriate action)

EBH will not:

o Tag each bounced email with the original email which caused that bounce. This is because I would prefer that EBH not store every single email sent out by the website.

Of course, comments, suggestions, and opinions on the feature set are welcome.

Technical Notes

o Emails sent out by Drupal will have a return-path address set to a preconfigured mailbox. EBH periodically (via cron) open this mailbox and processes the emails within it.
o EBH will require a working email server and an accessible mailbox, probably using IMAP(s) or POP(s).
o EBH can be a replacement for the current Bounced Email module. (Maybe we should even use the same module name?)
o EBH will first be developed for Drupal 5.x, with a 6.x version to follow.

Thoughts on the actual text processing of bounced emails

(Warning: this is potentially controversial)

I'm seriously considering using the bounce email processing logic from Mailman. Mailman has the most comprehensive bounce logic that I'm aware of, and it has special cases to handle bounce emails from Yahoo, Outlook etc. It also means I don't have to reinvent the wheel (redoing text processing is something I very much want to avoid).

Whenever Mailman updates it's bounce processing logic, I can just pick up the changes.

(BTW, the email processing in the module Bounced Email is very simplistic, and I have doubts about it's robustness)

To take this further, I'm also considering having Mailman as a dependency, and having EBH call into Mailman. Last time I looked, Mailman has a standalone command line script which can execute the bounce processing and output the results. I can integrate EBH with Mailman by spawning this script.

Else I'd have to come up with some sort of python/php bridge (Mailman is written in python). While this might sound like fun to do, I don't think it'll be possible anytime soon, so let's not go there.

Concluding thoughts and questions

So, please let me know what you think. Suggestions, comments, criticisms, flames, are all welcome.

In particular, I'd like to get enough feedback to form an opinion on:

How much demand is there for EBH?
Are people willing to pay for this functionality?
How much funds do people think I should ask for this?
Is it okay to ask that Mailman be required for EBH to work?

Thanks for reading.

Comments

neurojavi’s picture

This module would be very useful. The following use case "EBH will provide a way to list users with more than X bounced emails" would be the most used by me and many others so I wish it wasn't an optional one.

I think that mailman dependency isn't a good idea. It would prevent the use of this module by a lot of users.

Thanks.-

salvis’s picture

EBH would be a great companion to Subscriptions, because bounces are a pain for every Subscriptions installation. I'd be very interested in providing some integration, for example suspending subscriptions to bouncing addresses.

I wrote Subscriptions (with chx) for free and would not recommend a module that comes with a license fee, but you might be able to find funding for its development. Putting up a solid project and offering services might be a better strategy in the long run though.

A dependency on Mailman would likely be a turn off for many potential users.

Please try to talk to the Bounced Email maintainer about taking over his project. He's still active, but he hasn't touched Bounced Email in two and a half years. He'll probably turn it over to you and you can start from scratch in the proper slot.

chx’s picture

While PiP itself have not seen a release in six years, the PECL extension relatively recently did. Might worth a look http://pecl.php.net/package/python
--
The news is Now Public | Drupal development: making the world better, one patch at a time. | A bedroom without a teddy is like a face without a smile. |

--
Drupal development: making the world better, one patch at a time. | A bedroom without a teddy is like a face without a smile.

bengtan’s picture

Looks like mailman dependency is a no-go. However, I still believe in using the mailman bounce processing logic (as I definitely don't want to re-invent the wheel), so I'll probably fork the relevant portion out of mailman and do something with it.

Salvis: Thanks for that thought. I'll keep it in mind.

Chx: Thanks for that link, I'll go have a look.

EmanueleQuinto’s picture

CiviMail (http://civicrm.org/civimail), a component (add-on) of CiviCRM, handles bounces. We used version 1.8 on drupal 4.7 and neglecting some little issue on multiserver environment, works quite well. It can works either with MTA access (amavis) and imap. We test with gmail for domains (imap support from september 2007) without problems.

ema

bengtan’s picture

Yeah, but don't you have to install civicrm to use civimail? If so, I think it's a bit heavy to add civicrm to a site if you only want to use a part of a civicrm add-on.

criznach’s picture

And... I believe the bounce handling and mail processing has gotten better/easier since then. But yes, it's a bit heavy.

Dave Cohen’s picture

You could also look to http://www.phplist.com/ for an example of bounce handling code. I nearly used phplist instead of simplenews for one site because they wanted bounce handling, but it was too painful so we went without the bounce handling. In short, I'd use this module if it existed.

Please add to your requirements that it fire actions each time it processes a bounce. So that third-party modules can automatically act on the bounce.

bengtan’s picture

Please add to your requirements that it fire actions each time it processes a bounce. So that third-party modules can automatically act on the bounce.

Shall do. I was thinking along those lines of inserting hooks for third parties as well.

sammys’s picture

Since everyone thinks their preferred brand of "engine" is better and because there are different transports (e.g IMAP,POP) my only request is that the design is pluggable. E.g you can completely rewrite the engine module and replace the one used by default. I can assist with the design if you want to sound things out.

This will make it possible for your engine module implementation to be dependent on mailman yet other people can make theirs dependent on say enemies-of-carlotta instead.

--
Sammy Spets
Synerger Pty Ltd
http://synerger.com

--
Sammy Spets
Synerger Pty Ltd
http://synerger.com

sutharsan’s picture

As the maintainer of Simplenews I can say that there is a fair demand for EBH.
EBH should ideally be independent from a mail backend and have an API which various systems van use. But I realize this will not be easy.

Recently AjK started an issue with code to make an EBH for simplenews based on VERP. See the issue for more info: http://drupal.org/node/242137

-- Erik

-- Erik

bengtan’s picture

Hi,

Thanks everyone for their opinions.

It certainly seems this module will be useful, and I have got a few very helpful suggestions that I will most likely follow.

I would very much like to go ahead and write this module, but it comes down to how to justify doing so, and then scheduling it into the rest of my activities.

Please keep the feedback coming!

Thanks,
Beng

--
ThinkLeft
http://thinkleft.com.au

jonathan_hunt’s picture

I'd be keen to see this. I have a forthcoming project that will use email lists extensively so better bounce handling would be useful. I can test your functionality, and perhaps contribute some code, and a few $. Can I suggest you follow up the suggestion from salvis and look to take over the Bounced Email module?

markus_petrux’s picture

There may be sites that heavily use email, newsletters, notificacions, etc. so I think it would be nice to have an option to process bounces.

On the other hand... it would be nice if there was a way to promote some kind of standard for bounce notifications.

I'm currently dealing with a site with more than 300000 registered users, not in Drupal but it will someday, and for bounces I created a small PHP script that simply removes all that comes to a bounce-catcher email address used for the purpose. This is executed daily via cron. I can still check via webmail what is there in case this is needed, but only what remains between crons. We have aroung +20000 bounces daily. But not all mean permanent errors. Many are temporary, kind of quota exceded, etc. The problem is that it is impossible to automate some kind of analysis. Every email provider has its own format for the messages they send, you have to read manually to get an idea of what they are trying to tell... so that's why my process removes all bounces. Of course, this is not scalable solution, good practice either, but it works. At least, until a better solution can be applied.

Doubt is the beginning, not the end of wisdom.

zooki’s picture

Hi what happened with this???

ragnarkurm’s picture

I'm also looking for "bounce detection" solution in Drupal.
Or actually it could be universal to benefit all :)

Mailman and CiviMail are indeed good ideas for code reuse.

Doing googling found two more aspects which can be concidered:
* loop detection
* sending probes

Also found some pieces that havent mentioned
in this thread before. Maybe useful for someone.

OpenPSA
http://www.openpsa.org/version2/documentation/bounce-detection/

Lamson Project
http://lamsonproject.net/docs/bounce_detection.html

Email Bounce Detector
http://bouncedetector.riaforge.org/

Perl Module: Mail-DeliveryStatus-BounceParser
http://search.cpan.org/dist/Mail-DeliveryStatus-BounceParser/

PHP List
http://docs.phplist.com/PhpListConfigBounces

Moodle
http://docs.moodle.org/en/Email_processing

EZMLM (written in C)
http://www.ezmlm.org/faq/How-bounces-are-handled.html#How-bounces-are-ha...

Majordomo
(No good "bounce detection" link)

aidanlis’s picture

I'll develop it if someone wants to put an actual bounty on it.

ragnarkurm’s picture

As I need real solution for bounce handling I did a step forward.
I guess Mailman in most mass-tested, mass-deveolped and ripe software to re-use.
Today I took look into Mailman and CiviMail.
Here are my findings.

I understood that there are 2 layers processing done.
1) Identifying email address and bounce reason
2) Decision algorithm that decides when to disable user

Mailman

Pros

  • Has 2-layer system
  • First layer has MUCH richer/diverse patternset to find email from bounce
  • Second layer is VERY developed to handle temporary delivery problems
  • Its very easy to "rip out" first layer processing for own needs. Tested. (see code later)
  • Mass-tested, mass-deveolped and ripe software
  • Code written in Python. Its pro because the code is more strict.

Cons

  • Code written in Python. Its con because need to use 2 different languages in system (PHP and Python). Hence needs some interfacing/bridging.
  • Second layer probably more rooted into the framework
CiviMail

Pros

  • Written in PHP, simpler to deploy/integrate.
  • Probably possible to "rip out" first layer functionality. Not tested.

Cons

  • Has 1-layer system. Almost.
  • Keeps simple patterns in database - more complexity
  • CiviCRM bounce processing (L2) usually just disables user after 1 bounce which is very annoying to constantly enable users whose mailboxes have been full for a moment or they have been on vacation (notification) etc.
  • A lot of different bounce reasons (vacation, mailbox full, etc) - more code needed to decide yourselt how handle which cases.
Dummy list

There could be one more option to handle bounce detection and handling/processing.

  • Install almost any list software which code can be accessed freely.
  • Create dummy list where will be all emails.
  • No email is never sent to list and all notifications will be turned off
  • Send mails to users from custom code as needed
  • Feed all bounces to the dummy list
  • From time to time synchronize enabled/disabled status between the list and the drupal system
Mailman bounce detection system (L1)

Proof of concept.

I have Debian Lenny System.
Mailman version 1:2.1.11-11+lenny1.
Installation is at /var/lib/mailman

In proof of concept dir have following links:

email -> /var/lib/mailman/pythonlib/email
paths.py -> /usr/lib/mailman/bin/paths.py

Its needed to make easier to include needed files.
In same dir create following Python script (modification of /var/lib/mailman/tests/onebounce.py):

#! /usr/bin/env python

import sys
import email
import getopt
import paths
from Mailman.Bouncers import BouncerAPI

# print usage
if len(sys.argv) <= 1:
        print "Usage:",sys.argv[0],"<bouncefilename1> ..."
        exit

# loop through arguments
for file in sys.argv[1:]:

        print 'Processing',file

        # read in bounced message from file
        fp = open(file)
        msg = email.message_from_file(fp)
        fp.close()

        # loop it thourgh different specs
        # (Yahoo, Qmail, Postfix, Exim, ...)
        # we could actually break at first match
        # but for demo's sake check all of them
        for module in BouncerAPI.BOUNCE_PIPELINE:

            modname='Mailman.Bouncers.' + module
            __import__(modname)
            addrs = sys.modules[modname].process(msg)

            # sometimes we get None instead of []
            # these modules just have different return vaules
            # when not matching
            if addrs is None:
                addrs = []

            # report analysis result
            if addrs is BouncerAPI.Stop:
                print module, 'warning, not a bounce'
            elif len(addrs) == 0:
                # pass
                print module, '...'
            else:
                print module, 'bounce', ', '.join(addrs)

        print

It's important to keep code stepping for Python.
chmod 755
It reads commandline args as filenames.
Reads file and reports emails or if further processing is not needed.
Output:

user@host:~/mailman-bounce-handler% ./proof-of-concept.py /var/lib/mailman/tests/bounces/postfix_01.txt
Processing /var/lib/mailman/tests/bounces/postfix_01.txt
Postfix bounce xxxxx@local.ie

user@host:~/mailman-bounce-handler%

Actually /var/lib/mailman/tests/bounces is full of testcases.
So you could run to test

./proof-of-concept.py /var/lib/mailman/tests/bounces/*
CiviCRM stuff

This case was not so interesting to me,
so I didn't bother to create proof-of-concept code.

Here are just good starting hints
if somebody wants to develop something on his own.

Important file: sites/all/modules/civicrm/CRM/Mailing/BAO/BouncePattern.php
Important class: CRM_Mailing_BAO_BouncePattern
Important function: match(&$message)

Important db table: civicrm_mailing_bounce_pattern
It holds handful of most important bounce detection patterns.

Important db table: civicrm_mailing_bounce_type:

+----------+-----------------------------------------------+
| name     | description                                   |
+----------+-----------------------------------------------+
| AOL      | AOL Terms of Service complaint                |
| Away     | Recipient is on vacation                      |
| DNS      | Unable to resolve recipient domain            |
| Host     | Unable to deliver to destintation mail server |
| Inactive | User account is no longer active              |
| Invalid  | Email address is not valid                    |
| Loop     | Mail routing error                            |
| Quota    | User inbox is full                            |
| Relay    | Unable to reach destination mail server       |
| Spam     | Message caught by a content filter            |
| Syntax   | Error in SMTP transaction                     |
+----------+-----------------------------------------------+
exratione’s picture

I've written a general purpose and extensible bounce handling module for Drupal 7. See:

http://drupal.org/project/bounce