Migrating MailMan archives to Drupal forums

cyberchucktx - August 6, 2005 - 00:18

All:

I'm about to tackle a migration from MailMan archives to Drupal forums.

We have already got MailMan->forum posting working (via mailhandler/listhandler, with
some mods) and forum->MailMan via the same method(s).

Now we are planning to import our existing MailMan archives to the corresponding Drupal forum(s).

I've done a lot of work on understanding mailhandler/listhandler; since a single MailMan submission posts just fine to forums (including threading, which is cool) it should be relatively straightforward to write a bulk import procedure.

Questions:

  1. Anyone interested in this capability at all?
  2. Has anyone else done this already?
  3. When I finish it where should I post? I estimate about two weeks to get it running smoothly (may take less time, but I *do* have a full-time job :-)

Charlie (aka "cyberchucktx")

I would be interested in this

toraji - October 31, 2005 - 18:03

I have got the MailMan->forum and back again working, but was scratching my head trying to figure out how to migrate my MailMan archives. Do you have any more info?

TIA!

I would be very very interested in this

marknewlyn - October 31, 2005 - 18:07

Hi

I have a project which we are trying to migrate to a web-based structure and we have 3 years of mailman archives which I don't want to throw away.

Importing them would really help make the transition smoother.

I can help but only from a tinkering point of view - no guru on this stuff yet - just learning and playing around right now.

Mark

Recommendation:

handelaar - November 6, 2005 - 14:27

This is on my to-do list for evolt.org, in fact.

I've already decided that the thing to do is start from the mbox archives which Mailman stores. A shell script or shell-invoked bit of PHP would break these files into seperate email messages and throw them at listhandler's mail account.

Let me know if you get to it before I do :)

jh

I'm interested

Michelle - November 6, 2005 - 16:07

I'd like to know how you did the Mailman to the forum. I just yesterday took a look at the mailing list capabilities I have with my web host because I wanted to do something that I couldn't do in Drupal. My host uses the Mailman program so I'm interested in how people have that integrated with Drupal. Having it go into the forums sounds kinda cool. I could use that for one site where the members are used to email and kind of scared of the forums. LOL

Michelle

So?

reece146 - January 30, 2006 - 16:57

Did anything come of this?

I'm just about to migrate 7 years of mailman archives to drupal and would appreciate any pointers.

Was thinking I'd give a swipe at copying the archives to one big mbox and use the mailhandler module.

Way to attempt? Better way?

Right. what you can do is to

killes@www.drop.org - January 30, 2006 - 17:15

Right. what you can do is to define the mbox files of your archive as a mailhandler inbox and read them from there. Be sure to not inport too many at once. If you already got forum posts from that mailing list, the threading for recent posts might be a bit off. I'd recomend doing this on a test site first. :)
--
Drupal services
My Drupal services

Sounds reasonable. The

reece146 - January 30, 2006 - 17:23

Sounds reasonable. The existing site is PN and I'm migrating it to drupal (second domain this month!).

I'm just doing the grunt work of config'ing the module(s) currently; will report back with results.

Btw, what do you deem "too much" for import at a time? The obvious breakpoints are per month I suppose. Just a point of curiosity. Fwiw, I've used m2f to migrate from mailman to phpbb2 in one large import in the recent past and it worked fine. It's this same dataset so...

T!

"Too much" really depends on

killes@www.drop.org - January 30, 2006 - 17:46

"Too much" really depends on your server. If you increase the timeout value in cron.php to several minutes and your hardware is not from a museum you probably can import several hundreds of mails per cron run. You really need to find out what is right for your setup.
--
Drupal services
My Drupal services

Progress

reece146 - January 30, 2006 - 22:18

Ok, made good progress today. Just wiped the DB to start over in the AM with a fresh set of data and such.

One little series of quirks...

If you use the mailhandler module to import your mailman data and the user (identified by email addr) is not in your user database then the mailman message will not be imported...

In fact, the email addr is sent a message that their email could not be added to drupal (not able to create node blah, blah...).

Is there a way to turn this off temporarily? Bulk import of seven years worth of messages means that some of the email addresses will not even exist any more... if there was a work around it would save me the task of doing a sed/awk of my mailman archives for consistency against the drupal database...

Can I take non-valid email addrs and stick them in the mail alias field (mailalias.module) for the user? That was my plan of attack for the AM after doing some sed/awk work. I've got about 140 different addresses that need to be mapped to 60 active addresses... good thing it is a high volume, low user count mailing list and not the other way around. :)

Also, a hint for anyone else that stumbles across this thread: in order to map your mail list to your forum you need to stick something like:

type: forum
taxonomy: [My Forum]
promote: 0
status: 1

in your mailhandler default command field.

(from http://drupal.org/node/38943)

Sorry if this is obvious, took me some time to stumble around and find it.

No threading?

reece146 - February 1, 2006 - 18:36

I've done a bulk import of a "bazillion" mailman messages after cleaning up the user database and ensuring that the mail aliases (mailaliases.module) are fully populated with every old address the user has had since 1999 (royal PITA awk/sed work).

Anyway, I've bulk imported the entire mailman archive (~1500 messages per minute) and there is no threading?

Is this a failing of the mailhander module or my archive?

I know that I can migrate my mailman archive into phpBB2 using mail2forum and get threading so I'm thinking the former.

I can always move my archive from mailman -> phpBB2 -> drupal to get the threading since that seems to work but I'm kinda concerned that once I'm up and running there will be no further threading for new messages posted after import. Or does listhandler add that extra peice (haven't gotten that far yet).

Any responses and insights appreciated. I'm SO close! :)

r@m

There should be threading,

killes@www.drop.org - February 2, 2006 - 10:12

There should be threading, it is implemented.
--
Drupal services
My Drupal services

Clarification Please

reece146 - February 2, 2006 - 14:38

How does the module handle threading from inbound email? Is it by comparing the subject line with recent forum posts or by msg-id?

If the latter probably why my mailman archive doesn't work.

Also, after playing with the forum module a little bit it seems that the way drupal does forums is that the root of a topic is a node and then any subsequent posts to the topic are comments. All my imported mailman messages from the archive became separate nodes, not comments.

Is this correct or is there something wonky with my config or import?

There is something going

killes@www.drop.org - February 2, 2006 - 18:17

There is something going wrong.

The module uses msg-ids as it should. If all your followups end up as comments, someting is really going wrong.

the module first uses msg-ids, if that fails, it tries to find matching titles, and failign that it posts a new node. see listhandler_find_parent

--
Drupal services
My Drupal services

Happy, Happy, Joy, Joy

reece146 - February 2, 2006 - 20:44

Things are working well...

Minor quibble...

New forum entries are given the import date as the creation date. Normally this would be ok but since I'm doing imports of old content, is there a quick way to make the creation date the message time stamp. Pointer to where to change in the source, what to change with?

:)

Hmm, no this isn't

killes@www.drop.org - February 2, 2006 - 21:16

Hmm, no this isn't implemented. The mailhandler module would be the place to look at. imap_header() is used and returns the date in unix time. So a patch would be possible.
--
Drupal services
My Drupal services

Update

killes@www.drop.org - April 30, 2008 - 16:45

This has been implemented in mailhandler module during the last two years.

Importing INBOXen through the mailhandler/listhandler combination should work, including threading and creation of user accounts for unknown addresses.

--
Drupal services
My Drupal services

Folder vs IMAP/POP

reece146 - February 2, 2006 - 15:21

Btw, messages are retreived via a folder (populated by user procmail recipe :0c)

Would this make any difference wrt threading?

New (active mailing list) messages sent via mailman to the forums get there but are not threaded. (no longer worried about archives)

Comments added to threads are not sent out via listhandler it seems. Nothing happens.

What am I missing?

Not the issue

reece146 - February 2, 2006 - 16:55

I've got this working now. I forgot to create the list handler role.

Also, is blog.module and it's associated permissions required for these modules to work? Started working once I installed that.

Noticed user posts were creating "unable to create blog content" and once installed the problems all went away.

Still need to revist the threading from bulk import. May be related.

Create the list handler role?

arturoramos - February 3, 2006 - 05:54

Can you give us some more detail here? I have been trying to so a very similar thing importing about three years of YahooGroups messages into Drupal and then setting up a system much like YahooGroups where someone can email to a list and it gets posted and emailed back out to anyone on the list...

I have read through all of the documentation for listhandler and mailhandler and cannot figure out how this works other than the posting to the forums.

I am also getting tons of "unable to post to closed node" messages when I attempt my import from the MBOX file into the forums.

My question is...what does this mean?

I forgot to create the list handler role...

What is the second address supposed to be in the mailhandler program?

Listhandler role stuff

reece146 - February 3, 2006 - 14:44

Make sure you read and implement points 7 and 8 in the listhandler INSTALL doc (i.e. /var/www/$drupal_web_root/modules/listhandler/INSTALL).

I've got the second address pointed at the mailman list's posting address.

I haven't seen the closed node error. In your mailhandler commands are you setting the node as closed when importing? Bullet point 6 in the INSTALL doc gives the absolute minimum of what is required for your mailhandler config and commands.

HTH

p.s. works like a charm once everything is setup/configured properly. Very much worth the effort.

i'm interested

mpamphile - February 2, 2006 - 15:15

i'm interested

 
 

Drupal is a registered trademark of Dries Buytaert.