My site is set up to import e-mails into nodes, placing the From, Reply-To and Subject headers into CCK fields.
I recently received an e-mail where all three of these headers were MIME-Encoded (e.g. "=?utf-8?q?Hello_World?=" instead of "Hello World"). I was pleasantly surprised to see that the Subject header was correctly decoded in the node that was created, but puzzled that the From and Reply-To headers were not.
After looking into mailhandler's code, I found the following in mailhandler/plugins/mailhandler/commands/MailhandlerCommandsHeaders.class.php, inside the loop that parses each line of the header in the function process():
if (in_array($key, array('Subject', 'subject'))) {
$message[$key] = iconv_mime_decode($value, 0, "UTF-8");
}
else {
$message[$key] = $value;
}
This code clearly explains the behaviour I'm seeing. Is there any reason why only the subject header is MIME-decoded, and not the other headers? Changing the code to the following has fixed the problem I was seeing, but I'm not sure if MIME-decoding should be applied to all headers, or selectively. This change only includes the headers I needed for my site, but, for example, the "To" header in the e-mail in question was also MIME-encoded, so should really be dealt with like this as well. I'm not sure which other headers would also need MIME-decoding.
if (in_array($key, array('Subject', 'subject', 'fromaddress', 'reply_toaddress'))) {
$message[$key] = iconv_mime_decode($value, 0, "UTF-8");
} else if (in_array($key, array('from', 'reply_to'))) {
$message[$key] = array();
foreach ($value as $array_key => $array_object) {
$array_object->personal = iconv_mime_decode($array_object->personal, 0, "UTF-8");
$message[$key][$array_key] = $array_object;
}
} else {
$message[$key] = $value;
}
A patch with this change (against 6.x-2.4) is attached; as far as I can see this should apply to 6.x-2.x-dev and 7.x-2.x-dev as well.
| Comment | File | Size | Author |
|---|---|---|---|
| #2 | mailhandler-mime-all-headers.patch | 1.26 KB | penguin25 |
| mailhandler-mime-from-replyto-headers.patch | 1.32 KB | penguin25 |
Comments
Comment #1
danepowell commentedAFAIK, the only reason that the subject is decoded is due to this issue: #1312694: Clean ISO-8859-1 embedded in mail subjects or filenames
Just like in this case, the submitter only had a problem with one header field (the subject), so that's the only field that got decoded :)
So yes, we should probably just decode all of the headers. Thanks for reporting.
Comment #2
penguin25 commentedAttached is a modified patch that does the decoding for all headers.
Comment #3
danepowell commentedComment #5
danepowell commentedMarked #1391336: Garbled fromaddress with certain encodings as dupe
Comment #6
danepowell commentedDisabled automated testing...
Comment #7
danepowell commentedI fixed this in 7.x and just need to port it back to 6.x. There was a slight bug with your patch in #2 that I fixed.
http://drupalcode.org/project/mailhandler.git/commit/5e7cd6d
I also added a test for MIME-encoded headers:
http://drupalcode.org/project/mailhandler.git/commit/068810a
Comment #8
danepowell commentedhttp://drupalcode.org/project/mailhandler.git/commit/49738a1
http://drupalcode.org/project/mailhandler.git/commit/55d6778