configurable encoding patch
fr - June 4, 2008 - 13:01
| Project: | vCard |
| Version: | 5.x-1.1 |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | fr |
| Status: | closed |
Description
I noticed that Outlook does not correctly display umlauts because of lacking UTF8 support.
This patch makes drupal export a ISO-8859-1 encoded vCard, which works around this problem.
| Attachment | Size |
|---|---|
| vcard-utf8-outlook.patch | 486 bytes |

#1
What about characters that are not defined in ISO-8859-1 ?
#2
From my point of view this rather is a bug in outlook than in vcard.
So what will happen when korean characters like 의 계정 세부사항 are included in a vcard?
#3
Well, other characters won't work anymore? As I said, it's meant to be a workaround.
Of yourse, the bug is in outlook. But imagine, the situation is just, the customer uses outlook and wants a solution.
I guess, you don't want to tell him, that he's the problem.
I suggest, this mainly affects people outside of europe, which would have to replace ISO-8859-1 to their local encoding.
Another generic solution could be to implement Quoted Printable encoding. How about that?
#4
Sorry, as long as we didn't came to an agreement, this issue is not fixed.
And it is not a bug just because Microsoft apparently is too stupid to handle different encodings than ISO-8859-1.
Drupal is using UTF-8 all over the place, so I won't include this patch as it is now.
However, I could imagine to pipe the $vcard->fetch(); on line 198 through a themeable function, so anyone interested in limiting the characterset could do this via overwriting the function in the themes template.php
#5
I never said it's a drupal bug.
You did not answer my second question - I've also recommended another probably generic solution.
I think your approach would also be fine, but has the disadvantage that it requires effort on templating.
#6
Again, this is not a bug in vcard.
If you ask Google, this problem is well known to MS since a long time.
And yes, you would have to tell your users, that this is a bug in MS-Software.
--
According to [1] the only valid encodings since V3 of the vcard specification are 8BIT for strings and Binary for files.
Quoted Printable is available in vcard specification until V2.1.
So a possible solution is to provide additional settings on admin/settings/vcard where the admin may choose to generate V3 or V2.1 versions of vcard.
If V2.1 has been chosen one may choose to encode in Quoted-printable.
In vcard_fetch() we need to add a charset param for each element and the contents need to be encoded in Quoted-printable.
So, how to encode in Quoted-printable?
Theres no function in PHP I know of.
[1] http://www.ietf.org/rfc/rfc2426.txt
#7
Yes, but UTF8 uses more than 8 bits, so the vCard module seems to violate the RFC.
The characters have to be encoded in ASCII.
Here you got a function for Quoted Printable encoding:
function quoted_printable_encode($input, $line_max = 76)
{
$hex = array('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
$lines = preg_split("/(?:\r\n|\r|\n)/", $input);
$eol = "\r\n";
$linebreak = "=0D=0A";
$escape = "=";
$output = "";
for ($j=0; $j<count($lines); $j++)
{
$line = $lines[$j];
$linlen = strlen($line);
$newline = "";
for($i=0; $i<$linlen; $i++)
{
$c = substr($line, $i, 1);
$dec = ord($c);
if ( ($dec == 32) && ($i == ($linlen - 1)) ) { // convert space at eol only
$c = "=20";
}
elseif ( ($dec == 61) || ($dec < 32 ) || ($dec > 126) ) { // always encode "\t", which is *not* required
$h2 = floor($dec/16); $h1 = floor($dec%16);
$c = $escape.$hex["$h2"].$hex["$h1"];
}
if ( (strlen($newline) + strlen($c)) >= $line_max ) { // CRLF is not counted
$output .= $newline.$escape.$eol; // soft line break; " =\r\n" is okay
$newline = " ";
}
$newline .= $c;
}
$output .= $newline;
if ($j<count($lines)-1) $output .= $linebreak;
}
return trim($output);
}
Everybody knows it and that's not the point.
The question is, how can you give users a possibility to work around that without modifying the drupal core?
Sure, but that doesn't help them, since they just want something that simply works.
How about being less ignorant and providing a possibility to solve this issue so that everyone is happy?
Yes, this would be cool.
I am going to release a new patch in the next time, which gives the vCard module the possibility to configure the encoding in the admin menu if you can agree with this.
#8
If you provide a valid solution, be shure It'll be incorporated.
But it mustn't reduce the possible available characters from 1.114.112 to 191 as supposed in your first post!
And there's nothing about religion in this.
By the way would you please file a bug against Microsoft Outlook [1], so that they know people want UTF-8 support in it!
[1] http://weblog.timaltman.com/archive/2006/03/22/reporting-bugs-microsoft
#9
I just uploaded a new patch, which makes the encoding configurable and covers common charsets (using iconv).
It defaults to UTF-8, even if this is not RFC-compliant.
I didn't implement Quoted Printable encoding and the ability of changing the vCard version yet... stay tuned.
Yes, I'll do that ;)
#10
Automatically closed -- issue fixed for two weeks with no activity.