When I have this filter enabled in an input format, and there are unencoded special characters such as registration symbol and em dashes, etc, extra codes appear around them whether they're in links or not.

Examples, ® rather than just ® or â rather than -

I have the problem when ELF is the only filter enabled so it doesn't seem to be a conflict issue with other filters. When I turn it off, the extra characters disappear.

Drupal 6.16, ELF 2010/04/27

CommentFileSizeAuthor
#3 elf-787980-3.patch701 bytesfuerst

Comments

Anonymous’s picture

I have the same problem

fuerst’s picture

That seems to be a problem with the underlying libxml2 library used by PHP.

Workaround is to convert the text to ISO-8859-1 by using http://de2.php.net/utf8_decode before loading the text: $doc->loadHTML(utf8_decode($text));

Not an ideal solution though. Probably using PHP's DOMDocument isn't the way to go for this module? This class isn't used in Drupal elsewhere AFAIK. May be for good reason.

fuerst’s picture

Status: Active » Needs review
StatusFileSize
new701 bytes

Hm, looks like the solution is easy and libxml is not to be blamed: loadHTML expects the encoding set in the <head> section of an HTML document. See first comment in http://de2.php.net/manual/en/domdocument.loadhtml.php.

Adding <head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head> to the text before loading it works for me.

Patch attached.

troybthompson’s picture

Excellent, that worked for me.

fuerst’s picture

Well, so you may set this issue status to "reviewed & tested by the community" to let the maintainer know. Thanks!

troybthompson’s picture

Status: Needs review » Reviewed & tested by the community
xano’s picture

Status: Reviewed & tested by the community » Fixed

Thanks a lot!

Fixed and committed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

vitok-dupe’s picture

Version: 6.x-3.x-dev » 7.x-3.x-dev
Status: Closed (fixed) » Active

Now encoding problem in D7 version. If save node in English all fine, but then i save in Russian, I get a mess of hieroglyph. At the edit page all Russian letter in normal.

xano’s picture

Status: Active » Postponed (maintainer needs more info)

Can you give step-by-step instructions to reproduce the problem?

xano’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Please open this issue again if you can provide step-by-step instructions on how to reproduce the problem using a clean installation and the latest dev versions of Drupal and External Links Filter.