"Þ" is being replaced with "Þ" instead of "Þ".
I reported TinyMCE incorrectly messing up my Icelandic texts some time ago (http://drupal.org/node/21060). At that time I gave up on WYSIWYG editors and stuck to HTML. Now I tried FCKeditor and had the same thing happen.
After many hours of reading through Javascript code certain of FCKEditor's fault in this I discovered by disabling all Input modules for the node everything was fine.
In filter.module there is a line:
Line 997: $chunk = preg_replace('/&([^#])(?![a-z]{1,8};)/', '&$1', $chunk);
This line assumes there are no entities with upper case letters. Big thorn however is and that's why this only messed up this SINGLE character.
By changing it to:
$chunk = preg_replace('/&([^#])(?![a-zA-Z]{1,8};)/', '&$1', $chunk);
Now everything works fine. I'd post a patch but I have no idea how atm.
Comments
Comment #1
imerlin commentedInput filter swallowed the first line... I'll try to explain.
* WYSIWYG is translating "Þ" to "& THORN;" (remove space)
* Drupal's input module is translatin "& THORN;" to "& amp; THORN;"
Messes up alot of Icelandic texts, not sure if other countries use this character but it's in the w3c doc.
Comment #2
Steven commentedThis has been fixed in 4.6/HEAD recently.
Comment #3
dries commented