(I have posted this as a message to the drupal-support mailing list and been advised to post it here. Don't know if as a bug or as a feature request)
I am setting up an spanish language site using Drupal 4.4.0 and have
run into problems when using the Textile filter. If I don't use any filters, the content shows up correctly when using accents:
"La Asesoría Académica es uno de los centros oficiales de información"
But if I use the textile filter (last version, with any of the two possible usages (Textile 1 or Textile v2b) I get the next undesired result:
"La AsesorÃa Académica es uno de los centros oficiales de informaciÃ"
I think that UTF-8 encoding has something to do with it... I would like if someone has run into similar problems and if there's a solution.
Another question it's that if there is any way of Drupal adding
automatically when you insert breaks in the editing area.
Thanks in advance,
álvaro
Comments
Comment #1
magnestyuk commentedThe same here, except I modified all header charsets to iso-8859-2. Things worked fine until textile. Textile deleted all characters from my posts that were accented.
I looked into the textile php-s and after some reading up on the topic I modified this code in textile1.php around line 130 after "# entify everything":
I commented that piece of code and inserted
So, although the function that is responsible for the "bad" conversion (encode_high) is still in the script, it is not invoked. Instead, the above conversion (taken from php.net's manual) is done that--as I understand--leaves alone characters not in the iso-8859-1 character set.
I have a limited knowledge of php and this is not the best solution, but it worked for me. Also note that I use only Textile 1, because that's enough for me, so I left alone textile2.php.
I hope this helps somewhat.
Comment #2
magnestyuk commentedIn fact, I just discovered that the above modification did not actually leave alone my accented characters but changed them to html entities. This is not what I wanted. I would love to see the textile module be considerate of non-Western character sets and truly leave out those non-Western characters from filtering altogether.
I'm also at the brink of going public with my Drupal site, but I'm having tons of issues with non-Western encoding, this being one of them.
I would really really appreciate if someone took the time to look into this.
Thanks.
Comment #3
jhriggs commentedPlease try again with the latest release of the Textile module. It now uses a completely different Textile engine, a PHP port [1] of Brad Choate's Textile.pm Perl module [2]. It may help with this, or there may be some new fixes we can try now.
[1] http://jimandlissa.com/project/textilephp
[2] http://bradchoate.com/mt-plugins/textile
Comment #4
pz commentedDidn't work for me with the 4.4.0 release (downloaded 2004-07-11), my workaround was to set the default value for options['char_encoding'] to 0 instead of 1, which seems to work fine for my site.
Comment #5
Gabriel R. commentedIf you are using UTF-8, changing the char_encoding option is recomended and it works well.
Comment #6
jhriggs commentedhttp://drupal.org/node/11357