Greetings,
I'm working on an English-language site, but in a few of the articles we need to quote a few words of Greek. The Greek text I have is from a unicode font, and I can paste it into any application on my computer, change the font, etc. and it always shows up as the right characters.
When I put it in the "body" field in a node, it is converted to all question marks. This happens regardless of input filter, whether filtered or full HTML. When I go back to the node editing form, the text has permanently turned into question marks.
HOWEVER
When I paste that same Greek text into one of my custom CCK fields, the Greek comes up just fine in the final product, and this also seems to be unrelated to the input format specified.
Does anyone have any idea how the text processing in the body field might be different from that of a CCK field? Any ideas on how I might fix this problem. I'd rather not add a replacement body field with CCK, since that would involve converting a lot of nodes...
Comments
First off, this should work
First off, this should work flawlessly and painlessly.
Assuming you are using the same theme to view all pages and assuming you are viewing all pages on the same browser (you might check to see of the browser somehow uses different encoding on different pages on your site, which would probably be a theme issue), it sounds like there is some kind of filtering going on with you nodes that doesn't happen with the cck fields.
To test this, I would suggest entering the greek text as part of the title. If that works, I would suspect that you have a WYSIWYG editor that is filtering your code and goofing things up for youin which case, there is likely a configuration change you can make in it to prevent it from changing your utf characters to escaped character codes.
If that doesn't work, you should look at any modules that are part of your input filtering. This should work with Full html or filtered html, but you may have a module that tries to substitute characters or clean things up or do other stuff.
Beyond that, it would probably be possible to do what you are talking about with javascript, but unless you have been monkeying around with javascript or a module that uses javascript, you should be safe. Even at that, it sounds pretty far out.
Hello, and thanks for
Hello, and thanks for responding.
Here's what I've done for further testing:
I have a Drupal installation on a different server...I'll test this out there just to make further narrow down what the problem might be.
More results: Tested Greek
More results:
I guess the next thing to check, as far as I can think of, is to upgrade those two other sites to 5.7 and see if the problem comes into play once 5.7 comes on the scene.
Another thing to consider is
Another thing to consider is the encoding of your database. It should be utf-8. However, that doesn't sound like the issue here.
What sounds more likely is that you have specified a font (probably in your css page) that is available on your system but that does not include Greek characters.
Other than that, you may have just hit a really weird bug. You could try entering text from other languages or different text in Greek to see what happens.
It really can't be the fonts
It really can't be the fonts since the font works on CCK fields, but not on body or title fields.
I don't think it is the database encoding, since Greek works in CCK fields and on other website with the same host (and thus created with the same configuration.
I tried entering some unicode Hebrew, and got the same question mark problem.
Right now I'm going to upgrade one of those pre-5.7 sites and see if the bug arises when 5.7 is on the scene.
It would be possible to have
It would be possible to have your CSS apply different styles and different fonts to different content types. (Of course I am talking about how it is displayed after it is saved, not how it looks when you first enter it.)
Multiple languages and even mixed languages is something that a lot of people do with Drupal and is something that has been implemented well for a long time. I really don't think you have found a bug in Drupal core that will be fixed when you upgrade. It might be possible, but I would say very unlikely.
I think the issue is probably with your theme or font. Your DTD probably says the article is English which might prevent the browser from trying to substitute an appropriate font. I would suggest trying the default theme with a default stylesheet to see what happens.
Another thing you can check for clues to your problem is to actually open up the database with phpmyadmin or whatever you use and view the text that is saved in the database to see if it is saved as Greek characters or question marks.
Fixed It
OK, I went into the database and found that the question marks were there too. I looked and a lot of my tables were set to latin_swedish_1 or something like that. How that happened, I have no idea. All I know is that this database originated a while back from a 4.7 installation. So, since I know nothing about bulk SQL queries and executions, I just dumped the entire database using mysqldump and opened the file in Textmate, and did a find & replace - taking "latin1" and replacing it with "utf8". Everything seems to be working like a charm now.
Thanks so much for helping me get to the bottom of this.