When I try to restore a database dump from my production server to my development site, Unicode characters are getting munged. In production, I see "Björn" but when the same database is loaded locally I get "Björn"
I checked the dump file by loading it into vim and validating that it was UTF8 encoded (:set fenc) and that I could see "Björn". So I think the dump file is properly encoded.
All of the CREATE TABLE statements in the dump include DEFAULT CHARSET=utf8.
I checked the record in question with the mysql command line client against my development database, after I had restored the dump, with SELECT * FROM comments WHERE cid=... and I see "Björn". So I think the restore is good too.
The status report on both boxes shows the Unicode library as green: "PHP Mbstring Extension".
Which leaves...what, exactly? I'm stumped! I get very nervous when I can't easily restore my production backups. Can anyone throw me a clue?
In case it matters, my dev box is Mac OS X, Drupal 6.9, MySQL 5.0.27, PHP 5.2.6. My production box is running Linux, MySQL 5.0.32, PHP 4.4.6.
Many thanks!
Comments
ping
Allow me one ping to catch some west coast US traffic. I've continued to look into it, but am still completely stumped.
same here
I'm experiencing the same problem - the hosting company promises to get back in 48 hrs.
I noticed different results from this query on the development site and the live site:
SHOW VARIABLES LIKE 'char%';
(live)
Variable_name Value
character_set_client: latin1
character_set_connection: latin1
character_set_database: latin1
character_set_filesystem: binary
character_set_results: latin1
character_set_server: latin1
character_set_system: utf8
character_sets_dir: /usr/local/mysql-5.0.67-linux-i686/share/mysql/charsets
(development)
Variable_name Value
character_set_client: utf8
character_set_connection: utf8
character_set_database: latin1
character_set_filesystem: binary
character_set_results: utf8
character_set_server: latin1
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/
I suspect the sql server configuration is mangling the character encodings. No idea though as to how to fix it.
The solution was...
I had to add the argument --default-character-set=utf8 to the mysql command I used to restore the dump (even though the dump file was UTF8 encoded, and even though every CREATE TABLE statement has a CHARSET=utf8).
Hope that helps you too!
my solution
was to switch web hosts ;-)
I couldn't have added that argument even if I knew about it due to the crippled version of phpmyadmin provided. The tech support guys were no help at all. My advice to people looking for web hosting: stay away from Sasktel.
I had exactly the same
I had exactly the same situation into a rds instance from amazon i fixed it changing db parameters using the api.
Its very important if someone have a csv file(utf8) and try to load data into a table also in utf8, that uses set names utf8 and check using a select to test the result. if you have chars not correctly translated please use SHOW VARIABLES LIKE 'char%'; to check the values, you need all utf8.
The only source of knowledge is experience. ~ Albert Einstein.
this may or may not help
...but I discovered that I had this problem too, only after my client had done a fair amount of work on the site. This meant that restoring my old copy of the database wasn't going to work, since that copy didn't contain the client's new nodes. My only option, that I knew of, was to search and replace the garbage characters. I developed this collection of Drupal-specific SQL statements and ran them all in the "SQL" tab of PHPMyAdmin, and they've fixed the problem for me:
You should do the same thing
You should do the same thing for these tables & columns, I discovered (the hard way):
term_data:
name
blocks:
title
boxes:
body
view_view:
page_title
page_header
page_empty
page_footer
Brother, you saved my bacon.
I've been struggling with this for a week. I knew this way would work, but I could not get the character strings right. Frustrated would be an understatement.
Thank you
Thank you
Thank ’ !!!!!
Glad to be of help :) Mainly
Glad to be of help :)
Mainly for my own reference, here's that set of SQL statements, updated for Drupal 6 (minus the Views stuff...too involved for the moment I'm afraid):
More update for Drupal 6
Thanks a lot, bdimaggio
In Drupal 6 you also have to
UPDATE node SET title = REPLACE(title,...
and replace
UPDATE menu SET title = REPLACE(title,...
with
UPDATE menu_links SET link_title = REPLACE(link_title,...
I need to replace those
I need to replace those characters in the url (clean) as well.
What would be the proper term for that?
Thanks.
Edit: Figured it out.
D7 Unicode Character Fix
This thread really helped me out! I also had lots of unicode characters and ended up making some modifications to the code here to work with D7.
Found another char
Also for D7:
this seems like it could be
this seems like it could be very helpful, but I seem to be having issues cut/paste or download trying to get the characters to be replaced on my sql command line. With putty I tried translations set to utf8 and to Win1250 also to no avail. hmmm.
this seems like it could be
this seems like it could be very helpful, but I seem to be having issues cut/paste or download trying to get the characters to be replaced on my sql command line. With putty I tried translations set to utf8 and to Win1250 also to no avail. hmmm.