Hi,

After migrating from one host to another I found out one interesting thing:
DRUPAL DATA IS STORED IN MYSQL TABLES IN THE LATIN1 (!) CHARSET.

I don't know why it is so, but I know that if searching in my native language (not english)
the SEARCH is CASE-SESNITIVE, which is totally unaccepable (becase on other sites it's working fine).

If any way to conver my current sql tables from latin1 to utf-8 and will it help wich seach and may be something else?

thank you.

Comments

Avm’s picture

This means you have some problems with connection between MySQL and Drupal.

If you have MySQl > 4.1 then you are lucky and this could be fixed easily.

You should do two things.

1. Convert you data stored in tables from LATIN1 to UTF8.
2. Tell Drupal ti should connect to base using UTF8, in includes/database.mysql.inc:

$connection = mysql_connect($url['host'], $url['user'], $url['pass'], TRUE) or die(mysql_error());

mysql_query('SET NAMES utf8');
mysql_query('SET CHARACTER_SET utf8');

mysql_select_db(substr($url['path'], 1), $connection) or die('unable to select database');

And your base should have collation=utf8_general_ci

Continental’s picture

1. do you want me to run those php script? I'm afraid... =)
2. why I didn't have this before by default?

My sql version was always and now > 4.1

Avm’s picture

1. If you have console access you can dump you database:
mysqldump --default-character-set=utf8 db_name > dump.sql
May be you need set default-character-set to latin1 , you will see it from dump-file

After it drop existing DB and recreate it with right collation, for example: utf8_general_ci, then import your data (check for proper SET NAMES in dump file).

2. This should no be a problem for proper configured MySQL. In /etc/mysql.cnf under [mysqld] should be added one string: init-connect="SET NAMES utf8" Thus every connection, which not defined explicit, will be utf8 connection.

Continental’s picture

I still don't understand the magic. My collation in the old base is latin1. So, it's logically (?) to make dump:
mysqldump --default-character-set=latin1 db_name > dump.sql
After this SET NAMES=latin1 in the dump.
I drop tables in the database and go to phpmyadmin and make:
ALTER DATABASE `dbname` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
Then, mysql db_name < dump and all my tables again latin1.

Afterwars, I made
mysqldump --default-character-set=utf8 db_name > dump.sql
my dump increased from 5Mb(latin1) to 7Mb(utf8), set names = utf-8, but tables charset = latin1. I again dropped database, changed collation for utf8, and made import:
mysql db_name < dump and nothing changed! Tables are still latin1, and site it still working... ;-) (without any hacks).

When I migrated to a new host before, my host asked me to make dump with default-character-set=latin1 (because collation is latin1, so it's correct dump). And we simply imported it in latin1 again. But I want UTF-8.

How I can convert this dump to utf-8 and what else should I change (to import this dump approrerly in the utf-8 database)???