Closed (fixed)
Project:
Drupal core
Version:
4.6.3
Component:
database system
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Anonymous (not verified)
Created:
17 Jan 2005 at 21:39 UTC
Updated:
22 Aug 2006 at 18:39 UTC
Drupal isn't setting the connection character set when creating and using a MySQL 4.1 database. Because all of Drupal is UTF-8, it should do "SET CHARACTER SET utf8" at the begining of each MySQL connection, if it detects that it's connecting to a MySQL 4.1 database. Also we need to instruct users to make the database with CREATE DATABASE drupal CHARACTER SET utf8.
Comments
Comment #1
jvandyk commentedThis still applies to cvs. INSTALL.txt and database.mysql.inc do not reflect these changes.
Comment #2
damien tournoud commentedUntil all the modules use utf8-compliant functions, we need to use "binary" collation on several columns, at least in tables "cache" and "search_*". That's because in a utf8 connection, not all character strings are valid.
Until this is fixed, it's probably better to use a latin1 charset connection, and to modifiy the database creation script to use "latin1" collation.
Comment #3
morbus iffdamz: are those two the only tables that "matter"?
Besides the connection string that anonymous mentions (of which I know little about it), we can convert the default encoding of the database and tables using the following commands in MySQL 4.1.
More talk about this issue here: http://www.cesspit.net/drupal/node/897.
Since this particular bug can corrupt backups, I'm setting it to critical. A broken backup is very very bad.
Comment #4
morbus iffChanging back to normal. Further testing is needed.
Comment #5
jozef commentedHere are results of my experiment, but i am drupal newbie, correct me if i am wrong
1. My ISP has MySQL 4.1 set to default values:
collation_database=cp1250_general_ci
collation_server=cp1250_general_ci
character_set_database=cp1250
character_set_server=cp1250
2. I can say Drupal 4.6.3 works with slovak language, but
3. cron.php completes successfully but the search_ tables are empty, therefore search module does not work.
4. trip_search works good, as well with slovak characters
5. ALTER DATABASE database CHARACTER SET utf8 COLLATE utf8_general_ci;
solved the cron troubles, but
6. it is no more possible input slovak characters into content (page, story, ...)
7. RESUME: i will use 8bit encoding and trip_search with MySQL 4.1
Comment #6
magico commentedI agree with #4 and because nobody else complained about this particular problem I'm closing it.
It's critical to have an usable and recoverable database, but it seems it was one person case.