I'm having a big problem with mysql and drupal, hope you can help me...

My drupal 4.7 website is hosted on a linux RH box (not mine: Atjeu.com). Drupal was installed upon mysql 4.0.27 and utf8 was the encoding.

The server has been rebuilt because it has been heavily hacked, and the technicians have fully restored the backups.

Restoring the server, the ISP upgraded mysql to 4.1.x; actually I don't know if this is the real culprit, but my web pages scambled all accented letters. I opened the database with phpmyadmin and saw that every table had a latin1 charset and swedish_general_ci collation, instead of ut8_general_ci

The ISP doesn't seem to be able to fix the issue, I opened a support ticket three days ago! I need your help!

It seems that the db actually has utf8 data in it, but for some reason it reads it as latin1.
I have already tried to:

1) alter the tables and change them to utf8 and utf8_general_ci
to no avail, all accenter letters were still garbled on my web pages

2) follow the directions on http://textsnippets.com/posts/show/84 :

# mysqldump --user=usr --password=passwd --default-character-set=latin1 --skip-set-charset mysql_db > mysql_db.sql

# sed -r 's/latin1/utf8/g' ./mysql_db.sql > mysql_db_utf8.sql

# mysql --user=usr --password=passwd --execute="DROP DATABASE mysql_db; CREATE DATABASE mysql_db CHARACTER SET utf8 COLLATE utf8_general_ci;"

# mysql --user=usr --password=passwd --default-character-set=utf8 mysql_db < mysql_db_utf8.sql

But this last command retuened with:

ERROR 1062 (23000) at line 1406: Duplicate entry 'perchè' for key 1

I'm getting frustrated… anyone can help me?? :-(

Comments

jsenich’s picture

I can't remember exactly where I found this so I don't have a link, but I've successfully used it several times.

<?php
// put in your username, password
$conn = mysql_connect("localhost", "user", "pass");

//change this to false to alter on the fly
$printonly=false; 
$charset="utf8";
$collate="utf8_general_ci";

$altertablecharset=true;
$alterdatabasecharser=true;


// put here your databases ...
$currentDBArray[] = "db name or array of db names";

function PMA_getDbCollation($db)
{
        $sq='SHOW CREATE DATABASE `'.$db.'`;';
        $res = mysql_query($sq);

        if(!$res) echo "\n\n".$sq."\n".mysql_error()."\n\n"; else

        if($row = mysql_fetch_assoc($res))
        {
                $tokenized = explode(' ', $row[1]);
                unset($row, $res, $sql_query);

                for ($i = 1; $i + 3 < count($tokenized); $i++)
                {
                        if ($tokenized[$i] == 'DEFAULT' && $tokenized[$i + 1] == 'CHARACTER' && $tokenized[$i + 2] == 'SET')
                        {
                                if (isset($tokenized[$i + 5]) && $tokenized[$i + 4] == 'COLLATE')
                                {
                                         return array($tokenized [$i + 3],$tokenized[$i + 5]); // We found the collation!
                                }
                                else
                                {
                                        return array($tokenized [$i + 3]);
                                }
                        }
                } 
        }
        return '';
}
 

$rs2 = mysql_query("SHOW DATABASES"); 

if(!$rs2)
        echo "\n\n".$sq."\n".mysql_error()."\n\n";
else
        while ($data2 = mysql_fetch_row($rs2))
        {
                $db=$data2[0];
                $db_cha=PMA_getDbCollation($db);
                if ( in_array ( $db, $currentDBArray ) )
                        if ( substr($db_cha[0],0,4)!='utf8' ) // limit to charset
                        {
                                mysql_select_db($db);
                                $rs = mysql_query("SHOW TABLES"); 

                                if(!$rs)
                                        echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                else
                                        while ($data = mysql_fetch_row($rs))
                                        {
                                                //jayif ( substr ( $data[0], 0,2 ) == "ez" )
                                                if ( 1==1 )
                                                {
                                                        $rs1 = mysql_query("show FULL columns from $data[0]");
                                                        


                                                        if(!$rs1)
                                                                echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                                        else
                                                                while ($data1 = mysql_fetch_assoc($rs1))
                                                                {
                                                                        if(in_array(array_shift(split("\\(",$data1['Type'],2)),array(
                                                                                                                                  //'national char',
                                                                                                                                                //'nchar',
                                                                                                                                                //'national varchar',
                                                                                                                                                //'nvarchar',
                                                                                                                                                'char',
                                                                                                                                                'varchar',
                                                                                                                                                'tinytext',
                                                                                                                                                'text',
                                                                                                                                                'mediumtext',
                                                                                                                                                'longtext',
                                                                                                                                                'enum',
                                                                                                                                                'set'
                                                                                                                                                 ))) 

                                                                         {
                                                                                if(substr($data1['Collation'],0,4)!='utf8') // limit to charset
                                                                                {
                                                                                        $sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type'].' CHARACTER SET binary '.($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').';';
                                                                                        if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n"; 
                                                                                        else
                                                                                        {
                                                                                                echo ($sq."\n") ; 
                                                                                                $sq="ALTER TABLE `$data[0]` CHANGE `".$data1['Field'].'` `'.$data1['Field'].'` '.$data1['Type']." CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate").($data1['Default']==''?'':($data1['Default']=='NULL'?' DEFAULT NULL':' DEFAULT \''.mysql_escape_string($data1['Default']).'\'')).($data1['Null']=='YES'?' NULL ':' NOT NULL').($data1['Comment']==''?'':' COMMENT \''.mysql_escape_string($data1['Comment']).'\'').';';

                                                                                                if(!$printonly&&!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n"; 

                                                                                                else echo ($sq."\n") ; 
                                                                                        }
                                                                                }
                                                                        }
                                                                }

                                                                if($altertablecharset)
                                                                {
                                                                        /*
                                                                         $sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET binary";
                                                                         echo ($sq."\n") ; 
                                                                        if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                                                        */
                                                                        $sq='ALTER TABLE `'.$data[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");
                                                                        echo ($sq."\n") ; 
                                                                        if(!$printonly)
                                                                                if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                                                }
                                                }
                                                else
                                                        echo $data[0] . " nicht geändert.\n";
                                                if( $alterdatabasecharser )
                                                {
                                                 /*
                                                 $sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET binary";
                                                 echo ($sq."\n") ; 
                                                 if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                                 */ 
                                                 $sq='ALTER DATABASE `'.$data2[0]."` DEFAULT CHARACTER SET $charset ".($collate==''?'':"COLLATE $collate");

                                                 echo ($sq."\n") ; 

                                                        if(!$printonly)
                                                                if(!mysql_query($sq)) echo "\n\n".$sq."\n".mysql_error()."\n\n";
                                                }
                                        }
                                }
                        }

?>
gabble’s picture

Thanks for your time, but it didn't work.
The script changed the table charset to utf8 and collationto utf8_general_ci
but mysql still interprets utf8 data as latin1

:-(

jerboa7@drupal.ru’s picture

My hoster had upgraded MySQL from 4.0.26 to 5.0.x and my Drupal's sites (Drupal 4.7, 5.0) were changed encoding. Opened with PHPmyAdmin databases had charset cp1251 instead utf8. It was very unplesent.
I made dump of database and used editor Vi in Shell. I wrote :$s/ DEFAULT CHARSET=cp1251/ DEFAULT CHARSET=utf8/g