By halfer on
Hi all
I have had trouble with unusual characters in my articles - probably down to issues in my TinyMCE editor - and unfortunately I now have a whole series of errors in my large number of articles. What would be the most convenient way to search and replace eg "’" for an apostrophe?
I am using MySQL and PhpMyAdmin, but if there is a module that might assist, that'd be great.
Thanks :)
Comments
Hi, just pinging this
Hi, just pinging this thread. I am a new Drupaller, and could use some assistance :)
Is the best approach for me here to write a standalone PHP snippet that reads in records from the relevant Drupal table, modifies them, and writes them back?
Suggested code
Well, I went ahead and created a standalone snippet. Perhaps this might be of use to someone. It's not particularly good at dealing with db difficulties, but of course anyone is welcome to tidy it up. Recommend that users backup their node_revisions before use. Be aware that more specific searches should be run before less specific ones, lest the wrong thing be replaced!
I think its a collation problem..
I think its actually a collation problem rather than tinyMCE or someother module..
No idea why. Its just happened on mine... have a feeling its because the collation is all mixed - mine is in latin1_swedish - no idea why..
You're right
You're absolutely right, it was that. Should have reset my database to UTF-8, and after having to amend a lot of documents, including some hand correction, I've learnt my lesson ;)
My search-n-replace code should be useful for other things too, I hope - and perhaps one day this feature will be offered as a Drupal module. I know I'd find it useful.
some changes
in my case, the characters were even more messed up. to this end, i've made some modifications.
in some cases (fantastico installs?) the password in your sites file will be scrambled. in that case make sure that the pwd is pasted from that file or it will not run.
Almost Worked
After updating to the newest version of mysql, I ran this script to fix all of the whacked out characters that were inserted into my content. The script ran successfully, but it only appeared to replace the characters on five pages of my site (out of hundreds).
Any suggestions on why this could have happened and how to apply the script to my entire site?
Thanks.
dg
It might be that the
It might be that the problematic character sequences you are experiencing are not catered for in the scripts here. Try identifying each sequence of 'bad characters' and add in your own search/replace elements to deal with them. If you are unsure of the consequences, comment out the database write operations until you are confident that it will make good changes (and back up your database in any case).
Or run SQL script
SQL script example can be found here
http://drupal.org/node/369961
this replaces ’ – — “ †¿ á ó ñ ň é ú Ã
with appropriate UTF8 replacements
I found that by SEARCHing from phpMyAdmin, one can find the affected tables and columns, then one can modify the script easily and run it from phpMyAdmin SQL tab.
If this is still an issue...
I somehow didn't find this thread when i was running into charset issue on my sites. But for my jobs, which was combining different nodes with different content encodings, i found it much easier to do this from the command line using character set conversion utilities, such as the unix command
iconv, and from php using themb_convert_encodingfunction:http://www.php.net/manual/en/function.mb-convert-encoding.php
Be careful though - I also got my character sets extra mangled by running mysqldump on a host (WestHost) whose command-line utilities are very unfriendly to many character sets - their version of
sedclobbered some of the high-ascii characters. Bad news. Doing it in PHP avoids that.possibly easiest are the character set conversion tools inside MySQL itself - http://dev.mysql.com/doc/refman/5.1/en/charset-conversion.html - these work if you are converting from tables all in one character set.. That wasn't the case for me, sadly.
A few more tips
Thanks to all for the answers in this post. It helped me a lot.
A few issues I had that might help the next person with the same problem are:
1. The Right Smart Quote appears in this post missing the last character.
'â€' => '"',is missing a third symbol that you need to find to get it to work.2. If you are updating CCK nodes, the changes made to the database will not be reflected unless the CCK content cache is cleared. I used
cache_clear_all('content:'.$n.':'.$n, 'cache_content'); // Clears the CCK cache so update worksHope this helps.