Given
PostgreSQL 7.4.2
Drupal 4.6.3

Drupal database have been created in UNICODE encoding for French and Russian languages supporting.

When I try go to http://mypage/cron.php page I get:

warning: pg_query(): Query failed: ERROR:  Unicode characters greater than or equal to 0x10000 are not supported in /home/mypage/includes/database.pgsql.inc on line 45.

user error: 
query: INSERT INTO search_index (word, sid, type, score) VALUES ('������', 1, 'node', 22) in /home/mypage/includes/database.pgsql.inc on line 62.

When I try to search smth in English it's ok (well... it's not ok exactly, because I can't run cron to indexing my site, that is why it says tongue-in-cheek phrase "Your search yielded no results"), but when i try to search in Russian, Search Engine says:

warning: pg_query(): Query failed: ERROR:  Unicode characters greater than or equal to 0x10000 are not supported in /home/mypage/includes/database.pgsql.inc on line 45.

user error: 
query: SELECT DISTINCT i.sid, i.type FROM search_index i INNER JOIN node n ON n.nid = i.sid  INNER JOIN users u ON n.uid = u.uid WHERE n.status = 1 AND (i.word = '���') in /home/mypage/includes/database.pgsql.inc on line 62.

And when I try to search in French:

warning: pg_query(): Query failed: ERROR:  invalid byte sequence for encoding "UNICODE": 0xe3a962 in /home/mypage/includes/database.pgsql.inc on line 45.

user error: 
query: SELECT DISTINCT i.sid, i.type FROM search_index i INNER JOIN node n ON n.nid = i.sid  INNER JOIN users u ON n.uid = u.uid WHERE n.status = 1 AND (i.word = 'd�but') in /home/mypage/includes/database.pgsql.inc on line 62.

What to do?

Comments

Steven’s picture

There is a call to strtolower in search.module which corrupts utf-8 data in 4.6. This is a known issue which has already been fixed in the development version. In the meantime you could just get rid of it, though you will not get case-insensitive searching.

If you have mbstring installed for php, you could replace it with mb_strtolower() for utf-8.

--
If you have a problem, please search before posting a question.

Doctor Moo’s picture

PHP is rather unearthly language. :-) I can't reconfigure php in my remote server in order to include mbstring. I had to make my search machine case-sensitive through removing strtolower. Nevertheless it works. :-) Thanks.

I hope this will be a stopgap measure. I wish you patience. ;-)