Search in Drupal work with national non-english words only with such conditions:

1) php-module mbstring is switched on
2) php configured with mbstring.func_overload = 7 (or at least 2)
3) php configured with mbstring.internal_encoding "UTF-8"

Cause: in search.module there is used strtolower() which incorrectly work UTF-8 (without configured mbstring).

Options 2-3 can be configured in .htaccess. Also these options can be replaced by other solution: replacing of all strtolower() to mb_strtolower() in search.module file.

This "feature" is also present in 4.6 RC1.

IMHO it must be fixed in some way or DOCUMENTED IN INSTALL.TXT AND IN HANDBOOK (in "installation" section).

Notice: with 2-3 options there will be happen some warnings in Drupal:

1) in includes/menu.inc:910 (see: http://drupal.org/node/11758)
2) in modules/user.module:214. For correct work I replaced:

    if (ereg("[^\x80-\xF7 [:alnum:]@_.-]", $name)) return t('The username contains an illegal character.');

by code:

   if (ereg("[^\\x80-\\xF7 [:alnum:]@_.-]", $name)) return t('The username contains an illegal character.');

Comments

Steven’s picture

The 4.6 search.module is significantly different from 4.5.x and supports UTF-8 even without mbstring (though you will get better results on non-latin languages with it). Marking as won'tfix, because this was the main point for this issue.

I fixed the user notice.

Steven’s picture

It is the intention of this regular expression to be byte based, not character based. Is there a way to force us to use the plain version of this function?

I suppose preg_replace is overridden as well? Stupid PHP... when will they learn that bytes are not the same as characters :(

What happens when you try the following:

 preg_replace("/[^\x80-\xF7 [:alnum:]@_.-]/", $name)
killes@www.drop.org’s picture

Status: Active » Closed (won't fix)

no user feedback

hip’s picture

Title: search.module doesn't search non-english words without mbstring php-module » search.module finds non-english words BUT doesn't highlight them

(Sorry for I feel quite 'unsure' posting to the bug reports)

Hi,

Does 'won't fix' mean the solution to this problem will be abandoned on 4.6?

I mean, I'm developing a site in spanish language right now based on the latest stable Drupal (4.6). The search.module does find the accented and other latin characters but it doesn't highlight them. I've been looking in the CVS repository but the 4.6 module progressed is abandoned (?) at version 1.123.2.4 (2006/01/05). Newer versions must rely on some other module updates or on future 4.7 relase core modules.

(BTW, preview for threads to bugs is disable :-( )

Do I have to wait to the official 4.7 release to show up and upgrade to it in order to get this feature or may there be some newer versions of search.module for 4.6? Is this a case where 'some light patching' is recommended?

Thanx (and sorry for any annoyance),
hip

hip’s picture

(Sorry for I feel quite 'unsure' posting to the bug reports)

Hi,

Does 'won't fix' mean the solution to this problem will be abandoned on 4.6?

I mean, I'm developing a site in spanish language right now based on the latest stable Drupal (4.6). The search.module does find the accented and other latin characters but it doesn't highlight them. I've been looking in the CVS repository but the 4.6 module progressed is abandoned (?) at version 1.123.2.4 (2006/01/05). Newer versions must rely on some other module updates or on future 4.7 relase core modules.

Do I have to wait to the official 4.7 release to show up and upgrade to it in order to get this feature or may there be some newer versions of search.module for 4.6? Is this a case where 'some light patching' is recommended?

Thanx (and sorry for any annoyance),
hip

(BTW, preview for threads to bugs is disable :-( )