make it not case sensitive (or make that a possibility)

gusgsm1 - August 11, 2006 - 21:25
Project:Search config
Version:HEAD
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:closed
Description

Using Drupal (4.7.2) with a non English language makes the search ability of Drupal very lame as many Latin languages (French or Spanish, for instance) do use a lot of diacritics, not to speak of case-sensitivity of the search option.

People using those languages tend to forget diacritics when searching the net. Google, for example, makes a good job at there making no distinction between "crítico" (Spanish for "critical"), "Crítico", "Critico", "CRITICO" and so on...

This ability of filtering the search input would skyrocket the value of search in those languages. And English speaker would benefit from that as well.

A tick box in the general options for a site could be an option to enable-disable case/diacritics sensitivity.

Thank you :)

#1

gusgsm1 - August 14, 2006 - 10:01

I see now Drupal 4.7 is not case sensitive (I had'nt reindexed my site, sorry).

But the issue for Search ignoring diacritics or reagruping may stand alive...

If somebody with enough knowledge could care (a module as somebody hinted in a forum thread?).

Thank you :)

#2

canen - August 14, 2006 - 17:26

I really don't know a lot of when it comes to i18n but if it's as simple as stated in the link you provided I wouldn't mind doing a simple module. I may actually have to use it for a site that I currently working on.

Do you know where I can find a list of 'diacritics' and their English equivalent? By the way have you checked if this functionality is not provided by another module?

Thanks.

#3

canen - August 14, 2006 - 19:56

Found this function from wordpress. Seems to do what is required. I would just make a direct copy if doing a module.

#4

canen - August 14, 2006 - 20:34

OK. It seemed simple enough so I did a module. Tell me if it works for you.I did a basic test here and it seems to work.

If it is useful I can create a project for it. I didn't do much work on this that Stephen for the link above and the Wordpress guys.

AttachmentSize
accents.module 8.1 KB

#5

canen - August 14, 2006 - 20:36

Oh, you'll need to re-index your search.

#6

gusgsm1 - August 15, 2006 - 13:11

canen,

Thanks a lot!! I'll take a look today to see if it works ok.

As for he requested list. I think it 'should/could' be summed up with two simple rules:

  1. Any vowel with a diacritic reverts to a diacriticless version; ie: "á â Å ã" are all "a". This option has only advantages as this diacritics are always typed twice: One for the tilde and one for the vowel. So users tend to forget it (even if they know it).
  2. Any consonant with a diacritic has a synonym; ie: año (Spanish for year) is searched as "año" and "ano". This option, however has some disadvantages as languagues with those diacritics tend to have one kestroke assigned to them and they can be sound/meaning quite different pairs (for example: Spanish Año is "Year" and "ano" is anus, in the example above).

So, comes again the possibility of making an option of an "ignore diacritics when searching" tickbox. But I guess that is much heavier work than just the first option that is extremely important by itself and, as a speaker of a diacritic-loaded language, is extremely helpful.

A happy Gustavo :)

Ps. If we are talking about diacritics in utf-8, I guess the canonical source is Unicode.org perhaps?

#7

gusgsm1 - August 15, 2006 - 20:00
Title:Search: make it not case sensitive (or make that a possibility)» It works!

It works. After loading it I reindexed and cronned and now "imagenes", "Imágenes" and "imágenes", etc. (Spanish for images in all its variants) give back the same results.

That's (from the usability point of view) absolutelly splendid.

Thanks. IOW a pint (or 2... ) ;)

PS. I guess this would be of interest for other Spanish, French, Portuguese, etc... users.

#8

ax - August 15, 2006 - 20:56
Title:It works!» make it not case sensitive (or make that a possibility)

please do not change issue titles - thanks!

#9

canen - August 16, 2006 - 02:03
Status:active» closed

Created a project here http://drupal.org/project/accents. It may take awhile to show up.

 
 

Drupal is a registered trademark of Dries Buytaert.