There are many character variants for Chinese (Simplified/Traditional), Japanese (Kanji), Korean (Hanja) characters:

http://hkiug.ln.edu.hk/unicode/hkiug_tsvcc_table-UnicodeVersion-1.0.html

They do not necessarily have a one to one relationship. I'm trying to accurately portray old materials, books, publications from the olden days, but still allow people to search via the standard keyboard entries.

If someone types in 台灣, the results should contain 台灣, 臺灣, 台湾.

Would I be able to modify the Drupal search module so that the searching on variant can return results from another? Possibly something via Regular Expressions?

Bonus if it also works with search result highlighting.

Comments

Anonymous’s picture

I've put the character variants into an table so a combination can be created based on a user's query. Is there anyway I can run a combination function and then pass an "OR" or use regular expression brackets [] into the regular drupal search function?

alexii99’s picture

Hi PeiPei,

I would like to know more about your solution years ago, and also any updated solution for this?

Thanks!

Anonymous’s picture

I never found a solution, I did post the question to D.tw here: http://drupaltaiwan.org/forum/20110317/4991 It's above my skillset unfortunately, though I've been meaning to play around with it.

alexii99’s picture

Hi Pei Pei,

Thanks for your reply.
I have tried this one http://drupal.org/project/ccsearch, but it is only for drupal 6.
My drupal skill is not good, after I apply the module and its dependence , the search doesn't work either. Too bad...

nancydru’s picture

Has anyone looke at this page in IE8? The characters don't render. They are fine in FF.

Anonymous’s picture

They should look fine in Unicode UTF-8. I'm looking at it in IE9.

Still interested if anyone's got a solution for this problem. :)

nancydru’s picture

Below IE 9, one needed to install special graphics software (from MS). It's included in IE9.

alexii99’s picture

Good to re-activate this post and hope something good happen!