I have just been dealing with an issue where on client's site using solr search through Views exposed form, I was not getting any results when searching for "マルチェロ ブラック" (where 2 words were separated with ideographic space \u3000), while searching for "マルチェロ ブラック" (the same 2 words, but separated with a normal space) returned expected results. The expected behavior was that the former case, where words were separated by the ideographic space, should return the same set of results as in case of normal space.

The issue is that SearchApiQuery::parseKeys() is exploding the query string on normal space only, ignoring other possible whitespace characters. Thing to consider here then would be changing its behavior to use preg_split() on generic whitespace character with PCRE_UTF8 modifier.

Suggested patch to follow.

Comments

maciej.zgadzaj’s picture

Status: Active » Needs review
StatusFileSize
new479 bytes

Patch attached.

drunken monkey’s picture

Category: feature » bug

Thanks for reporting this and already providing a patch!

Yes, we should of course fix this. I'm amazed we aren't doing this already, normally I always made sure to consider different languages as well.

Anyways, it seems the patch works very well, it's a trivial change and I'm about to create a new release – so just committed this right away.
Thanks again!

drunken monkey’s picture

Status: Needs review » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.