Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
In SearchApiQuery::parseKeys drupal string functions are not used consequently. As a result when a quoted term contains multibyte characters, a closing quote character is not detected properly. This patch fixes the problem by converting all string manipulation functions within parseKeys to their drupal_* equivalents.
Comment | File | Size | Author |
---|---|---|---|
#4 | 1468678-3-remove-drupal-strlen.patch | 1.69 KB | znerol |
#1 | 1468678--drupal_multibyte_wrappers-1.patch | 25.3 KB | drunken monkey |
0001-Prevent-incomplete-quoting-by-using-drupal-string-fu.patch | 1.73 KB | znerol | |
Comments
Comment #1
drunken monkeyThanks for reporting this, should have fixed this a long time ago … Fact is I often forget to use the proper wrapper functions and that's the result. However, while we're at it, let's fix this in the whole module.
The attached patch does just that.
Comment #2
znerol CreditAttribution: znerol commentedSorry, no. It does not work that way. You frequently use something like
$value[drupal_strlen($value) - 1]
in order to get the last character of a string. However you may not use the result ofdrupal_strlen
as an index into a string. For multibyte strings the value will be too low. Rather you should usedrupal_substr($value, -1)
in order to get the last character in a multibyte as demonstrated in my patch.Comment #3
znerol CreditAttribution: znerol commentedI'll post an updated patch soon.
Comment #4
znerol CreditAttribution: znerol commentedOk, I reviewed the docs and the code. The API docs for drupal_substr says that:
Therefore it is safe to use the plain PHP functions
strpos
andstrlen
in most situations. The issue at hand however is thatdrupal_strlen
is used inquery.inc
where plainstrlen
would be appropriate.Using grep and coder-review I analyzed all occurences of drupal-prefixed string functions vs non-prefixed string functions and IMHO the only problematic combination is the one in
query.inc
which caused me problems with multibyte search keys when they were enclosed in quotes.Therefore I propose to fix the problem by removing the drupal-prefixed functions in favor of the plain
strlen
functions and leaving all the rest as it is now. A patch is attached which also adds an appropriate test tocheckQueryParseKeys
test case.Where the drupal-prefixed string functions were used
IMHO there is no need to use drupal_strtolower here. Operator is not expected to hold multibyte data.
Ok.
Ok.
Not ok. Output of drupal_strlen is number of characters and not number of bytes.
IMHO there is no need to use drupal_strtolower here. Multibyte characters will be stripped anyway.
Ok.
Where coder-review suggests drupal-prefixed string functions
We deal with controlled/static strings in all of the instances in this file. No action required.
I'm not sure about that. This is probably a separate issue. Test case: Try to boost img-tags. Then examine score of content having multibyte string in an attribute of the img-tag (e.g.
<img title="Schöne Grüsse>
)We deal with controlled/static strings in all of the instances in this file (sort order and sql operators). No action required. Off-by one error is fixed by converting the drupal-prefixed strlen functions to the plain versions.
We deal with controlled/static strings in all of the instances in this file (field names may not contain multibyte characters). No action required.
We deal with controlled/static strings in all of the instances in this file (field names may not contain multibyte characters). No action required.
We deal with controlled/static strings in all of the instances in this file (field names may not contain multibyte characters). No action required.
Most probably there are no problems here because the non-prefixed string functions are used consequently here, i.e. no intermixing of drupal_strlen with index-access.
Most probably there are no problems here because the non-prefixed string functions are used consequently here, i.e. no intermixing of drupal_strlen with index-access.
These operate on table names and field names. The drupal-string functions are definitely not necessary here.
This instance only operates on a table name. No problems.
Most probably there are no problems here because the non-prefixed string functions are used consequently here, i.e. no intermixing of drupal_strlen with index-access.
This instance only operates on a table name. No problems.
This instance only operates on a table name. No problems.
This instance only operates on a table name. No problems.
We deal with controlled/static strings in all of the instances in this file (field names may not contain multibyte characters). No action required.
Comment #5
drunken monkeyWow, thanks for your efforts! You even added a test case – exemplary! ;)
The patch works fine, the test checks out … committed! Thanks again!