A search for "front" doesn't pull up http://drupal.org/node/78487 but "Front" does (no quotes on either). Is that intended? I don't think having search be case-sensitive is that usable. Is this known, intended, for performance reasons?

Comments

hunmonk’s picture

Status: Active » Needs review
StatusFileSize
new927 bytes

this is a bit of a rat's nest. here's what i've found so far:

  • we're simply using the 'LIKE' operator for our comparisions.
  • mysql (and most other databases, it seems) does case-insensitive text searches by default. so if you're using mysql and having this problem, you'll need to correct that with your database setup.
  • postgres does case-sensitive text searches by default. there are a number of ways make the search case-insensitive, though.

it looks to me like the simplest solution is to use the postgres custom ILIKE operator for these searches, in the case where postgres is the database. we could try using the 'lower' function here in all cases, but i'm not sure of the performance implications, and i'm also not sure it's implementation is consistent across databases.

attached patch uses the ILIKE approach.

hunmonk’s picture

Project: Project » Project issue tracking
Version: 5.x-1.0 » 5.x-2.x-dev

moving to correct queue

john morahan’s picture

This problem appears on drupal.org - doesn't that use MySQL?

Compare:
http://drupal.org/project/issues/project_issue?text=Front (3 results right now)
http://drupal.org/project/issues/project_issue?text=front (no results)

(despite the fact that this very node contains "front" in lowercase, so maybe case isn't the real problem here?)

dww’s picture

Status: Needs review » Needs work

Right. I don't think hunmonk's patch would fix the problem, since it's definitely present in mysql (or at least, d.o), as well. I've long suspected that d.o's search indexes are busted, leading to all sorts of trouble with issue advanced search -- sadly, rebuilding them would be a hugely expensive task, and I'm not 100% certain that's really the source of the trouble. Not sure what to do about that. I also don't have time now to investigate more closely and see if there's something broken in the code, or just the d.o indexes themselves...

hunmonk’s picture

Status: Needs work » Needs review

i believe my patch is part of fixing this issue, at least on postgres installs. please review. :)

dww’s picture

I did review, which is why I said it needs work. ;) It doesn't solve the reported problem on d.o, which is the main issue here. I agree ILIKE is preferable to LOWER() -- behold the pain and suffering at http://drupal.org/node/83738.

But, whatever, I won't get in a shoving match about the status of this issue. I just don't think it helps to do a partial, pgsql-only solution when we don't understand what's really wrong yet.

hunmonk’s picture

i've uncovered the postgres issue while exploring the report, and i see it as _part_ of the fix (obviously not the only fix) for what's been reported. if you really want, i can break it out into another issue, but this issue already has all of the description for the postgres-related problem.

dww’s picture

Sorry if I wasn't clear. Your fix is an improvement, and worth doing. It just doesn't *solve* the issue here, which is why the patch still needs work. Just because something still needs work doesn't mean it should be thrown out, or that it's all wrong. It might just be incomplete. That's all I'm saying.

aclight’s picture

I did a bit of limited testing of project_issue's advance search on my site, and I didn't have any difficulties with searches being case sensitive. The site is using MySQL.

I suspect, as dww pointed out, that the problem is deeper than just the project_issue code.

If, in fact, the d.o search index is messed up, then there's not much that us mere mortals can do. I'm not sure hot project.drupal.org is set up, but is it possible to rebuild the search index on that instance and see what happens?

Fixing this might help cut down on duplicate issues being filed against projects, as it's happened more than once to me that I've used advanced search and didn't find a relevant issue, but later discovered an issue that should have turned up given my search terms, etc.

john morahan’s picture

Oooooooh.... I think I see the problem!

$sql[] = ($like && $field != 'p.pid') ? "$field LIKE '%$value%'" : "$field = '$value'";

Shouldn't that be:

$sql[] = ($like && $field != 'p.pid') ? "$field LIKE '%%$value%%'" : "$field = '$value'";

?

john morahan’s picture

Title: Advanced search is case sensitive » Advanced search doesn't escape % in LIKE matches (and is also case sensitive in PostgreSQL)
StatusFileSize
new929 bytes

I updated hunmonk's patch with the above. But now I'm thinking this really should be two issues after all.

hunmonk’s picture

Status: Needs review » Needs work

@john: let's do two separate patches, please -- so if you could roll another one for just the fix you spotted, that would be great. i'm fine with reviewing/committing both patches from this one issue if it's not a problem for anybody else -- although that's unorthodox, we do have all the information for both problems already in this issue. :)

john morahan’s picture

Status: Needs work » Needs review
StatusFileSize
new688 bytes

Sure.

hunmonk’s picture

gave the patch in #13 a pretty good workover:

  • searching using words that start with letters used in string substitution work (ie, 'start' now works)
  • searching by starting the search phrase with a % sign still leads to some weirdness (ie, searching for %start gives you (n.title LIKE '%''tart%') admittedly, this is an edge case, but if we can figure out a simple way to handle that it would be nice
  • even with this patch, queries on postgres still have a problem w/ case sensitivity

should we just commit this fix, or try to find an approach that handles the weird edge case as well?

john morahan’s picture

StatusFileSize
new7.14 KB

I guess the ideal way to fix this would be to replace the whole thing with '%%%s%%' and use db_query's built-in substitution.

Unfortunately that would seem to be a somewhat bigger task. Here's a first attempt.

john morahan’s picture

StatusFileSize
new718 bytes

Alternatively, if you want a simple workaround and don't object to slightly evil hacks, then maybe something like this would be sufficient.

hunmonk’s picture

Title: Advanced search doesn't escape % in LIKE matches (and is also case sensitive in PostgreSQL) » Advanced search doesn't escape % in LIKE matches

moved the postgres bug to http://drupal.org/node/192156

hunmonk’s picture

Status: Needs review » Fixed

committed the patch in #16, and installed on d.o. also applied to 4.7.x-2.x and 5.x-1.x. this is an admittedly ugly hack, but this whole section of code sucks, and will be replaced in the near future -- so it seems appropriate given the circumstances.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.