Attached is a basic test case for the full text engine in search.module. It indexes some content and then runs a set of hardcoded queries with given results. It also checks the relevancy scores for normalization.

All 105 tests pass on Drupal 5.x-dev. There are 18 fails in Drupal 6.x-dev due to bugs in the new code.

CommentFileSizeAuthor
#3 search_match.test5.06 KBRok Žlender
search_match.test_.txt4.92 KBSteven

Comments

douggreen’s picture

Thanks for the tests. #205795 fixes the normalization, which was causing the scoring range problems.

After applying #205795 there are only 5 failed tests. Since the simpletest sets the minimum word length to 3, shouldn't a query for "dolore eu" return both 2 and 7, not just 7? "dolore xx" would also return 2 and 7, not the empty set. "ut minim" and "xx minim" have similar failures, but this time the results I would expect are 5, 6, and 7. The final failure is on "enim veniam am minim ut", which I'd also expect to match 5 and 6.

The new code is obviously doing something different regarding AND searches that use small words (smaller than the minimum word size). My Human eye says that the results are as I'd expect. Please help me understand why search should return the results you've coded, and why that makes sense, so that we can either fix the new search code or fix the simpletest. Thanks!

Steven’s picture

The minimum word length only applies to the search_index / search_total tables, not to the matching in general.

If all short words were ignored, then "dolore xx" (and thus "dolore yy") would indeed be treated as "dolore" and match items 2 and 7. Then you'd expect "dolore xx OR yy" to return the same results (since it is equivalent to "(dolore xx) OR (dolore yy)"). But this isn't so, and there is a test for it, which passes on 6.x.

The restriction has always been that a query needs only one long word in an AND context. This translates to being able to eliminate items on the first pass, and thus not having to do a full table scan on the second pass.

Rok Žlender’s picture

StatusFileSize
new5.06 KB

I made following changes:
- removed $text var from search_wipe call it produced bunch of notices and I cant find any use for it
- added setUp and tearDown calls which is standard for simpletest, also search module enable in setUp if search is not enabled yet

Other than that D5 works great and as you said D6 produces 18 fails. If you are ok with change no1 I'll committ to D5&6 simpletest

douggreen’s picture

Status: Needs review » Reviewed & tested by the community

The remaining 6.x simpletest failures are fixed with #205920.

I think that this simpletest is good to commit as-is. Steven did a great job of creating thorough test cases. If additional test cases come to light, we should add them in subsequent patches.

Rok Žlender’s picture

Status: Reviewed & tested by the community » Fixed

Committed to HEAD and D5 version of simpletest. Thanks Steven.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.