I want a textfield conainting "barfoobar" to show up in the search results when searching the term "foo". currently it soes only show up if you search the term "barfoobar". Is this supposed to work and am I missing some setting?

I posted the issue here because I have got no possibility to test it with Solr.

Comments

luksak’s picture

Title: Fulltext search does not use the method use in MySQL's LIKE '%foo%' » Fulltext search does not use the method used in MySQL's LIKE '%foo%'
Project: Search API Database Search » Search API
Component: Code » Framework

I managed to test it with Solr and it doesn't work either. changing the project.

zambrey’s picture

Subscribe, just want to know if fulltext search is somehow possible.

drunken monkey’s picture

Title: Fulltext search does not use the method used in MySQL's LIKE '%foo%' » Add option for partial matching
Project: Search API » Search API Database Search
Component: Framework » Code
Category: support » feature

This behaviour is actually not defined in the Search API, it's up to the individual backends on how the text is searchable, whether partial matches are supported, etc. Currently, as far as I know no search backend provides this out-of-the-box, though.
While it's a simple matter of changing a config file for Solr, it's much more complicated for the database backend. We'd probably have to index (and search for) n-grams and provide this as an option along with the current behaviour.

In any case, having this in the DB backend-specific issue queue was right.

luksak’s picture

Thank you for the feedback.

This is going to be necessary I guess, isn't it? Is it going to require a lot of effort?

Could you point me in the right direction with Solr approach? I have got no experience in using it.

Lukas

drunken monkey’s picture

This is going to be necessary I guess, isn't it? Is it going to require a lot of effort?

Depends on what „this“ is.

Could you point me in the right direction with Solr approach? I have got no experience in using it.

See, e.g., #1056018: Better document Solr config customization options and #1307784: Fuzzy Search.

luksak’s picture

Depends on what „this“ is.

Good searchengines (e.g. Google) shows you results also if a string only partially matches. This functionality should be in the Search API and also usable by people who do not have Solr. For a first step the functionality I initially described (MySQL's LIKE '%foo%') would already satisfy me.

For now I will go for the Solr integration. But I will have projects where I do not have the possibility to install Solr on the server but still need a good search.

Edit:
The Solr approach worked perfectly. I had to change the schema.xml: Inside the node

<fieldType name="text" ...

I added two new filters

<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />

and uncommented the line

<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>

Edit 2:
I uncommented the line

<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>

again because somehow this removed the exact matches. Maybe I will understand Solr one day ;)

luksak’s picture

I guess this is the solution for this problem: fuzzysearch. I have not tested it since I use Solr now.

btmash’s picture

Wanted to add a clarification to this regarding apachesolr. EdgeNGramFilterFactory (<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />) only creates n-grams from the beginning or end of an input token. If you want more fuzzy matches, you should try NGramFilterFactory (<filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25" />) instead.

drunken monkey’s picture

I guess this is the solution for this problem: fuzzysearch. I have not tested it since I use Solr now.

Thanks for the link, I didn't know that module! Seems great, though. I've added it to the module list on the Search API project page.

Wanted to add a clarification to this regarding apachesolr. EdgeNGramFilterFactory (&lt;filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" /&gt;) only creates n-grams from the beginning or end of an input token. If you want more fuzzy matches, you should try NGramFilterFactory (&lt;filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="25" /&gt;) instead.

Wow, thanks a lot for that! Seems like I've been giving wrong advice to people for over half a year …
Good to know, though.

Although recently I implemented this in Solr (on a private project, not cleanly enough for the official one) by adding wildcards to all words' starts and ends and using the "edismax" request handler (available since 3.5, I think), which also worked very well.

Anyway, still leaving this open as we might one day want to implement this functionality here, too. Maybe by joining efforts with the "Fuzzy Search" maintainer.

luksak’s picture

It's great to hear that there is some progress.

How did you use edismax?

I do not understand why there are two DB search server modules. You really should join those two modules.

attiks’s picture

semiaddict’s picture

I was able to get partial word searches on fulltext fields by slightly modifying the generated query.
A patch is attached with those modifications.

Note: I am not sure if this has any negative impact on any other portions (facets, sorts, etc).

robmc’s picture

Thanks for the patch! I can confirm this works. We have testing scheduled which should expose issues. I'll report if any are found.

Cheers,

Rob McCrea

agoradesign’s picture

Thanks for the patch! I can also confirm that it works as proposed. This behaviour offers definitely a far better user experience!

Anonymous’s picture

#12 seems working(I dont see why it shouldnt ^^).

achton’s picture

I'd be curious to hear the module maintainer's opinion on the approach in #12. The search_api_string_filter module does not work for fulltext fields, so is not an option in my case.

mrharolda’s picture

Status: Active » Needs work

The patch in #12 works, but IMHO it's best to be able to select what word matching you'd like on your site: exact (= ...) or fuzzy (LIKE %...%).

Does the search_api support engine-specific options?

Edit: yes it does, but only on the server part, not for queries ... :(

mrharolda’s picture

Slightly improved patch with better input filtering ... still looking in making this optional.

achton’s picture

It seems this approach breaks the "search keys" feature, ie. searching for multiple words.
My testing suggests that only the first term is used for fetching results, if the patch in #18 is applied.

Can anyone confirm this?

mrharolda’s picture

@achton Can't confirm your issue.

I did however find out that nodes with multiple hits show up multiple times in your results. A search on 'tes' with a node with both 'test' and 'testing' in it will give duplicate results. :(

colan’s picture

Status: Needs work » Reviewed & tested by the community

That last patch looks good (no duplicate results), and doesn't noticeably harm performance. I RTBCed it, but maybe that's premature if we want to make it an option? I don't see a problem with leaving it as is, but maybe we could create another issue to make it an option if folks want to turn it off?

drunken monkey’s picture

Status: Reviewed & tested by the community » Needs work

I did however find out that nodes with multiple hits show up multiple times in your results. A search on 'tes' with a node with both 'test' and 'testing' in it will give duplicate results. :(

Seems logical, I guess, that's the problem with using "LIKE".

I'm also pretty sure it will perform worse than my proposed solution of giving the user the option to index n-grams instead of only whole words. However, this one is of course much easier to implement (save for the little problem mentioned above), and if we let users decide if they want this behaviour (and the associated performance losses) it should be OK, I guess.

So, if you fix the above bug and add the option for users (as you say, including it on the server and using it on all queries is the only way here, regrettably – #1720348: Add the concept of query extenders might change this in the future, though), I would commit this. The option should probably contain a short warning regarding the performance impact.

yenidem’s picture

Thank you very much @MrHaroldA and @semiaddict #12, this patch worked. I have spend a loooot of time for this issue.

semei’s picture

Will this get committed? I personally also feel like partial mathcing is absolutely indispensable.

drunken monkey’s picture

As said in #22, the current patch has still two problems. If those get addressed, I'll commit the patch. (We should probably add some tests, too, to ensure that behavior doesn't pop up later.)

Anonymous’s picture

Is it a problem to add distinct() ??

damienmckenna’s picture

Status: Needs work » Needs review
drunken monkey’s picture

Status: Needs review » Needs work

Why should this suddenly not need work? If you think just adding distinct() will work, then please, just do that.

damienmckenna’s picture

@drunken: I just changed the status to 'needs review' to trigger testbot; background: I'm helping maintain a site and discovered the above patch had been applied, I wanted to get a quick status check on it, sorry for adding some confusion.

drunken monkey’s picture

Ah, OK. Please just note that next time.
At least it's good to know our current tests already catch the problems with this patch.

Johnny vd Laar’s picture

For people that don't want to hack search api db. Here is a piece of code that you can place in your own module that does the same as the patch.

/**
 * Implements hook_query_TAG_alter() for search_api_db_search.
 * @see https://drupal.org/node/1299238#comment-7757405
 */
function X_query_search_api_db_search_alter(QueryAlterableInterface $query) {
  $to_like = function(&$conditions) use (&$to_like) {
    foreach($conditions as $i => $condition) {
      if (is_int($i)) {
        if (isset($condition['field']) && is_object($condition['field'])) {
          $sub_conditions = &$conditions[$i]['field']->conditions();
          $to_like($sub_conditions);
        }
        elseif (isset($condition['value']) && is_object($condition['value'])) {
          $sub_conditions = &$conditions[$i]['value']->conditions();
          $to_like($sub_conditions);
        }
        elseif ($conditions[$i]['operator'] != 'LIKE' && !is_object($conditions[$i]['value'])) {
          $conditions[$i]['value'] = '%' . db_like($conditions[$i]['value']) . '%';
          $conditions[$i]['operator'] = 'LIKE';
        }
      }
    }
  };
  $query->distinct();

  $tables = &$query->getTables();
  if (isset($tables['t']['table']) && in_array($tables['t']['table'], array('search_api_db_node_search_api_viewed', 'search_api_db_user_search_api_language'))) {
    $conditions = &$query->conditions();
  }
  elseif (isset($tables['t']['table']) && is_object($tables['t']['table'])) {
    $conditions = &$tables['t']['table']->conditions();
  }

  if (isset($conditions)) {
    $to_like($conditions);
  }
}
geerlingguy’s picture

From #22:

So, if you fix the above bug and add the option for users (as you say, including it on the server and using it on all queries is the only way here, regrettably – #1720348: Add the concept of query extenders might change this in the future, though), I would commit this. The option should probably contain a short warning regarding the performance impact.

Could you say what, exactly, is required to get this patch through? I'm in agreement with #24:

I personally also feel like partial mathcing is absolutely indispensable.

Without partial matching, search is pretty useless in most use cases I've encountered... for me, if I need to worry about performance at all, there's no way I'm going to use database-backed search anyways. At that point, I'll switch to solr for the indexing/searching.

But I'd really like to see this patch (or something like it) committed, so I'm willing to work on whatever improvements are required.

drunken monkey’s picture

Could you say what, exactly, is required to get this patch through? I'm in agreement with #24:

- Eliminate the duplicate results when multiple words match, and make all tests pass.
- Add a server option to switch this behavior on or off (default to off), with a note that this might have a negative impact on performance.

geerlingguy’s picture

Okay, I'll try to get to this soon.

Anonymous’s picture

#31 If you don't work with nodes, this code:

  if (isset($tables['t']['table']) && in_array($tables['t']['table'], array('search_api_db_node_search_api_viewed', 'search_api_db_user_search_api_language'))) {
    $conditions = &$query->conditions();
  }

basically makes the whole hook inoperative.

So it's better to use:

  if (isset($tables['t']['table'])) {
    $conditions = &$query->conditions();
  }
gilsbert’s picture

Hi.
Nice news.
+1 waiting for the patch.

styrbaek’s picture

The Views Pager disappear when using the code in #31

styrbaek’s picture

Issue summary: View changes

typo

stopshinal’s picture

What is the status on this? I also need partial matching.

-#31, I tried this code and it didn't seem to have an effect.

jon pugh’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new719 bytes

Re-rolled on 7.x as of Nov 19th.

mrharolda’s picture

Status: Needs review » Needs work

The patch above still hasn't got an option to enable/disable partial matching, nor does it handle duplicates in the result.

Edit: the db_like() filtering was also stripped from the patch!

markplindsay’s picture

I was able to get #31's module code working with some modifications. The idea is that you want to get all conditionals using LIKE.

It looks like a Search API query on many fulltext fields (picked out in the Search API UI's Fields tab) is formed using a series of UNIONs. In my case, I had fulltext enabled on some taxonomy term names so users could search by tags. But the LIKE conditionals were not being extended to these taxonomy term name UNIONs. So searching for partial tag names wasn't working.

By running $to_like on the UNION conditionals as well, I was able to get partial matches working with these other fulltext fields. My code isn't perfect and may not accommodate your use case, but maybe it can give you a start.

/**
 * Implements hook_query_TAG_alter() for search_api_db_search.
 * @see https://drupal.org/node/1299238#comment-7757405
 */
function X_query_search_api_db_search_alter(QueryAlterableInterface $query) {
  $to_like = function(&$conditions) use(&$to_like) {
    foreach($conditions as $i => $condition) {
      if(is_int($i)) {
        if(isset($condition['field']) && is_object($condition['field'])) {
          $sub_conditions = &$conditions[$i]['field']->conditions();
          $to_like($sub_conditions);
        }
        elseif(isset($condition['value']) && is_object($condition['value'])) { 
          $sub_conditions = &$conditions[$i]['value']->conditions();
          $to_like($sub_conditions);
        }
        elseif($conditions[$i]['operator'] != 'LIKE' && !is_object($conditions[$i]['value'])) { 
          $conditions[$i]['value'] = db_like($conditions[$i]['value']) . '%';
          $conditions[$i]['operator'] = 'LIKE';
        }
      }
    }
  };
  $query->distinct();
  $tables = &$query->getTables();  
  // as https://drupal.org/comment/7916105#comment-7916105 noted, the first
  // if statement involving search_api_db_node_search_api_viewed in the original 
  // code didn't seem to be useful so i removed it.
  if(isset($tables['t']['table']) && is_object($tables['t']['table'])) {
    $conditions = &$tables['t']['table']->conditions();
    if(isset($conditions)) {
      $to_like($conditions);
    }
    // get the union queries from the $tables['t']['table'] selectquery object
    $unions = &$tables['t']['table']->getUnion();
    if(isset($unions)) {
      foreach($unions as $delta => $union) {
        $union_conditions = &$unions[$delta]['query']->conditions();
        if(isset($union_conditions)) {
          $to_like($union_conditions);
        }
      }
    }
  }
}
jncruces’s picture

The previus comment worked for me perfectly. Only i added the percentage symbol before the search text to simulate a search like "contains any word".

/**
* Implements hook_query_TAG_alter() for search_api_db_search.
* @see https://drupal.org/node/1299238#comment-7757405
*/
function X_query_search_api_db_search_alter(QueryAlterableInterface $query) {
  $to_like = function(&$conditions) use(&$to_like) {
    foreach($conditions as $i => $condition) {
      if(is_int($i)) {
        if(isset($condition['field']) && is_object($condition['field'])) {
          $sub_conditions = &$conditions[$i]['field']->conditions();
          $to_like($sub_conditions);
        }
        elseif(isset($condition['value']) && is_object($condition['value'])) {
          $sub_conditions = &$conditions[$i]['value']->conditions();
          $to_like($sub_conditions);
        }
        elseif($conditions[$i]['operator'] != 'LIKE' && !is_object($conditions[$i]['value'])) {
          $conditions[$i]['value'] = '%' . db_like($conditions[$i]['value']) . '%'; // Here is my change
          $conditions[$i]['operator'] = 'LIKE';
        }
      }
    }
  };
  $query->distinct();
  $tables = &$query->getTables();
  // as https://drupal.org/comment/7916105#comment-7916105 noted, the first
  // if statement involving search_api_db_node_search_api_viewed in the original
  // code didn't seem to be useful so i removed it.
  if(isset($tables['t']['table']) && is_object($tables['t']['table'])) {
    $conditions = &$tables['t']['table']->conditions();
    if(isset($conditions)) {
      $to_like($conditions);
    }
    // get the union queries from the $tables['t']['table'] selectquery object
    $unions = &$tables['t']['table']->getUnion();
    if(isset($unions)) {
      foreach($unions as $delta => $union) {
        $union_conditions = &$unions[$delta]['query']->conditions();
        if(isset($union_conditions)) {
          $to_like($union_conditions);
        }
      }
    }
  }
}
FranciscoLuz’s picture

#42 works!

FranciscoLuz’s picture

Title: Add option for partial matching » Search API Database Search sub-string ( partial words ) searching match
Anonymous’s picture

There should be also "starts with" and "ends with" options available in addition to "contains".

Anonymous’s picture

#42 works but only if this condition is met is_object($tables['t']['table'] which is not always the case. I had my search set up to only index the title field, in which case $tables['t']['table'] is just a string and not an object.

Also, the Views pager disappears. Removing the $query->distinct(); line makes it appear again. Duplicates don't seem to appear in my case, but haven't tested properly.

jncruces’s picture

#46 has the reason. Removing $query->distinct(); is solved the pager problem.

Thanks.

Johnny vd Laar’s picture

Status: Needs work » Needs review
StatusFileSize
new2.55 KB

Attached patch works with the new db structure and also provides a server option to switch on / off this partial search behavior.

Status: Needs review » Needs work

The last submitted patch, 48: search_api_db-partial-fulltext-search-1299238-48.patch, failed testing.

Johnny vd Laar’s picture

Status: Needs work » Needs review
StatusFileSize
new2.97 KB

Ok I think I missed a var somewhere. Here is a fix for the notice.

FranciscoLuz’s picture

The patch at #50 worked as advertised.

I am having an issue though with portuguese characters like ç, ã, é and so on.

Say for instance I am searching for tração but type tracao instead, it won't return the results containing tração.

Does anyone know how could I fix this issue.

Anonymous’s picture

#51 see 2100665

drunken monkey’s picture

Status: Needs review » Needs work

Thanks, that looks great! And when the test bot is happy, it's pretty probable that it will also work correctly – at least when the option is not enabled. However, it would be great if you could also add some tests for this option, so we have some proof that it really works (and can assure it keeps working).

Also, this part is very confusing (though rather brilliant I have to say, after finally having understood it) and thus in dire need of some comment explaining it:

@@ -1378,16 +1387,36 @@ class SearchApiDbService extends SearchApiAbstractService {
+          if ($mult_fields) {
+            $db_query->addExpression("t.word LIKE '%" . db_like($word) . "%'", $alias);
+            $db_query->groupBy($alias);
+          }

Apart from that, as said, it's great! Thanks a lot again!

caesius’s picture

Patch needs updating for the latest dev commit which touched the same parts of service.inc.

Johnny vd Laar’s picture

Status: Needs work » Needs review
StatusFileSize
new6.46 KB

I've updated my patch with the latest commit, added some comments and added test scripts. Lets see what testbot thinks of it.

drunken monkey’s picture

StatusFileSize
new7.13 KB

Wow, excellent work, thanks! I wish all contributors to my modules were as good as you …
Anyways, there were still two tiny faults with your patch:

  1. +++ b/search_api_db.test
    @@ -341,6 +345,39 @@ class SearchApiDbTest extends DrupalWebTestCase {
    +    $results = $this->buildSearch('foo')->execute();
    +    $this->assertEqual($results['result count'], 5, 'Partial search for »foo« returned correct number of results.');
    +    $this->assertEqual(array_keys($results['results']), array(1, 2, 4, 3, 5), 'Partial search for »foo« returned correct result.');
    

    This search is sorted by score, but 1/2/4 and 3/5 have actually the same score (5 and 1, respectively). Therefore, this test passing relies on the database sorting by ID for identical scores, which is not always the case (and should in any case not be relied on here).
    Therefore, I added an explicit id ASC sort.

  2. +++ b/service.inc
    @@ -1340,6 +1348,7 @@ class SearchApiDbService extends SearchApiAbstractService {
    +    $partial = $this->options['partial_string_search'];
    

    Existing servers won't have this setting, so there should be an empty() around it.
    Also, since you only use it once, we don't really need a variable for it, I'd say. (Even though the temptation to keep it and rename it to $match_parts is quite large …)

I also changed some strings and added basic method documentation. Please see the attached patch, I hope it still passes. If it's also still alright with you, I think we can finally commit this!

drunken monkey’s picture

StatusFileSize
new6.86 KB

Oops, that's the right one.

Poieo’s picture

The partial matching seems to be working well. However, now I'm getting the following error using today's dev with this patch.

Error message
SQLSTATE[21000]: Cardinality violation: 1241 Operand should contain 1 column(s)
Poieo’s picture

A little more info that may help...I'm using the Full Text search field and the only time the issue appears is if the search term is a single word. If I use multiple words, the results, including partial, work great.

Johnny vd Laar’s picture

I didn't encounter that error, can you perhaps post the query that search api db generated?

Poieo’s picture

I tried to use View's settings to show the query but I only get the following along with the error message: Query No query was run.

Is there anything else I can do to provide you with this information?

Poieo’s picture

Under Fulltext search settings in views, under 'Use as', if Search keys is selected I do not get the error, but if Search filter is selected I get the error...this may be unavoidable. I'm pretty sure I need to have Search keys selected anyway.

I tried both settings on a clean install of Drupal Commerce and the error does occur for Search filter.

Johnny vd Laar’s picture

I can't seem to reproduce this error. I have:

  1. Search api db with partial search enabled (patch in #57)
  2. View with full text search "Search filter – use as a single phrase that restricts the result set but doesn't influence relevance."

Do you also have the error when you didn't apply the patch?

drunken monkey’s picture

StatusFileSize
new11.74 KB

The partial matching seems to be working well. However, now I'm getting the following error using today's dev with this patch.

Thanks a lot for reporting this problem! After some playing around, I could also reproduce this.
However, getting a reproducible test case, finding the root cause of the problem and then fixing it took several hours. But now I'm done and the attached patch should hopefully solve this and some other problems with specific setups. I've also included tests for the problems I've found (there were several).

Please see if this patch now (or still) works for you!
It's really complex so I'd like to make sure it breaks nobody's setup.

Johnny vd Laar’s picture

Last patch works for me, but I didn't get the error in the first place. Does it work for you Poieo?

caesius’s picture

Patch no longer applies; service.inc has since been updated.

jeroent’s picture

StatusFileSize
new11.73 KB

Created reroll of this patch.

caesius’s picture

Works for me. I was previously having an issue with multiple-word searches not working, but this patch fixes that.

drunken monkey’s picture

Status: Needs review » Fixed

OK, great!
Committed (finally).

Thanks again to everyone working on this, especially Johnny!

Johnny vd Laar’s picture

you're welcome! thanks for committing.

jnorell’s picture

Trying to get this feature to work, what am I missing? I'm using the default search box and results page (not custom view).

  • I upgraded search_api_db to 7.x-1.3, and ran upgrade.php.
  • I edit my search api server, enable the Search on parts of a word checkbox.
  • I delete all indexed data and reindex the site.

I then search for known partial terms in title or body (and longer than min length), with no results found.

Thanks...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

drunken monkey’s picture

Title: Search API Database Search sub-string ( partial words ) searching match » Add option for partial matching
Status: Closed (fixed) » Needs work

#2286329: Incorrect facet counts in multi-word search reports a regression caused by this patch, it seems we accidentally reverted the commit from #1403916-4: Multi word search results sets incorrect count for search_api_facets when Facets is in use.

+++ b/service.inc
@@ -1746,6 +1763,10 @@ class SearchApiDbService extends SearchApiAbstractService {
+
+    $group_by = &$db_query->getGroupBy();
+    $group_by = array();
+

These added lines were introduced in #55 – do you still know why you added them, Johnny, or can you reconstruct it? I can't find anything wrong with removing those, and the tests also don't fail.

@ jnorell: Please don't comment in fixed issues, otherwise your comments are likely to be overlooked (as happened here). Either re-open them when commenting, or open a new issue.
Have you been able to get this to work again? What you describe looks like it should work, and re-indexing shouldn't be required either. Is the index maybe lying on the wrong server, or did you miss some other mistake in the setup?

Johnny vd Laar’s picture

Hmm I don't remember anymore why I've done that.

drunken monkey’s picture

OK, to be expected, I guess.
Could you maybe test the patch in #2286329-4: Incorrect facet counts in multi-word search and see whether partial searches still work for you after this? Then I'd just commit and we'll assume this somehow got in from an earlier version or something.

drunken monkey’s picture

Status: Needs work » Closed (fixed)

Closing here again since we committed the other issue.

brunorios1’s picture

Hi,

I have two nodes with titles: 111222333 and 111-222-333

A search for 111-222-333 returns the 2 nodes as results.
But a search for 111222333 only returns the node with exact match title 111222333.

This is expected?
How can I create a search that works in both cases?

Thanks!

mrharolda’s picture

@brunorios1: only SOLR support that kind of searches.

Check this for more info: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Tokenizers

brunorios1’s picture

Thanks @MrHaroldA,
I'll take a look.

brunorios1’s picture

@MrHaroldA,

I managed to do this with search_api_db using the search_api Tokenizr processor with the dash in the Ignorable Characters field.

Thanks again.

brunorios1’s picture

Can you please confirm if there's a "starts with" option available as mentioned in #45?

My server is configured to search parts of words.

I have an exposed filter (Indexed node: Title) in my view, but I only see "contains" and "doesn't contain" avalable in the operators options.

Thanks!

Anonymous’s picture

How/where do I enable the partial match/search? I don't see anything in the UI.
Overlooked this is D7.

guilopes’s picture

StatusFileSize
new3.22 KB

Hello @brunorios1, This patch add option to search by "Start With"

drunken monkey’s picture

Project: Search API Database Search » Search API
Version: 7.x-1.x-dev » 8.x-1.x-dev
Component: Code » Database backend
Status: Closed (fixed) » Patch (to be ported)

Seems I overlooked this while identifying issues that need to be ported to D8. This never made it in there, only three places where, for some reason, the D7 code for this shows through.

drunken monkey’s picture

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

The code looks great and I think the tests for this are very readable and as far as I can see they cover all the changes.

  • drunken monkey committed 5b279f9 on 8.x-1.x
    Issue #1299238 by drunken monkey: Added an option for partial matching...
drunken monkey’s picture

Status: Reviewed & tested by the community » Fixed

Great to hear, thanks for reviewing!
Committed.

drunken monkey’s picture

Status: Fixed » Needs review
StatusFileSize
new4.27 KB

Damn, this seems to have broken tests for some reason. Or the test bot is in "random fails" mode again, which actually seems more likely to me, given the test results.
But let's see.

Status: Needs review » Needs work

The last submitted patch, 90: 1299238-90--revert_test.patch, failed testing.

drunken monkey’s picture

Status: Needs work » Fixed

OK, it seems the test fails are actually unrelated to this issue, but most likely to a change in Core. And that's when I'd hoped with the stable release this sh*t would be over …
See #2668908: Fix new test fails.

nyariv’s picture

I am getting an sql error on multiple keyword search for #86 patch using fulltext search contextual filter:

SQLSTATE[42S22]: Column not found: 1054 Unknown column 't.word' in 'having clause': SELECT t.item_id AS item_id, SUM(t.score) AS score FROM (SELECT t.item_id AS item_id, t.score AS score, t.word LIKE '%malta%' AS w0, t.word LIKE '%test%' AS w1 FROM {search_api_db_pnx_entities_text} t WHERE ( (t.word LIKE :db_condition_placeholder_0 ESCAPE '\\') OR (t.word LIKE :db_condition_placeholder_1 ESCAPE '\\') )AND (field_name IN (:db_condition_placeholder_2)) GROUP BY item_id, score, w0, w1) t GROUP BY t.item_id HAVING (COUNT(DISTINCT t.word) >= :subs2) ORDER BY score DESC; Array ( [:subs2] => 2 [:db_condition_placeholder_0] => %malta% [:db_condition_placeholder_1] => %test% [:db_condition_placeholder_2] => name )

nyariv’s picture

Status: Fixed » Needs work
drunken monkey’s picture

Can you reproduce this with a clean installation, using the latest versions of both Core and this module?
If so, please list the steps to reproduce this problem in more detail.

nyariv’s picture

Tried backtracing the steps on my environment but I could not reproduce it. Looking at the code in Database.php , if either of the lines 1814

unset($columns['word']);

or 1872

$db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => $subs));

are removed then the error disappears. It seems the issue is when the $not_nested flag is false and multi word is used and partial matches are on.

nyariv’s picture

Ok I managed to reproduce on clean install. The steps are:

1) Install profile standard with search api 8.x-1.x, enable all logging messages to display
2) Install Database Search Defaults module
3) Change database server setting 'minimum word length' to 1 and enable 'search parts of word'
4) Create search index view, add search fulltext contextual filter
5) Insert any two different words into arguments and update preview

drunken monkey’s picture

Can't reproduce it, sorry. Are you maybe using a database software other than MySQL – e.g., Postgres?
Sorry, forgot to ask that in my previous mail.

drunken monkey’s picture

Status: Needs work » Needs review
StatusFileSize
new1.37 KB

Sorry, disregard the above – it seems I had error display disabled, for some strange reason.
Actually, this really is broken, as the attached patch should show. Setting to "Needs review" for the test bot.

Status: Needs review » Needs work

The last submitted patch, 99: 1299238-100--partial_matching_followup--tests_only.patch, failed testing.

The last submitted patch, 99: 1299238-100--partial_matching_followup--tests_only.patch, failed testing.

drunken monkey’s picture

Status: Needs work » Needs review
StatusFileSize
new3.23 KB
new7.41 KB

I think/hope I managed to fix it. Please test/review!
Also, as far as I can see, the minimum character count doesn't influence this at all.

drunken monkey’s picture

Issue tags: +Needs backport to D7

Also, at least the test needs to be backported.

The last submitted patch, 102: 1299238-102--partial_matching_followup--tests_only.patch, failed testing.

The last submitted patch, 102: 1299238-102--partial_matching_followup--tests_only.patch, failed testing.

borisson_’s picture

Status: Needs review » Needs work

I really like the tests, they are very expressive @drunken_monkey++

I can't help but nitpick at least a little bit.

  1. +++ b/search_api_db/src/Plugin/search_api/backend/Database.php
    @@ -1772,14 +1774,15 @@ protected function createKeysQuery($keys, array $fields, array $all_fields, Inde
    +      $mul_words = $word_count > 1;
    

    I think this'd be more readable if it'd have the surrounding parenthesis because I didn't pick up on the fact that this is a short if statement.

  2. +++ b/search_api_db/src/Plugin/search_api/backend/Database.php
    @@ -1858,11 +1877,18 @@ protected function createKeysQuery($keys, array $fields, array $all_fields, Inde
    +          if ($mul_words) {
    +              $db_query->having('COUNT(DISTINCT t.word) >= ' . $var, array($var => $subs));
    +          }
    

    Too much indentation here, I think.

  3. +++ b/search_api_db/tests/src/Kernel/BackendTest.php
    @@ -239,13 +239,13 @@ protected function disableHtmlFilter() {
    +   * @param array|null $fields
    

    Can we make this annotation more specific? Is string[]|null correct?

drunken monkey’s picture

Status: Needs work » Needs review
StatusFileSize
new1.89 KB

I can't help but nitpick at least a little bit.

I would have expected nothing less. ;)
Should all be fixed with the attached updated patch.

I think this'd be more readable if it'd have the surrounding parenthesis because I didn't pick up on the fact that this is a short if statement.

Not really a "short if statement", just an assignment of a boolean value. But sure, we can add parantheses.

drunken monkey’s picture

Drupal.org didn't want my patch …

borisson_’s picture

Status: Needs review » Reviewed & tested by the community

Looks great!

nyariv’s picture

Tested #108, seems to work well.

  • drunken monkey committed f290966 on 8.x-1.x
    Follow-up to #1299238 by drunken monkey: Fixed partial matching with...
drunken monkey’s picture

Great, thanks for reviewing and testing!
Committed.

drunken monkey’s picture

Project: Search API » Search API Database Search
Version: 8.x-1.x-dev » 7.x-1.x-dev
Component: Database backend » Code
Status: Reviewed & tested by the community » Patch (to be ported)
Issue tags: -Needs backport to D7
ricdeters’s picture

I'm new to drupal. What is required to get this funtionality in Drupal 7?

drunken monkey’s picture

The functionality is already present in Drupal 7, you just need to enable it in the search server settings (admin/config/sarch/search_api/server/SERVER_ID/edit). What still needs to be ported is a small bug fix that might cause problems with this option in some situations.

If you are a developer and want to port the patch, start by adding the assertions from the patch to D7's SearchApiDbTest::searchSuccessPartial() and see if they pass. If not, try to apply the same fix as in the D8 patch (the code is largely the same or similar).

mikemadison’s picture

I know mostly this is about D8 at this point, but a quick throwback all the way to #6 for D7...

We are using Search API Solr and were running into incomplete partial matching using a views based search page. It took me a bit to understand why this wasn't working, as my schema.xml file already contained the suggested code above...

<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />

HOWEVER the code wasn't being applied to a text field. Once I added the filter class to the text field definition in the schema.xml, reloaded the SOLR core, and then re-indexed the server, it worked great.

nikolay borisov’s picture