Problem/Motivation

In the SOLR schema.xml you can define dynamic field types for different purposes to enable things such as filters etc. This allows you to do more with your searches, such as using EdgeNGramFilterFactory to enable partial matching in fulltext searches. However, there are currently limitations in Search API Solr (or perharps Search API itself) that prevent you from being able to do this.

Proposed resolution

The ideal would be able to set up a dynamic field type in the schema.xml for Solr and then be able to select this as an option in the index (ie an alternative to fulltext, string etc). You can define these types by implementing hook_search_api_data_type_info(), including providing a fallback for servers that support the implementation. However, it may be better to provide a hook in Search API Solr that allows you to define the additional bits of information required for Solr (such as prefixes, support etc) which Solr then declares in it's own implementation of hook_search_api_data_type_info().

So the way I imagine it working would be:

  1. Developer adds the new field type and dynamic field to their schema.xml with the settings that they want. For example:
    <dynamicField name="te_*" type="edge_n2_kw_text" termVectors="true" />
    Will set any fields named "te_*" to use the Edge N Gram definition that is already in the schema.xml.
  2. Developer then (in a custom module) implements something like:
    /**
     * Implements hook_search_api_solr_dynamic_field_info().
     */
    function mymodule_search_api_solr_dynamic_field_info() {
      return array(
        'edge_n2_kw_text' => array(
          // Definitions for hook_search_api_data_type_info().
          'name' => t('Fulltext (with partial matching)'),
          'fallback' => 'text',
          // Additional definitions for Search API Solr.
          'prefix' => 'te',
          'always multiValued' => TRUE,
        ),
      );
    }
    

The issues

  1. To use a custom data type with a fallback, the server needs to declare that is supports the data type. SearchApiIndex::index() does this by calling the server's implementation of SearchApiAbstractService::supportsFeature() with the feature being called 'search_api_data_type_' . $custom_type (search_api/includes/index_entity.inc line 430). To allow custom dynamic data types, Search API Solr would need to return TRUE for these custom types, which it currently doesn't.
  2. In SearchApiSolrService::getFieldNames() we get hold of the type using $field['type'] (search_api_solr/service.inc line 347). However, custom fields that provide a fallback store the fallback in 'type' but the real type in 'real_type'. In SearchApiIndex::index() they use the following code to decide what type to use:
          if (isset($info['real_type'])) {
            $custom_type = search_api_extract_inner_type($info['real_type']);
            if ($this->server()->supportsFeature('search_api_data_type_' . $custom_type)) {
              $fields[$field]['type'] = $info['real_type'];
              ...
            }
          }
    

    I think SearchApiSolrService::getFieldNames() needs to do the same process so that it knows what type it should be using.

  3. SearchApiSolrService::getFieldNames() then needs to figure out the naming convention for the field. First off it gets hold of the prefix. This list of prefixes is hard coded as part of SearchApiSolrService, so any dynamic fields would need to be able to be added in. After that, it then works out whether the field is a single or multiple:
            if ($pref != 't') {
              $pref .= $type == $inner_type ? 's' : 'm';
            }
    

    This hard coding then means it's impossible to declare other text based field types, as they would have a different prefix.

I'm up for doing most of the leg work, but would definitely appreciate some reviewing of it to make sure I've not missed anything obvious or if there is a better solution to the problem!

Comments

andrewbelcher’s picture

Status: Active » Needs review
StatusFileSize
new5.85 KB

Ok, here is a patch that works smoothly for me!

What I've done

  • Added a new hook, hook_search_api_solr_dynamic_field_info() (and hook_search_api_solr_dynamic_field_info_alter()) to define how Search API Solr should index different Search API Data Types.
  • Added an info getter (search_api_solr_get_dynamic_field_info()) for retrieving the information.
  • Updated service.inc to use the definitions from the new hook rather than the previously hard-coded variables.
  • Implemented hook_search_api_solr_dynamic_field_info_alter() to give the definitions for the types that were previously hard-coded.

How to use it

Here is an example of how to set up an alternative fulltext using the EdgeNGramFilter as already defined in the schema.xml Search API Solr ships with:

mymodule.module

/**
 * Implements hook_search_api_data_type_info().
 *
 * Declare our new type to Search API so it can be selected for a field.
 */
function cns_solr_search_api_data_type_info() {
  return array(
    'edge_n2_kw_text' => array(
      'name' => t('Fulltext (partial)'),
      'fallback' => 'text',
    ),
  );
  return $types;
}

/**
 * Implements hook_search_api_solr_dynamic_field_info().
 *
 * Tell Search API Solr how to index our new data type.
 */
function cns_solr_search_api_solr_dynamic_field_info() {
  return array(
    'edge_n2_kw_text' => array(
      'prefix' => 'tem',
      'always multiValued' => TRUE,
    ),
  );
}
bachbach’s picture

Hi Andrew,
patch #1 works for me against 1.0-rc3, thanks a lot for that, i use it to setup an autocomplete field based on solr, this solution is far better than this one

andrewbelcher’s picture

I've fixed some of the field type definitions that were missing the prefix key and re-rolled it against the latest dev version. Have also attached a patch for rc3.

drunken monkey’s picture

StatusFileSize
new6.12 KB

Thanks for the feature request and the patch! The feature really does make sense, there currently doesn't seem to be an easy way to do this. (I hope you know, though, that for custom web sites you can just make the specific field use a certain type, instead of using a dynamic field. So it's still easily possible to do this, it's just admittedly far less comfortable than when you can add the field type and later just pick the fields to use it in the UI.)
Your patch also looks great, I rarely get patches this good … Sorry it took me so long to get back to you!

Anyways, I'd still like to discuss this a bit before committing:

  1. Shouldn't we spare users from implementing two hooks to add such a type? We could either implement hook_search_api_data_type_info() ourselves in this module and there just pass on the information from the new hook_search_api_solr_dynamic_field_info(); or we could let users specify the additional, Solr-specific information right in hook_search_api_data_type_info() – including additional keys isn't forbidden (we should probably just prefix them accordingly), other modules would just ignore them but we would look for them and automatically add support for these types.
  2. In both cases it would be necessary to move the specification of the types we support by default outside of a hook implementation, which I think we should do in any case. We could still let people alter them.
  3. Another idea: we could only let people alter them (i.e., our type-prefix mappings). Wouldn't that be enough? We'd then only have to change supportsFeature() to return TRUE for all types that have a mapping.
    Of course, even though functionally probably equivalent it would be a bit less comfortable, having to both add the data type and alter the prefix mappings. Still, in the current patch there's two hooks to implement, too.
  4. Do we need the always multiValued flag? People could just specify both (e.g.) tes_* and tem_* as multi-valued fields. If we document this properly (we should add a short explanation on how to add dynamic fields anyways), that's probably still more comfortable than having a special case for that flag. Well, or not. However, with the previous idea of only allowing altering of the prefix mappings, having the flag wouldn't be possible anyways.

OK, that's about all of my thoughts so far. Sorry it's a bit much, writing stuff down helps me think. ;)
Generally, I like low-impact solutions, but making this as comfortable as possible for developers is of course important, too. Given that, I guess one of the two variants in item 1 would be the way to go.
What do you think?

andrewbelcher’s picture

drunken monkey, no worries about the time! Thanks for your feedback!

  1. I did initially have them all as part of the existing hook... I can't remember what my reasoning was, I'll see if I've got my history of how the patch got to where it is now and see if I can spot why I did that. I'm wondering if the best place would be to 'define' the additional keys for hook_search_api_data_type_info() and then use hook_search_api_data_type_info_alter() to add our info to the core Search API types...
  2. If we're bringing it all into hook_search_api_data_type_info()/hook_search_api_data_type_info_alter(), is there any reason to move the bits we provide out of a hook? We can just implement it in the same way we'd recommend others to?
  3. Not quite sure what you mean here. I'm not 100% sure I understand everything about the supportsFeature() method, so my inclination here would be for me to update the patch with the above and then let you tell me if I've done the right thing with supportsFeature() :P
  4. Erm... possibly not. I think the reason for it was that what was previously there:
    @@ -456,8 +442,9 @@ class SearchApiSolrService extends SearchApiAbstractService {
             }
     
             $inner_type = search_api_extract_inner_type($type);
    -        $pref = isset(self::$type_prefixes[$inner_type]) ? self::$type_prefixes[$inner_type] : '';
    -        if ($pref != 'tm') {
    +        $type_info = search_api_solr_get_dynamic_field_info($inner_type);
    +        $pref = isset($type_info['prefix']) ? $type_info['prefix']: '';
    +        if (empty($type_info['always multiValued'])) {
               $pref .= $type == $inner_type ? 's' : 'm';
             }
             $name = $pref . '_' . $key;

    So it was already doing a manual check for tm before deciding whether to treat things as multi valued or not. Obviously leaving it hard-coded for only tm doesn't make sense, so I brought it into a parameter. But if everything can be single/multi valued, then that check could just go and we can ignore the always mulitValued stuff. Again, you probably know better than me why it was forcing tm to be multi valued, so I'll leave that one to your call...

Things my end are manic atm, so may be a couple weeks before I've got the time to take another look at this, but it will happen!

drunken monkey’s picture

Status: Needs review » Needs work

I'm wondering if the best place would be to 'define' the additional keys for hook_search_api_data_type_info() and then use hook_search_api_data_type_info_alter() to add our info to the core Search API types...

You're right, good idea!
It feels a bit hack-ish, but I guess it could really be the best alternative.

If we're bringing it all into hook_search_api_data_type_info()/hook_search_api_data_type_info_alter(), is there any reason to move the bits we provide out of a hook? We can just implement it in the same way we'd recommend others to?

You're right, I guess there isn't, and it might make things clearer for others.

Not quite sure what you mean here. I'm not 100% sure I understand everything about the supportsFeature() method, so my inclination here would be for me to update the patch with the above and then let you tell me if I've done the right thing with supportsFeature() :P

OK. ;)

Erm... possibly not. I think the reason for it was that what was previously there:
[[CODE]]
So it was already doing a manual check for tm before deciding whether to treat things as multi valued or not. Obviously leaving it hard-coded for only tm doesn't make sense, so I brought it into a parameter. But if everything can be single/multi valued, then that check could just go and we can ignore the always mulitValued stuff. Again, you probably know better than me why it was forcing tm to be multi valued, so I'll leave that one to your call...

The problem here is two-fold: first, in the Search API, fulltext data can be passed to the service class already tokenized. Also, Solr will tokenize the data anyways, so having it as single-valued (and using implode() to force tokenized data to be single-valued) doesn't give us any advantage. Therefore, this service class always uses a multi-valued field for fulltext data.
However, since we're using common configurations, we can't just define ts_* to be multi-valued, too. Therefore, we just always force the field to have the tm_* prefix, no matter whether the Search API field is single-valued.

However, for custom-added fields there really is no reason why we shouldn't let users define both XXs_* and XXm_* as multi-valued (as long as we document this properly), and it saves us from having to have this additional key and changing all of these checks.

However, as I write this up I'm becoming less and less certain this is the right way, maybe having the key will be more comfortable and/or less error-prone, or maybe there is a hidden assumption somewhere I didn't take into consideration.
I guess we'll just have to see how well this works in the implementation, and re-evaluate whether we want to have the multiValued key after all.
But for now, I think we should try to do this without it and see how well it works.

(Oh, in any case it would be great if you could add this use case to the developer documentation once this patch is committed.)

smithmilner’s picture

This looks pretty interesting. What kind of performance hits might there be for integrating EdgeNGramFilterFactory into regular text fields?

danielnolde’s picture

StatusFileSize
new6.18 KB

Rerolled patch #4 against current dev to make use of it. Works well for me so let's think about whether committing this is better than further discussion. My 2cents: Commit this, it works as intended, solves a problem well, looks clean in terms of code and architecture (to be as sapi noob at least).

drunken monkey’s picture

Status: Needs work » Needs review
andrewbelcher’s picture

Sorry I've not got round to re-working as per the suggestions, things have been absolutely manic! I should have more space in the next couple weeks, so I will try my best to get an update patch up!

Alternatively if you want to commit it create some follow up issues and I can pick them up.

drunken monkey’s picture

Status: Needs review » Needs work

Absolutely no problem, happens to me all the time, too. ;)

Follow-up issues are rarely a good idea, especially outside of core, so I'll just leave this for now until you have the time to work on it. Please don't feel pressured, though, lack of time is very common among Drupal developers. ;)

drclaw’s picture

Hi All,

I thought I might take a stab at it as I need this functionality as well. I've attached a patch that I think addresses previous discussions:

  1. We're using hook_search_api_data_type_info_alter() now instead of a custom hook to add our additional data type info to default data types (as well as location, geohash and tokens -- more on this at the end)
  2. We're also now using search_api_get_data_type_info() in our service class getFieldNames() method to get field info
  3. SupportsFeature has been modified to include any defined custom data types
  4. I left the always multivalued key in there, but I lowercased it to keep it drupal-esque

There was one thing I wasn't sure about. When I first started working on this, I added our extra data type info for the tokens, location and geohash data types. However, this was causing them to show up in the type list on the fields admin page in search api. The seemed incorrect to me since I couldn't choose those before as a type. So, I changed it to only add the extra data if the type had already been defined, presumably by another module (e.g. https://drupal.org/project/search_api_location for location fields)... Is this correct?

Thanks!
drclaw

drclaw’s picture

Status: Needs work » Needs review

Oops, needs review. =)

nick_vh’s picture

@@ -113,6 +113,95 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
+      'always multivalued' => FALSE,

I'd change this to cardinality, having a big string that says always multiple is a bit of redundant information because if it's set in code it will always be always ;-)

Also, this is more in line with how drupal defines its structure of telling is a field is singular or multiple (cardinality). See https://api.drupal.org/api/drupal/modules!field!field.module/group/field/7

nick_vh’s picture

Status: Needs review » Needs work
drclaw’s picture

I don't think Cardinality is the right terminology in this case. We're not determining the number of elements of the field. We just want to know if it is or isn't a multivalue field...

drunken monkey’s picture

Status: Needs work » Needs review

I agree with drclaw, we don't want to know the exact number of items. And "multivalued" would convey the wrong impression that FALSE means that the field will always be single-valued. "always multivalued" is really a bit long, but I can't think of a better option.

Sorry for not having had the time yet to do a review on this, I have way too many open issues and way too little time. I'll get back to you, hopefully soon!

nick_vh’s picture

So, in Drupal 6 we used the terminology of "multiple", in Drupal 7 this changed to "cardinality". In the apachesolr module we use this field from Drupal to define if a field is single of multivalued. And each multivalued field we also index the first value in a single valued field in order to allow dynamic switch backs if these fields would ever change.

Now, I think it's a good idea to use the existing structure in Drupal to define if a field is single or multiple, certainly in a module that is so Drupal/Entity agnostic like Search API.

If not "cardinality", you should probably use "multiple" and omit the "always" as always is not clear enough.

drclaw’s picture

Yep, I see your point. I don't really like that "always" in there either. However, the terminology we're trying to match here is Solr's, not Drupal's. I think I made a mistake by lower casing the term "multiValued" because that is the how it's spelled in the dynamicField tag of Solr's schema.xml (I've updated the patch). Also, the purpose of the flag is not to decide if a field is single or multiple. The purpose is to indicate whether or not SearchAPI Solr should automatically detect whether or not the field is multiValued, or if it should just always consider it multiValued.

I'm not opposed to changing the phrasing of the flag, but I want to make sure it conveys the correct information.

nick_vh’s picture

ok I got it.
So you are trying to define which fields you have available in Solr and tell Search API a specific field is not available in single or multiValued type.

In order to do so, we can probably have the property single and multiple, default both of them to true when loading the definitions and then in the hook you can set single to FALSE or multiple to FALSE depending on the field definition?

That sounds like a more understandable data structure.

drunken monkey’s picture

Status: Needs review » Needs work

So you are trying to define which fields you have available in Solr and tell Search API a specific field is not available in single or multiValued type.

It's only half of that: either it's available for both single and multiple values, or it's always multi-valued (or multiValued). This just comes from my initial decision to make text fields always multi-valued (as you might recall, I previously only had the t_* prefix for fulltext) – to avoid larger changes in the code, we have to be able to define whether new dynamic fields should behave like fulltext fields – i.e., always be multiValued in Solr. It doesn't say anything about whether the original data is single- or multi-valued.

I now know that this initial decision was probably the wrong one, but now we're stuck with it and have to make do. You are welcome to get rid of this mess in D8. ;)

As for the patch, I now reviewed it. See my comments below. Functionally, it seems to work fine, at least as far as the existing functionality is concerned. But I don't see why it shouldn't work for new dynamic fields, too.

  1. +++ b/includes/service.inc
    @@ -298,9 +277,16 @@ class SearchApiSolrService extends SearchApiAbstractService {
    +    // Custom data types.
    +    foreach (search_api_get_data_type_info() as $type => $info) {
    +      $supported[] = 'search_api_data_type_' . $type;
    +    }
    

    You can't do it this way – Solr doesn't automatically support all custom data types, and there are no search_api_data_type_* features for the built-in types (although that of course shouldn't lead to any errors).

    This is needed to support the new, "dynamic fields" data types, right? In that case, I'd probably define the static ones as before, then check against those and return TRUE for a match. And only if there is none and the feature starts with the right prefix, get the data type information – and only return TRUE if the one in question has the Solr prefix defined.

  2. +++ b/search_api_solr.module
    @@ -113,6 +113,95 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
    + *
    + * @see hook_search_api_data_type_info()
    

    Since you reference thatabove already, I don't think the additional @see is necessary.

  3. +++ b/search_api_solr.module
    @@ -113,6 +113,95 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
    +      'name' => t('Fulltext'),
    +      'prefix' => 'tm',
    +      'always multiValued' => TRUE,
    

    I wouldn't override the names here, and also I think for 'always multiValued' == FALSE we can just leave this blank (we have to use empty() everywhere anyways).

  4. +++ b/search_api_solr.module
    @@ -113,6 +113,95 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
    +  $extra_types_info = array(
    +    'tokens' => array(
    +      'prefix' => 'tm',
    +      'always multiValued' => TRUE,
    +    ),
    

    tokens is a special case – it's actually defined by the Search API, so it should always be available, but it's no "official" data type (= none that can be selected in the UI). It'S used internally to mark already tokenized fulltext data.

    So, I guess it's necessary to somehow add this type to our Solr-internal type collection, but not to the "official" one via hook_search_api_data_type_info_alter().
    Probably you should wrap all search_api_get_data_type_info() calls (apart from the one in supportsFeature()) in a custom method which adds tokens to the array.

drclaw’s picture

Status: Needs work » Needs review
StatusFileSize
new6.15 KB

Alright, patch round #2 (at least, round 2 from me). I incorporated all your feedback. Hope I understood everything correctly. In doing #4 I realized we don't even need to implement hook_search_api_data_type_info() or hook_search_api_data_type_info_alter(). Just the wrapper function is enough. Because of this, I added an example of a hook_search_api_data_type_info() implementation with our new keys to search_api_solr.api.php. Hope that makes sense!

Thanks!
drclaw

drunken monkey’s picture

StatusFileSize
new7.79 KB

Thanks for revising the patch, great work! It already looks pretty good, I just found a few remaining issues (largely nit-picking, and nothing functional):

  1. +++ b/includes/service.inc
    @@ -290,6 +268,7 @@ class SearchApiSolrService extends SearchApiAbstractService {
    +    // Search API features.
    

    Comments should be full sentences ending with a period.

  2. +++ b/includes/service.inc
    @@ -301,7 +280,22 @@ class SearchApiSolrService extends SearchApiAbstractService {
    +    else {
    +      // Check custom data types for custom dynamic fields.
    +      foreach (search_api_get_data_type_info() as $type => $info) {
    +        // Only consider it supported if the "prefix" key is in the info
    +        if (array_key_exists('prefix', $info) && $feature == 'search_api_data_type_' . $type) {
    +          return TRUE;
    +        }
    +      }
    +    }
    

    First, you don't need the else if you return in the if.
    Second, this code is too complicated: you only need to check a single key – just check it, no need to loop. Also, you only need to check this at all if $feature has the right prefix.

    Looping through all the types would only be useful if you then (statically) cache that information (the first time you need it) so further lookups for any data types will go faster. However, since search_api_get_data_type_info() is already statically cached, there probably wouldn't be much performance to gain here.

  3. +++ b/includes/service.inc
    @@ -495,8 +489,10 @@ class SearchApiSolrService extends SearchApiAbstractService {
    +        // dpm($type_info);
    

    This should probably not be committed. ;)

  4. +++ b/search_api_solr.api.php
    @@ -103,3 +103,26 @@ function hook_search_api_solr_multi_query_alter(array &$call_args, SearchApiMult
    +
    +/**
    + * Implements hook_search_api_data_type_info()
    + *
    + * This is an example of how you might define a custom dynamic field in solr.
    + */
    +function example_search_search_api_data_type_info() {
    

    Great that you included this example!
    However, while I'm not completely sure whether there's a standard or best practice for this, I'd use the documentation method the Entity API uses with entity_crud_hook_entity_info().
    That way, primarily, we can properly document how the additional keys should look.

  5. +++ b/search_api_solr.module
    @@ -113,6 +113,87 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
     /**
    + * Returns the search api data type info for both the default search api data
    + * types and custom data types defined by hook_search_api_data_type_info().
    + *
    

    The first sentence of a documentation block should be on a single line of at most 80 characters.

  6. +++ b/search_api_solr.module
    @@ -113,6 +113,87 @@ function search_api_solr_search_api_server_update(SearchApiServer $server) {
    + */
    +function search_api_solr_get_data_type_info($type = NULL) {
    +
    +  // Grab the stock search_api data types
    +  $infos = search_api_get_data_type_info();
    

    This is called often enough that we should definitely statically cache it.

I already included all of these in the attached patch, so as far as I'm concerned this would be good to go. Please test/review!

drclaw’s picture

Status: Needs review » Reviewed & tested by the community

Thanks for the notes (and re-roll). All good points. Your docblock explanations are much better than mine too. =P

Tested it out and it works a treat! Am I being too ambitious by setting it RTBC?

drunken monkey’s picture

Issue summary: View changes
Status: Reviewed & tested by the community » Fixed

No, you aren't. Thanks for reviewing, and always good when people then actually set the issue to RTBC!
So, committed. Thanks again for your work, everyone, especially andrewbelcher for the initial idea!

pol’s picture

Hi all,

I'm looking for a bit of help on this.
I'm looking for a way to use the 'edge_n2_kw_text' type for the title of a node, so a user can find a substring in the Title of a node easily.

To do this, according to this commit, I have to implement two hooks:

/**
 * Implements hook_search_api_data_type_info().
 *
 * Declare our new type to Search API so it can be selected for a field.
 */
function mymodule_search_api_data_type_info() {
  return array(
    'edge_n2_kw_text' => array(
      'name' => t('Fulltext (partial)'),
      'fallback' => 'text',
    ),
  );
  return $types;
}

/**
 * Implements hook_search_api_solr_dynamic_field_info().
 *
 * Tell Search API Solr how to index our new data type.
 */
function mymodule_search_api_solr_dynamic_field_info() {
  return array(
    'edge_n2_kw_text' => array(
      'prefix' => 'tem',
      'always multiValued' => TRUE,
    ),
  );
}

Then, in the fields of my index, I select "Fulltext (partial)" for the Title field. Right ?

I did it, but still, the 'label' of my node is still using the type 'text'.

What am I doing wrong ?

drunken monkey’s picture

The label field in Solr, if your are talking about that, isn't used at all by the Search API Solr module, and it's also not possible to change its type (since that's hardcoded). tem_title should be the Solr field containing your node titles.

pol’s picture

Hi,

Thanks for your insights, I did more research and I'm still not able to get it working.
When I select "Fulltext (partial)" for the Title field, it should use in Solr the tem_title type.

Using that code in #26, the Title field is still using the tm_title type.

If I select the "String" type for the Title field, the schema browser in Solr finds it under ss_title, which is right.

In the search_api_solr.api.php, there's also an example on how to use it, I copied it and I'm able to see the "Fulltext (w/ partial matching)" option in the select box, when I use it for the title field, I'm unable to find it in the schema browser of Solr. It's always using tm_title.

function mymodule_search_api_data_type_info() {
  return array(
    'edge_n2_kw_text' => array(
      'name' => t('Fulltext (w/ partial matching)'),
      'fallback' => 'text',
      'prefix' => 'te',
      'always multiValued' => TRUE,
    ),
  );
}

I guess I'm doing something wrong, but I don't know what I'm missing.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

drclaw’s picture

@Pol,

I think maybe we made a mistake in the documentation for using the 'edge_n2_kw_text' field that comes preconfigured in the Schema.xml provided by search_api_solr. When a field is configured as "always multiValued", you need to provide the full dynamicField name prefix as it's defined in Schema.xml. For the edge_n2_kw_text field, there are two dynamicFields defined. One that is multivalued and one that isn't:

<dynamicField name="tes_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true" />
<dynamicField name="tem_*" type="edge_n2_kw_text" indexed="true" stored="true" multiValued="true" omitTermFreqAndPositions="true" />

Since the field is always multiValued, the 'm' and 's' part isn't automatically added to the prefix for you so you have to do it yourself:

function mymodule_search_api_data_type_info() {
  return array(
    'edge_n2_kw_text' => array(
      'name' => t('Fulltext (w/ partial matching)'),
      'fallback' => 'text',
      'prefix' => 'tem',
      'always multiValued' => TRUE,
    ),
  );
}

Our example needs to be update to include the above "m" in the prefix. Also it's possible that the hook documentation itself could go into more detail of how the "m" and "s" letters are automatically added for you if you don't specify a field as always multiValued. Maybe @DrunkenMonkey has some thoughts?

pol’s picture

Oh cool, that 'm' missing makes my day :)

Indeed, I think the documentation should be updated as well :)

Thanks for the heads up drclaw, it's fantastic, it works pretty good !!!

sgdev’s picture

@drunken monkey, I fixed the documentation issue with a patch over 2 years ago, and @mkalkbrenner continues to close it and recommend that I contact you if it needs to be added.

Please include it in the 7.x branch. Thank you.

https://www.drupal.org/project/search_api_solr/issues/3004407