Using docValues should be the default for all string fields due to performance reasons. But there's an important edge case:

String fields are using SORTED_SET, so multiple identical entries are collapsed into a single value. Thus if I insert values 4, 5, 2, 4, 1, my return will be 1, 2, 4, 5 when enabling docValues.

If you need to preserve the order and duplicate entries, consider to store the values as zm_* (twice). Therefor select the "solr_string_storage" custom field type.

CommentFileSizeAuthor
#2 3024554.patch1.6 KBmkalkbrenner
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mkalkbrenner created an issue. See original summary.

mkalkbrenner’s picture

Status: Active » Needs review
FileSize
1.6 KB

  • mkalkbrenner committed 2676d84 on 8.x-3.x
    Issue #3024554 by mkalkbrenner: Solr 7.x: use docValues="true" for sm_*
    
mkalkbrenner’s picture

Status: Needs review » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Jancy Christopher’s picture

Hi mkalkbrenner,

As mentioned above

If you need to preserve the order and duplicate entries, consider to store the values as zm_* (twice). Therefor select the "solr_string_storage" custom field type.

Can you explain select the solr_string_storage custom field type?

Thanks for your feedback

Jancy Christopher’s picture

Hi mkalkbrenner,

I Have upgraded my search Search API Solr module from 8.x-2.0-alpha2 to 8.x-3.3.
I can see there is difference in how the best bet fields are indexed.

Added in indexed field as,
field exclude - data type boolean
field query_text - data type string

On solr 8.x-2.0-alpha2 the best bet values were indexed as below.

"response":{"numFound":1,"start":0,"docs":[
      {
        "bm_exclude":[false,
          false,
          false,
          false,
          false,
          true],
        "sm_query_text":["partners",
          "partner",
          "clients",
          "investors",
          "ecosystem",
          "excludeprop",]}]

and now the bm_exclude values does not presave the duplication and order.
It is displaying as below.

 "response":{"numFound":1,"start":0,"docs":[
      {
        "bm_exclude":[false,
          true],
        "id":"8cdkf4-cgi_index-entity:node/57362:en",
        "sm_query_text":["clients",
          "ecosystem",
          "excludeprop",
          "investors",
          "partner",
          "partners"]}]

Any suggestion on the above will be good thanks.

mkalkbrenner’s picture

You must not use docValues in this case. You can modify the field type using the yaml file or even better provide a dedicated type for that. The best way would be if that module provides its dedicated Solr field type.

Jancy Christopher’s picture

@mkalkbrenner, Thanks for your feedback.

To fix I have removed the docValues for boolean from schema.xml file.
For the string field I have changed the field type to storage-only(solr_string_storage custom field type option available in search_api_solr).

This preserves the duplication and order.