Last updated March 7, 2014. Created by drunken monkey on June 1, 2013.
Edited by ressa. Log in to edit this page.

The Solr configuration files packaged with this module are provided in a way to make customizing as easy as possible. The “core files” with the base configuration for the Solr server are schema.xml and solrconfig.xml. These should never be edited directly as they will have to be updated if future versions of the Search API Solr search module changes these files (though this shouldn't be the case too often).

The other files, however, only contain some default settings or only documentation, to help you customize your Solr server. These files will only rarely change, and when they do it should either be unnecessary to update your copies, or trivial to do so. Therefore, you can fill and edit them with custom settings specific to your site's needs. For the format of these files and what you can do with them, see the documentation comments included in them, or the official Solr wiki. The three *_extra*.xml files are included into schema.xml and solrconfig.xml when they are read, thus allowing you to easily add settings to them.

Remember: After changing any configuration, you will always have to restart your Solr server for the changes to take effect!

A few examples for possible customizations follow.

Changing the Solr type of a field

The schema.xml file contains several alternatives for most data types that aren't used by default. For example, for fulltext fields there are text (the default), text_ws, text_und and edge_n2_kw_text; for (long) integers, there are long (used by default), slong and tlong.

If you want to use such a type for one of your indexed fields, it's pretty easy: you first have to find out the internal name Solr uses for the field. This will be the internal Search API field name (can be seen, e.g., in the source code of the index's Fields tab) prefixed by a few letters and an underscore. For example, the Solr field name for the body text of a node is tm_body:value.
Then just put the following inside of schema_extra_fields.xml:

<fields>
   <field name="FIELD" type="TYPE" indexed="true" stored="true" multiValued="(true|false)" />
</fields>

For the right multiValued (and perhaps other) settings, it's easiest to look inside the schema.xml file for the <dynamicField> declaration with the prefix matching your field, and copy all its settings except for name and type.

So, for example, to change the Solr type of the node's body text to text_ws, use:

<fields>
   <field name="tm_body:value" type="text_ws" indexed="true" stored="true" multiValued="true" termVectors="true" />
</fields>

Sadly, due to restrictions of Solr itself, replacing the Solr type used for a certain Search API data type alltogether is not possible. If you want to do that, you will manually have to change it for all fields of that type – though you can use dynamic fields for at least a little help: e.g., if you want to replace the type for the fields is_comment_count and is_category, you can just use <dynamicField name="is_c*" type="TYPE" … /> (provided there is no other field with that prefix which you don't want to change – which will always be the case when changing a type completely, though).

Changing the language of a fulltext field

By default, all text fields in Solr will use English stemming. If you want to use stemming for a different language (or other modifications), you'll have to create a new type with these settings and then configure the relevant fields to be indexed with this type. (How the latter is done was already explained above – just add field definitions for some or all fields with the tm_* prefix with your customly added type.)

For adding the custom text type, just copy the definition of the text type in schema.xml to schema_extra_types.xml. The type definition is the block starting with <fieldType name="text" and ending with the next </fieldType> (about 54 lines in total). Then edit the copy in schema_extra_types.xml to your liking.
First, change the identifier (in name="text" right at the beginning) to some other, not already used one – e.g., text_fr for French text. (An example for German is already included in the schema_extra_types.xml file – just remove the comment to use it.) You can use any identifier you like, though, so iwflksxf is also fine.

Then replace the two occurrences of "English" in the definition with the language of your choice – see this Solr wiki page for a list of supported languages.
If you want to use several languages at once on this Solr server, and therefore can't just fill synonyms.txt, protwords.txt, etc., with settings for your language, you can also set new, language-specific files for these default files here. Just replace the respective file names in the definition.

To add more than one type, just copy one or more additional type definitions after the closing </fieldType> of the first one.

Finally, just add the <field> definitions using the new type(s) to schema_extra_fields.xml as described above. Remember to change type="text_ws" to type="text_de", or whatever you use in your schema_extra_types.xml file name field.

Creating a text type for partial matching

(For actually using that type for your fields, again, see above.)

By default, the Solr search module doesn't support partial (or substring) matching. E.g., when searching for "break", items containing "breakpoint" (or "unbreakable") aren't found. This default was selected since it returns more reliable results that don't just contain the search keys by accident, and since it will perform better for larger data sets. Also, stemming already takes care of some of these queries (see also Solr's notes about stemming).
However, on many sites users will expect partial matches to be returned. Luckily, Solr already comes equipped with text analysis tools to easily implement this for your server: the solr.NGramFilterFactory and the solr.EdgeNGramFilterFactory filters. The difference is that, with the latter, only partial matches at the beginning (or, optionally, at the end) of words will be found, while the former will find all substrings contained in a word. Which of these you want to use depends on your specific use case / site. The procedure is nearly identical in both cases, though:

First, copy a text type definition to schema_extra_types.xml and change the identifier, as described above.
Then, add the following line to the type definition after the first occurrence of "solr.SnowballPorterFilterFactory" (inside of the <analyzer type="index"> element; not after the second occurrence):
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="25" />

If you want partial matches inside of words to be found, too, simply remove the "Edge" part from that line. In this case, you should also remove both occurrences of the solr.WordDelimiterFilterFactory filters: remove everything from the <filter that preceeds that string to the "great than" sign (>) coming after it.

Now, after also adding the field definitions, re-starting your Solr server and re-indexing your content, partial matches should be found with searches on your site.

Adding a new Search API data type

This requires a bit of custom code in addition to configuration changes, but it can make custom additions a lot more user-friendly and easier to use. With Search API Solr Search, you can easily add new data types to the Search API index's "Fields" form. That way, you can make all of the above changes in a way that just lets you select the data type for the field on the "Fields" form like normally. Everything is displayed right in the admin UI, which also makes it easier to remember which fields have custom changes made. Also, you don't need a field's Solr identifier to make changes to its type.

To add the new type, first add it in Solr (as described above) and also add a dynamic field for it. (If it is a non-fulltext type, instead create two dynamic fields: a single-valued one whose prefix ends in s_* and a multi-valued one whose prefix ends in m_*.) Then, create a custom module (or use an existing one and implement hook_search_api_data_type_info() (documented in the Search API's search_api.api.php file) with the extensions documented with search_api_solr_hook_search_api_data_type_info() (in this module's search_api_solr.api.php file).

For example, if you have created two new types, super_integer for integers and super_text for fulltext, you could use the following for the dynamic fields:

<dynamicField name="superi_s_*"  type="string"  indexed="true"  stored="true" multiValued="false" />
<dynamicField name="superi_m_*"  type="string"  indexed="true"  stored="true" multiValued="true" />
<dynamicField name="supert_*"  type="text"    indexed="true"  stored="true" multiValued="true" termVectors="true" />

Then your hook implementation should look like this (with MODULE being your custom module's name):

<?php
function MODULE_search_api_data_type_info() {
  return array(
   
// You can use any identifier you want here, but it makes sense to use the
    // field type name from schema.xml.
   
'super_text' => array(
     
'name' => t('Super fulltext)'),
     
'fallback' => 'text',
     
// Dynamic field "supert_*".
     
'prefix' => 'supert',
     
// Fulltext types are always multi-valued.
     
'always multiValued' => TRUE,
    ),
   
'super_integer' => array(
     
'name' => t('Super integer'),
     
'fallback' => 'integer',
     
// Dynamic fields with name="superi_s_*" and name="superi_m_*".
     
'prefix' => 'superi_',
    ),
  );
}
?>

Using the correct Lucene version

Starting with Solr 3.x, it is possible (and mandatory) to specify the version of Lucene your Solr server should use. Since the module developers cannot know what version of Solr their users will running, the default config files contain defaults for Solr 3.5 or Solr 4.0 (depending on config version), which will also work for all later versions (of the same major version, i.e., 3 or 4).

However, for best performance, the latest bug fixes, etc., you should definitely use the latest version available to your server, which will be the version of Solr itself. This setting can easily be changed in the solrcore.properties file provided with the config files. Just change the value after the equals sign = that starts with solr.luceneMatchVersion=. The format to use is as follows: first LUCENE_, then the major and minor version number you want to specify, without anything in between. So, for example, if you are using a Solr 4.2 server, the line in solrcore.properties should look as like this:
solr.luceneMatchVersion=LUCENE_42
Never use versions higher than that of your Solr server, as Solr will then refuse to start.

Caution: You should also keep in mind that for some minor version updates, the format of config files can change. This is especially the case for Solr 3.6. This means, that you cannot use versions of 3.6 or later for this setting and still use the default config files provided with this module. That's also why the default setting for the 3.x configs is 3.5 – it is the latest version that will work with the provided 3.x config files.
If you are using Solr 3.6 or higher (but still 3.x), you should either leave the setting unchanged at LUCENE_35; or try to upgrade to Solr 4.x; or, if you are an advanced Solr user, use the correct Solr version and adapt the config files accordingly.

Looking for support? Visit the Drupal.org forums, or join #drupal-support in IRC.