Last updated February 25, 2014. Created by drunken monkey on May 20, 2013.
Log in to edit this page.

This page documents special features you can use with Solr servers in the Search API, as well as some common pitfalls when using Solr.

Pecularities / Specifics
Solr's commit behavior
- Problems with maxWarmingSearchers
Multiple sites with a single Solr server
Non-English and multilingual sites
Supported server features
Hidden variables
Keeping Solr data for dev servers separate
Customizing your Solr server

Pecularities / Specifics

Please consider that, since Solr handles tokenizing, stemming and other preprocessing tasks, activating preprocessors in a search index' settings is usually not needed or even cumbersome. If you are adding an index to a Solr server you should therefore then disable all processors which handle such classic preprocessing tasks. Enabling the HTML filter can be useful, though, as the default config files included in this module don't handle stripping out HTML tags.

All Search API datatypes are supported by using appropriate Solr datatypes for indexing them. By default, "String"/"URI" and "Integer"/"Duration" are defined equivalently. However, through manual configuration of the used config files this can be changed arbitrarily. Using your own Solr extensions is thereby also possible.

The "direct" parse mode for queries will result in the keys being directly used as the query to Solr (see the Lucene query syntax and Solr's additions for details). This module uses the extended dismax query handler by default, so syntax errors will also be handled gracefully (though, depending on the type of error, no results might then be returned).

Solr's commit behavior

Solr has a concept called "committing" when indexing data. This means that data sent to Solr for indexing will not become available immediately when searching, but only after the next "commit" operation occurs.
By default (with the configuration files provided with this module), Solr will automatically commit indexed items after 120 seconds, or when 10,000 items have been indexed since the last commit. Before that happens, the indexing will have no effects, even though the index's Status page will already reflect the newly indexed documents.

To alleviate the negative consequences this might have, there is one workaround implemented in the Solr service class: When you have the "Index items immediately" option enabled for an index, the Solr service class will automatically commit after every indexing operation.
However, this does not work if the option is disabled and you index an item directly through other means, e.g., with Rules.

Problems with maxWarmingSearchers

(Note: This section mainly concerns very early versions of the module (before 7.x-1.0 Beta 4), but could be relevant for specific problems in newer versions, too.)

Especially when using very early versions of this module, you might come across the following error message when trying to index:
Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later.
This occurs because, upon committing, Solr has to create new "searchers" (threads for searching) for searching the new state of the index that was committed. To use these searchers, they are first "warmed" (prepared by executing a few searches) so the subsequent real searches won't take too long.
However, when several commits occur in a very short time, then new searchers for later commits could be created while others are still warming. Solr prevents this spinning out of control with the maxWarmingSearchers setting – when the number of warming searchers in that setting is reached, all subsequent commits will fail with an exception until some searchers have finished warming up.
Since module versions before 7.x-1.0 Beta 4 committed very frequently when indexing larger numbers of items, this occurs often with larger sites. As a workaround, you can try to increase the index's cron limit and make sure to deactivate "Index items immediately" (at least for bulk item updates), but the strongly recommended way to deal with this is to upgrade to a newer version of the module, or at least to Beta 4.

When this error occurs regularly on your site with a newer module version, you might want to consider deactivating the "Index items immediately" setting for your index – or using Solr 4 and the latest module version, where this problem is mitigated further.

Multiple sites with a single Solr server

Using a single Solr instance for several sites with Search API works pretty well out of the box. However, since the documents in the Solr server are identified by the search index's machine name, you have to make sure not to use search indexes with the same machine name on different servers (or use the search_api_solr_index_prefix variable, if available in your module version (see your README.txt)).
#1776534: Add support for using a Solr server with multiple sites, once committed, should remove this restriction so that even using the same server for search indexes with the same machine name won't lead to clashes.

If you want to be able to also search items from other sites whose contents are indexed in the same Solr server (a multi-site search, like here on Drupal.org), this is a lot harder to accomplish. There is a sandbox module, Search API Site that would accomplish this, but it's not in a usable state at the time of writing (Nov 2013). Some tips are in #2034831: Is it possible to search Site A from Site B?, but these are more workarounds than real solutions.

Non-English and multilingual sites

Note: There is now a new project for dealing with multilingual sites with Search API Solr, called Search API Entity Translation Solr search. It's only a sandbox at the time of writing, but you should definitely give it a try before going the much harder route of implementing this on your own.
If the module works for you, you can ignore all the rest of this subsection.

Note 2: Laurens Meurs has posted a great hands-on tutorial about how he got Search API Solr Search working correctly on a multlingual site in #2146077-4: How to get Search API Solr work with i18n content translation. This might also be worth checking out (before or after reading through the following explanations).

The configuration files provided with this module will configure Solr by default to use English stemming for all text fields. If your site has another (single) language, you can either just replace the two occurrences of "English" in the schema.xml file or, if you want to avoid making direct changes to the config files (to make later updates easier), follow the method described on the next page to change the language for all your fulltext fields.

However, for multilingual sites this becomes even harder. First off, the process depends on whether you're using content translation or entity translation. For the latter, you should use the Search API Entity Translation module to transform the indexed items to the same schema as used for content translation (i.e., one item per language) as otherwise there will be virtually no language support. Then, the biggest remaining problem is that, as explained above, all content will by default be stemmed using an English language stemmer.

The easiest way to deal with this is to just completely remove stemming from the Solr configuration – not really ideal, but at least content won't be stemmed in the wrong language (which is worse than no stemming, in most cases). As above, editing the schema.xml is the easiest way, but can lead to problems later, during module updates. If you want to do it properly, there is already the text_und type defined for indexing text without stemming and you'd only have to make your fulltext fields use that type. For that, you can either change the prefix key of Search API's text type (see search_api_solr_hook_search_api_data_type_info()), or add a new type with the same method, or use the method described on the next handbook page for customizing field types.

However, if you want to do it completely right and stem the content of each node using the correct language, this gets a lot harder still and requires writing some custom code. You have to change the indexed Solr documents (using hook_search_api_solr_documents_alter()) and move all fulltext content to use a prefix for a dynamic field you have defined for that language. Then, at query time, you have to use hook_search_api_solr_query_alter() to query all language-specific versions of a field instead of (just) the default one. You can, of course, also restrict it to a certain subset of languages (or a single one), depending on your use case. (If the user has set the site to Spanish, they probably don't want to find Greek content.)

Supported server features

search_api_autocomplete
Introduced by module: search_api_autocomplete
Lets you add autocompletion capabilities to search forms on the site. (See also Hidden variables below for Solr-specific customization.)
search_api_facets
Introduced by module: search_api_facetapi
Allows you to create facetted searches for dynamically filtering search
results.
search_api_facets_operator_or
Introduced by module: search_api_facetapi
Allows the creation of OR facets.
search_api_mlt
Introduced by module: search_api_views
Lets you display items that are similar to a given one. Use, e.g., to create a "More like this" block for node pages.
Due to a regression in Solr itself, "More like this" doesn't work with integer, float and date fields in Solr 4 (see #2004596: MLT not working with Solr 4.x). As a work-around, you can index the fields (or copies of them) as string values.
search_api_multi
Introduced by module: search_api_multi
Allows you to search multiple indexes at once, as long as they are on the same server. You can use this to let users simultaneously search all content on the site – nodes, comments, user profiles, etc.
search_api_spellcheck
Introduced by module: search_api_spellcheck
Gives the option to display automatic spellchecking for searches.
search_api_data_type_location
Introduced by module: search_api_location
Lets you index, filter and sort on location fields. Note, however, that only single-valued fields are currently supported for Solr 3.x, and that the option isn't supported at all in Solr 1.4.
search_api_grouping
Introduced by module: search_api_grouping
Lets you group search results based on indexed fields. For further information see the FieldCollapsing documentation.

Hidden variables

search_api_solr_autocomplete_max_occurrences
By default, keywords that occur in more than 90% of results are ignored for
autocomplete suggestions. This setting lets you modify that behaviour by
providing your own ratio. Use 1 or greater to use all suggestions. Defaults to 0.9.
search_api_solr_index_prefix
By default, the index ID in the Solr server is the same as the index's machine name in Drupal. This setting will let you specify a prefix for the index IDs on this Drupal installation. Only use alphanumeric characters and underscores. Since changing the prefix makes the currently indexed data inaccessible, you should change this variable only when no indexes are currently on any Solr servers.
search_api_solr_index_prefix_INDEX_ID
Same as above, but a per-index prefix. Substitute the index's machine name for "INDEX_ID" in the variable name. Per-index prefixing is done before the global prefix is added, so the global prefix will come first in the final name:
(GLOBAL_PREFIX)(INDEX_PREFIX)(INDEX_ID)
The same rules as above apply for setting the prefix.
search_api_solr_http_get_max_length
The maximum number of bytes that can be handled as an HTTP GET query when the Solr server's HTTP method is set to "AUTO". Typically Solr can handle up to 65355 bytes, but Tomcat and Jetty will error at slightly less than 4096 bytes. The setting therefore defaults to 4000, which should be safe in nearly all cases but will still use "GET" for searching most of the time.
search_api_solr_cron_action
This module can automatically execute some upkeep operations daily during cron runs. This variable determines what particular operation is carried out. The options are:
spellcheck
The "default" spellcheck dictionary used by Solr will be rebuilt so that spellchecking reflects the latest index state.
optimize
An "optimize" operation is executed on the Solr server. As a result of this, all spellcheck dictionaries (that have "buildOnOptimize" set to "true") will be rebuilt, too. (This was the default in version 1.4 and earlier of the module, which didn't have this variable.)
none
No action is executed.

The default is "spellcheck". However, if an unrecognized value is explicitly set, it will be interpreted as "none".

search_api_solr_site_hash
A unique hash specific to the local site, created the first time it is needed. Only change this if you want to display another site's results and you know what you are doing. Old indexed items will be lost when the hash is changed and all items will have to be reindexed. Can only contain alphanumeric characters.

Keeping Solr data for dev servers separate

If you are using Features to keep your configuration synced across multiple servers (development, staging, production) it becomes a bit tricky to ensure that the development site won't index content on the production site's Solr server (showing invalid links or Lorem Ipsum content to site visitors). The probably best way is to use the Search API Override module, which allows you to override server settings in settings.php. That way, you can just use separate Solr instances or cores for each server, and use settings.php to point each to the correct one.

Customizing your Solr server

The configuration files contain extensive comments on how to add additional features or modify behaviour, e.g., for adding a language-specific stemmer or a stopword list. For easy adding of configuration, please use the schema_extra_fields.xml, schema_extra_types.xml and solrconfig_extra.xml files. These allow for easier updating, since typically only the schema.xml and solrconfig.xml files of the module will be updated. Making your changes to the *_extra* files therefore allows you to easily update the configs to newer versions of the module.
If you are interested in further customizing your Solr server to your needs, see the Solr wiki for documentation.

When editing the configuration files, please only edit the copies in the Solr configuration directory, not directly the ones provided with this module. Otherwise, your changes might easily get lost when updating the module.

You'll have to restart your Solr server after making any configuration changes, for them to take effect. This also applies when updating the config files to those of a new release.

See the next section of this handbook for detailed descriptions and examples.

Looking for support? Visit the Drupal.org forums, or join #drupal-support in IRC.