Common pitfalls

Last updated on

30 July 2022

Drupal 7 will no longer be supported after January 5, 2025. Learn more and find resources for Drupal 7 sites

This page lists the pitfalls most commonly encountered by new users, to hopefully minimize the people falling into them in the future.

Disable other search modules
Creating fulltext searches in Views
Creating a search block with Views
Avoid showing all results when no keywords are entered
Views relationships to multi-valued fields
Problems with non-ASCII characters
Changes in related entities don't lead to re-indexing
Having "Index items immediately" disabled can lead to leaks of confidential data
Unpublished content showing in Search API results
Indexing of broken references
Different processor settings for fulltext fields
The “Content access” processor doesn't work for some custom access mechanisms
Problems with fields re-used across content types/bundles with different settings

Disable other search modules

While Drupal core provides its own "Search" module, the "Search API" suite of modules is completely independent of that. Therefore, if you are using the Search API for searches on your site, you should uninstall all unrelated search modules (unless you're sure you need them), especially Drupal core's "Search" module and the Apache Solr Search module (which you might have installed).

Keeping the "Search" module enabled will harm performance, since indexing for that module will still occur even though you are not using it anymore. (You might also get inferior search pages from it still accessible to users on your site.)

Enabling two independent Apache Solr modules on one site can lead to data loss, compatibility problems and, also, decreased performance (since items are indexed twice).

Creating fulltext searches in Views

When you want to use Views to create a search page with a fulltext search, only use the "Search: Fulltext search" filter (or contextual filter) for the keywords input! Specifically for the filter, also make sure that "Use as" is switched to "Search keys", not to "Search filter".

"Search keywords" are special in the Search API, compared to normal filters, in that they are parsed into separate words (unless you are using the "Single term" parse mode) that will all be searched separately. Normal filters, even on fulltext fields (= fields indexed as type "Fulltext"), will search for entered phrases as a whole, as if the keywords were put in quotes. Furthermore, only proper keywords will influence the relevance of results, if you are using this mechanism for sorting – filters won't do that.

So, even if you only want fulltext searches on a single field – if you want "normal" fulltext search behavior, use the "Search: Fulltext search" filter!

Creating a search block with Views

Setting up a search block that redirects users to your search view is not simple if you're not a Views expert, especially if you don't want all exposed filters present in this block. However, unless you want some preprocessing done on the form (most notably, adding autocompletion), you can easily circumvent this by creating a custom block and putting the HTML for the form in there (make sure to set the text format to "Full HTML" or something similar):

<form action="/path/to/view" method="get">
  <label for="search-keywords">Search</label>
  <input id="search-keywords" maxlength="128" name="search_api_views_fulltext" size="20" type="text" value="" />
  <input class="form-submit" name="" type="submit" value="Search" />
</form>

(If you've changed the parameter key used for the "Search: Fulltext search" filter, change "search_api_views_fulltext" accordingly.)

Avoid showing all results when no keywords are entered

Just check "Required" in the settings of the "Search: Fulltext search" exposed filter and the view will be blank before keywords are entered. Alternately, set the "Exposed form style" to "Input required".

Views relationships to multi-valued fields

In general, all fields and relationships in the Views integration are provided by the Entity API. Therefore, see that module's handbook for details, and also report problems with this part in the Entity API issue queue.

One problem or restriction commonly encountered is that, due to internal technical limitations, it is not possibly to correctly use a relationship to a multi-valued property in Views. If you add such a relationship to your view, only the first value of the field will be used.

To work around this, you could try the Views Field View module, using the raw field values as arguments for the nested view. However, this might have a severe performance impact for larger views or datasets, as each row in the original view will execute an additional view.

The other way of working around this problem is to use custom code on your site and add all needed properties from the relationship to the entity itself using hook_entity_property_info_alter().

Problems with non-ASCII characters

These problems are highly backend-specific, as the Search API itself doesn't specify or implement any constraints or special treatments of characters. (However, if you have the Transliteration module enabled, a Transliteration processor will become available which can help alleviate most problems, regardless of service class. Any data returned from the search server, like facets, would then be in transliterated form, too, though.)

For the Database search service class, #1144620: Fix character collation problems contains a discussion on that topic. In the issue's course, a patch was also committed to the module which should solve the problem on MySQL servers. For other types of SQL servers, patches are still needed, however.

For the Solr search, Solr should already treat non-ASCII characters properly. However, English stemming is applied by default, which might lead to wrong results for other languages. See the Solr module's documentation for how to fix this.

Changes in related entities don't lead to re-indexing

Through the use of the "Add related entities" form on an index's Fields tab, it is possible to index the fields of entities (or other structures) related to your indexed items. For example, you could index the names of taxonomy terms contained in a node's taxonomy reference field. Or – e.g., for access control – the user roles of the node's author.

However, when you now change the name of a taxonomy term (or the roles of a user), you'll notice that the nodes who reference that term (or user) aren't getting marked as "dirty" and, subsequently, re-indexed. This leads to those fields containing related data to become stale. Sadly, this is very hard to solve in the Search API, so a solution to this problem could still take a while. (See #2007692: Changes in related entities don't lead to proper re-indexing for a discussion of this issue.)

There are a few custom workarounds available which you can use for your site:

Probably the easiest and most comfortable to implement would be to use the Rules integration of the Search API to automatically re-index (or mark as "dirty") items when their related entities change. (The rules to create for this of course depend on your specific setup.)
If you are (or employ) a developer: Use custom code to do the same. In hook_X_update(), just call search_api_track_item_change() with the appropriate (indexed) item type and IDs. (See below for an example.)
If such changes occur only very rarely, and if the site is rather small and only maintained by you, you can also just manually re-save all affected items if such a change occurs.

An example for doing this in custom code (when you have the name of related taxonomy terms indexed for a node index) follows:

/**
 * Implements hook_taxonomy_term_update().
 */
function MODULE_taxonomy_term_update($term) {
  if ($term->name !== $term->original->name) {
    search_api_track_item_change('node', taxonomy_select_nodes($term->tid, FALSE));
  }
}

Placing the above into a module file (and replacing MODULE with the module's identifier) will automatically mark nodes which reference a term as "dirty" when the term's name changes.

Having "Index items immediately" disabled can lead to leaks of confidential data

The "Node access" data alteration, which automatically filters out node results that the current user shouldn't be able to access, works with the indexed state of the entity. The same is true for manual set filters (e.g., in Views on the "Published" field) or most other access control mechanisms.
However, if the index's "Index items immediately" setting is disabled, changed items will (usually) not be indexed until the next cron run, which means the data in the index will be out-dated until then. Since, usually, the data of the results shown to the user comes from the database, not from the search index, this means that data which the user shouldn't see might be displayed to them in search results. However, this will be the case only for very specific setups:

The item must have been accessible previously and only later become inaccessible.
When the item becomes inaccessible, some data must be added that end users shouldn't see. (Otherwise, only data they could see before anyways will be shown to them.)
The "secret" data must be in a field that will be displayed in the search results (or could end up in an excerpt shown with the results).

If this setup applies to your site, it is very much recommended that you enable the "Index items immediately" option for the index in question. (Using Rules to immediately index items only if such a change occurs is also possible, if the load on the server would otherwise be too high. However, keep in mind that Solr's commit behavior might prevent this from working as expected.)

If you are using Solr, enabling the server's "Retrieve result data from Solr" option might also be a way to prevent this from happening, since the search will then show the old data while the new one isn't indexed, not the one with the confidential content added. This will not work for data in Field API fields shown in Views results (due to restrictions imposed by the Field API) – so this is only an option if you aren't using Views, or if the confidential data won't be in a Field API field.

Unpublished content showing in Search API results

"Edit" your node index setting found on /admin/config/search/search_api/index/ , and check the 'Filters' tab.
Normally you want "Exclude unpublished nodes" to be on. If it's off, you may see unwanted content showing up in the search results (though access to the full page will still be restricted).
Changing the filter settings will require a re-index to be triggered.

Indexing of broken references

Due to the Core bug #1281114: Database records not deleted for Term Reference Fields after Term is Deleted, which means that references to taxonomy terms aren't removed from, e.g., nodes when the referenced term is deleted, the Search API will sometimes index those term references pointing to a nonexistent taxonomy term. This means, for example, that facets listing those terms will just display the term ID, not the term label.

To resolve this, either help fix the Core bug or use a module like Field reference delete for fixing this problem in contrib.

Different processor settings for fulltext fields

The Search API currently doesn't support separate processing for fulltext search keywords based on the searched field. Therefore, enabling processors like "Ignore case" or "Stopwords" for only some fulltext fields will usually not work as intended: while only the values of the selected fields are processed during indexing, the search keywords for all fields are processed by the processor when searching.

There is currently no proper solution for this problem so it is advised that you always enable or disable such processors for all fulltext fields.

The “Content access” processor doesn't work for some custom access mechanisms

While Drupal provides a pretty flexible node access system out-of-the-box, it is unfortunately not completely generic, especially when using a completely separate implementation (in this case: in the Search API). Therefore, though the “Content access” processor tries its best to account for all grants and access records in the node access system, this is unfortunately not enough to support all custom node access solutions/modules.
One popular module that we cannot support, for instance, is the view_unpublished module.

We have unfortunately not found any generic solution for this problem yet, if one even exists. If you want correct node access with a custom node access module for which the “Content access” processor doesn't work, you'll need to write your own variant of that processor.
For a more detailed discussion, see #1617794: Make "Node access" compatible with additional contrib modules and the issues linked from there.

However, when the processor doesn't completely work, it should currently always err on the side of caution. We haven't received any complaints so far about users seeing content that they shouldn't have access to.

Problems with fields re-used across content types/bundles with different settings

See the description in #2863551-6: Custom text fields disappear from the indexable fields: in the example of “Long text” fields, setting different values for “Text processing” in different bundles results in the field’s sub-properties not being available for indexing on the “Fields” tab. There is no known solution at the moment – you are advised to not reuse the field in such a case, if integration with Search API is important.

Help improve this page

Page status: No known problems

You can:

Log in, click Edit, and edit this page
Log in, click Discuss, update the Page status value, and suggest an improvement
Log in and create a Documentation issue with your suggestion

On this page

Getting started

Common pitfalls

Disable other search modules

Creating fulltext searches in Views

Creating a search block with Views

Avoid showing all results when no keywords are entered

Views relationships to multi-valued fields

Problems with non-ASCII characters

Changes in related entities don't lead to re-indexing

Having "Index items immediately" disabled can lead to leaks of confidential data

Unpublished content showing in Search API results

Indexing of broken references

Different processor settings for fulltext fields

The “Content access” processor doesn't work for some custom access mechanisms

Problems with fields re-used across content types/bundles with different settings

Help improve this page

News items

Our community

Documentation

Drupal code base

Governance of community

On this page

Getting started

Common pitfalls

Disable other search modules

Creating fulltext searches in Views

Creating a search block with Views

Avoid showing all results when no keywords are entered

Views relationships to multi-valued fields

Problems with non-ASCII characters

Changes in related entities don't lead to re-indexing

Having "Index items immediately" disabled can lead to leaks of confidential data

Unpublished content showing in Search API results

Indexing of broken references

Different processor settings for fulltext fields

The “Content access” processor doesn't work for some custom access mechanisms

Problems with fields re-used across content types/bundles with different settings

Related Content

Common pitfalls

Help improve this page