This module implements the Porter stemming algorithm to improve English-language searching with the Drupal built-in Search module.

The process of stemming reduces each word in the search index to its basic root or stem (e.g. 'blogging' to 'blog') so that variations on a word ('blogs', 'blogger', 'blogging', 'blog') are considered equivalent when searching. This generally results in more relevant search results.

This module implements Version 2 of the Porter stemming algorithm, which is the version that Porter currently recommends using for live applications.

Note that although the Porter stemming algorithm is not specific to American English, some British spellings will not be fully stemmed. Most notably, -ise word endings are not stemmed as well as -ize, due to technical issues in the algorithm.

Installation and usage

This module only needs to be enabled for it to work with Drupal search. There is no configuration necessary.

After installing and enabling this module (in the usual way), you will need to rebuild the search index. To do this:

  1. Visit Administer > Site configuration > Search settings, and click on "Re-index site".
  2. Ensure that cron has run sufficient times so that the Search Settings page shows that the site is 100% indexed. You can run cron manually by visiting Reports > Status report and clicking on the "Run cron manually" link.

Limitations and Notes

  • The Porter stemming algorithm has a few parts that work better with American English than British English, so some British spellings will not be stemmed correctly. It is also definitely English-specific, and non-English content will not be stemmed correctly.
  • The core Search module in Drupal 7.x does not provide a way for a stemming module (such as Porter Stemmer) to know the language of content or search terms during searching or search indexing. So, if you have a multi-lingual Drupal 7 site and enable the Porter Stemmer module, it will unfortunately try to apply its stemming algorithm to all the content on your site, regardless of language. See this issue for details: #363336: Add option for Porter-stemmer to only stem English or language neutral content for a multi-language site.. Drupal 8 does not have this problem.
  • The Porter stemming algorithm attempts to reduce words to their lingustic root words -- it does not do general substring matching. So, for instance, it should make "walk", "walking", "walked", and "walks" all match in searching, but it will not make "walking" a match for "king".
  • In Drupal 7.x, there is an issue with excerpts in Porter Stemmer (see: #437084: Excerpt fails to find stemmed keyword). For example, if a page contains the word "walking" and someone searches for "walk", that page will be included in the search results, but the search excerpt will not display the portion of text containing "walking" (it will probably just display the first paragraph of text on that page). This should not be a problem in Drupal 8.x.

Maintainers

The Porter Stemmer module is currently co-maintained by jhodgdon and mark_fullmer. If you have questions or comments about this module, please communicate with the maintainers by posting an issue (see issues area in sidebar of this page). That way, others can benefit from the answers as well.

Project information

Releases