Last updated April 9, 2011.

For Drupal 8, we've been discussing overhauling the Search module, so that it has a flexible plugin architecture with reference plugin implementations. A possible goal is to make it a tool (e.g. providing facet UI, possible listener for entity updates, etc) Previous discussions and reference information:

This page collects the current ideas for the requirements, architecture, and implementation of this module. Feel free to edit!


Overarching Ideas

  1. Core Search module is a minimal framework, with a reference/default implementation provided, plus a toolkit to enable more richly featured search back-ends to share the same presentation code and other generic code.
  2. Modularized framework - separate pieces/plugins for:
    • Indexing
    • Facet selection and display
    • Search filters (date range etc.) -- are these facets or something different?
    • Searching/querying (with facets)
    • Display/UI (modularized into blocks)/URL
    • Logging of indexing and searches
    • Results ranking (depending on the back end)
    • Pre-processing at index time and search time (depending on the back end, see below)
    • Deciding what to index and when. Sometimes the "things" to search might not necessarily be nodes or even entities -- they could be Views pages for instance, which might need to be reindexed on a schedule. Or it might be the search index has nothing to do with Drupal at all, so nothing from the site gets added to it. [As a note, the Search API module only works with entities.]
    • Deciding what text to index for each indexed item (e.g., render in the theme, call node_view(), convert a PDF to text, etc.)

    Given this, the reference implementation would probably be several modules.

  3. Ideally, the ability to mix together different "things" in searches.
  4. Pluggable preprocessing (for things like stemming, dealing with punctuation, etc.). This should probably be a filter chain, with the ability of any preprocessing module/filter to also have a way of highlighting found words in the search results (because if the search term and search index have been preprocessed, the search terms will not necessarily be exact matches to the text in the result). However, some back ends will want to do their own preprocessing

Implementation Notes

  1. Want a plugin architecture rather than using hooks, because:
    • Can use or not use plugins from each module
    • Plugins can inherit from each other
    • For now, use newer OO plugin style in CTools; eventually: Butler?
  2. What to include in the reference core plugins:
    • Search nodes and users -- and other entities?
    • Can we use an external PHP library such as Zend Lucene that we don't have to maintain, and is it appropriate to do so?
    • Base classes for all plugins that are Drupal-core-independent agnostic
    • Subclass to make the Drupal-reference implementation
    • Facets: Basic UI in core. Reference implementation: user vs. node, node type, and author?
    • Search - simple not boolean, but make sure a plugin could do boolean, and we need phrases for the basic implementation
    • User interface - display should highlight found keywords
  3. Support for different types of filters - plain text entry, selects, auto-complete, sliders, etc,
  4. Ability for site builder to define multiple search "environments" with different plugins (back end, display, etc.) at different URLs