Search module in Drupal 7 only supported languages on the node level, and only indexed the main (original) language version of the node. This means if you used field translation (implemented as an API in Drupal core and supplied with a user interface in the contributed entity_translation project), your node translations were never indexed.
In Drupal 8, we introduced indexing for all language variants of a node as separate instances. This means the following API changes were necessary:
- search_index() now takes a language code and parameter order has changed. Instead of
search_index($sid, $module, $text), it now takes a language code as well:search_index($type, $sid, $langcode, $text) - search_reindex() was also changed, and then it was removed and split into a few functions. See change notice https://www.drupal.org/node/2326575 for details on that.
- hook_node_update_index() changed from
hook_node_update_index(Drupal\node\Node $node)to get language code as well:hook_node_update_index(Drupal\node\Node $node, $langcode), where the language code is provided by the search system, so you can/should provide extra data based on the specific language
Example hook implementation change:
Drupal 7:
function comment_node_update_index($node) {
// ...
// Build the comments for this node.
$build = comment_view_multiple($comments, $node);
return drupal_render($build);
// ...
}
Drupal 8:
function comment_node_update_index($node, $langcode) {
// ...
// Build the comments for this node only for the needed language.
$build = comment_view_multiple($comments, $node, $langcode);
return drupal_render($build);
// ...
}
The search index now stores possibly multiple entries per node, so it should not be trusted anymore to have one entry per node.
Drupal 7 data structure (note only the original node language is indexed, no language is recorded in search index):
search_dataset: (sid: 123, type: node, data: good morning, ...)
search_index: (word: morning, sid: 123, type: node, ...)
Drupal 8 data structure (note multiple entries for the node are indexed for all field/property languages, language is recorded in search index):
search_dataset: (sid: 123, type: node, langcode: en, data: good morning, ...)
search_index: (word: morning, sid: 123, langcode: en, type: node, ...)
search_dataset: (sid: 123, type: node, langcode: de, data: guten morgen, ...)
search_index: (word: morgen, sid: 123, langcode: de, type: node, ...)