#1669876: Add missing language functionality in search module added the langcode field to search_dataset and search_index but there is no upgrade path.

Also we need to work out how to upgrade the existing entries in search_dataset and search_index - or if we need to? Should we just wipe it and warn the user that they need to rebuild?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott’s picture

Issue tags: +D8MI

Tagging

Gábor Hojtsy’s picture

Wow, how could we have missed the upgrade path?! Wow. Thanks for this.

I'm not sure we need to update the existing data, it sounds more useful to reindex the site on the upgrade. Although both search index and search dataset have a sid which theoretically is the nid of the node indexed, it is designed to be used for other things as well. Also, the new feature supports entity translated nodes, and Drupal 7 sites could just as well have lots of non-indexed content that way. So best to initiate a full reindex.

I'm wondering why the fail does not actually fail?

Gábor Hojtsy’s picture

Status: Needs review » Needs work

As for why it does not fail search_reindex() seems to be a bunch of db_delete() calls and if there is no langcode condition it should work on the upgraded db as far as I see. See http://api.drupal.org/api/drupal/core%21modules%21search%21search.module...

catch’s picture

Status: Needs work » Needs review

search_reindex() only deletes from those tables, so it doesn't care about the missing columns. search_reindex() then search_cron() might fail?

Dropping and re-indexing is best yeah.

alexpott’s picture

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks good to me. It deletes all search index data without calling out to possibly fragile API functions for it to be recreated later. Fine for me.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Hm. I'm not sure about this. This is going to introduce what seems like buggy behaviour on all D7 => D8 upgrades; namely sites will be without proper search results until cron's run a few (dozen, or hundred in the case of sites like Drupal.org) times, right?

Can we not keep the data in the table without a language associated with it and then gradually re-index content with a language? I don't get catch's statement in #4, I guess.

alexpott’s picture

The issue here is that node_search_execute() expects the search tables to have a valid langcode see http://api.drupal.org/api/drupal/core%21modules%21node%21node.module/fun...

Maybe we could do something like this in the update function...

UPDATE {search_index} s, {node} n SET s.langcode = n.langcode WHERE s.type = 'node' and s.sid = n.nid 

But this very messy cause we don't have the vid and we're joining on nid :(

Also on big sites we might run into serious performance issues here. Even if we don't do the above update. Without the truncates rebuilding the indexes might take some time.

This is quitevery ugly... :( looking at node_search_execute() we can't even easily just change $item->langcode to NULL if it equals spaces (as it would after the update if we don't do the truncate).

Gábor Hojtsy’s picture

@webchick: I'm not sure reindexing the site is such a big problem. Granted there were no other changes to the search or indexing logic, so there seems to be no other reason to do that. We can definitely put out a dsm in the update informing the user that the index needs to be rebuilt. Otherwise as @alexpott explained, we either risk a serious performance problem in the upgrade or extra code in the runtime that will need to be there forever to be compatible with a search database not ready for Drupal 8 yet.

catch’s picture

Status: Needs review » Reviewed & tested by the community

This is going to introduce what seems like buggy behaviour on all D7 => D8 upgrades; namely sites will be without proper search results until cron's run a few (dozen, or hundred in the case of sites like Drupal.org) times, right?

That's entirely true but iirc the Drupal.org 7.x upgrade has been timed to take in the region of 20 hours to complete all the updates, because many of those are poorly optimized. Not sure on the current status of that but a fast upgrade then the option to stay in maintainance mode while running cron, vs. just a very slow upgrade in the first place seems like the better place to have the time spent. If sites care about this they can just run drush search-index a few times before opening things back up, if they're as big as Drupal.org they should be using Solr or something else anyway.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok, fair enough. I guess this is the lesser of two evils in that case.

Not sure about a dsm(). My guess is there could be be 100s of dsm()s by the time we're all done, so it's probably going to be missed. Might be something we need to add to the release notes or a FAQ handbook page somewhere, though... but let's see how people react.

Committed and pushed to 8.x. Thanks!

Automatically closed -- issue fixed for 2 weeks with no activity.