Over in #2650986: Fix entity loading in SearchApiFieldTrait, we try to get the Views fields to multi-load field entities wherever possible, including multi-loading the result objects/entities in case they haven't been set before.
However, this is counteracted by \Drupal\search_api\Plugin\views\query\SearchApiQuery::addResults(), which has hidden hard-coded object loads (each individually) in several places:

  • Unless the “Skip item access checks” option was enabled, it will call \Drupal\search_api\Item\Item::checkAccess() on every result item – which, in turn, has to load the result object for the check, of course.
  • Even worse, though, it will then also call \Drupal\search_api\Item\Item::getLanguage() for every result item to set the search_api_language property on the result row.

The former is unfortunate, but there's also hardly a way around this – we have to check access at that point, and we can't do that without the object. This could at least use multi-loading, though.
However, the latter is really aweful, for a number of reasons:

  • It can't be avoided at all (other can be skipped with a query option).
  • The property is not actually needed in a lot of cases, and could easily be retrieved later if it is actually needed. Having it there basically just helps for the actual "search_api_language" field handler, which is probably rarely used and could easily be coded to not rely on the field's presence.
  • Specifically for our content entity datasource, it's also vexing that we could actually tell the item language without loading it, as the language code is part of the ID. Fixing this would need an API change or addition, though, and wouldn't really be necessary if we just removed the call and adapted the rest of the Views integration. (Just fixing this part would also be possible, but then cause the same performance problems with other datasources, which can't tell the language by looking at the ID.)
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drunken monkey created an issue. See original summary.

justanothermark’s picture

I encountered this recently while working on (admittedly, quite unusual) requirements. I had multiple sites with read access to an index (only one with write access). The read-only sites would read the index and render a minimal version of the node and link back to the original site.

However, the access check and getLanguage() call would fail on trying to access the original object as it did not exist on the current site. As described, I used the query option to avoid the access check but the getLanguage() call was unavoidable even though the language variable was unused.

Attached is the patch I used so that getLanguage() would extract the language from the ID instead of requiring the original object. This is NOT a fix for the original issue for optimising loading and should not go into the module. It is only provided as a stop-gap for anybody else who wants to avoid accessing the object unnecessarily.

Renrhaf’s picture

Hitting same issue with a custom datasource to fetch & index data into an ElasticSearch backend. Maybe we could use a closure here instead of a loaded value to be able to use it only when needed...

drunken monkey’s picture

Status: Active » Needs work
FileSize
880 bytes

Hitting same issue with a custom datasource to fetch & index data into an ElasticSearch backend. Maybe we could use a closure here instead of a loaded value to be able to use it only when needed...

Would also be an option, yes, but I don't think that's necessary. Everything to fix the issue is already in the IS, I think.
Here is a patch for the easy part, at least: multi-loading when checking the access.

Renrhaf’s picture

Thanks, do you think we can also avoid loading the item for the "getLanguage()" part ?

drunken monkey’s picture

Thanks, do you think we can also avoid loading the item for the "getLanguage()" part ?

Yes, would just need to be implemented.

Renrhaf’s picture

Hello Drunken monkey,

What do you think of a patch link this ?

It means changing the DatasourceInterface and I'm not sure if it is something that can be done.
Maybe the use of events could be useful here too, or a system where the language is automatically added to indexed data and retrieved from indexed content directly.

I don't know which path to go...

Then I'm still having issues with entities being loaded in the SearchApiFieldTrait::preLoadResultItems() method.
Even when using a "rendered item" only in the view. I though this would allow me to skip entity loading having all HTML already generated and indexed.

drunken monkey’s picture

It means changing the DatasourceInterface and I'm not sure if it is something that can be done.

No, sorry, that's not possible as it would break all existing datasource plugins. Adding a new method to the interface might be acceptable, if we also add it to the plugin base class, but I'd rather avoid that, too.

Anyways, since this is only part of the problem, I think the better solution for this issue is just avoiding to set the search_api_language when extracting the Views results. It's not really needed (if someone needs it, they can very easily get it anyways) but causes lots of problems.
If you additionally want to fix getting the language from an item ID, I think I'd prefer adding a new optional datasource interface for that with a dedicated method to do that, and implement it with the entity datasource. But, as said, I wouldn't see that as the priority here, since the problem would still remain for all/most other datasources.

Renrhaf’s picture

Hello, something as simple as that ? It's fixing my issue but won't this cause other issues on multilingual sites ?
In all cases thanks for your suggestions ! I've manager to avoid calling again my API calls when displaying indexed fields and that is a huge performance optimization for my custom datasource.

drunken monkey’s picture

Hello, something as simple as that ?

Not quite, no – I'm pretty sure we'd also need a new dedicated Views field plugin for the "Item language" field so it keeps working after that change. But should also be pretty simple.

bijumon.ta’s picture

Hi,

I am using multilingual functionality in Drupal 8 . I have the same issue . When we checked the performance , its calling the function getLanguage() is calling around 400 times. Is there any issue if we comment the line from getLanguage() on SearchApiQuery.php:556 .My website in English and French and using the interface language for filtering results in same view.

drunken monkey’s picture

I am using multilingual functionality in Drupal 8 . I have the same issue . When we checked the performance , its calling the function getLanguage() is calling around 400 times. Is there any issue if we comment the line from getLanguage() on SearchApiQuery.php:556 .My website in English and French and using the interface language for filtering results in same view.

Just try it out, I’d say?
After actually checking the code again, it seems like this might cause additional problems aside from the “Item language” field, though, that would have to be fixed. But it seems that if you don’t use any Views fields, but, for instance, the “Rendered entity” row display, then this workaround (i.e., the patch in #9) might already work for you without any additional adaptions.

But it seems to be a longer way towards making this work for the general use case than previously anticipated.

weseze’s picture

Could this issue be responsible for terribly slow searches?
I'm seeing load times in excess of 10seconds on cold cache...

I tried all of the above patches an nothing seemed to work? Is any extra config needed to make them work?

drunken monkey’s picture

Could this issue be responsible for terribly slow searches?

It could be, but probably isn’t. You’ll have to profile this some way yourself on your site to find out (e.g., using XHProf, New Relic, …).

rp7’s picture

Specifically for our content entity datasource, it's also vexing that we could actually tell the item language without loading it, as the language code is part of the ID. Fixing this would need an API change or addition, though, and wouldn't really be necessary if we just removed the call and adapted the rest of the Views integration. (Just fixing this part would also be possible, but then cause the same performance problems with other datasources, which can't tell the language by looking at the ID.)

Based on this piece of info & the fact that my project only uses datasources with the language code in the item ID, I've come up with this temporary fix:

**
 * Implements hook_search_api_results_alter().
 */
function mymodule_search_api_results_alter(ResultSetInterface &$results) {
  $items = $results->getResultItems();
  foreach ($items as $key => $item) {
    list(,, $language) = explode(':', $key);
    $item->setLanguage($language);
  }
}

This circumvents the \Drupal\search_api\Item\Item::getLanguage() issue on my project. As I'm using https://www.drupal.org/project/external_entities, 1 external API call vs 25 (in my case) does provide a little performance boost.

drunken monkey’s picture

Priority: Normal » Major
Status: Needs work » Needs review
FileSize
6.03 KB

After a bit of consideration, it’s probably best to just provide a ResultRow sub-class that lazy-loads those “magic” properties.
(Other option of just providing our own sub-class of \Drupal\views\Plugin\views\field\LanguageField would also be very simple, but might break sites with niche use cases and is not as pretty (imho).)

rp7’s picture

Patch in #16 works for me!

legolasbo’s picture

Status: Needs review » Reviewed & tested by the community

Looking at this I think that the patch surely improves the issue. I am curious about the performance improvement when the lazyloaded properties are actually used. My guess is that it would still result in multiple loads, but that's what is currently the case so no regression there. All in all I think this is a step forward! :)

drunken monkey’s picture

Thanks a lot for reviewing!
The one lazy-loaded property that will actually load the object (search_api_language) will only rarely be used (only when the “Item language” field is added to the view). And even in that case, as you say, the performance will be (bad) as before.

However, while we’re at it, we should probably also fix the other source of single-loads in the method, as stated in the IS. Patch attached.

The revised patch also fixes the coding standards violation by just going and changing the standards – we already have the properties defined with a leading underscore, and adding them explicitly to the class seems the best way still, so we hardly have a choice there.
Or is there a way to exclude a certain standard just for a specific file?

drunken monkey’s picture

Went and answered my own question a minute after posting.
I really should stop doing this (posting prematurely, that is – not doing proper research).

legolasbo’s picture

Status: Needs review » Needs work

Nitpicks really, but I do feel that the empty try/catch needs some work.

  1. +++ b/src/Plugin/views/query/SearchApiQuery.php
    @@ -603,6 +603,9 @@ protected function addResults(array $results, ViewExecutable $view) {
    +      // If search items are not loaded them, pre-load them now in bulk to avoid
    +      // them being individually loaded inside checkAccess().
    

    /s/loaded them,/loaded already

  2. +++ b/src/Plugin/views/query/SearchApiQuery.php
    @@ -653,6 +651,35 @@ protected function addResults(array $results, ViewExecutable $view) {
    +   * Loads all "original objects" not already loaded of the given search items.
    

    The wording of this line feels off. Maybe this is better?

    Loads all not already loaded "original objects" of the given search items.

  3. +++ b/src/Plugin/views/query/SearchApiQuery.php
    @@ -653,6 +651,35 @@ protected function addResults(array $results, ViewExecutable $view) {
    +        // Can't actually be thrown here, but catch for the static analyzer's
    +        // sake.
    

    It won't be thrown in the current codebase, but maybe it will be thrown in a future iteration? I think we should at least generate a warning or log the exception.

drunken monkey’s picture

Status: Needs work » Needs review

Thanks a lot for the review!
I agree with the first two remarks – fixed in the attached patch.

It won't be thrown in the current codebase, but maybe it will be thrown in a future iteration? I think we should at least generate a warning or log the exception.

That would violate both the interface and logic: if $load is FALSE, the method is a simple getter – how would that throw any exception?

Lastly, I figured that the preLoadSearchItems() might actually be a handy helper method in other circumstances, too, so why not move it to the ResultSet class?

legolasbo’s picture

Status: Needs review » Needs work

You forgot to attach the patch @drunken_monkey.

Lastly, I figured that the preLoadSearchItems() might actually be a handy helper method in other circumstances, too, so why not move it to the ResultSet class?

This actually makes a lot of sense. Good idea!

drunken monkey’s picture

Oops, thanks for pointing this out!

legolasbo’s picture

Status: Needs review » Needs work
+++ b/src/Plugin/views/query/SearchApiQuery.php
@@ -625,6 +628,9 @@ protected function addResults(array $results, ViewExecutable $view) {
+      // If search items are not loaded already, pre-load them now in bulk to
+      // avoid them being individually loaded inside checkAccess().
+      $result_set->preLoadResultItems();

Can't we have the ResultSet do this whenever the first result is being accessed? That way people using it don't have to know about this logic at all and get the better performance by default.

drunken monkey’s picture

Status: Needs work » Needs review

Can't we have the ResultSet do this whenever the first result is being accessed? That way people using it don't have to know about this logic at all and get the better performance by default.

As we can’t know whether the caller actually needs the original objects, this would actually decrease performance in some cases, leading to unnecessary loads (the very issue we want to fix here). Or do I misunderstand?

  • drunken monkey committed 271f0f0 on 8.x-1.x
    Issue #2898327 by drunken monkey, Renrhaf, justanothermark, legolasbo:...
drunken monkey’s picture

Status: Needs review » Fixed

Alright then, let’s finally commit this, I’d say.
Thanks a lot again, everyone!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.