I have a node with over 100k revisions and metatag_entity_load takes over 8 seconds to work. I believe this is because metatag_metatags_load_multiple will load up every revision_id.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeytown2’s picture

More info on this issue here: https://groups.drupal.org/node/401538

DamienMcKenna’s picture

Have you tried using EntityCache?

gielfeldt’s picture

@DamienMcKenna Whatever happened to root-cause analysis? Besides, EntityCache suffers from a race-condition if you're using another cache backend than the Database.

Back on topic:

Some consumers want the metatags for all revisions for some reason.

We could at least handle the cases where revision_ids are provided:

(Warning: untested).

$query = db_select('metatag', 'm')
  ->fields('m', array('entity_id', 'revision_id', 'language', 'data'))
  ->condition('m.entity_type', $entity_type)
  ->condition('m.entity_id', $entity_ids, 'IN')
  ->orderBy('entity_id')
  ->orderby('revision_id');

if (!empty($revision_ids)) {
  $query->condition('m.revision_id', $revision_ids, 'IN');
}

$result = $query->execute();

That should result in a much more efficient query.

And for the rest: Two consumers in the metatag module fetches metatags for all revisions.
1.) Metatags feeds processor
2.) Bulk delete/revert

#1 I'm not so sure why it fetches metatags for all revisions. Perhaps it could settle for fetching only the current revision?
#2 shouldn't be a problem. This is an administrative batch operation, and IMO does not require real-time priority.

gielfeldt’s picture

Another thing to note is that PDO and MySQL are not very good friends. The quoting of the entity_ids in the query causes mysql to do a filesort (at least on my setup). But I've seen this many times before though. Nothing to do about it, short of bypassing PDO's placeholder mechanism and write the query manually.

mikeytown2’s picture

Looking into this more and I think the issue could be worse than first reported (as you need a lot less revisions for the slowdown to occur). The node in question on our setup is a test one. We make sure things are working as they should with our non-drupal api by changing a value on drupal and saving it. This happens every 5 minutes. Our slowdown was gradual and looking at the metatag table there are only a little over 2k entries in it for this node. This corresponds with us pushing out the new version of metatag (beta 9) on the 20th (near the start of week 04).

So rather than needing 100k revisions for this slowdown to be noticeable, I would bet that it would be noticeable after several hundred revisions. This bug only affects new revisions so it makes since that an automated test would be the first one to pick this up. I'll add in the condition to the query as this looks like it should dramatically speed this up.

On an unrelated note we do recognize that saving 100k revisions is not needed. But this is still a valid bug as everything else runs hook_*_load in the millisecond range.

As for using entity cache we can't do that in our case as we merge in some of the field data from the nondrupal API via bf_field_mapper (the API is the master of the data in our case, not the drupal database).

Edit: The test runs every 5 minutes = 12 per hour = 288 per day = 2016 per week.

mikeytown2’s picture

Status: Active » Needs review
FileSize
1.03 KB

I can confirm the filesort issue when the nid is quoted; although the speed hit is minimal as this is only a couple thousand rows. I commented out the revision_id to test this.

SELECT 
  m.entity_id AS entity_id, 
  m.revision_id AS revision_id, 
  m.language AS language, 
  m.data AS data
FROM metatag m
WHERE (m.entity_type = 'node') 
AND (m.entity_id IN  ('1082')) 
-- AND (m.revision_id IN  (4562064))
ORDER BY entity_id ASC, revision_id ASC

Attached is the patch to switch this over to use $revision_ids if it is given.

DamienMcKenna’s picture

Have you tried out your patch with the 100k records? I'm a little surprised this is faster, given DBTNG is supposed to be slower than a straight up query.

mikeytown2’s picture

I did. Here is the output

hook_entity_load metatag in 0.0261 seconds
hook_entity_load rdf in 0 seconds
hook_TYPE_load webform in 0.061 seconds
hook_TYPE_load bf_field_mapper in 0.024 seconds
hook_TYPE_load comment in 0.0124 seconds
hook_TYPE_load print in 0.0161 seconds
hook_TYPE_load scheduler in 0.016 seconds
hook_TYPE_load themekey_ui in 0.0237 seconds
hook_TYPE_load user in 0.0165 seconds
hook_TYPE_load print_mail in 0.016 seconds
hook_TYPE_load print_pdf in 0.0159 seconds

Main reason it's faster is that the query is only returning 1 row vs over 2k. Adding an additional condition to a db_select is a lot cleaner looking than adding an extra AND to a db_query. You could use db_query if you really wanted to and get the same speed up; just be sure to add in the additional logic with !empty($revision_ids)

DamienMcKenna’s picture

Issue tags: +Performance

Tag.

gielfeldt’s picture

@mikeytown2 In the patch you also included my bad casing of the second orderby() :-)

mikeytown2’s picture

juampynr’s picture

Part of the patch at https://drupal.org/comment/8310313#comment-8310313 already addresses this issue. Even though you call metatag_metatags_load_multiple() with a huge list of nids and revision_ids, it will perform in milliseconds. This is already part of metatag-7.x-1.0-beta9.

I think we can close this issue as a won't fix but I will let @DamienMcKenna to decide.

mikeytown2’s picture

@juampy

This issue is still present. Create over 1k revisions on a single node to repo. This issue showed up after 7.x-1.0-beta9.

DamienMcKenna’s picture

Status: Needs review » Needs work
Parent issue: » #2175021: META: Plan for Metatag 7.x-1.0-rc1 release

Clearly some improvements are needed on the revision handling, and possible some API changes to accommodate the need.. so that makes it a blocker for 1.0-rc1.

mikeytown2’s picture

@DamienMcKenna
What is wrong with the patch in #6? It's what we are using to fix this issue.

gielfeldt’s picture

I'm also curious. What improvements are "clearly" needed?

DamienMcKenna’s picture

Status: Needs work » Needs review
FileSize
1.51 KB

Well, there's an erroneous IF statement, for one ;-)

juampynr’s picture

The last patch will still be very heavy for a large amount of nodes being loaded since it has two where conditions.

Here is an example. Given the following code:

// test.php
$nids = db_query('select nid from node order by nid DESC limit 0,500')->fetchCol();
node_load_multiple($nids);

If you run it with drush scr test.php, it will call to metatag_entity_load(array_of_500_nids), which finally calls metatag_metatags_load_multiple('node', array_of_500_nids, array_of_500_rids). This query will take a few seconds to run.

While debugging this with @q0rban, he suggested to simply filter by revision_ids when they are present and, if they aren't, filter by nids. This makes the query to run in just a few milliseconds.

DamienMcKenna’s picture

@juampy: Thanks, I'd been thinking of the same thing. I'm going to test it a little more before committing, but it seems pretty good so far.

juampynr’s picture

Edited. Wrong patch. See comment #22.

Status: Needs review » Needs work

The last submitted patch, 20: metatag-beta7-optimize-loading-2183203-20.patch, failed testing.

juampynr’s picture

Just for the record: for sites using metatag-7.x-1.0-beta7 which are having performance issues when loading metatags.
Here is the adapted patch to address it.

MXT’s picture

@juampy for #22 Sorry I don't understand: metatag module is in beta 9 (that I'm using with BIG performance issues) and you have provided a patch for beta 7 ?

Is your patch already applied to beta 9 ? Or should I have to apply it in any case

Thank you for clarifications

juampynr’s picture

@MTX, you should try patch at comment #18.

juampynr’s picture

Status: Needs work » Needs review

Setting back to needs review since the patch that applies on HEAD is the one at comment #18.

Status: Needs review » Needs work

The last submitted patch, 22: metatag-beta7-optimize-loading-2183203-21.patch, failed testing.

DamienMcKenna’s picture

Status: Needs work » Fixed

I've committed patch #18, thank you all for your help with this.

  • DamienMcKenna committed 9804f83 on 7.x-1.x
    Issue #2183203 by mikeytown2, juampy, DamienMcKenna: Improved queries in...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

davidwbarratt’s picture

gielfeldt,

EntityCache suffers from a race-condition if you're using another cache backend than the Database.

Is there an open issue for this? I couldn't find one:
https://www.drupal.org/project/issues/entitycache

gielfeldt’s picture

drumm’s picture

A followup to this issue is at #2835614: metatag_metatags_load_multiple() shouldn’t need to sort. I don’t see how the orderBy is being used, it might be able to be removed.