Problem/Motivation

Generating the sitemap empties the persistent entity cache.

Steps to reproduce

Proposed resolution

The reason that $storage->resetCache([$data_set['id']]); is called when generating the site map is because we want to ensure that the entities don't take up masses of memory due to the static cache. This is very reasonable. However calling this code also results in the persistent cache being cleared. This results in the unexpected result of generating a sitemap clearing the entity cache.

The module needs to disable the static caching of entities during sitemap generation rather than calling resetCache(). Or it could clear the entity static cache directly.

Remaining tasks

User interface changes

API changes

Data model changes

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott created an issue. See original summary.

gbyte’s picture

As you mentioned, we added this to avoid memory buildup and exhaustion. See #3170261: Entity static cache can cause memory exhaustion during sitemap regeneration.

I'm happy to review patches to improve this.

alexpott’s picture

Realised that related entities such as paragraphs might end up getting stuck in the static cache if they are loaded when we load the entity.

@gbyte just found #3170261: Entity static cache can cause memory exhaustion during sitemap regeneration and saw that the original solution there was pretty much what I'd recommend. Going to upload something very similar to #3170261-2: Entity static cache can cause memory exhaustion during sitemap regeneration. Please give @mstef issue credit too.

alexpott’s picture

Status: Active » Needs review

Hopefully at some point in the future one of:

will land and then \Drupal\simple_sitemap\Plugin\simple_sitemap\UrlGenerator\EntityUrlGenerator::generate() can be removed because core will handle it for you.

Created a MR to clear the entity memory cache after generating the url instead of resetting the entity cache. This will result in sitemap generation doing less queries to the db cache backend because it's stop clearing it :D

alexpott’s picture

I've generated 5000 nodes with paragraph fields and set the content type to be indexed see - https://gist.github.com/alexpott/874a0480b7097d7d97322c053dec6634

I'm generating the sitemap doing vendor/bin/drush ssg -v

The results show that subsequent runs are quicker because sitemap generation is no longer obliterating the entity cache.

With patch

Run 1

>  [notice] 3000 out of 5001 total items have been processed. memory: 54.5 MB, non-peak mem: 52.5 MB
>  [notice] 5001 out of 5001 total items have been processed. memory: 56.5 MB, non-peak mem: 54.5 MB

________________________________________________________
Executed in   16.84 secs   fish           external
   usr time    7.89 secs  106.00 micros    7.89 secs
   sys time    0.96 secs  690.00 micros    0.96 secs

Run 2

>  [notice] 5001 out of 5001 total items have been processed. memory: 54.5 MB, non-peak mem: 54.5 MB

________________________________________________________
Executed in    9.55 secs   fish           external
   usr time    5.16 secs  113.00 micros    5.16 secs
   sys time    0.62 secs  703.00 micros    0.62 secs

Without the patch

Run 1

>  [notice] 3311 out of 5001 total items have been processed. memory: 54.5 MB, non-peak mem: 52.5 MB
>  [notice] 5001 out of 5001 total items have been processed. memory: 56.5 MB, non-peak mem: 54.5 MB

________________________________________________________
Executed in   15.79 secs   fish           external
   usr time    7.12 secs  111.00 micros    7.12 secs
   sys time    0.96 secs  677.00 micros    0.96 secs

Run 2

>  [notice] 4487 out of 5001 total items have been processed. memory: 54.5 MB, non-peak mem: 54.5 MB
>  [notice] 5001 out of 5001 total items have been processed. memory: 54.5 MB, non-peak mem: 54.5 MB

________________________________________________________
Executed in   11.48 secs   fish           external
   usr time    5.36 secs  116.00 micros    5.36 secs
   sys time    0.72 secs  782.00 micros    0.71 secs
alexpott’s picture

The reason for the speed up in subsequent runs is because it does way way less queries. With the patch subsequent runs is do around 25000 queries and without the patch we're doing around 50000 queries!

alexpott’s picture

I've created #3202290: Add performance test script to make performance testing simpler in the future. It feels worth it because this module often has to deal with huge amounts of data and small changes can have a very big impact.

gbyte’s picture

Status: Needs review » Active
alexpott’s picture

Status: Active » Reviewed & tested by the community

I've answered the review question. I think there's a misunderstanding of what deleteAll() is clearing. It is only clearing the entities from the memory used by the current process.

  • gbyte committed 116eea5 on 8.x-3.x authored by alexpott
    Only clear memory cacheIssue #3202233 by alexpott, gbyte: Stop clearing...
gbyte’s picture

Status: Reviewed & tested by the community » Fixed

Thanks for the contribution & clarification.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.