Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I was testing a big upgraded with lots of fields (290 fields and 500 instances or so, 1600 yml files in total with entity displays, image styles and other things like that).
One major performance issue that I noticed (apart from 300+ loaded config files, duplicated to 600 by locale.module) was that glob() in buildQuery() gets really slow with that amount of configuration.
I displayed a simple view on the frontpage and glob() took up almost 10% of the whole request time, with 100 calls to buildQuery().
Comment | File | Size | Author |
---|---|---|---|
#99 | 1971158-deleteTags-99.patch | 2.04 KB | Anonymous (not verified) |
#92 | 1971158-deleteTags-92.patch | 1.67 KB | alexpott |
#90 | 1971158-deleteTags.patch | 1.54 KB | Anonymous (not verified) |
#84 | 81-84-interdiff.txt | 761 bytes | alexpott |
#84 | config-load-multiple-1971158-84.patch | 23.8 KB | alexpott |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedCan you identify why do we run a query on configuration entities here to begin with? This sounds like something you should only need to do rarely, and only in an admin context.
Comment #2
BerdirI think buildQuery() is just called like that because that's also how the method is called in the database storage controller, it's called from load().
Wondering about that, why do we need this in the first place? Maybe it was necessary when we had to support $conditions but now, we can simply loop over the id's and do a config($prefix . $id) on them, no?. @yched suggested in #1880766: Preload configuration objects that we'd add support for loading multiple config files at once, mostly because we could then also do a getMultiple() against the config storage cache.
Comment #3
BerdirOh, I misread it, that listAll()/glob() business only happens when you load pass NULL and load all entities of a certain type... Wondering who's doing that...
Comment #4
BerdirAdded debug($this->entityType); in that if and got dozens of 'editor''s and a bunch of 'field_config''s.
Looks like we need support for a fullyLoaded flag so that we don't run through this multiple times. Also, we could combine that with a id map and load all at once from the cache or something like that.
@timplunkett also found #1971174: BreakpointGroup::loadAllBreakpoints() is expensive
Comment #5
Damien Tournoud CreditAttribution: Damien Tournoud commentedThe culprit in the Editor module is
editor_load()
, which look like a premature optimization anyway.For the field module, those calls should be cached already (in the
field_info:instances
cache), not sure why it's not working for you.The main problem is that the storage structure is really stupid. Browsing files like that from the filesystem cannot possibly scale. We need to either start hashing this directory or move the store to a proper database (like SQLite or a db* file).
Comment #6
BerdirYeah, the problem with it is that it assumes that the entity_load_multiple('editor') call is somehow cached and initially faster, which isn't the case.. The optimization makes sense I think because it seems to be called for all formats of a textfield, so if it's called, then it really is called for all of them. If we can improve the listAll() and do a config getMultiple() then this would be true.
The problem with fields seems to be that it assumes that there is a call to field_info_instances($entity, $bundle), where it would pre-fill the internal field by id static cache. That call never seems to happen, possibly due to the EntityNG conversion of nodes so each field is loaded separately.
Also related #1880766: Preload configuration objects. Moving the config entity caching logic to config storage controller might allow #1880766: Preload configuration objects to focus on only the non-entity config files, of which there should only be a limited amount.
Comment #7
quicksketchI'm 100% okay with removing that optimization, though it sounds like there's a deeper bug that is causing something that should be faster to be slower instead.
Comment #8
Wim Leersquicksketch and I worked on
editor.module
, he wroteeditor_load()
specifically; I talked to him and he's fine with changing that to not doing the multiple loading, especially now that anonymous and authenticated users now have one format anyway.I don't mind writing that patch, but berdir is saying we can (should?) fix
listAll()
and introduce$configStorage->getMultiple()
to make it faster.So… which should we do? Fix editor+field or improve "all entities" loading?
Comment #9
BerdirI think those are three different issues. This issue should attempt to make the load all use case as fast as possible, which is currently used quite often for config entities (that's how EFQ works, for example).
Once this is fixed then the editor_load() issue depends on what we expect in terms of actually using/seeing all editors for most users or not and how much memory overhead this would mean. If we try to be intelligent, we could also load them based on the editors a user would be allowed to see?
And the field case sounds like yet another separate bug. There is specific code to avoid those fields being loaded separately and that doesn't work right now.
Comment #10
yched CreditAttribution: yched commentedRegarding fields & FieldInfo cache :
My hope/target is to be able to ditch FieldInfo's persistent cache of Field and FieldInstance config entities, and rely solely on the config() cache on raw config data. Caching ConfigEntities on top of it sounds like a recipe for hair loss.
The "static" cache in FieldInfo would stay of course, this would be where Field and Fieldinstance objects are persisted across the request once they've been loaded (and we usually need them at several places in the callstack within a request).
The flow would be:
- FieldInfo uses config_get_storage_names_with_prefix() (which currently does listAll() / glob()) to build a "field map" (which fields / instances are present on which entity types / bundles), and persistently caches it.
- FieldInfo::getBundleInstances() figures out a list of config names to load based on the "field map", multiple-loads the config items (hitting config persistent cache), and statically persists them.
This relies on being able to multiple-load config items, and on the assumption that instantiating a ConfigEntity from raw config data is not too expensive and does not need to be persistently cached...
Comment #11
BerdirHere's a proof of concept that introduces ConfigFactory::loadMultiple() and a listAll() cache. Seeing some nice performance improvements (700 instead of 800 queries) but I doubt it will get through the tests just yet ;)
Notes:
- loadMultiple() is a bit tricky because ConfigFactory/Config are specifically built to support lazy loading but here we actually don't want that and want to load the upfront. Note that lazyloading is pointless for config entities anyway as we do a get() right after loading them anyway, so this is no no way a regression.
- Unlike get(), loadMultiple() only returns existing, successfully loaded configuration objects
- Note sure about listAll() being a separate or a single cache entry. A single one might make sense as we can do more intelligent cache clears and only have to do a single cache get for them.
Comment #13
BerdirStupid mistake :(
Comment #15
BerdirThis should fix the installer. looks like there are multiple cached storage objects floating around, so changed to use a static property.
Comment #17
BerdirOpened #1975486: Add a persistent cache in DatabaseStorageController::getFieldDefinitions() for the missing entity field definition cache.
Comment #18
amateescu CreditAttribution: amateescu commentedWhile this issue has derailed a bit from its title, I think we can still have a really quick win for cold caches by replacing glob() with GlobIterator, at least until someone takes on the last paragraph from #5 :/
I got a nice ~5ms improvement on a test site with 600 config files, and ~10ms with 1200 files, basically for free.
Comment #20
BerdirOk, figured out what's causing those test fails.
The reason we're doing a isNew() check here isn't just because this could theoretically create a new config, it's also because deleted configuration is kept in CacheFactory::$cache as a new config object, as that's what Config::delete() is doing.
I just moved that check one level down as it's a problem within the config system and not config entities. Seems like deleted configuration should get removed from there.
Comment #22
yched CreditAttribution: yched commentedAnxiously waiting for this to get green so that I can fix FieldInfo caching ;-)
Comment #23
BerdirOk, I figured out the problem but not yet a solution.
During installation, we're re-creating the container many times. As we're using the memory backend cache, each time we get a new CachedStorage(), we also get a new MemoryBackend, with a separate storage. Which means that in same cases, we call deleteTags() on one memory backend and load from another that still has the cache, which leads to inconsistent results.
This is also something that's happening in #1885830: Enable static caching for config entities.
Tried to make the cache use static but that broke things as well, maybe drupal_static() would work but that within a class is ugly as well.
Any ideas?
Comment #24
BerdirComment #25
BerdirCould it be that easy?
The config factory has the persisted tag while the cached storage doesn't. So direct access to the storage will result in a different instance being used. So it seems only obvious to make this persisted as well.
The comment preview tests seem to pass, let's see...
Comment #27
BerdirOk, let's see how this works. Added a method that allows to clear the static list cache from within CacheFactory::reset() which is called in resetVariables().
Comment #28
BerdirComment #29
Anonymous (not verified) CreditAttribution: Anonymous commentedthis looks great!
can we make this something like:
i'm ok with that being a follow up, and i don't really care about the interface name.
Comment #30
BerdirSomething like this?
Named it StorageCacheInterface, in case we want to add more methods that aren't listAll() related
Completely untested.
Comment #31
Anonymous (not verified) CreditAttribution: Anonymous commentedyay, i think this is ready to go.
Comment #32
alexpottWe need to test this stuff
Comment #33
Anonymous (not verified) CreditAttribution: Anonymous commentedadded some tests for CachedStorage, and the new behaviour we've added for loading multiple.
Comment #35
subson CreditAttribution: subson commentedIn progress... Drupalcon sprint, May 24, 2013
Comment #36
Anonymous (not verified) CreditAttribution: Anonymous commented#33: config-load-multiple-1971158-33.patch queued for re-testing.
Comment #38
Anonymous (not verified) CreditAttribution: Anonymous commentedrerolled, let's see if the bot will run it this time.
Comment #40
Berdir#38: config-load-multiple-1971158-38.patch queued for re-testing.
Comment #41
Anonymous (not verified) CreditAttribution: Anonymous commentedadded some tests for the listAll static caching.
Comment #42
BerdirNot sure I understand this test. It talks about primed cache but then expects a call to the storage?
If we set a cache entry, then that shouldn't happen?
Comment #44
Anonymous (not verified) CreditAttribution: Anonymous commentedderp derp derp durpal.
sorry, i'll reroll and fix that. i was playing with something else in that test and forgot to restore it.
Comment #45
Anonymous (not verified) CreditAttribution: Anonymous commentednew patch, minus the derp derp to address #42.
Comment #46
Anonymous (not verified) CreditAttribution: Anonymous commentedderp derp derp.
Comment #47
BerdirWe should probably use a bit more serious naes :)
We have randomName() as well, so you could use that.
I think it would also not hurt to have some basic assertions on the returned values, so that this part is also checked.
Then we just need to find someone to RTBC the implementation and tests :)
Comment #48
BerdirSomething like this.
Comment #49
msonnabaum CreditAttribution: msonnabaum commentedTests look great in #48.
Comment #50
Anonymous (not verified) CreditAttribution: Anonymous commentedyay, this looks RTBC to me. berdir++
Comment #51
BerdirComment #52
BerdirNoticed some things that need to be cleaned up after @alexpott pointed me to one of them.
Needs tidying up, should also use {@inheritdoc}.
We still don't have a solution for inheritdoc + additional stuff, that said, this actually seems wrong (I copied it). Seems to me that a class that throws exceptions that are not defined in the interface violates that interface but that's not related to this issue.
That's the pirate version of an array ;)
This shouldn't specify bool I think.
Might have time tonight, but if someone else wants to do this, go for it...
Comment #53
BerdirHere's the promised cleanup.
Comment #54
Anonymous (not verified) CreditAttribution: Anonymous commentedlooks good!
Comment #55
alexpottAfter installing this patch and query cache_config table I noticed we have some duplicate keys due to a inconsistency with how we pass in the prefix for config entities eg.
list:views.view.
andlist:views.view
Patch attached cleans this up... more cache hits ftw
Doing a standard profile install and viewing the frontpage as user 1
Before
After
Comment #56
BerdirHm, that's quite a list. Wondering if it would be safe to cache this in a single cache key or if the data could grow too big.
Comment #57
alexpottOkay so lets not cache when a prefix is not provided. This is the first step in refactoring the storageinterface to have separate
findAll()
andfindByPrefix()
instead of a do-it-alllistAll()
method. But rather than do that here lets keep the scope creep down and do this is a followup.Comment #58
alexpottActually doing it this way round is cleaner... implementing the findByPrefix() method rather than the findAll()...
Comment #59
alexpottOkay we have tests but the Performance tag should still be there...
Comment #60
Anonymous (not verified) CreditAttribution: Anonymous commentedthis looks good to go. again ;-)
Comment #61
catchWhy does this have to be a static as opposed to a class variable? I'd expect this to only get instantiated once per request, or at least in such a way that the cache shouldn't be shared.
Shouldn't the variable be renamed now the cache belongs to this method?
Comment #62
BerdirI made it static because there were many different config storage objects floating around during installation and module enabling. Maybe that's no longer the case thanks to the persist tag, I need to check that.
Comment #63
BerdirLet's see if the testbot likes this.
Comment #65
BerdirLooks like he does, that's nice!
This should fix the php unit test.
Also had a quick look at doing a single cache, but it looks like there aren't that many different listAll() calls on the same page, so let's keep it separate for now.
What I did see was an insane amount of entity constructing for field config entities (again) as in 12k config entities insane ;). We'll need to look into that after this is in, I thought I fixed it with the field definition cache, but something changed it again it seems...
Comment #66
yched CreditAttribution: yched commented@Berdir: not sure what you mean exactly in your last paragraph, can you expand a bit ? (although maybe this doesn't belong in this issue ?)
Comment #68
Berdir#65: config-load-multiple-1971158-65.patch queued for re-testing.
Comment #69
Berdir#65: config-load-multiple-1971158-65.patch queued for re-testing.
Comment #71
Berdir#65: config-load-multiple-1971158-65.patch queued for re-testing.
Comment #73
BerdirThis uses the injected config factory and refactors that phpunit test to mock load() and not call down into mocked config objects. This doesn't doesn't need to bother.
Comment #75
BerdirRe-rolled, something conflicted with that role test.
Comment #77
BerdirWould work better if would use the factory ;)
Comment #79
BerdirOk, this turned out to be a tag checksum static cache problem. When we delete a tag multiple times on page requests, then the parent (test) process is not being updated and still uses the deletions count from the static cache. Very surprised that this does not cause failures anywhere else.
Comment #80
catchOverall this looks great. I found a couple of nitpicks.
$names isn't used anywhere else in the method, so no need for the temporary variable. Also they aren't 'remaining' until after the getMultiple() call.
I had to read this three times, because I initially read it as "when items are successfully loaded, the cache backend removes them from the cache". Could we just say "Items loaded from cache are removed from the list of $names to fetch" or similar?
Should the cache tag be configFindByPrefix? or 'configPrefixes'? Not sure it should but the tag name jumped out a little given it's a verb - almost looks like it's an options array to deleteTags() to find the tags by prefix (which would be horrifying).
This seems easy to get wrong in contrib modules too, not sure there's anything we can do about it except fix the cases in core though.
Comment #81
alexpott1. Removed and added a comment
2. This comment is removed
3. So I've created a class constant for the cache tag name... this means one we can change it after API freeze and two that we can clear the cache items using this tag reliably from other code.
4. I don't think there is anything we can do about @catch last comment in #80 at this stage.
Comment #82
alexpottForget to change status... Shows how often I roll patches these days...
Comment #84
alexpottSo #2024833: Adopt load() and loadMultiple() on entity storage controllers changed a bunch of things but missed out changing the mock in
\Drupal\user\Tests\Views\Argument\RolesRidTest
this currently works in head because the call to loadMultiple \Drupal\user\Plugin\views\argument\RolesRid::title_query() just depends on the filesystem and entity loading ... this patch introduces a dependency on the cache system.Comment #85
Anonymous (not verified) CreditAttribution: Anonymous commentedi think this is ready to go.
Comment #86
catchLooks great except:
Needs to be Cache::deleteTags() - these aren't the same thing.
#918538: Decouple cache tags from cache bins is related.
Comment #87
Anonymous (not verified) CreditAttribution: Anonymous commentedso i guess this means we need a test that shows that $this->cache->deleteTags(array('listAll' => TRUE)); breaks stuff?
@catch: are you ok with that being a follow up?
Comment #88
alexpott@catch actually committed this... see 8ceb6e1.
So I'm marking this a fixed and yep #86/#87 ... this needs to be a followup.
Comment #89
catchThat won't actually break anything at the moment, because the tag is only used for one bin, but it's wrong API usage (which shows we should do the split). I think we could commit the one-liner here so re-opening.
Comment #90
Anonymous (not verified) CreditAttribution: Anonymous commentedhere ya go.
Comment #92
alexpottMissing a use statement...
Comment #94
Berdir#92: 1971158-deleteTags-92.patch queued for re-testing.
Comment #96
catch#92: 1971158-deleteTags-92.patch queued for re-testing.
Comment #98
Anonymous (not verified) CreditAttribution: Anonymous commentedwow. so that change really breaks things. shows you how much we collectively actually get how this cache tags stuff works.
digging around now.
Comment #99
Anonymous (not verified) CreditAttribution: Anonymous commentedberdir pointed out in IRC that we should persist the 'cache.config' service, attached patch does that, and more tests pass.
i have NFI why that works, but that is above my D8 pay grade.
Comment #100
Anonymous (not verified) CreditAttribution: Anonymous commentedderp derp durpal.
Comment #101
yched CreditAttribution: yched commented#99: 1971158-deleteTags-99.patch queued for re-testing.
Comment #102
catchLooks good.
Comment #103
jibran#99: 1971158-deleteTags-99.patch queued for re-testing.
Comment #104
catchCommitted/pushed to 8.x, thanks!