Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 PST on 31 March 2024, to get $100 off your ticket.
We rebuild the registry a lot during the installation process, and most of the time in this rebuild is currently spent calling hash_file()
on the content of the files all over again.
I suggest we ignore the files we already have parsed previously in the same request. In my testing, this significantly improves the performance of the installation process (executed through Drush), and it is likely to increase the performance of the automated testing too.
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedProposed patch.
Comment #2
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #3
klausiNot sure we should develop this against D8 since the registry will go away soon, see also #1541674: Remove the registry.
The approach looks sane and I can see that every call to module_enable() will trigger the parsing of all files, which is quite ridiculous.
Could you post some performance benchmarks for your use case? Would be interesting to see how much faster an installation of commerce kickstart or recruiter can be.
RTBC to me.
Comment #4
klausiTagging.
Comment #5
catchPlease post xdebug or xhprof output for the installer, or even
time drush si
would do if it's significantly different.Typo for subsequent.
The static flag then calling out to registry_get_parsed_files() meant I had to read the patch twice as well as digging out the function to see what was going on, wouldn't it be simpler to keep a static array of files that have already been parsed? If the answer is "no it's not" then I don't mind though.
NIce to see performance work happening on the installer again.
Comment #6
Damien Tournoud CreditAttribution: Damien Tournoud commentedA typical installation of Commerce Kickstart using
drush site-install commerce_kickstart --verbose
.Before:
After:
Attached the XHProf traces of the two Drush sessions.
Comment #7
Damien Tournoud CreditAttribution: Damien Tournoud commentedReroll with the typo spotted in #5 fixed.
It might, but I want to change the structure of this function as less as possible. Every change we make we risk breaking stuff, and there is no point in fine tuning a component that is going away anyway. This patch is purely self-contained and cannot possibly break anything.
Comment #9
amateescu CreditAttribution: amateescu commentedThe wrong patch was rerolled in #7, this one should be better :)
Comment #10
sunNice one. Good to go, I think.
Comment #11
sunoh, and this likely makes WebTestBase::preloadRegistry() obsolete...? (We can figure that out in a separate follow-up issue)
Comment #12
Anonymous (not verified) CreditAttribution: Anonymous commentedthis looks good to me. i tried to think of how this would break, but came up with nothing.
Comment #13
catchThanks. This was RTBC already so I've gone ahead and committed/pushed to 8.x, moving to 7.x for backport.
Comment #14
tim.plunkettRerolled.
Comment #15
amateescu CreditAttribution: amateescu commentedNice and clean reroll :)
Comment #16
David_Rothstein CreditAttribution: David_Rothstein commentedHas anyone tested this patch with Drush commands that download/update new code? For example:
I don't know much about the internals of how that command works, but if it's downloading new versions of a module and then trying to execute the updated code in the same request, it could run into problems with this patch, no?
Or actually even some of the simpler Drush commands also: If there's any Drush code that does "rebuild registry + download new code + rebuild registry again" in one request, that seems like this patch would leave the registry out of date until the next cache clear.
Besides the above, I suppose this is pretty safe.
Comment #17
webchickThat sounds like a "needs review."
Comment #18
pounardThis patch could be improved a bit: the static variable could be called
$first_call
for example, and aif ($first_call) {}
could surround the first 5 lines (requiring database stuff), which would avoid some unnecessaryrequire_once
and staticDatabase
object calls.Comment #19
mgifford#14: drupal-1470656-14.patch queued for re-testing.
Comment #20
manuelBS CreditAttribution: manuelBS commentedAny progress here? Patch seams to work very nice.
Comment #21
mgifford14: drupal-1470656-14.patch queued for re-testing.
Comment #22
killua99 CreditAttribution: killua99 commentedI'm working with this patch since .... I can't remember.
This patch has been installed in over more than 4 site (for me) with none problem at all.
My 4 sites are from smallers sites to biggers sites.
None problem at all.
Comment #23
manuelBS CreditAttribution: manuelBS commentedFor me this patch also works in 3 sites. Ready to commit I guess.
Comment #25
David_Rothstein CreditAttribution: David_Rothstein commentedLooks like that was caused by a testbot fluke, but #16 still needs an answer (and maybe #18 too).
Comment #26
killua99 CreditAttribution: killua99 commentedWhat about this approach. Looks like take in consideration comment #18 and the #16 maybe the pm-update is really sensitive command.
We can start test from this patch and so.
Comment #27
Fabianx CreditAttribution: Fabianx commentedI think this should have a $seen array, which is static but keyed by filename.
Because while modules could be enabled in the same request, it is very very unlikely that any class file changes dynamically.
Additionally to support updates however we likely should add a drupal_static instead and have it be cleared by the module system.
Then its _really_ safe.
Comment #28
stefan.r CreditAttribution: stefan.r commentedComment #30
joseph.olstadwe're still using patch 14
Comment #33
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedComment #34
joseph.olstadThis is the exact same patch as patch 14 by @tim.plunkett, and exactly the same changes that were committed to D8 core a while back. The patch was originally for D7 , committed to D8 verbatim.
Need to re-trigger the testbot.
I can't see why this wouldn't pass tests, so lets trigger the testbot.
We've been using patch #14 for a long time now and on D7 7.50 as well as 7.44 and previous releases.
From my review of the patch, it is identical to what was committed (see above commits) into D8 registry.inc
renamed the patch.
Comment #35
joseph.olstadComment #37
joseph.olstadRTBC.
Jenkins tests were re-run, php 5.6 passes on second test
failures were just glitches by the testbot.
Comment #38
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedI am still of the opinion that we should use $seen array instead of a global variable. It might be a little more memory needed, but it definitely is safer.
Also #16 still needs an answer.
Overall we might be able to get away with just checking timestamps instead of the whole hash.
Comment #39
joseph.olstad@Fabianx , I noticed this was already committed to 8.0.x, 8.1.x , 8.2.x, 8.3.x and 9.x without the $seen array.
Commit 42e17a1 on 9.x, 8.1.x, 8.0.x, 8.2.x, 8.3.x
by catch
As far as I understand the fact that this was already committed 'as is' / 'verbatim' seems logical to me that the next step would be to commit it to D7.
Just trying to figure this out where this fits with the backport policy "if there are good reasons to work on a fix independently (for example divergence between the 7.x and 8.x code bases or a different approach being needed), then the 7.x issue may be worked on and fixed independently."
In this situation the patch was written for D7 and the reroll was a verbatim no code change for D8 and was committed to D8. I'm of the opinion that a new child issue should be created for the new $seen array idea. Otherwise how many years will it take to get this into D7? D7 is going to be around for quite a few more years and it'd be nice to get performance fixes in like this so that the 1 million or so installations can enjoy improved performance a few years before D7 support is dropped.
Comment #40
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commented#39: The backport policy does not apply to code however that was removed from Drupal 8 again after commit, which is true for all registry related things. Therefore we can (and must) treat it as a new patch.
I am not opposed to the patch, but also don't want to break lots of installations in certain cases, which is why answering #16 is important.
Comment #41
joseph.olstadOh wow, comment #31 caused me some confusion, it says 27 days ago, however the commit it'self mentioned in the comment was actually done in 2012 .
It appeared to me like commit was recently made 27 days ago but it was done 4 years ago...
Wow, took almost 4 years for that commit message to show up.
This caused me a bit of confusion.
Comment #42
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commented#41: No, the commit message is from when the 8.x-3.x branch was opened, which makes it a new commit on that branch.
But the functionality has been reworked in Drupal 8 by removing the registry in favor of composers autoloader.
Therefore the code that was committed is no longer present even though it had not been reverted.
Introducing that code now in D7 has such a different risk than usual backports, because the functionality was never really tested on D8 and especially not on the released version.
Therefore in this case the backport policy does not apply.
Comment #43
joseph.olstad@fabianx , I consider this patch to be low-lying fruit, low risk, its on joelpittets list of patches for low risk, we've been using it for a long time on Drupal 7.38 , 7.39 , 7.40 , 7.44 and now 7.50 , it's greatly improved installation times which is important for our CI testing on travis and also for simplytest.me when our distro is spun up (wetkit distro) for demos.
We dropped our installation times by more than half (the bigger the distro install profile is the more noticeable improvement this patch makes ) by including this patch in the wetkit distro and included a few other of joelpittets favourites , some of which were brought into 7.50 , but I consider this patch to be easy pickings, easy commit credits to drupal core glory for maintainers and adds a lot of value to Drupal , we have millions invested in the development of the wetkit distribution which is still used by many of our government clients and is still being implemented for some new government clients. Although we have less than 100 reported installs of our distribution it is installed by really big clients that have deep pockets and big budgets. One of those sites has over one quarter million pieces of content. There is also a province who's entire site is using a version of this distro.
While currently my main client (a very high profile client) does not use a current version of this distro we do use this patch.
The success of Drupal 8 is depends on the success of Drupal 7 , holding back on improving Drupal 7 may not necessarily convert directly into new Drupal 8 installations and might actually do the opposite and sour developer relations.
Comment #44
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commented#43: Thanks for giving feedback that it works for you.
However #16 is still not answered.
And it needs to be first.
Comment #45
joseph.olstadRTBC++
To answer @david_rothsteins question about drush use case in comment #16
Using this patch did the following update to the entity API module using drush pm-update , WORKS GREAT
drush pm-update entity
Finished performing updates. [ok]
Checked the system table, everything looks as expected, although there were no schema changes between 1.6 and 1.7 so I tried another module, next up: custom_search
Code updates will be made to the following projects: Custom Search [custom_search-7.x-1.20]
Custom_search 7104 Store vids - machine names association.
Before upgrading the schema record in the system table was at 7103
After , it is 7104 (AS EXPECTED) (see query and result)
RTBC++
Comment #46
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commented#45: For a true test case however you need e.g. a renamed class within the same file as that is what we are testing here.
Comment #47
joseph.olstad@fabianx, how about your $seen array idea in comment #27 , do you have an example of where this has been done in the past? I guess I could search through the core code.
Actually a better drush pm-update test would be an update of many modules rather than the one-at-a-time that I did.
Comment #48
joseph.olstadwhile my tests all passed, and this patch does really make a difference on installation speed, I would have to do a LOT more testing and debugging and searching to really be able to fully answer these questions brought up and investigate the $seen array suggestion.
Comment #51
joseph.olstadOk, if its good enough for 8.4.x , it should be good enough for 7.x. We're still using this patch and have been for ages.
Passes all testing, what more can you ask for?
Comment #52
dsutter CreditAttribution: dsutter as a volunteer commentedRTBC+ patch #34
Comment #53
gdaw CreditAttribution: gdaw as a volunteer commentedPatch 34 RTBC +1,
let's get this in for D7 !
Comment #54
joseph.olstadstill using this with 7.56
it's been committed to D8 , there's nothing left to review.
Do it like they do on the discovery channel.
Comment #56
joseph.olstadreturning status to RTBC.
I triggered some re-tests on 7.57
except one, all of the tests I triggered passed, including php 7.1
It appears that the php 7 fail was a glitch in the test bot. Retriggering php 7 mysql test
Comment #57
PolI'm testing this patch on my local setup since a small week and I haven't found any issue, and indeed, it's faster.
I'm all for it in D7.
Comment #58
PolComment #59
joseph.olstadtests all pass, any fails were due to a temporary glitch/config regression in the jenkins / drupal.org testbot automation which has since been fixed.
Yet another win here.
Comment #60
joseph.olstadBumping to 7.61, this didn't make it into 7.60
Comment #61
joseph.olstadComment #62
joseph.olstadComment #63
PolUpdating credits.
Comment #64
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedI am okay to include this as is, because Drupal 8 has the same patch, so deviating from it does not seem wise.
Comment #67
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedUnfortunately after commit I realised that we break the lookup cache with this.
Proposed change:
Here we should add the file to an $unchanged_files array.
Later we should combine the files with the unchanged files again.
and do the lookup then.
Unrelated, but small performance gain: Use isset() instead of in_array($file, array_keys()) for the lookup in $files.
The main thing this skips is to call _registry_parse_files() on all $files again, but the lookup cache should not break as a result of that.
--
Unrelated follow-up:
module_implements('', FALSE, TRUE);
_registry_check_code(REGISTRY_RESET_LOOKUP_CACHE);
should be _outside_ of the transaction not inside. Can someone open a follow-up please?
Comment #68
PolUpdated patch with @fabian 's comment.
Comment #69
PolIssue with moving out some calls out of the try/catch block has been created: #3028364: Update function _registry_update() and move module_implements() and _registry_check_code() calls out of the try/catch.
Comment #70
PolComment #71
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedBack to RTBC, the reason this is safe and #16 is no problem is because how this works:
- We just do not check the hash of the files again -- new files (from newly enabled modules) will be picked up and hashed correctly, as they cannot be present in registry_get_parsed_files(), which is what $files is compared against
- Existing files would only be affected _if_ the hash changed during the request, which is the performance gain that is used here
- Old files are deleted correctly -- by the else block -- as they are present in registry_get_parsed_files(), but not in $files [unchanged case]
Comment #73
Fabianx CreditAttribution: Fabianx at Tag1 Consulting commentedCommitted and pushed to 7.x - Thanks all!
Comment #74
joseph.olstadThe cheer leading squad has arrived!
backflip
Comment #75
PolLOL ! :-)