If locale finds a new string, it immediately clears the locale cache so the next request will rebuild it and cache the fact that the string has been found (with or without a translation).
If you have several strings being found on a page, this can lead to the locale cache being cleared over and over. This is reasonably expensive with memcache at the moment since it does several operations during a wildcard clear, DELETE FROM {cache} WHERE cid LIKE 'foo%'; isn't free either.
Instead of doing it every time, we can just record the fact that a string has been found (in $conf or drupal_static() or somewhere), then add a hook_exit() to locale module to clear the cache once for the end of the request.
No patch yet, I came across this issue with the i18nstrings module, but it looks like the code in there was adapted from core. #1080954: Clear textgroup caches once per request is that issue (with a D6 patch).
Comment | File | Size | Author |
---|---|---|---|
#66 | locale-lookup-1080964-63.patch | 6.51 KB | DamienMcKenna |
#63 | locale-lookup-1080964-63.patch | 6.51 KB | Berdir |
#63 | locale-lookup-1080964-63-interdiff.txt | 1.25 KB | Berdir |
#60 | drupal-1080964-60.patch | 6.4 KB | tim.plunkett |
#58 | drupal-1080964-locale-58-d7-do-not-test.patch | 6.4 KB | disasm |
Comments
Comment #1
catchHere's a patch.
Instead of clearing the cache every time we find a string, it collects a list of a newly found cacheable strings in the locale static. Then in locale_exit(), if any new strings are found, adds these to the existing cached strings that are already there.
I'm not set up to profile this , but it should mean requests look like:
instead of
Didn't run tests locally so see what the bot says.
Comment #2
catchFix that notice.
Comment #4
Steven Jones CreditAttribution: Steven Jones commentedLooks to me like that foreach is just iterating over the arrays incorrectly. Patch attached that should fix that.
I've not reviewed if this patch actually works mind you, are their tests for that already?
Powered by Dreditor.
Comment #5
Steven Jones CreditAttribution: Steven Jones commentedMinor change to use a
$cid
variable we went to the trouble of creating.Comment #7
Steven Jones CreditAttribution: Steven Jones commented#5: locale_1080964.patch queued for re-testing.
Comment #8
Steven Jones CreditAttribution: Steven Jones commentedLooks like pirf broke testing that patch.
Comment #9
Steven Jones CreditAttribution: Steven Jones commentedLet's try this again shall we testbot?
Comment #10
Steven Jones CreditAttribution: Steven Jones commentedIf you follow the link through then the tests for #9 are actually passing. Testbot is a little fragile it would seem.
Comment #11
drewish CreditAttribution: drewish commentedThis comment doesn't make much sense:
Comment #12
catchThat's true, but it's the same as the current comment minus what felt like a redundant sentence, I wasn't sure what it said so didn't try to rewrite it completely, but very open to suggestions :)
Comment #13
drewish CreditAttribution: drewish commentedI cleaned up some of the comments.
Does it really make sense to write this out at the end of every session? Why not just use register_shutdown_function() to set the callback when we notice it's changed?
Comment #14
drewish CreditAttribution: drewish commentedI'm also now concerned that the write out needs to use the same lock that we use when doing the initial cache loading to avoid race conditions.
Comment #15
DanPir CreditAttribution: DanPir commentedsubscribe
Comment #16
DanPir CreditAttribution: DanPir commentedsubscribe
Comment #17
catchI like using register_shutdown_function for this, there are other places this pattern is used in core that could use the same thing.
Also added the lock - if we fail to acquire the lock, we just don't write to the cache - the next request can do that.
Haven't run tests.
Comment #18
catchTags.
Comment #19
catchThe 8.x patch should apply to 7.x, here's one for D6 as well. I should hopefully be able to test the D6 patch on a client code base in the next day or so.
Comment #20
catchHere's before and after screenshots from Drupal 6. You can see 246 calls to cache_clear_all() (this was done on the french version of the front page of a client site, after running TRUNCATE locales_source), compared to zero with the patch.
The way I'm profiling doesn't pick up the _locale_write_cache() but this is just a cache_get(), foreach and cache_set() so shouldn't be much.
Comment #21
catchFixes a bug in the D6 patch.
Comment #22
catchumm, both screenshots in #20 were 'before', here's the actual after.
Comment #23
catch#17: locale_1080964.patch queued for re-testing.
Comment #24
catchAdding DrupalCacheArray tag, this is worth looking at for converting to that.
Comment #25
catchHere's a DrupalCacheArray version.
Current situation in 8.x.
- each time you upgrade core to a minor release, each translated string on the site that is found will trigger a clear of the locale cache in that language, if you have 15,000 rows in {locales_source} (I'm looking at a 6.x sites with 18,000), that's 15,000 cache clears assuming they're all found at some point.
- each time the cache is cleared, the next request will get a cache miss and acquire a lock, then front load the cache with all previously found strings under a certain length. All other requests in that language until the cache item is set / lock released will need to individually query each string, however they're all capable of clearing the cache again if they find new strings.
- strings are cached based on current usage, and also whether the string is shorter than 75 chars to save memory.
With the patch:
- finding strings in new versions will not trigger a cache clear, it'll just trigger a locale cache write at the end of the request
- there's a write lock to prevent the criss/cross issue of different paths overwriting the cache with different sets of strings built into CacheArray. However the locking operates differently from current 8.x - with CacheArray the first request to try to cache some strings wins, rather than the first request to get a cache miss.
- all strings get cached regardless of length. This would mean a larger cache item when lots of strings have been found. However, the patch also segments the cache by system path root (i.e. 'node', 'admin', 'user'), so for example strings that are only ever found on admin pages won't get into the cache for other path roots and vice versa.
One drawback at the moment is that CacheArray doesn't handle multi-dimensional arrays at all, so I'm currently creating a new one per-context - if there were lots of contexts on a page, then you could end up with lots of cache items. Currently we're barely using msgctxt in core so it's in practice not going to make a lot of difference, but it could if it starts to be used more widely. If that's a problem we'd likely need to go for a custom class that has a language in the constructor, then something like function get($context, $string); instead of CacheArray.
beejeebus suggested caching only the keys in irc (i.e. the lookup string, not the translation), then we could the translations in __construct() or somewhere with an IN query each request, with the idea we'd not clear the cache when saving new translations at all.
If we want the IN() query to only look for strings that we know have translations though, then we still need to store whether there's a translation or not, which means that saving translated strings will still need to clear the cache or force an update, so I didn't implement this for now but that's probably worth discussing more.
Patch passed a couple of locale tests locally but I didn't run the whole suite or anything.
Comment #26
Gábor HojtsyYou can get a list of contexts used by all of contrib on localize.drupal.org, just go to a translation team's Translate tab and hit the context dropdown. There are not a lot of contexts (and not very consistent):
Comment #28
catchOK given that list I think it's OK to have a cache entry per context so we can use CacheArray as is as opposed to adding extra custom logic. I'll look at fixing up the test failures.
Comment #29
catchThis should fix at least some of those test failures.
Comment #31
catchOne more then.
Comment #32
catchHere's xhprof output comparing current HEAD vs. the patch. To get this I did the following:
- added a new language
- visited 8.localhost/fr
- cleared the cache
- upped the version in bootstrap.inc
The main change is that 69 calls to cache()-deletePrefix() are now gone from that page.
The other change would be that the next request to that page would have to rebuild the cache from scratch, whereas with the patch the cache has already been primed (since we remove the need to clear the cache each time a new string is found). Once the version stuff is settled down it's the usual trade-off: HEAD will rebuild the whole cache the first request that gets a miss, the patch does the usual CacheArray pattern of adding stuff as it finds it.
There's two after screenshots, that's because a lot of the code in resolveCacheMiss() used to be in locale() itself, so I've taken the function detail from there too for easier comparison.
Comment #33
Anonymous (not verified) CreditAttribution: Anonymous commentedhere's another approach, based on talking to catch in IRC.
Comment #35
Anonymous (not verified) CreditAttribution: Anonymous commentedfixed dumb dup locale function.
changed LocalCache to LocaleLookup.
implemented catch's suggestion to keep less data in memory in LocaleLookup.
Comment #37
Anonymous (not verified) CreditAttribution: Anonymous commentedfixed typos in LocaleLookup.
Comment #39
Anonymous (not verified) CreditAttribution: Anonymous commented#37: 1080964-37-locale-lookup.patch queued for re-testing.
Comment #41
Anonymous (not verified) CreditAttribution: Anonymous commentedtrying that again.
Comment #43
catchI like the lack of locking in beejeebus' patch and the fact it's not necessary to clear caches when adding/updating a translation, but I'm concerned about the trade-off from the extra cache sets (one for each string) as well as the cache()->get() plus cache()->getMultiple() compared to a single cache()->get(). Would be good to get some more opinions on this, but I'm starting to also think we should start a new issue to look at 1. introducing an interface which can cleanly handle both approaches 2. deciding on the preferred approach.
Here's an update based on #31 that switches from per path-root caching to per-role caching - we realised when discussing this in irc that per-path-root caching doesn't help if for example you have admin_menu installed showing the full admin tree on every page, whereas per-role should mean not too many potential combinations as well as anonymous and low level authenticated users getting a much smaller cache item than they would otherwise.
Comment #44
aspilicious CreditAttribution: aspilicious commentedTrailing whitespace
Same
15 days to next Drupal core point release.
Comment #46
catchThanks! Fixed the whitespace issues, I had vim set up to show those, but apparently no longer.
Also updated the use statement and some stale docs with the class rename.
Comment #47
Anonymous (not verified) CreditAttribution: Anonymous commented#46 looks good to go to me, and those benchmarks show it
i agree that my approach is more experimental, so it's probably worth moving forward with this and creating follow ups for that.
Comment #48
Gábor HojtsyPutting on D8MI sprint for tracking.
Comment #49
catchHere's a D6 backport. It can't be committed due to PHP 5.2 requirements, but I'm (very) slowly trying to get these into Pressflow.
Comment #50
aspilicious CreditAttribution: aspilicious commentedCheck your environment again, it doesn't show you trailing whitespaces anymore ;)
13 days to next Drupal core point release.
Comment #51
aspilicious CreditAttribution: aspilicious commentedDidn't ment to unset the rtbc of #46.
Comment #52
webchickI confess to not totally understanding everything in this patch, however it has sign-off from the right people and doesn't seem to do anything too contentious.
Committed and pushed to 8.x, with a small comment tweak per catch:
change
to:
Marking 7.x and patch (to be ported) as a result, and also since David will probably want a closer look at this for 7.x.
Comment #53
Gábor HojtsyShould we have a changelog entry for this?
Comment #54
Gábor HojtsyRemoving off Drupal 8 multilingual sprint issues. Work applies to Drupal 7 now.
Comment #55
nerdsteinI am adding in the backport to D7
Comment #56
nerdsteinThe patch did not apply cleanly. I manually went through the patch and made edits to locale.module. When running the D7 tests, it fails. I wanted to upload it for feedback.
Comment #57
aspilicious CreditAttribution: aspilicious commentedComment #58
disasm CreditAttribution: disasm commentedattached is a patch fixing the spacing in nerdstein's patch as well as fixing some issues between d7 and d6.
Comment #59
tim.plunkett.
Comment #60
tim.plunkettJust renamed, no credit.
Comment #62
mgiffordJust missing this one test from the looks of it:
Default context is working. Other locale.test 1046 LocaleImportFunctionalTest->testLanguageContext()
Comment #63
BerdirThis should work better, the patch did not correctly consider the context stuff. That should have resulted in more than just one fail :)
Comment #64
mgifford63: locale-lookup-1080964-63.patch queued for re-testing.
Comment #65
mgiffordPatch from @Berdir still applies nicely.
Comment #66
DamienMcKennaThe patch still applies. Re-uploading the same file so the new testbots can have a look.
Comment #71
Silicon.Valet CreditAttribution: Silicon.Valet commentedreviving the zombie thread, would be interested in seeing this merged. Hi @DamienMcKenna from the past!
Comment #72
Silicon.Valet CreditAttribution: Silicon.Valet at Red Hat commentedWe've been running this code in production without any issues now for 5 months. Patch also looks good, but I'm lost on some of the subtleties.
Comment #73
joseph.olstadA diamond in the rough here, reminder, lets make a seperate issue for D7 so that we do not have to ping the D8 people every time we update this ticket. Put a link to the new D7 issue here, and then set this issue back to 8.x and mark this issue as 'Fixed'
Comment #75
alexdmccabeFWIW: We're still using the patch from #66 in production.