When using i18n the system will produce a load of sql queries just to lookup translated text elements.
This is perfect. However this issue is about finding an optmium caching strategy to reduce query amount on complex sites - which is a performance improvement.
Currently i18nstrings not even using a standard drupal cache (cache_set, cache_get, ...). It simply uses a local static cache and the sql query to fetch data.
Several pluggable caching modules allow massive speedup in sql-less (much more performant key based) caching implementations. If none of these modules are available, each cache lookup will result in a cache select query. This situation wouldn't be what we want regarding implementation.
Additionally caches need to be flushed on every update.
Moving each single string into the cache will have a lot of buckets. This can speed down the caches (but might be perfectly OK) and leverates their memory requirements.
However, aggregating the cache to combined serialized objects (e.g. all translated elements per URL) would result in data duplication (and even more RAM usage).
So we're looking for a clean caching strategy to have very few high speed queries which is pluggable for the keybased non-sql lookups.
Any thoughts?
Comments
Comment #1
ducdebreme CreditAttribution: ducdebreme commentedThat's very good idea. We were suffering a lot from the bunch of sql queries generated from i18nstrings, because we have localized vocabularies with hundreds of terms. E.g. the generation of a form caused around 2500 sql queries.
So I have checked the function i18nstrings_cache() and I thought, it would be simple to add a simple cache. The static $strings is replaced by a global $i18nstring cache that is lazy-loaded from cache, populated when needed and stored into cache, if additional translations were added.
This patch reduced the number of queries from 2500 to 1000.
Remarks: we are using D6.19 / I18N 6.x-1.7
Comment #2
miro_dietiker@ducdebreme: Note that this solution creates one global growing cache element.
While this is limited in case of trivial programmed keybased strings, it is not when using i18n to translate entities / field contents - and will fail.
Thus PHP gets very heavy - and e.g. if you add many many modules (and even data) each PHP request cycle will consume additional MegaBytes of memory.
Instead of writing the cache again and again on each page load i'd much more create it initially as a cache then containing simply everything from i18n.
Much more i see here the need to implement a keybased cache that is based on smaller get_cache chunks and could thus rely on e.g. memcached / cacherouter and similar. DB load will disappear and scaling is better possible.
Comment #3
ppcc CreditAttribution: ppcc commentedsubscribing.
Comment #4
ducdebreme CreditAttribution: ducdebreme commentedYes, it's true, the cached translation lookup table is global and will increase in size. But please take a closer look to the implementation: the translation lookup table is kept in the cache and is only be beeing updated, if new translations were requested. If the object has not been changed for hours, it expires and gets recreated on the next page request.
I have chosen this solution to ensure that the cache gets notified of translation changes.
If you thing of breaking up the cache in smaller chunks, e.g. we can cache the translation lookup table per language, so instead of
we could implement it like
I don't think, it is a good idea to further reduce the chunk sizes, because not all sites do use cacherouter/memcache. If they use SQL-based cache instead, they won't benefit from the solution.
Comment #5
das-peter CreditAttribution: das-peter commentedHere's a fist attempt to introduce such caches. I expect the patch to fail, but let's see what happens ;)
Comment #7
das-peter CreditAttribution: das-peter commentedWell, that couldn't work. Updated patch.
Comment #9
das-peter CreditAttribution: das-peter commentedHmm, had to move around some parts of the caching.
The whole thing looks like it could use a refactoring anyway ;)
It's a bit a muddling through between i18n_string_object and i18n_string_textgroup_default.
Let's see how this works now.
Comment #11
das-peter CreditAttribution: das-peter commentedI've reworked the whole approach to avoid caching objects but only their payload.
So the cache added in
i18n_string_object
is the main cache.i18n_string_textgroup_default
maintains persistent caches of metadata to use the cache ofi18n_string_object
.I'm still trying to figure out what's the matter with the menu link tests.
Comment #13
das-peter CreditAttribution: das-peter commentedI've found the issue with the menu link tests and now all tests pass locally.
Let's see how it goes here.
Comment #14
das-peter CreditAttribution: das-peter commentedFound an issue during project release test-phase.
Updated patch attached.
Comment #15
Jose Reyero CreditAttribution: Jose Reyero commentedThough I like the idea of having some good caching for i18n strings, and the patch looks good, I don't think this is the better approach.
If you consider textgroups may have an unlimited number of strings -unlike locale strings-, just enforcing cache there is like having a cache for all taxonomy terms, all menu items, etc... I mean this may work for some use cases, may not work when you have too many of them.
Actually the i18n_string system is built with that option in mind, if you see different 'textgroup' classes can be used for different text groups, see i18n_string_textgroup() function -possibly it is my fault the 'class' parameter is not properly documented in the api.php file-. The idea behind that was having testgroup handler classes that implemented caching or specific performance improvements for different text groups.
So I think the way to to this is creating some
class i18n_string_textgroup_cached extends i18n_string_textgroup_default {}
and then use that for selected 'safe' text groups, like fields that are supposed to be a more limited set of strings, still allowing anyone to use that for any other text group, just by implementing hook_i18n_string_info_alter()
What I mean is cahing is cool, it will just not work for some sites so enforcing it for every text group is not ok.
Comment #16
das-peter CreditAttribution: das-peter commented@Jose Reyero Thanks, awesome to have feedback on this. Actually I started of with building a custom (project specific) wrapper class first. Similar to your
class i18n_string_textgroup_cached extends i18n_string_textgroup_default {}
. But then I thought that I really should provide an official patch and thus started modifying the original class.However, the current approach should work pretty well in a dedicated class. So I'll re-write that and post the patch later.
Comment #17
das-peter CreditAttribution: das-peter commentedHere we go. The attached patch introduces
i18n_string_textgroup_cached
. i18n_field uses the new caching class.I also started to write tests for the caching - we could test way more but it's a good start.
No interdiff this time due the changed approach and thus a lot of changes.
Comment #18
das-peter CreditAttribution: das-peter commentedFound an issue while testing the new approach.
Comment #19
das-peter CreditAttribution: das-peter commentedAnother fix for
i18n_string_textgroup_cached::multiple_cache_get()
.Comment #20
Jose Reyero CreditAttribution: Jose Reyero commentedPretty impressive one!
This looks great. Committed. Thanks.
Now wondering what is the best strategy for letting everybody enjoy the performance boost without breaking any existing site for which so much caching may be overwhelming..
Should we add a enable/disable option? Should it be per text group? Should we enable the full caching for new installs maybe having an option to disable it?
Comment #21
MXTHi guys, I'm here because my query logs shows me that there are TONS of "i18n_string_textgroup_default::load_translation" queries in my site, so I upgraded i18n to latest dev to test this commit in #20 but I'm not so expert in performance/caching issues so probably I'm missing something:
After i18n upgrade my queries passed approximately from 800 per page to 2120 (1920 queries generated by i18n_string_textgroup_default::load_translation)
What did I do wrong? I have to do some settings / resave after upgrading i18n ?
Thank you for helping me.
Comment #22
das-peter CreditAttribution: das-peter commented@MXT
The newly introduced caching mechanism is just used by
i18n_field
for now.And there shouldn't be anything necessary to make the caching working.
Atm. I can't provide any further information since I don't know your setup nor the other changes that were introduced by the update you made.
Comment #23
Dean Reilly CreditAttribution: Dean Reilly commentedI might be able to explain what's happening. The change introduced in #13 breaks caching (even static caching) of strings which don't have a translation. Such strings are stored as (bool)FALSE and thus empty() returns TRUE resulting in load_translation() being called and another database hit.
I've written a test which shows the issue and a patch to change the empty() back to !isset(). I was not able to recreate the issue which prompted the change in #13 so I assume that whatever was going on there has been fixed elsewhere.
Comment #26
Dean Reilly CreditAttribution: Dean Reilly commentedFailures seem unrelated to my change so moving back to review.
We should probably bump the variable dependency from 7.x-1.2 to 7.x-2.5 to fix all of the notices it's currently firing.
Comment #28
das-peter CreditAttribution: das-peter commentedThat looks absolutely reasonable and there are even new tests for it!
Awesome!
However, the exceptions make me wonder what is going wrong there :|
I'm not bold enough to set it to RTBC because of those exceptions and the fact that I just did a visual review.
Comment #29
Dean Reilly CreditAttribution: Dean Reilly commentedI think the exceptions are coming from it using the wrong version of the variable module. It looks like there might be a bug in the testbot. I've added to a report here #2249403: Testbot not picking up recommended version.
Comment #30
rfay23: i18n_strings-add-persistent-caching-776146-23.test-only.patch queued for re-testing.
Comment #31
rfay23: i18n_strings-add-persistent-caching-776146-23.patch queued for re-testing.
Comment #32
BerdirI did a re-test of the branch which is green again, so this should come back green as well I guess.
Comment #34
das-peter CreditAttribution: das-peter commentedOkay, now I'm bold enough to call that RTBC ;)
Comment #35
Jose Reyero CreditAttribution: Jose Reyero commentedThanks, we love new tests :-)
Committed.
Comment #37
Jose Reyero CreditAttribution: Jose Reyero commented@das-peter,
This one looks related to this patch, would you take a look at it please?
#2252221: Error on string translation