Problem/Motivation
Problem: The static $custom_strings
variable cannot be manipulated once it's initialized in t()
Proposed resolution
Rationale:"The benefit of having this is that we can have translations injected in after the bootstrap has been initialized. To the user, this means that we can have contextual translations based on certain criteria that could be evaluated post-bootstrap...."(Rob Loach, #22)
Rationale: "...there are challenges in making the interface feel like it's been designed with the end-user in mind. Being able to manipulate strings conditionally will be a big asset in making Drupal more user friendly...."(chiddicks, #23)
Proposed Solutions:
- "keep the current static"
- "use the drupal_static_fast pattern"
- "remove the static caching altogether and just use variable_get()"
Final Solution:
Remove the static caching altogether since it's already handled in variable_get()
for us.
Submitted Patches:
Remaining tasks
- Commit the Drupal 8 patch at #32
- Move the issue to the Drupal 7 queue.
- Commit the Drupal 7 patch at #32
- Move the issue to the Drupal 6 queue.
- Evaluate whether this is needed for Drupal 6.
User interface changes
none
API changes
None
Original report by [Rob Loach]
The static $custom_strings
variable cannot be manipulated once it's initialized in t(). Making use of the drupal_static static cache system would allow such manipulations.
Comment | File | Size | Author |
---|---|---|---|
#40 | dynamic_strings_updated-1135950-40.patch | 3.21 KB | chiddicks |
#39 | dynamic_strings-1135950-39.patch | 1.28 KB | chiddicks |
#32 | 1135950-32.patch | 1.57 KB | RobLoach |
#32 | 1135950-32-d7.patch | 1.45 KB | RobLoach |
#32 | 1135950-32-d6.patch | 1.04 KB | RobLoach |
Comments
Comment #1
RobLoachComment #2
chiddicks CreditAttribution: chiddicks commentedI'll add my support for this patch by doing some review and performance testing. I notice that there is a "fast pattern" for drupal_static(); if there is one function that gets called a million times, I'm sure it's t(). I will do some testing of this patch, with and without the fast pattern as described here - http://api.drupal.org/api/drupal/includes--bootstrap.inc/function/drupal...
More to follow.
Comment #3
chiddicks CreditAttribution: chiddicks commentedI did some performance testing, and here are the results.
Setup: vanilla Drupal 8.x-dev install, latest nightly build. Clicked around until I found a page that called t() a lot.
Results:
Here's the code I used to test the fast pattern:
I'm going to look at my methodology tomorrow to see if there are any holes. It'd be good to have a maintainer weigh in. Even with the fast pattern, the performance hit is about 50% for this change. Especially with t() being called so often, we've got to decide if the performance hit it worth the added flexibility. It's also probably useful to note that the overall page-load time for the test ranged between 140-180ms, meaning that the variable retrieval in t() accounted for approximately 2% of execution on the page (at most) and 1% at least.
Comment #4
RobLoachDidn't realize you could do fast static caching like that! Fancy.
Comment #5
catchIt looks like the static just avoids calling variable_get(), it may be faster to remove it altogether in this case. Would you mind running the same test with that change?
Comment #6
chiddicks CreditAttribution: chiddicks commentedMy timing method was flawed; the Drupal timer functions round to the nearest hundredth of a millisecond each time the timer is stopped, which means that it is automatically incremented at least 0.01ms or so per function call. This is not accurate enough. I've just implemented a basic timer myself using a couple of global variables:
So, I've rerun the tests with the more accurate timer, and found it makes a big difference. Timing over pageload with 304 calls to t(), average of 5 pageloads.
Existing implementation:
2.2 ns/retrieval
Rob's latest patch, 1135950.patch:
2.7 ns/retrieval
Original patch, drupal_static_t.patch:
4.6 ns/retrieval
catch, your request is a little difficult because it would involve testing the performance of the whole t() function, because eliminating the static variable means that the data is no longer cached and the custom strings are retrieved from the database on each function call. The static retrieval time goes to zero, but I'm sure the overall function runtime for t() would jump considerably. I'm not sure I have time to do that test, and besides, I think it's reasonable to assume that caching a variable is going to be well worth your while over a couple of hundred function calls - especially since variable_get is querying the DB.
In conclusion, I'm really liking the looks of Rob's latest patch. My performance concerns were largely based on inaccurate information. While the latest patch does introduce a small performance hit, the flexibility trade-off is well worth it in my view.
Comment #7
catchvariable_get() does not retrieve from the database, it accesses a global. See http://api.drupal.org/api/drupal/includes--bootstrap.inc/function/variab...
Here's a quick untested patch using variable_get(), it should be fine to run the benchmarks again with the timer stop just before:
drupal_static_fast might be the best pattern here, those 300 function calls are worth saving, but since variable_get() is cheap too (not a database query), let's at least know how much of a gain it is. There is no question we want to use drupal_static_fast compared to regular drupal_static, agreed on that, but that's not the only option.
So for the benchmarks to be comprehensive, we need the following:
current 8.x (static)
variable_get() with no static
drupal_static
drupal_static fast.
Since the code execution path is slightly different in variable_get() compared to whether custom_strings is set, we should do these with and without something in $custom_strings.
Micro-optimizations like this are exactly where we need to balance performance vs. complexity. There is quite a lot of resistance to drupal_static_fast due to ugliness, if it turned out to be not needed here, or very marginal, then that will only add to the FUD - making it harder to use when it really is necessary. If it's a still a decent improvement compared to variable_get(), then we'll be able to point to this issue the next time this comes up.
Apologies for being a pain in the arse on this, I'm asking for this because this is not the first issue where:
- we had a static in a critical path function, that was optimizing something quite cheap (but called very often).
- someone has posted a patch to change this to drupal_static() for flexibility
- that patch would have completely undone the performance gain, in addition to adding more complex code to the function (compared to no static, or just normal static caching).
It is good that the drupal_static_fast issue got caught before this was committed (not the case in other issues unfortunately), but we are only guessing on whether there's still a gain. My own guess would be the variable_get() will come in slower than drupal_static_fast, but faster than drupal_static, but we should know if this is the case or not, and how far in each direction.
Comment #9
chiddicks CreditAttribution: chiddicks commented#7: custom_strings_variable_get.patch queued for re-testing.
Comment #10
chiddicks CreditAttribution: chiddicks commentedYou're not being a pain in the arse at all, you're completely right. I hadn't even looked at the variable_get() function itself, I just assumed it was a database query because of the variables table. So, here's the data:
(Average of 5 pageloads, same as before, but the timer runs from the beginning of t() to the immediately before string is returned)
existing implementation, static: 7.6ns/function call
drupal_static fast: 8.3ns/function call
drupal_static: 9.9ns/function call
no static, custom_strings_variable_get.patch: 9.1ns/function call
I'd love it if there was a standard for performance testing and that it was automatically done on testing patches. I feel my performance testing is probably fairly accurate, but a little haphazard. catch, I see you're already on that: #638078: Automated performance testing for core
Your guess was right about the relative performance of each method. Good to know that we've made progress from the original patch. I should add that the original reason for this patch was to have the ability to manipulate the custom strings per page-load, which is why using a static variable in some form is ideal.
Comment #11
catchThanks for doing those! I need to get back to the automated performance testing issue...
So looking at those numbers, the original static makes t() 16.5% faster, drupal_static makes it 9% slower, drupal_static_fast makes it 9% faster.
I just realised why this patch brought back memories, we had more or less this exact same issue two years ago :( http://drupal.org/node/523694#comment-1826970
One last thing, can you confirm whether you have the xdebug extension enabled or not? Nnot just profiling, whether it's enabled at all.
If you do, please do one last round of benchmarks with it disabled - xdebug will favour drupal_static_fast and static disproportionately since it adds overhead to every function call, but not so much for other stuff.
Comment #12
chiddicks CreditAttribution: chiddicks commentedThe xdebug extension was not enabled for any of the benchmarks.
Comment #13
catchAlright, back to RTBC, thanks again!
Comment #14
catchComment #15
RobLoachThe patch applies cleanly against both Drupal 7.x and 8.x.... And works.
This is catch's patch, as well as catch's patch against the 7.x branch.
Comment #16
catchSorry I meant to RTBC chiddick's patch from #4, not my own patch from #7.
To be clear, the choices are between:
- keep the current static, mark this won't fix - this would prevent tricky stuff from contrib (tricky stuff from contrib is the best reason for drupal_static()
- use the drupal_static_fast pattern - this is a tiny bit slower than a straight static but not much.
- remove the static caching altogether and just use variable_get() - this is twice as slow as using the drupal_static_fast pattern. variable_get() allows you to tweak $conf instead of drupal_static() trickery, so functionally much the same. I just realised there would be a slight memory saving there, if you had a lot of custom strings.
Marking back to CNR since I'm now thinking just using variable_get() here might end up best, but that is my own patch so I shouldn't mark RTBC (even if by accident).
Comment #17
RobLoachcatch and I talked briefly on IRC and decided that the third option is best. We'll be removing the static caching altogether, as it's already cached globally in
$conf
and retrieved in thevariable_get()
call. This means the patch goes back to RTBC with the catch's patch at #15 (originally from #7).Thanks!
Comment #18
chiddicks CreditAttribution: chiddicks commentedSo this would allow a module to modify the global $conf during the page-load, have t() use some custom-er custom strings when rendering the page, and then have the regular custom strings reloaded by variable_initialize() function on the next page-load?
So I guess my question would be: since we know that there is a performance improvement in using a static variable, what's the disadvantage of using the drupal_static fast patch (#4)? Is it enough to warrant the performance regression of eliminating the static?
Comment #19
chiddicks CreditAttribution: chiddicks commentedUgh, form caching.
Comment #20
catch@chiddicks variable_get() is already doing very little, drupal_static_fast we need to keep for things that absolutely need static caching, but are also called so many times that normal drupal_static() is a performance hit - this one fails the first part of that.
Comment #21
webchickHm. What's being advocated here is taking the most commonly called function in all of Drupal and slowing it down. By 17%. What's not clear here is what the use case for doing that is. It better be good. :)
(Thanks for all the awesome benchmarks, chiddicks!)
Comment #22
RobLoachThe benefit of having this is that we can have translations injected in after the bootstrap has been initialized. To the user, this means that we can have contextual translations based on certain criteria that could be evaluated post-bootstrap. Using String Overrides and the Context module, for example, one could have
t('Create %nodetype')
translate to "Submit Review" on Review nodes, and then "Take Quiz" on Quiz nodes.Something that's usually hacked into the theme layer is now brought into the translation system.
Comment #23
chiddicks CreditAttribution: chiddicks commentedI must add my voice of support for this change. Drupal is a very powerful tool, but often there are challenges in making the interface feel like it's been designed with the end-user in mind. Being able to manipulate strings conditionally will be a big asset in making Drupal more user friendly. While it is a performance regression, it opens the door a great deal. In my opinion, it's a worthy trade-off.
Comment #24
RobLoach#15: 1135950.patch queued for re-testing.
Comment #25
RobLoachComment #25.0
solomojo CreditAttribution: solomojo commentedIssue Summary
Comment #26
RobLoachTesting bot!
Comment #26.0
RobLoachEdit Issue Summary
Comment #27
xjmSummary added by solomojo.
Comment #28
xjmRe: #26 -- I think you need to request the re-test with the link if the patch has been tested before.
Comment #29
xjm#15: 1135950.patch queued for re-testing.
Comment #30
xjmRerolled for core/.
Comment #31
xjmWeird... let's try that again.
Comment #32
RobLoachThe attached patches are for both Drupal 8 and 7. Works in both versions. Didn't change any of the code from xjm's above patch.
Not sure if this applies to Drupal 6 since there isn't a concept of contextual localization there, but here's a potential patch for it anyway. I'll update the issue summary.
Comment #32.0
RobLoachedit issue summary
Comment #33
Dries CreditAttribution: Dries commentedThe performance impact of this patch could be a bit of a problem as the function is called a lot.
Catch was very involved with this patch, and has looked at the performance numbers, so I'll leave it up to catch to commit this.
Comment #34
catchSounds like we should be adding the content type machine name as context to the t() call to allow this, then this could be done with straight translation rather than manipulating a global. Could we add a follow-up to discuss that?
It's very likely that the new config API is going to make this code irrelevant (either way), since I doubt it is going to allow runtime manipulation of configuration variables - that's a good reason to look at use cases like this and see if there's another way to do it.
I am not very concerned about the performance regression since it was a small optimization in the first place as part of a desperate attempt to get D7 performance under control, but also can't get excited about the 'feature' either, so need to ponder this a bit more.
Comment #35
RobLoachThe patch that I submitted was based on your code at #7. Looking at #6 for the performance numbers, would using the drupal_static fast pattern be a good compromise? I understand it would be fantastic to wait on the config API to drop in, but I don't see that coming in anytime shortly. Maybe just have this in for Drupal 7 then and wait for the config API for Drupal 8? It is a bug and currently requires a hack in Drupal core to get around.
The point of the specific example I provided was that it's a translation that does not have a context associated with it. There could be any context required with any given string, and it's impossible to know what kind of context we want out of it. We don't want to hack core to inject the desired context whenever we run into a contextual translation, so having the ability to change the
static
variable here is required.So think using the drupal_static fast pattern here is the way to go? It's almost twice as fast as the latest submitted patch, and is only 0.5 ns per 300 calls to the function.
Comment #36
catchI don't mean waiting for the context/config APIs, rather that this is unlikely to have arbitrary contextual configuration, so you won't be able to do this hack any more with $conf once it's in.
Given that, can we look at whether it's worth adding that context to the t() string, since that's the existing mechanism we have already for translating the same string in different ways based on context. I disagree it's 'impossible to know' what context is needed, that's why we have an API for this stuff at all. I'm not familiar with the criteria used for adding context to strings though so it could be a mis-use, but we should at least look into that before committing workarounds, and that would have the advantage of still existing post-CMI.
If I do commit this I'd rather do the straight variable_get() than drupal_static_fast(), I just think the use-case is fragile and I'm not convinced it's the only way to workaround this (at least with that example).
Comment #37
chiddicks CreditAttribution: chiddicks commentedBased on my limited reading about Config API and its purpose, it seems to me like there will need to be some thought put into how Strings will fit into that system. One of the things it is aiming to solve is the unnecessary loading of all variables on each page request, something that certainly applies to Strings. Perhaps it would be wise to talk about what an ideal solution to Strings storage and retrieval would be given the Config API and work backwards. We know there is a use-case for runtime String replacement, but we also know that this will be a tiny minority among String retrievals. A sort of 'String Registry' might be useful, as a means of letting t() know whether it should go through the costly process of loading that String from a source other than the cache. It would function somewhat like the Theme Registry in that it would itself be static (and therefore fast) but would allow for a string to generated by a dynamic piece of code where desired.
I'm not sure how much relevance it has in D7 given that Config API is a D8 thing, but talking about how we might improve this system overall might yield a more favourable solution here.
Information about Config API: http://groups.drupal.org/node/155559
Comment #38
sunThis could use a review from Gabor.
Comment #39
chiddicks CreditAttribution: chiddicks commentedOk, I'm back at it. I think we can have our cache and eat it to. So there is a string registry of sorts already, in the $conf['locale_custom_strings_language'] variables. As it happens, this is just an array where the key is the original string, and the value is the replacement string. We can simply add an optional array instead of the replacement value directly, wherein a type and function name can be specified rather than just the replacement string. This way, at runtime, a function can be called to specify the string rather than being locked into a static string throughout the pageload. It can also be backwards compatible by adding in a check to determine if the replacement value is a string or an array. For example, something a developer would add to the settings.php:
Modifying the original statement in t():
To accommodate the dynamic option:
Basically, I think this gives us the option to specify strings at runtime without any performance regression at all. See, it only performs the additional calls if the string has been "registered" in the custom_strings array, a small minority of strings. It provides modules with the ability to modify the string at runtime, given that prior to runtime it registers the custom string in $conf. It can then manipulate the string based on any number of criteria, for example, a variable value:
I ran some performance test based on what I was doing previously in this thread, but the difference in average retrieval value is negligible, as you might anticipate given that the caching is still happening as before.
Comment #40
chiddicks CreditAttribution: chiddicks commentedNew patch (not cumulative) with some updates to bring it in line with Drupal standards. Also adding changes to default.settings.php to explain changes and provide examples.
Comment #41
RobLoachAlthough I like the idea, I fear that we'll run into the same problem with catch's performance concerns above. If we just expose the static $custom_strings variable to contrib, then we could handle all of thing externally without having to check is_array() on each custom string.
-10 days to next Drupal core point release.
Comment #42
chiddicks CreditAttribution: chiddicks commentedThe is_array call is purely for backwards compatibility. If that was deemed unimportant for D8, it could be reduced further, but would probably need to be included in any backports.
It's my position that this is a better balance between performance and function than not caching $custom_strings at all. Allowing a custom string to use a callback enables the functionality we'd like but doesn't require changes to the $custom_strings variable during runtime. It's potentially more compatible with the Config API coming down the line as well. All-in-all, it does not introduce a performance regression beyond potentially making custom string retrievals longer by virtue of calling an additional function, in some cases.
It would be great if catch could weigh in.
Comment #42.0
chiddicks CreditAttribution: chiddicks commentedUpdated
Comment #43
deggertsen CreditAttribution: deggertsen commentedShould the priority of this issue be changed?
Is the summary accurate in that all that needs to happen is for the patch to be applied or do we need a new patch? If so seems like this should be marked needs review.
Is catch still working on this at all or should this be reassigned?
Comment #44
mgiffordUnassigning stale issue. Hopefully someone else will pursue this.
Comment #53
catcht() no longer has static caching, and locale_custom_strings comes from $settings. Closing this as outdated. If there's a specific use-case the current architecture doesn't allow for, would suggest opening a new task.
Comment #54
catch