Problem
- Some info hooks require you to declare human-readable values with t(). Some others don't. Getting it wrong potentially leads to security issues.
- Info hook values for low-level stuff defined in code and similar things are always the same, but Drupal creates different, language-specific caches for them.
Goal
- Restore sanity.
- Drop language-specific caches. Only translate/localize strings on presentation.
Details
- #503550: Translated strings are cached in _info() hooks and #813370: Hook_menu_alter() not supported by potx made us (inconsistently) introduce two patterns:
- Info hooks that define things using t() for human-readable values
- Info hooks that define things NOT using t() for human-readable values
Nonsense? Nonsense.
- Some (most) info hooks require you to use t() now, and we cache (the same) information for each language, just translated.
- The mere reason for doing so is that we're not able to determine translatable/localizable strings through our potx template extractor otherwise.
- Lacking a proper mechanism for doing so, we helped ourselves by introducing weird magic for functions like
watchdog()
that requires you to put the log message string within the function arguments on the same line in order to be identified by potx' string extractor.
Proposal
- Introduce
tl()
, standing for "translate later/lazy", [bikeshed], to signify a string to be extracted for string translation. - Remove language-specific caches for info hooks, everywhere.
Comment | File | Size | Author |
---|---|---|---|
#31 | interdiff.txt | 659 bytes | almaudoh |
#31 | translation_no_op-31.patch | 3.1 KB | almaudoh |
#29 | plenty_t.zip | 658 bytes | almaudoh |
#29 | translation_no_op-29.patch | 2.45 KB | almaudoh |
#27 | translation_no_op.patch | 1.75 KB | almaudoh |
Comments
Comment #1
sunComment #2
BerdirRestoring sanity sounds nice :)
+1
Would we use that in hook_menu() as well?
This would also be helpful for contrib which have similar patterns (not just info hooks, but e.g. writing untranslated strings to the db and translate on display like menu/watchdog).
However, wouldn't we also need something like t_dear_potx_i_know_what_i_am_doing($variable) to prevent it from throwing those warnings that it currently does if you do t($variable)?
Comment #3
sunPerhaps we'd just remove the warning for t($variable) for D8?
Attached patch implements tl().
Of course, there will be a lot to convert after introducing this. I'd suggest to handle every single subsystem/API on its own in separate issues, since each of those patches will have to:
Comment #4
BerdirAnother thing that might make use of this are exception messages.
Those are currently a huge mess in core and there are open issues for that. Simply using tl() won't be enough for them because they might have arguments which need to be passed through (and either used when the exception is displayed and/or passed to watchdog() to save them).
Docblock in the patch looks nice, probably makes sense to add a reference to t() as well?
Comment #5
Gábor HojtsyOne of the main reasons this was not done before is performance. Why introduce a function call even for the most innocent thing as to return a string, which would slow down the page a bit. If we only use these with info hooks, it would of course only be called when the info caches are regenerated but it is a small hit nonetheless. Magic comments were suggested before, but those could be ugly too. Think /* t( */ and /* )t */ or something along those lines.
Also consider that we'd need to support format_plural() type singular/plural pairs as well. These are stored and retrieved in pair, so they not equal to t($singular) and t($plural). They were not equal in D7 and much less so in D8 that we changed their storage model too earlier. So with that in mind, you'd need format_plurall() too (note double l).
Finally, if this is all only needed for potx, and we can agree on a code convention that does not require API changes (no new function), it is easily introduced in any Drupal version. Potx is a contrib module and it can react to anything in the code so long as we code it in :) So we might not need a core change at all to make this happen. We might only need a coding style change.
(This is a recurring discussion, there are likely many places in issues and g.d.o where you can find previous instances of the same discussion. The above is my short recollection of what we discussed before in previous instances).
Comment #6
Gábor HojtsyAlso not sure why this would be a critical, especially that it can be solved outside core even.
Comment #7
sun@Gábor Hojtsy: I filed this as critical, because of the first and foremost problem mentioned in the OP:
As mentioned in #3, a tl() facility cannot be simply tacked on the current code. The actual API/module code needs to actively support and be aware of it. If hook implementations don't return translated strings anymore, then the strings need to be translated later on (on presentation, when translations are actually needed).
I don't consider performance to be an issue. It's only used in cases of info hook definitions (of which all are cached), and this is actually the typical no-op function being used in performance benchmarks. Even if you'd call this a million of times, like so:
then the total time is negligible:
On format_plural(), eek. ;) I'm not sure whether that use-case actually exists within the tl() scope. I'm only aware of one issue that wanted to add format_plural() support to watchdog() strings, but I think I marked that won't fix, because no one really cares whether log messages use proper plural language rules.
Comment #8
chx CreditAttribution: chx commentedDrop language-specific caches. Only translate/localize strings on presentation.
That's ... not practical? Good caching involves storing pieces of rendered HTML, you can't really disassemble that into the source strings to translate that on the fly.
Comment #9
Crell CreditAttribution: Crell commented-1 from me on using a noop function as a marker. That has performance implications as well as hard dependency problems. Can we not use annotations somehow?
How does anyone else do this? Or do they simply use keyed lookup files (something I know we considered but decided against for some reason...)?
Comment #10
xjmWhatever solution is implemented should then be used for annotations' Translation objects. (See #1683644: Use Annotations for plugin discovery).
Comment #11
fagoComing back to this one from #1853096: Integrate symfony validation violations with Drupal translation. Over there, we need a way to pass on a translated message such that afterwards we can output the translation + the untranslated message, e.g. to log it to watchdog. Problem being, our regular way to do this (pass on message template + args) does not cope with format_plural(), what we need absolutely for validation violation messages.
Thus, I was thinking about introducing a simple class holding translatable messages
Once we'd have such a simple value-object we could easily
- write potx support for that
- pass it on to functions that need to postpone translation like watchdog() or violation constraints.
- put it into our caches (that's why I'm posting it here ;)
Not sure whether having those objects in caches is a performance concern - that would obviously require benchmarks. But it would be a simple way to unify per-language caches and solve potx issues. Actually, I'm surprised I've not found any talk about a solution like this one here?
Comment #12
fagoI did a quick patch and performance test. I converted t() calls from system_data_type_info() and tested deserialization performance via that code snippet.
I was not able to get a measurable performance difference, e.g.
Translatable classes
1424.64ms for 10000 runs.
0.142464ms per run
with pre-translated strings
1439.51ms for 10000 runs.
0.143951ms per run
Simple patch attached. Note, that we could use that class from plugin annotations would result in a better DX.
Comment #13
Gábor Hojtsy@fago: the patch at #1813762: Introduce unified interfaces, use dependency injection for interface translation proposes to swap most of what is under t() out to be pluggable, so I think there are probably lots of opportunities to work together here, instead of making things pluggable on different levels in parallel, somehow figure out the best way :)
Comment #14
fagoThanks for the pointer I was not aware of that issue, it doesn't seem to address the problem of postponing translation though?
Comment #15
Gábor HojtsyNo it does not do that.
Comment #17
Crell CreditAttribution: Crell commentedAlso potentially relevant: #1843798: [meta] Refactor Render API to be OO
That issue isn't targeting translation per se, but the parallels are strong.
Comment #18
Gábor HojtsyDoes not seem to have any chance to happen in Drupal 8.
Comment #18.0
Gábor HojtsyUpdated issue summary.
Comment #19
Gábor HojtsyComment #20
catchThis would solve some outstanding critical issues in 8.x and it might even be a duplicate.
Comment #21
Gábor HojtsyIn one of our criticals at #2363099: Using translated field definition descriptions in entity schema results in different schema definitions, resulting in update.php changes this came up but then got discarded in favour of special casing these method calls for the potx parser.
Comment #22
stefan.r CreditAttribution: stefan.r commentedJust bumping this because this is coming up again in another issue and will surely come up in contrib as well.
@Crell what kind of annotations were you thinking of? If we duplicate the string in the annotation we'd need to police that it actually exists in the file because it's so easy for it to deviate.
Maybe something like this could work, where we just take the default value of the property?
Comment #23
almaudoh CreditAttribution: almaudoh commentedThis potentially would work for constraint violation messages and other cases where the translatable string is stored as a class property. But are there other cases where the translatable string is not stored in a class property?
Comment #24
stefan.r CreditAttribution: stefan.r commentedHmm maybe that's a limitation we can live with?
Otherwise maybe an annotation right above a regular variable assignment, do we know if any other PHP projects do this?
Comment #25
Gábor HojtsyComment #26
stefan.r CreditAttribution: stefan.r commentedA $module.messages.yml file would make sense, though I think @Crell mentioned in #9 that was decided against at some point, does anyone remember why?
Comment #27
almaudoh CreditAttribution: almaudoh commentedLooking through this again, I think a no-op version of t() or $this->t() would still be the best option. Perhaps a fourth optional argument to specify that it's a no-op. This would need profiling though for the extra
if
statement.Comment #29
almaudoh CreditAttribution: almaudoh commentedFixed the test fails.
Also did some basic profiling with the patch using ab with xdebug disabled. Disabled internal page cache module. Installed attached
plenty_t
module which callst()
10000 times.HEAD:
Run1:
Run2:
patch:
Run1:
Run2:
This indicates that there is really no overhead to the if-statement needed by the no_op approach.
I will look at using xhprof maybe later.
Comment #31
almaudoh CreditAttribution: almaudoh commentedComment #32
Crell CreditAttribution: Crell at Palantir.net commentedRe #26: I don't recall details. There was discussion between D8MI and CMI back in 2011 and early 2012 regarding using placeholders in code, with a lookup file for the actual strings. There are other projects that work that way, but not all of them. I *think* the reason it was decided against was that it was hinder DX for module authors, but I don't recall for sure. Perhaps Gabor or Heyrocker would remember.
For my part, my main concern is avoiding runtime dependencies. A no-op function call in a random class somewhere may have minimal performance impact, but it's still a function that must be defined at runtime and therefore hurts testing, code separation, etc. That's why, eg, we're not translating exception messages; it would mean any class that might throw an exception always needs the translation system, which is not acceptable (especially given the potentially numerous circular dependencies it could cause). Some way of denoting a translatable string that does not have a runtime requirement is fine; what that is, I don't care as much. :-)
Comment #33
stefan.r CreditAttribution: stefan.r commentedWell it'd be the n-th way of translating something but I don't think the DX hindrance is that horrible. If we can't do a no-op function likely anything we pick will be a DX hindrance, but this seems like an important enough feature to make us want to pick at least something.
Maybe annotations would be the least "controversial" bet? We could just parse any variable assignment in the line underneath the annotation:
Comment #45
smustgrave CreditAttribution: smustgrave at Mobomo commentedWondering if this is still relevant in 9.5 or 10
Comment #46
andypostyes, it's still relevant for potx future