Many of the translation po-files that can be downloaded from localize.drupal.org contain strings that cannot be imported due to the function locale_string_is_safe()
in locale.inc which sanitizes translation strings using a whitelist of allowed HTML tags. This results in error messages being displayed and logged during the importing of po-files. These error messages don't give any indication as to which specific strings could not be imported however. This is bad since it leaves the user completely in the dark.
The most recent patch in comment #20 does the following:
- it adds specific watchdog entries for each single string which could not be imported
- it amends the error messages to direct users who encounter these errors to the log at /admin/reports/dblog to find out what went wrong so they can act on this information.
Note: the most common reason for strings not being imported is not some actually nefarious HTML tags but stems from the usage of strings in core like t('<none>')
or t('<Hidden>')
. These strings get translated by translators to something like <keine>
in German, whereupon the import fails, since <keine>
is of course not valid HTML.
Comment | File | Size | Author |
---|---|---|---|
#55 | D7-log_failed_imports-1020842-50.patch | 3.48 KB | joseph.olstad |
#50 | clip.PNG | 19.71 KB | LarsKramer |
#23 | log-failed-imports-1020842-23.patch | 3.25 KB | reglogge |
#20 | log-failed-imports-1020842-20.patch | 3.25 KB | reglogge |
#14 | html-strings-1020842-14.patch | 5.17 KB | reglogge |
Comments
Comment #2
hansfn CreditAttribution: hansfn commentedWe have the same problem with "Hidden" in modules/field_ui/field_ui.admin.inc.
Comment #3
droplet CreditAttribution: droplet commentedsubscribe
Comment #4
plachInterface translation issues belong to the locale queue.
Comment #5
droplet CreditAttribution: droplet commentedduplicated: #1136294: Forbidden HTML in translation strings
for some reason, it worked for me now .. (welcome to reopen)
Comment #6
hansfn CreditAttribution: hansfn commentedAre you sure? Go to admin/config/regional/translate/translate, search for the string "<none>" and try to add a translation "<ingen>". I get the warning:
The submitted string contains disallowed HTML: <ingen>
Comment #7
droplet CreditAttribution: droplet commented@hansfn,
oh..your right.
I found that it works for non-latin language (could be security problem ?)
<にほんご>
<無>
Comment #11
reglogge CreditAttribution: reglogge commentedI guess this should be fixed in Drupal 8 first and then possibly backported. After all strings are frozen for Drupal 7.
The error message and failed import of translated strings which contain e.g.
<keine>
as a translation of<none>
stem from the functionlocale_string_is_safe()
in locale.inc which sanitizes translation strings using a whitelist of allowed HTML tags.<none>
or<Hidden>
and<keine>
are of course not in this whitelist.Attached is a patch that changes all occurences of
t('<none>')
tot('(none)')
and the sole occurence oft('<Hidden>')
tot('Hidden')
(without brackets).To me that makes sense.
Sidenote:
<none>
can of course be translated, but the translation cannot be in the format of an HTML-tag. Translating<none>
to e.g.< keine >
(with spaces between the brackets an the text) works just fine. It's very counterintuitive however.Comment #12
reglogge CreditAttribution: reglogge commentedAfter further consideration:
The previous title of this issue was misleading. Of course
<none>
can be translated: The problem is that translation strings containing text within angled brackets cannot be imported. So<none>
could be translated as anything BUT<keine>
. The error message "n translation strings were skipped because they contain disallowed HTML" is also not helpful at all since it doesn't say which specific strings are problematic.The attached patch builds on the previous one and does the following things:
Comment #14
reglogge CreditAttribution: reglogge commentedModified the failed test so that it checks for the correct message text.
Comment #16
Gábor HojtsyImprovement suggestions to error messages and logging of individual errors looks like a very good idea.
Comment #17
Gábor HojtsyMarked #994226: Translation strings with disallowed HTML and #514926: translated strings shouldn't be validated as duplicates.
Comment #18
reglogge CreditAttribution: reglogge commented#14: html-strings-1020842-14.patch queued for re-testing.
Comment #20
reglogge CreditAttribution: reglogge commentedRerolled the patch against current head with only the additional error and watchdog messages.
Replacing the
<none>
and similar strings should probably be done in a separate issue?Comment #21
reglogge CreditAttribution: reglogge commentedRenaming the issue to make the purpose clearer and crosstagging #1317884: Remove all instances of <none>, <Hidden> and <br/> from translatable strings because they lead to import errors where strings like
<none>
,<Hidden>
and<br/>
are corrected so unsuspecting translators don't directly translate them to strings that are not importable.Comment #22
Gábor HojtsyI still think this is a very good improvement. Two minor comments:
Dot missing at the end. Also, "HTML in the string" => "HTML in the translation"?. This would avoid repeating the same word plus the issue is with the translation not the source string, right?
Comment #23
reglogge CreditAttribution: reglogge commentedYou are of course right. New patch attached.
Comment #24
Gábor Hojtsy@reglogge: I think functionality-wise this looks good. I've asked @yoroy to look at the text to figure out if we can convey the same meaning with shorter text to reduce stuff that needs to be understood by users.
Comment #25
reglogge CreditAttribution: reglogge commented@Gábor: Cool.
In the meantime I attach a new patch with the same text but an additional check for the existence of dblog module. If it's not enabled, it makes no sense to direct users to the log, right? The test doesn't have to be modified since dblog is always enabled during tests.
Comment #26
yoroy CreditAttribution: yoroy commentedGabor asked for a review of the wording of the messages.
Comment #27
reglogge CreditAttribution: reglogge commentedThanks yoroy for going into this! My feedback:
First string:
The first rewrite is best, I think. It's nicely shorter while still giving *some* indication as to the nature of the problem. Just abbreviating it to 'HTML issues' in the second rewrite makes things more obscure, I think. *issues* can be anything...
Second string:
I wouldn't change it since the patch also doesn't change it and already translated strings are not broken that way.
Third string:
The second rewrite is wrong, because it's not the translation of %string that's been skipped, but the %string. The first rewrite is good, but I would modify it to:
"Import of string "%string" was skipped because of disallowed or malformed HTML."
to have the same wording as in the error message from the first string (just noticed this...)
Comment #28
yoroy CreditAttribution: yoroy commentedThanks. Lets ignore that second string indeed, I make a mistake looking at the patch there.
I'm not really sure if 'disallowed or malformed' really gives more useful info than the simpler 'issues' and they both are quite ugly words. 'Issues' *is* vague though and messages should be concrete so lets keep the 'disallowed or malformed'.
And yes, consistency is good here, so agreed on your suggested rewrite for that third string.
Comment #29
reglogge CreditAttribution: reglogge commentedHere is the patch with the wording agreed upon by yoroy.
On checking for the existence of dblog module: I realized that this is more complex. There are actually four distinct possibilities:
- dblog is active
- syslog is active
- dblog and syslog are active
- none is active
Since checking for all of these conditions and changing the resulting error message would result in several different error messages that then would also have to be translated, I think it's best to just omit the link to the logs in the error message.
So now the error message just states "See the log for details" and leaves it up to he user to find the correct log if one exists.
Comment #30
Gábor HojtsyWhat do other modules do for "see the log" type of references?
Comment #31
reglogge CreditAttribution: reglogge commentedAfter a quick search it seems to be a mixed bag. Outside of dblog module itself, there is only one instance of the path /admin/reports/dblog in Drupal core, and that's in update.php. Here there is a check for the existence of dblog but not for syslog:
I couldn't find any instances of 'admin/reports/dblog' in a list of contrib modules such as views, panels, features and about 40 more
/modules/update/update.authorize.inc just outputs 'See the log' without any checks:
Comment #32
yoroy CreditAttribution: yoroy commentedIf logs files are mentioned I would personally need a pointer to where I might find them or this would be annoying info without direction. If we don't do some of that, I'm not sure if we're being helpful :)
Comment #33
yoroy CreditAttribution: yoroy commentedAh, crossposted. Not sure then if this is something to fix in here or in a followup. Gabor?
Comment #34
Gábor HojtsyWell, I don't know how would we solve it in a followup either. Certainly having numerous different messages based on the logging used (especially since Drupal has pluggable logging), sounds like a pretty bad approach. However, we should somehow tell users that we logged the errors, even if we cannot link to the logs IMHO.
Comment #35
reglogge CreditAttribution: reglogge commented@Gabor: agreed regarding the different messages according to enabled logging mechanism.
The sentence 'See the log for details.' is still in the error messages in the current patch, however, so we give users a pointer in the right direction.
The main benefit to the patch IMHO is the fact that we log the individual failed strings so people can look them up and corerct them in their installs s well as submit actually importable translations to their language groups on l.d.o.
Comment #36
yoroy CreditAttribution: yoroy commentedQuick summary of chat with Gabor in IRC:
Paraphrasing Gabor:
Logging is pluggable, so we can't link to specific log files without doing checks.
In practice people probably have dblog, less likely syslog, (both core modules, where syslog has no UI) and pretty unlikely anything else.
My thoughts being:
For me, the unknowing site configurator it would be good to at least scope the message to make it clear that these are drupal specific logs, so that I don't have to hunt for apache or php logs or something.
We decided to make the trade-off as follows:
Check for dblog and link to it if enabled. In all other cases, we don't offer a link but still suggest to go look for Drupal log files.
Comment #37
Gábor HojtsyOr in short let's re-introduce the logic from #25 and mention logging in the second case too. (But keep the shorter messages that you worked out since :).
Comment #38
reglogge CreditAttribution: reglogge commentedOk, that sounds sensible.
Patch attached.
Comment #39
Gábor HojtsyCan we make the 'See the log' text a link instead of just log. Log looks like going to be hard to click :)
Comment #40
reglogge CreditAttribution: reglogge commentedRight. New patch attached.
I also replaced the url in the error message with a placeholder using the url()-function so people without clean urls enabled get a workable path to the logs.
Comment #41
Gábor HojtsyI think this is great now. It satisfies all concerns raised above both by @yoroy and I and it is an important improvement to make the exact error messages available for review. Using the logs for that is useful for cases when the import was automatic (l10n_update and soon in Drupal 8 core) and when you navigate away but want to resolve the issues anyway. I think this is good to go.
Comment #42
Gábor HojtsyIMHO should also be backported to D7 once lands in D8.
Comment #43
yoroy CreditAttribution: yoroy commentedI too feel we made the right trade-offs here, so agreed on rtbc.
Comment #44
catchHmm, it'd be good if we had a better way to link to the logs, but it seems worth the check here to make the actual message useful, and it's not adding any strange interdependencies since dblog is very low level. I opened #1325736: Create a consistent method to link to watchdog log reports to discuss it some more though in case we can come up with something.
Comment #45
reglogge CreditAttribution: reglogge commentedA straight backport to D7 would involve string changes. Specifically, it's about adding the sentence "Check the log for details" in the existing user-facing error message set when a string cannot be imported.
Alternative:
Just adding the new logging for individual failed strings would be adding a new string, not changing an existing one. We wouldn't have the message alerting the user, but still at least something in the logs.
What to do?
Comment #46
xjmTo clarify, it looks like this has been committed to 8.x in:
http://drupal.org/commitlog/commit/2/a9420816540375a1672d1236788f27b2ab9...
Comment #47
reglogge CreditAttribution: reglogge commentedSetting to needs work.
Comment #48
Gábor HojtsyAdd UI translation tag.
Comment #49
RodrigoBalest CreditAttribution: RodrigoBalest commentedIs there any patches on this for version 7.22?
I tried to use the patches above, but it seems that the file gettext.inc was replaced.
Comment #49.0
RodrigoBalest CreditAttribution: RodrigoBalest commentedUpdated issue summary.
Comment #50
LarsKramer CreditAttribution: LarsKramer commentedNot sure if this is the right place for the following. When adding Spanish language to a Drupal 8.1.3 installation, 1 string "was skipped because of disallowed or malformed HTML" (see attached screenshot of log message). I can see the string containing the error, but it would be more useful with some information about how to correct it. Is there some webpage where I can correct the error in the translation? Couldn't this error check be implemented already when the translator uploads the translation?
Comment #51
LarsKramer CreditAttribution: LarsKramer commentedOK, I now managed to locate the bad string on https://localize.drupal.org/ and make a suggestion. If this is the way to do it, could it perhaps be explained in the log message? Or even better: have some automatic checking on https://localize.drupal.org/ to prevent the error in the first place?
Comment #54
joseph.olstadsubscribing D7 backport
Comment #55
joseph.olstadFirst attempt