Many of the translation po-files that can be downloaded from localize.drupal.org contain strings that cannot be imported due to the function locale_string_is_safe() in locale.inc which sanitizes translation strings using a whitelist of allowed HTML tags. This results in error messages being displayed and logged during the importing of po-files. These error messages don't give any indication as to which specific strings could not be imported however. This is bad since it leaves the user completely in the dark.

The most recent patch in comment #20 does the following:

  • it adds specific watchdog entries for each single string which could not be imported
  • it amends the error messages to direct users who encounter these errors to the log at /admin/reports/dblog to find out what went wrong so they can act on this information.

Note: the most common reason for strings not being imported is not some actually nefarious HTML tags but stems from the usage of strings in core like t('<none>') or t('<Hidden>'). These strings get translated by translators to something like <keine> in German, whereupon the import fails, since <keine> is of course not valid HTML.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hansfn’s picture

We have the same problem with "Hidden" in modules/field_ui/field_ui.admin.inc.

droplet’s picture

Version: 7.0 » 7.x-dev
Component: base system » language system

subscribe

plach’s picture

Component: language system » locale.module

Interface translation issues belong to the locale queue.

droplet’s picture

Status: Active » Closed (won't fix)

duplicated: #1136294: Forbidden HTML in translation strings

for some reason, it worked for me now .. (welcome to reopen)

hansfn’s picture

Status: Closed (won't fix) » Active

Are you sure? Go to admin/config/regional/translate/translate, search for the string "<none>" and try to add a translation "<ingen>". I get the warning:
The submitted string contains disallowed HTML: <ingen>

droplet’s picture

@hansfn,
oh..your right.

I found that it works for non-latin language (could be security problem ?)

<にほんご>
<無>

reglogge’s picture

Version: 7.x-dev » 8.x-dev
Status: Active » Needs review
FileSize
1.89 KB

I guess this should be fixed in Drupal 8 first and then possibly backported. After all strings are frozen for Drupal 7.

The error message and failed import of translated strings which contain e.g. <keine> as a translation of <none> stem from the function locale_string_is_safe() in locale.inc which sanitizes translation strings using a whitelist of allowed HTML tags. <none> or <Hidden> and <keine> are of course not in this whitelist.

Attached is a patch that changes all occurences of t('<none>') to t('(none)') and the sole occurence of t('<Hidden>') to t('Hidden') (without brackets).

  • I retained the brackets around 'none' (but changed them to rounded brackets) because it denotes the absence of any defined image styles or number formatters in image.module or number.module.
  • I removed the brackets around the term 'Hidden' because it logically is on the same level as 'Inline' or 'Above' when used to select the diplay options for field labels in field_ui.admin.inc.

To me that makes sense.

Sidenote: <none> can of course be translated, but the translation cannot be in the format of an HTML-tag. Translating <none> to e.g. < keine > (with spaces between the brackets an the text) works just fine. It's very counterintuitive however.

reglogge’s picture

Title: "<none>" can't be translated » Translation strings with text in angled brackets cannot be imported
FileSize
3.94 KB

After further consideration:

The previous title of this issue was misleading. Of course <none> can be translated: The problem is that translation strings containing text within angled brackets cannot be imported. So <none> could be translated as anything BUT <keine>. The error message "n translation strings were skipped because they contain disallowed HTML" is also not helpful at all since it doesn't say which specific strings are problematic.

The attached patch builds on the previous one and does the following things:

  • Adds a watchdog message for each specific string that was skipped to make it easier to identify defective translation strings.
  • Formats the message "n translation strings were skipped because they contain disallowed HTML" properly as an error message.
  • Rewording of the message to also mention the possibility of malformed HTML and adding of a link to /admin/reports/dblog

Status: Needs review » Needs work

The last submitted patch, html-strings-1020842-12.patch, failed testing.

reglogge’s picture

Status: Needs work » Needs review
FileSize
5.17 KB

Modified the failed test so that it checks for the correct message text.

Status: Needs review » Needs work

The last submitted patch, html-strings-1020842-14.patch, failed testing.

Gábor Hojtsy’s picture

Improvement suggestions to error messages and logging of individual errors looks like a very good idea.

Gábor Hojtsy’s picture

reglogge’s picture

Status: Needs work » Needs review

#14: html-strings-1020842-14.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, html-strings-1020842-14.patch, failed testing.

reglogge’s picture

Status: Needs work » Needs review
FileSize
3.25 KB

Rerolled the patch against current head with only the additional error and watchdog messages.

Replacing the <none> and similar strings should probably be done in a separate issue?

reglogge’s picture

Title: Translation strings with text in angled brackets cannot be imported » Error messages when importing translated strings fails are not helpful at all

Renaming the issue to make the purpose clearer and crosstagging #1317884: Remove all instances of <none>, <Hidden> and <br/> from translatable strings because they lead to import errors where strings like <none>, <Hidden> and <br/> are corrected so unsuspecting translators don't directly translate them to strings that are not importable.

Gábor Hojtsy’s picture

Status: Needs review » Needs work
Issue tags: +D8MI

I still think this is a very good improvement. Two minor comments:

String "%string" was not imported due to potentially dangerous or malformed HTML in the string

Dot missing at the end. Also, "HTML in the string" => "HTML in the translation"?. This would avoid repeating the same word plus the issue is with the translation not the source string, right?

reglogge’s picture

Status: Needs work » Needs review
FileSize
3.25 KB

You are of course right. New patch attached.

Gábor Hojtsy’s picture

@reglogge: I think functionality-wise this looks good. I've asked @yoroy to look at the text to figure out if we can convey the same meaning with shorter text to reduce stuff that needs to be understood by users.

reglogge’s picture

@Gábor: Cool.

In the meantime I attach a new patch with the same text but an additional check for the existence of dblog module. If it's not enabled, it makes no sense to direct users to the log, right? The test doesn't have to be modified since dblog is always enabled during tests.

yoroy’s picture

Gabor asked for a review of the wording of the messages.

## Current patch:  
One translation string was skipped because it contains disallowed or malformed HTML. See the log for details on the skipped string.

## First rewrite:  
A translation string was skipped because of disallowed or malformed HTML. See the log for details.

## Second rewrite:  
Skipped a translation string because of HTML issues. See the log for details.

-- 

## Current patch:  
The translation file was successfully imported.

## First rewrite:  
Translation file imported succesfully.

## Second rewrite:  
Translation imported succesfully.

## Third rewrite:
Translation imported.

--

## Current patch:  
String "%string" was not imported due to potentially dangerous or malformed HTML in the translation.

## First rewrite:  
Import of string "%string" was skipped because of potentially dangerous or malformed HTML.

## Second rewrite:  
Translation of "%string" was not imported because of HTML issues.
reglogge’s picture

Thanks yoroy for going into this! My feedback:

First string:
The first rewrite is best, I think. It's nicely shorter while still giving *some* indication as to the nature of the problem. Just abbreviating it to 'HTML issues' in the second rewrite makes things more obscure, I think. *issues* can be anything...

Second string:
I wouldn't change it since the patch also doesn't change it and already translated strings are not broken that way.

Third string:
The second rewrite is wrong, because it's not the translation of %string that's been skipped, but the %string. The first rewrite is good, but I would modify it to:
"Import of string "%string" was skipped because of disallowed or malformed HTML."
to have the same wording as in the error message from the first string (just noticed this...)

yoroy’s picture

Status: Needs review » Needs work

Thanks. Lets ignore that second string indeed, I make a mistake looking at the patch there.

I'm not really sure if 'disallowed or malformed' really gives more useful info than the simpler 'issues' and they both are quite ugly words. 'Issues' *is* vague though and messages should be concrete so lets keep the 'disallowed or malformed'.

And yes, consistency is good here, so agreed on your suggested rewrite for that third string.

reglogge’s picture

Status: Needs work » Needs review
FileSize
2.97 KB

Here is the patch with the wording agreed upon by yoroy.

On checking for the existence of dblog module: I realized that this is more complex. There are actually four distinct possibilities:
- dblog is active
- syslog is active
- dblog and syslog are active
- none is active
Since checking for all of these conditions and changing the resulting error message would result in several different error messages that then would also have to be translated, I think it's best to just omit the link to the logs in the error message.

So now the error message just states "See the log for details" and leaves it up to he user to find the correct log if one exists.

Gábor Hojtsy’s picture

What do other modules do for "see the log" type of references?

reglogge’s picture

After a quick search it seems to be a mixed bag. Outside of dblog module itself, there is only one instance of the path /admin/reports/dblog in Drupal core, and that's in update.php. Here there is a check for the existence of dblog but not for syslog:

function update_results_page() {
  drupal_set_title('Drupal database update');
  $links = update_helpful_links();

  update_task_list();
  // Report end result.
  if (module_exists('dblog')) {
    $log_message = ' All errors have been <a href="' . base_path() . '?q=admin/reports/dblog">logged</a>.';
  }
  else {
    $log_message = ' All errors have been logged.';
  }

I couldn't find any instances of 'admin/reports/dblog' in a list of contrib modules such as views, panels, features and about 40 more

/modules/update/update.authorize.inc just outputs 'See the log' without any checks:

function update_authorize_update_batch_finished($success, $results) {
  foreach ($results['log'] as $project => $messages) {
    if (!empty($messages['#abort'])) {
      $success = FALSE;
    }
  }
  $offline = variable_get('maintenance_mode', FALSE);
  if ($success) {
    // Now that the update completed, we need to clear the cache of available
    // update data and recompute our status, so prevent show bogus results.
    _update_authorize_clear_update_status();

    // Take the site out of maintenance mode if it was previously that way.
    if ($offline && isset($_SESSION['maintenance_mode']) && $_SESSION['maintenance_mode'] == FALSE) {
      variable_set('maintenance_mode', FALSE);
      $page_message = array(
        'message' => t('Update was completed successfully. Your site has been taken out of maintenance mode.'),
        'type' => 'status',
      );
    }
    else {
      $page_message = array(
        'message' => t('Update was completed successfully.'),
        'type' => 'status',
      );
    }
  }
  elseif (!$offline) {
    $page_message = array(
      'message' => t('Update failed! See the log below for more information.'),
      'type' => 'error',
    );
  }
  else {
    $page_message = array(
      'message' => t('Update failed! See the log below for more information. Your site is still in maintenance mode.'),
      'type' => 'error',
    );
  }
yoroy’s picture

If logs files are mentioned I would personally need a pointer to where I might find them or this would be annoying info without direction. If we don't do some of that, I'm not sure if we're being helpful :)

yoroy’s picture

Ah, crossposted. Not sure then if this is something to fix in here or in a followup. Gabor?

Gábor Hojtsy’s picture

Well, I don't know how would we solve it in a followup either. Certainly having numerous different messages based on the logging used (especially since Drupal has pluggable logging), sounds like a pretty bad approach. However, we should somehow tell users that we logged the errors, even if we cannot link to the logs IMHO.

reglogge’s picture

@Gabor: agreed regarding the different messages according to enabled logging mechanism.

The sentence 'See the log for details.' is still in the error messages in the current patch, however, so we give users a pointer in the right direction.

The main benefit to the patch IMHO is the fact that we log the individual failed strings so people can look them up and corerct them in their installs s well as submit actually importable translations to their language groups on l.d.o.

yoroy’s picture

Quick summary of chat with Gabor in IRC:

Paraphrasing Gabor:
Logging is pluggable, so we can't link to specific log files without doing checks.
In practice people probably have dblog, less likely syslog, (both core modules, where syslog has no UI) and pretty unlikely anything else.

My thoughts being:
For me, the unknowing site configurator it would be good to at least scope the message to make it clear that these are drupal specific logs, so that I don't have to hunt for apache or php logs or something.

We decided to make the trade-off as follows:

Check for dblog and link to it if enabled. In all other cases, we don't offer a link but still suggest to go look for Drupal log files.

Gábor Hojtsy’s picture

Or in short let's re-introduce the logic from #25 and mention logging in the second case too. (But keep the shorter messages that you worked out since :).

reglogge’s picture

Ok, that sounds sensible.

Patch attached.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

Can we make the 'See the log' text a link instead of just log. Log looks like going to be hard to click :)

reglogge’s picture

Status: Needs work » Needs review
FileSize
3.44 KB

Right. New patch attached.

I also replaced the url in the error message with a placeholder using the url()-function so people without clean urls enabled get a workable path to the logs.

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

I think this is great now. It satisfies all concerns raised above both by @yoroy and I and it is an important improvement to make the exact error messages available for review. Using the logs for that is useful for cases when the import was automatic (l10n_update and soon in Drupal 8 core) and when you navigate away but want to resolve the issues anyway. I think this is good to go.

Gábor Hojtsy’s picture

Issue tags: +Needs backport to D7

IMHO should also be backported to D7 once lands in D8.

yoroy’s picture

I too feel we made the right trade-offs here, so agreed on rtbc.

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Needs review

Hmm, it'd be good if we had a better way to link to the logs, but it seems worth the check here to make the actual message useful, and it's not adding any strange interdependencies since dblog is very low level. I opened #1325736: Create a consistent method to link to watchdog log reports to discuss it some more though in case we can come up with something.

reglogge’s picture

A straight backport to D7 would involve string changes. Specifically, it's about adding the sentence "Check the log for details" in the existing user-facing error message set when a string cannot be imported.

Alternative:
Just adding the new logging for individual failed strings would be adding a new string, not changing an existing one. We wouldn't have the message alerting the user, but still at least something in the logs.

What to do?

xjm’s picture

To clarify, it looks like this has been committed to 8.x in:
http://drupal.org/commitlog/commit/2/a9420816540375a1672d1236788f27b2ab9...

reglogge’s picture

Status: Needs review » Needs work

Setting to needs work.

Gábor Hojtsy’s picture

Issue tags: +language-ui

Add UI translation tag.

RodrigoBalest’s picture

Is there any patches on this for version 7.22?

I tried to use the patches above, but it seems that the file gettext.inc was replaced.

RodrigoBalest’s picture

Issue summary: View changes

Updated issue summary.

LarsKramer’s picture

Issue summary: View changes
FileSize
19.71 KB

Not sure if this is the right place for the following. When adding Spanish language to a Drupal 8.1.3 installation, 1 string "was skipped because of disallowed or malformed HTML" (see attached screenshot of log message). I can see the string containing the error, but it would be more useful with some information about how to correct it. Is there some webpage where I can correct the error in the translation? Couldn't this error check be implemented already when the translator uploads the translation?

LarsKramer’s picture

OK, I now managed to locate the bad string on https://localize.drupal.org/ and make a suggestion. If this is the way to do it, could it perhaps be explained in the log message? Or even better: have some automatic checking on https://localize.drupal.org/ to prevent the error in the first place?

  • catch committed a942081 on 8.3.x
    Issue #1020842 by reglogge, yoroy: Fixed Error messages when importing...

  • catch committed a942081 on 8.3.x
    Issue #1020842 by reglogge, yoroy: Fixed Error messages when importing...
joseph.olstad’s picture

subscribing D7 backport

joseph.olstad’s picture

First attempt

  • catch committed a942081 on 8.4.x
    Issue #1020842 by reglogge, yoroy: Fixed Error messages when importing...

  • catch committed a942081 on 8.4.x
    Issue #1020842 by reglogge, yoroy: Fixed Error messages when importing...

  • catch committed a942081 on 9.1.x
    Issue #1020842 by reglogge, yoroy: Fixed Error messages when importing...