After importing a translation file (.po) into my drupal installation I get the error mentioned in the subject (' translation strings were skipped because they contain disallowed HTML').

Could someone tell me which errors to look for in the translation file? Which tags or other elements are disallowed? Are the tags checked always to close an opened tag (<p> and </p> for example)?

How can I find the strings containing disallowed html (if beyond specific illegal tags)?

Comments

FrankT’s picture

Status: Active » Fixed

I could find it out be checking an extracted translation file against the imported and found out that it (already the original strings) contained <br/>-Tags instead of <br /> (missing space).

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

FrankT’s picture

Version: 6.8 » 6.9
Status: Closed (fixed) » Active

It happend to me that the message appeared again...

One of the strings causing problems is
Die Breite der Zeitachse (in den Einheiten em, px oder %), z.B. 600px oder 90%. Leer lassen um den Standardwert zu verwenden.
So I shortened the string to be finally Leer which is never an disallowed HTML.

Does the error have to do with the source string The width of the timeline (in units of em, px or %), e.g. 600px or 90%. Leave blank to use default value.?

Please help me out!

ainigma32’s picture

Status: Active » Postponed (maintainer needs more info)

The function that tests if the translation string is safe can be found in /includes/locale.inc on line 828:

/**
 * Check that a string is safe to be added or imported as a translation.
 *
 * This test can be used to detect possibly bad translation strings. It should
 * not have any false positives. But it is only a test, not a transformation,
 * as it destroys valid HTML. We cannot reliably filter translation strings
 * on inport becuase some strings are irreversibly corrupted. For example,
 * a &amp; in the translation would get encoded to &amp;amp; by filter_xss()
 * before being put in the database, and thus would be displayed incorrectly.
 *
 * The allowed tag list is like filter_xss_admin(), but omitting div and img as
 * not needed for translation and likely to cause layout issues (div) or a
 * possible attack vector (img).
 */
function locale_string_is_safe($string) {
  return decode_entities($string) == decode_entities(filter_xss($string, array('a', 'abbr', 'acronym', 'address', 'b', 'bdo', 'big', 'blockquote', 'br', 'caption', 'cite', 'code', 'col', 'colgroup', 'dd', 'del', 'dfn', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'ins', 'kbd', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr', 'tt', 'ul', 'var')));
}

Testing the string you provided (both the german and the english) in a PHP block both return true. So it looks like these are not the stings that are causing this problem.

Please post back how this works out for you.

- Arie

FrankT’s picture

Status: Postponed (maintainer needs more info) » Active

Thanks, the list may help me in further cases. Does the list you mentioned contain the allowed tags, any other tags (strings within < and >) are disallowed!?

Currently I cannot reproduce the problem any more although the file is still the same, to me it looks more like a temporary problem with importing translations.

ainigma32’s picture

Status: Active » Fixed

Yes these are the allowed tags, all other tags are disallowed.

Since you can't reproduce the problem I'm setting this to fixed for now.

Feel free to reopen if the problem occurs again.

- Arie

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

xamanu’s picture

Just in case that somebody stumbles about this here:

I can report this error when using "iso-8859-1" encoded po-files using German "Umlaute" (Ä,Ü,Ö). Just changed to UTF-8 encoding and everything get's nicely imported.

dutchound’s picture

Just changed to UTF-8 encoding and everything get's nicely imported.

That comment just saved my posteror from troubles

before it said : 1811 translation strings were skipped... disallowed HTML.

after it said: 5 translation strings were skipped

Thanks!

JimP’s picture

I encountered this problem when importing the french 6.15 translation file, and I experienced 1650 or so strings skipped. I am running under windows. Changing the file's carriage returns to UNIX format fixed this problem for me - 4 strings only problematic now.

ledut’s picture

to JimP :

I have met the same issue :
How do you change the file's carriage returns to UNIX format ??

EDIT : Solved : on windows platform, use PsPad freeware (on http://www.pspad.com/)

Anonymous’s picture

Just chiming in to say that when working with a de.po file I got "8 translation strings were skipped because they contain disallowed HTML.". The file was Unix format / UTF8. Not bad for a 12000+ line file but still it would be helpful to know what or how to look for for those 8 strings without scanning through every line of code hoping it jumps out at me ;)

Drupal 6.16

ash0815’s picture

I have the same problem. My translation contains HTML code and Drupal say's

x translation strings were skipped because they contain disallowed HTML.

I deleted the HTML tags in my translation and some are not imported yet :-(

All formats are lost, even with the help of where code appears :-(

Example from login_destination:
URL: (<b>IMPORTANT! If using a WYSWYG editor - ensure that you use its plain text mode! There is a link below the text box.</b> )

My translation without HTML tags :-(
URL: (WARNUNG! Wenn Sie einen WYSIWYG-Editor verwenden - stellen Sie sicher, dass dieser auf Reintext gestellt ist! Dort ist ein Link unterhalb des Textfeldes)

I can't use this part:
Es sollte entweder ein String-Wert oder ein Array wie in: return array('path' => 'node/add/video or alias', 'query' => 'param1=100&param2=200'); geben.

That's pity :-(

uno’s picture

Version: 6.9 » 7.0-beta3

On admin/modules/install there is a original string that contains <br/> - missing space.
If you are having problems with translation - make sure to add it in a form of <br />.

Source code is:

To install a new module or theme, either enter the URL of an archive file you wish to install, or upload the archive file that you have downloaded. You can find <a href="@module_url">modules</a> and <a href="@theme_url">themes</a> at <a href="@drupal_org_url">http://drupal.org</a>.<br/>The following archive extensions are supported: %extensions.

zfreaker’s picture

In my case it also complain if i have used XHTML break: <br/>
when i switched to <br> it worked fine.

justindodge’s picture

@#15 - The <br/> tag will work XHTML style, but it needs a space before the closing character like so: <br />