Follow-up to #2042143: [META] Fix unsafe core translatable strings

Problem/Motivation

The locale_string_is_safe function is used by the Locale module to ensure that translations are safe before saving them in the database. It does so by comparing an Xss::filter-ed version to the original version so nothing serious can happen.

The problem is that some translatable strings could include tokens being wrongly filtered by the Xss::filter method.
For example, the locale module contains the following string that is not translatable because of the token inside :
This page allows you to translate the user interface or modify existing translations. If you have installed your site initially in English, you must first add another language on the <a href="[site:url]/admin/config/regional/language">Languages page</a>, in order to use this page.

When filtered, <a href="[site:url]/admin/config/regional/language"> becomes <a href="url]/admin/config/regional/language"> so the locale_string_is_safe function fails.

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Bug because it doesn't allow to translate a string if it has tokens in it.
Issue priority Not critical because it only affect the translation of strings.
Prioritized changes This is a prioritized change, because it's a bug.
The issue is tagged as security, as it's related to translation's security, but this is not a security vulnerability itself.
Disruption Not disruptive for core/contributed and custom modules/themes because the affected function is not widely used. Only on the translation section of admin. Contrib and custom modules shouldn't use it directly.

Proposed resolution

As seen with dawehner and chx, we should remove really safe tokens from the string before passing it to Xss::filter.
We chose this option as it is far easier and safer than trying to change Xss::filter. A specific issue has been opened to keep a track of this problem. See #2372127: Prevent Xss::filter from stripping a token at the beginning of an html attribute.

Remaining tasks

Contributor tasks needed
Task Novice task? Contributor instructions Complete?
Create a patch Instructions Done in #2
Add automated tests Instructions Done in #2
Review patch to ensure that it fixes the issue, stays within scope, is properly documented, and follows coding standards Instructions
Review patch to ensure that it does not introduce a security issue

User interface changes

None

API changes

None

CommentFileSizeAuthor
#67 2371861-followup-67.patch2.19 KBDavid_Rothstein
#57 2371861-drupal-locale-allow_tokens_in_href_in_translatable_strings-test-only-d7.patch1.88 KBpietmarcus
#56 2371861-drupal-locale-allow_tokens_in_href_in_translatable_strings-d7.patch3.84 KBpietmarcus
#56 2371861-drupal-locale-allow_tokens_in_href_in_translatable_strings-test-only-d7.patch3.84 KBpietmarcus
#42 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-42-tests-only.patch6.23 KBDuaelFr
#42 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-42.patch8.44 KBDuaelFr
#42 interdiff.2371861.33.42.txt1.07 KBDuaelFr
#37 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-33.patch8.43 KBGábor Hojtsy
#37 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-33-tests-only.patch6.21 KBGábor Hojtsy
#33 interdiff.2371861.29.33.txt4.54 KBDuaelFr
#33 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-33.patch8.43 KBDuaelFr
#33 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-33-tests-only.patch6.21 KBDuaelFr
#31 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-29-tests-only.patch6.29 KBYesCT
#29 interdiff.2371861.20.29.txt1.02 KBtucho
#29 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-29.patch8.51 KBtucho
#20 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-20.patch8.5 KBDuaelFr
#20 interdiff.2371861.17.20.txt1 KBDuaelFr
#17 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-17.patch8.5 KBDuaelFr
#17 interdiff.2371861.15.17.patch1.61 KBDuaelFr
#15 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-15.patch8.34 KBYesCT
#15 interdiff.2371861.14.15.txt2.69 KBYesCT
#14 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-14.patch8.29 KBDuaelFr
#14 interdiff.2371861.13.14.txt3.69 KBDuaelFr
#13 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-13.patch7.84 KBDuaelFr
#13 interdiff.2371861.11.13.txt5.3 KBDuaelFr
#11 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-11.patch3.74 KBYesCT
#11 interdiff.2371861.9.11.txt3.08 KBYesCT
#9 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-9.patch3.46 KBDuaelFr
#9 interdiff.2371861.6.9.txt1.93 KBDuaelFr
#6 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-6.patch2.59 KBYesCT
#6 interdiff.2371861.5.6.txt1.05 KBYesCT
#5 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-5.patch2.58 KBYesCT
#5 interdiff.2371861.2.5.txt2.13 KBYesCT
#2 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-2.patch2.51 KBDuaelFr
#2 drupal-locale-allow_tokens_in_href_in_translatable_strings-tests_only-2371861-2.patch1.54 KBDuaelFr
#45 interdiff.2371861.42.45.txt745 bytesDuaelFr
#45 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-45.patch8.46 KBDuaelFr
#45 drupal-locale-allow_tokens_in_href_in_translatable_strings-2371861-45-tests-only.patch6.24 KBDuaelFr
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

DuaelFr’s picture

Issue summary: View changes
DuaelFr’s picture

YesCT’s picture

Issue summary: View changes
Issue tags: +Security
+++ b/core/modules/locale/locale.module
@@ -961,6 +961,8 @@ function locale_translation_use_remote_source() {
+  // Remove safe tokens to avoid them causing trouble during the Xss:filter call.
+  $string = preg_replace('#\[[a-z0-9:_-]+\]#i', '', $string);

how do we know this is the pattern that would only match safe things?

YesCT’s picture

  1. +++ b/core/modules/locale/locale.module
    @@ -961,6 +961,8 @@ function locale_translation_use_remote_source() {
    +  // Remove safe tokens to avoid them causing trouble during the Xss:filter call.
    

    nit, needs to wrap at 80 chars.

    Lines containing comments (including docblocks) must wrap as close to 80 characters as possible without going over

    https://www.drupal.org/coding-standards/docs#drupal

  2. +++ b/core/modules/locale/src/Tests/LocaleStringIsSafeTest.php
    @@ -0,0 +1,50 @@
    + * Contains \Drupal\locale\Tests\LocaleTranslationProjectsTest.
    

    probably a copy and paste error. should match the name of the file.

  3. +++ b/core/modules/locale/src/Tests/LocaleStringIsSafeTest.php
    @@ -0,0 +1,50 @@
    + * Tests locale translation project handling.
    

    this seems more limited in scope given the file name.

  4. +++ b/core/modules/locale/src/Tests/LocaleStringIsSafeTest.php
    @@ -0,0 +1,50 @@
    +  /**
    +   * {@inheritdoc}
    +   */
    

    I'm not sure what is more usual, to {@inheritdoc} on this property, or just regular doc.

    did a quick guess/check:
    ag " extends KernelTestBase" -C7 | grep inheritdoc | wc -l
    is 9

    ag " extends KernelTestBase" -C7 | grep "Modules to enable" | wc -l
    is 30

    so.. not clear and not blocking this issue either way.

fixed those little things.

this still needs a review.

YesCT’s picture

oops forgot the first thing. here it is.

YesCT’s picture

Status: Needs review » Needs work
+++ b/core/modules/locale/locale.module
@@ -961,6 +961,9 @@ function locale_translation_use_remote_source() {
+  // Remove safe tokens to avoid them causing trouble during the Xss:filter
+  // call.

mentioned in the hallway, but documenting feedback here.

1.
we should be more specific in the comments about what trouble is.
let's talk about why they are not translatable, etc.

2.
Also, I think we should open an issue to fix this in Xss::filter, and add @todo comment linking to that issue.

3.
also, in the summary of this issue, we should explain why we are doing this fix before the call to Xss:filter and why we are not just fixing it there directly.

...

4.
and, I wonder if this pattern is too liberal. maybe there are some strings we should not remove before calling Xss::filter. dawehner mentioned something about javascript: ... ??

dawehner’s picture

+++ b/core/modules/locale/locale.module
@@ -961,6 +961,9 @@ function locale_translation_use_remote_source() {
+  // Remove safe tokens to avoid them causing trouble during the Xss:filter
+  // call.
+  $string = preg_replace('#\[[a-z0-9:_-]+\]#i', '', $string);
   return String::decodeEntities($string) == String::decodeEntities(Xss::filter($string, array('a', 'abbr', 'acronym', 'address', 'b', 'bdo', 'big', 'blockquote', 'br', 'caption', 'cite', 'code', 'col', 'colgroup', 'dd', 'del', 'dfn', 'dl', 'dt', 'em', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'i', 'ins', 'kbd', 'li', 'ol', 'p', 'pre', 'q', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr', 'tt', 'ul', 'var')));

... we should describe here what the troubles are. It should also kinda describe that replacing tokens is not a security problem ... and why.

DuaelFr’s picture

1.
Done. I hope my english is not as confuse as I think it is.

2.
Done #2372127: Prevent Xss::filter from stripping a token at the beginning of an html attribute

3.
Done.

4.
I tried to give an explanation in the comment. Tell me if it seems good for you.

DuaelFr’s picture

Status: Needs work » Needs review
YesCT’s picture

Sat with @DuaelFr to reword the comment.

This is now ready for someone to check the whole thing and see if the approach and the docs are good.

dawehner’s picture

  1. +++ b/core/modules/locale/locale.module
    @@ -961,6 +961,30 @@ function locale_translation_use_remote_source() {
    +  // Some strings have tokens in them. For tokens in the first part of href or
    

    It would be great to give some examples ...

  2. +++ b/core/modules/locale/src/Tests/LocaleStringIsSafeTest.php
    @@ -0,0 +1,52 @@
    +class LocaleStringIsSafeTest extends KernelTestBase {
    

    I would love to see a dedicated test which ensures that t and then token replace passed through a template does NOT call a security issue. Try to use a typical <script>alert('moeeeep');</script> in there.

DuaelFr’s picture

@dawehner Happy ? ;)

More seriously, I'm not sure about the way I made that new test. Can you have a look?

DuaelFr’s picture

Some improvements:
- test made more readable
- comments fixed

YesCT’s picture

little rewording and newline.
this test is easier for me to understand than #13. Very nice.

Status: Needs review » Needs work
DuaelFr’s picture

Fixes the issue in the test introduced by the newline at the end of the template.

The last submitted patch, 17: interdiff.2371861.15.17.patch, failed testing.

The last submitted patch, 17: interdiff.2371861.15.17.patch, failed testing.

DuaelFr’s picture

Just changed the regexp separators to be more consistent with other regexp in core.

dawehner’s picture

For me the test coverage looks fairly reasonable and the regex limits quite good.
On the other hand this should have a dedicated review from someone from the security review and maybe gabor itself.

Gábor Hojtsy’s picture

Status: Needs review » Needs work

I don't understand the problem itself. Why do we use tokens in those strings and not @url with a t() replacement later?! Think of our poor translators who need to understand each different format we make up for source strings and which parts they are supposed to touch / not touch.

DuaelFr’s picture

@Gábor Hojtsy:

First, these strings are in a yml file used to configure the Tour module so I don't know how we could process the placeholders as the t() function call is automated. Should we introduce a new configuration concept the add placeholders in translatable strings? It could be really complicated.

Then, we might not forbid contrib developpers to include tokens in their translatable strings. So, to keep these strings translatable, we need to fix this issue (or the related one dealing with Xss::filter).

Gábor Hojtsy’s picture

Ok I can see that tour module wants to use tokens instead of exposing its own translation-standard replacement patterns like @url to be used in the string. That makes sense if a whole set of tokens are supported in an extensible way. Is that the case? Where is that evaluated?

DuaelFr’s picture

The code replacing tokens in the body string is

// \Drupal\tour\Plugin\tour\tip\TipPluginText::getOutput()
Xss::filterAdmin($this->token->replace($this->getBody()));

where $this->token is a \Drupal\Core\Utility\Token instance.

According to the Token::replace() method's documentation, as the second parameter has been left empty, only the tokens that are not related to a particular context might be available. In the Core, that's the case of the following tokens :

  • site (name, slogan, mail, url, url-brief, login-url)
  • date (short, medium, long, custom, since, raw)
  • current-user (uid, name, mail, url, edit-url, last-login, created)

The thing is that, even if we could try to find another solution, we have to fix the issue at least to avoid contributed modules that would like to use a token in a translatable string to become untranslatables.

In D7 a common case was to transform account creation emails to use an HTML format then try to translate them. As it has been converted to Configuration Translation in D8 that is not a problem anymore but we may encounter that issue otherwise.

Gábor Hojtsy’s picture

Status: Needs work » Needs review

All right, let's fix this then. Thanks for the background info. I am sad different replacement styles are used but we seem to be putting developers ahead of translators again here which we did so many times in D8.

tucho’s picture

Issue summary: View changes
tucho’s picture

Issue summary: View changes
tucho’s picture

I have made a little improvement to the RegEx that removes the tokens from the string.

The original RegEx matches strings that are not valid tokens, like the following:

[:]
[site]
[site:]
DuaelFr’s picture

It was made on purpose. If we wanted to match all possible tokens, that RegEx should have been far more complicated as about every character are allowed in a token except colon and closing square brackets. You patch is still valid and still answer the need, though.

YesCT’s picture

here is a tests only to see if this fails in the way we want it to.

Status: Needs review » Needs work
DuaelFr’s picture

Patches rerolled and new coding standards applied to the new code.

DuaelFr’s picture

Issue tags: +Needs backport to D7

I'm becoming tired of this issue. I makes half of Commerce untranslatable and a lot of other modules that relies on tokens.
What's missing to make it move forward, please?

(The patch still applies)

Gábor Hojtsy’s picture

Wim Leers’s picture

<a href="[site:url]/admin/config/regional/language"> becomes <a href="url]/admin/config/regional/language">

I know you're talking about the general token problem. But, for URLs specifically, you should use internal:/admin/config/regional/languages.

DuaelFr’s picture

@Wim Leers I can see 3 occurences of this string, all in the tour module config files. Drupal Commerce 7.x contains loads of these strings too. All unstranslatables.

Status: Needs review » Needs work
DuaelFr’s picture

DuaelFr’s picture

Gábor Hojtsy’s picture

Priority: Normal » Major

There are 5 affected strings in Drupal 8 core that are not translatable due to this bug. https://localize.drupal.org/translate/languages/hu/translate?project=dru...

andypost’s picture

Status: Needs review » Reviewed & tested by the community

Closed as duplicate #2640886: Unable to import translation, HTML restriction

Testing manually a patch it works fine, also covered with tests

catch’s picture

Title: Allow strings including tokens in href or src attributes to be translated » Strings including tokens in href or src attributes cannot be translated due to safeness check incompatibilities

The patch is straightforward enough but the title kept confusing me.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to 8.1.x and cherry-picked to 8.0.x. Thanks!

  • catch committed 4b70319 on 8.1.x
    Issue #2371861 by DuaelFr, YesCT, Gábor Hojtsy, tucho: Strings including...

  • catch committed 410afdc on
    Issue #2371861 by DuaelFr, YesCT, Gábor Hojtsy, tucho: Strings including...
Gábor Hojtsy’s picture

Version: 8.0.x-dev » 7.x-dev
Status: Fixed » Patch (to be ported)
Issue tags: -sprint

Amazing, thanks all! Should now be backported to Drupal 7 though.

pietmarcus’s picture

Assigned: Unassigned » pietmarcus

I'll try to fix it for D7. The patch itself isn't a problem, but I've got some work to with the testscripts.

pietmarcus’s picture

I have backported the fix to Drupal 7. Since Drupal 7 doesn't use Twig and strings aren't considered safe by translating them, I didn't include the testLocalizedTokenizedString-testcase. The testLocaleStringIsSafe is included and should fail without the patch.

Please review and let me know if I missed anything or there should be more/better tests.

pietmarcus’s picture

Oops, meant to post a test-only patch, but forgot to remove the fix... This one should fail.

Status: Needs review » Needs work
pietmarcus’s picture

Status: Needs work » Needs review

As predicted the test only patch failed. The patch with the fix was succesfull. Please review.

DuaelFr’s picture

Status: Needs review » Reviewed & tested by the community

The patch in #56 can still be applied on 7.x and is working properly.
Thank you @pietmarcus for that port.

Fabianx’s picture

Issue tags: +Pending Drupal 7 commit

This is an easy one.

Marking for commit.

stefan.r’s picture

+1 to this, it fixes a long-standing very pesky issue, nice work!

stefan.r’s picture

Assigned: pietmarcus » Unassigned

  • stefan.r committed 3a2b4f5 on 7.x
    Issue #2371861 by DuaelFr, YesCT, pietmarcus, Gábor Hojtsy, tucho:...
stefan.r’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: -Needs backport to D7, -Pending Drupal 7 commit +7.50 release notes

Committed and pushed to 7.x, thanks!

David_Rothstein’s picture

Status: Fixed » Needs review
FileSize
2.19 KB

Here's a quick followup for some documentation fixes (hopefully a quick enough fix that it's not worth a separate issue).

The main thing I did here was remove a paragraph that I think was only relevant for Drupal 8... since we don't mark strings safe/unsafe in Drupal 7.

stefan.r’s picture

Status: Needs review » Reviewed & tested by the community

I actually don't think that bit is relevant to Drupal 8 anymore either :)

Fabianx’s picture

Issue tags: +Pending Drupal 7 commit

Marking for commit (not at a computer where I can commit right now).

  • stefan.r committed 3677ac5 on 7.x
    Issue #2371861 followup by David_Rothstein: Strings including tokens in...
stefan.r’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: -Pending Drupal 7 commit

Committed and pushed the follow-up as well, thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.