Since #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument was committed there are several cases in core that use ! placeholders on already-safe (but not directly-sanitized-by-Drupal) text and that therefore are now showing escaped HTML in the user interface.

Here is one (from core/modules/image/src/ImageStyleListBuilder.php):

    $build['#empty'] = $this->t('There are currently no styles. <a href="!url">Add a new one</a>.', array(
      '!url' => $this->urlGenerator->generateFromPath('admin/config/media/image-styles/add'),
    ));

And sure enough if you go to admin/config/media/image-styles and delete all existing image styles, you'll see this message with escaped HTML on the page:

There are currently no styles. <a href="/admin/config/media/image-styles/add">Add a new one</a>.

A good way to find many of these is to grep the codebase for "!url". However, many such examples are in hook_help() which apparently does not go through the SafeMarkup system before printing, and thus the bug isn't visible for those.

I guess the most developer-friendly solution is just to change these all to use @url. Although it's a little silly to sanitize something that is 100% guaranteed to be already safe, i.e. examples like t('Go to <a href="!url">this page on drupal.org</a> for more information.', array('!url' => 'https://www.drupal.org/some/documentation/page')).

CommentFileSizeAuthor
#14 2409477-13.patch9.51 KBbès
#6 2409477-6.patch13.4 KBeffulgentsia

Comments

David_Rothstein’s picture

Cheet’s picture

So looking into this issue I noticed that simply changing the "!url" to "@url" totally mangles any GET request that you can make.

For example:
"admin/config/media/image-styles/add?a=b&b=c"

becomes:
"admin/config/media/image-styles/add%3Fa%3Db%26b%3Dc"

It looks like the string is marked as "unsafe" and escaped. Perhaps there should be a way to mark variables as urls or some special use case for urls specifically. However for now, setting the escape character to "@" in a t() string is NOT the way to go.

David_Rothstein’s picture

I can't reproduce that at all - @url works fine for me even with a query string and doesn't mangle anything.

The @ placeholder simply calls String::checkPlain(), and String::checkPlain() is actually already used on many URLs throughout Drupal, including existing strings that use @url placeholders, anything that calls the l() function or the check_url() function, etc... so if there is a problem wouldn't it show up many other places already?

joelpittet’s picture

@Cheet what exactly did you enter to get that? I've seen that in the batch API too.

But quick test:
drush ev "echo t('@url', ['@url' => 'admin/config/media/image-styles/add?a=b&b=c']);"
admin/config/media/image-styles/add?a=b&amp;b=c
vs
drush ev "echo t('\!url', ['\!url' => 'admin/config/media/image-styles/add?a=b&b=c']);"
(Which for me was giving placeholder.
<em class="placeholder">admin/config/media/image-styles/add?a=b&amp;b=c</em>

jhodgdon’s picture

Priority: Normal » Major

As a note, hook_help() is probably immune because it is using the #markup render element, which I believe is immune to escaping. (And phew! I would hate to have to fix up all the hook_help().)

So... Using ! for URLs in t() calls is the right thing to do, given what our formatting system does -- you do not want to check_plain URLs, because it will convert & to an HTML entity. So I think this is more major than you think it is.

effulgentsia’s picture

Title: Various ! placeholders throughout core are now showing escaped HTML (especially <a href="!url"> examples) » Various ! placeholders throughout core are now showing escaped HTML (especially <a href="!..."> examples)
Status: Active » Needs review
StatusFileSize
new13.4 KB

So... Using ! for URLs in t() calls is the right thing to do, given what our formatting system does -- you do not want to check_plain URLs, because it will convert & to an HTML entity.

I disagree. Per #3, LinkGenerator::generate() checkPlain()s the URL before assigning it as the href value, so why would it be wrong to do it when generating an anchor tag outside of LinkGenerator::generate()? Yes, it converts the & to &amp;, but it does that inside an HTML string, and HTML agents know to decode that.

hook_help() is probably immune

Yes, hook_help() is special because it's concatenating $output anyway, so regardless of whether the individual pieces are marked safe or not, the end result isn't. It's getting away without being marked safe because it's being assigned as a #markup value. Once we fix #2273925: Ensure #markup is XSS escaped in Renderer::doRender() to not whitelist all #markup values, we might still choose to whitelist the return values of hook_help().

While I think it would be more proper to fix all hook_help() implementations to use @ instead of !, since per above that isn't strictly needed, and would be a lot of work, here's a patch that fixes all other places. I searched for href="!, but did not convert parameters that were the result of a check_url(), since that does its own check_plain().

joelpittet’s picture

@effulgentsia I asked this of @Cheet during the sprint. Is there any case where htmlspecialchars() would be detrimental to a URL? I couldn't think of anything specific, & is fine for a URL but maybe not for a JSON response, or in the CLI (which placeholder % is already a pain to <em>look at</em>).

I think I was running into problems with this for the batch api... but got a bit frustrated trying to debug it over the weekend and gave up... will return to the battle.

jhodgdon’s picture

Regarding #6, really? In September of 2013, we decided to change all the hook_help() implementations, which previously used ... <a href="@url">something</a> to change them to !url, specifically so that the URLs (which we knew were safe) would not be run through check_plain(). Now you're saying we should go through and convert them all back? We'd better discuss this a bit more thoroughly before we convert everything.

And then we'll need to update the "how to write help" standards on https://www.drupal.org/node/632280 and several other patches in the pipeline that are children of #1908570: [meta] Update or create hook_help() texts for D8 core modules.

effulgentsia’s picture

Looks like that hook_help() decision was made in #1876906-21: Implement hook_help() for views_ui.module? From a consistency standpoint, I disagree with it. IMO, check_plain() is *always* appropriate to use on a string that is plain-text and needs to be inserted into an HTML document. Even if you already know it to be safe. Having to think about when ! is safe vs. when @ is necessary seems worse to me than always using @ by default, and only using ! when necessary.

But, per #6, given that we do/can/will always whitelist hook_help() output, it doesn't actually matter if we use @ or ! there. So, while from a consistency with the rest of core standpoint, I see no reason for hook_help() to be the sole place where we don't check_plain() an attribute value, I also see no need to reverse the 2013 decision and create a lot of work for people. Let's only revisit that if/when we're making other sweeping changes to hook_help() standards.

So, let's keep this issue limited to only changing the places where it does matter, which is what the #6 patch does.

Is there any case where htmlspecialchars() would be detrimental to a URL? I couldn't think of anything specific, & is fine for a URL but maybe not for a JSON response

I can't think of any case where it creates a problem to check_plain() a plain-text string when inserting the result into an HTML attribute value. Even if the HTML string is then assigned to the value of a JSON variable. Any agent that deals with HTML needs to follow HTML parsing rules.

or in the CLI (which placeholder % is already a pain to look at)

CLI being a poor interface for viewing HTML is a bigger problem than just what's covered by this issue.

effulgentsia’s picture

I think I was running into problems with this for the batch api

I believe there was a time when batch API was not properly persisting which strings have already been marked safe from one batch step to the next. That was leading to multiple double-escaping bugs. AFAIK, those are all now fixed. Or, if not, should be a separate bug report of batch API.

joelpittet’s picture

@effulgentsia I'll open another bug report once I can narrow down what is going on there, a bit tricky because I'm not sure how batch_process() is supposed to redirect on an existing submit handler. Though this was happening on Sunday where I noticed a suspicious URL escaping while stepping through the Module Installer to try and repair it.

subhojit777’s picture

bès’s picture

Assigned: Unassigned » bès

Rerolling the patch to the latest head.

bès’s picture

StatusFileSize
new9.51 KB

Here is the reroll of the patch.

Now, there is still others !url in the code; but i'm not sure what we should do with them.

bès’s picture

Assigned: bès » Unassigned

Status: Needs review » Needs work

The last submitted patch, 14: 2409477-13.patch, failed testing.

bès’s picture

Status: Needs work » Needs review

Reroll patch

star-szr’s picture

Possibly some overlap with the child issues of this other one.

subhojit777’s picture

According to issue description I have deleted all image styles from admin/config/media/image-styles but I cannot see the error. Can anyone please tell where else I can reproduce the error.

subhojit777 queued 14: 2409477-13.patch for re-testing.

Status: Needs review » Needs work

The last submitted patch, 14: 2409477-13.patch, failed testing.

subhojit777’s picture

Assigned: Unassigned » subhojit777
willzyx’s picture

subhojit777’s picture

Assigned: subhojit777 » Unassigned
Status: Needs work » Closed (duplicate)
Issue tags: +SafeMarkup
subhojit777’s picture