Since #2399037: String::format() marks a resulting string as safe even when passed an unsafe passthrough argument was committed there are several cases in core that use ! placeholders on already-safe (but not directly-sanitized-by-Drupal) text and that therefore are now showing escaped HTML in the user interface.
Here is one (from core/modules/image/src/ImageStyleListBuilder.php):
$build['#empty'] = $this->t('There are currently no styles. <a href="!url">Add a new one</a>.', array(
'!url' => $this->urlGenerator->generateFromPath('admin/config/media/image-styles/add'),
));
And sure enough if you go to admin/config/media/image-styles and delete all existing image styles, you'll see this message with escaped HTML on the page:
There are currently no styles. <a href="/admin/config/media/image-styles/add">Add a new one</a>.
A good way to find many of these is to grep the codebase for "!url". However, many such examples are in hook_help() which apparently does not go through the SafeMarkup system before printing, and thus the bug isn't visible for those.
I guess the most developer-friendly solution is just to change these all to use @url. Although it's a little silly to sanitize something that is 100% guaranteed to be already safe, i.e. examples like t('Go to <a href="!url">this page on drupal.org</a> for more information.', array('!url' => 'https://www.drupal.org/some/documentation/page')).
| Comment | File | Size | Author |
|---|---|---|---|
| #14 | 2409477-13.patch | 9.51 KB | bès |
| #6 | 2409477-6.patch | 13.4 KB | effulgentsia |
Comments
Comment #1
David_Rothstein commentedComment #2
Cheet commentedSo looking into this issue I noticed that simply changing the "!url" to "@url" totally mangles any GET request that you can make.
For example:
"admin/config/media/image-styles/add?a=b&b=c"
becomes:
"admin/config/media/image-styles/add%3Fa%3Db%26b%3Dc"
It looks like the string is marked as "unsafe" and escaped. Perhaps there should be a way to mark variables as urls or some special use case for urls specifically. However for now, setting the escape character to "@" in a t() string is NOT the way to go.
Comment #3
David_Rothstein commentedI can't reproduce that at all - @url works fine for me even with a query string and doesn't mangle anything.
The @ placeholder simply calls String::checkPlain(), and String::checkPlain() is actually already used on many URLs throughout Drupal, including existing strings that use @url placeholders, anything that calls the l() function or the check_url() function, etc... so if there is a problem wouldn't it show up many other places already?
Comment #4
joelpittet@Cheet what exactly did you enter to get that? I've seen that in the batch API too.
But quick test:
drush ev "echo t('@url', ['@url' => 'admin/config/media/image-styles/add?a=b&b=c']);"
admin/config/media/image-styles/add?a=b&b=cvs
drush ev "echo t('\!url', ['\!url' => 'admin/config/media/image-styles/add?a=b&b=c']);"
(Which for me was giving placeholder.
<em class="placeholder">admin/config/media/image-styles/add?a=b&b=c</em>Comment #5
jhodgdonAs a note, hook_help() is probably immune because it is using the #markup render element, which I believe is immune to escaping. (And phew! I would hate to have to fix up all the hook_help().)
So... Using ! for URLs in t() calls is the right thing to do, given what our formatting system does -- you do not want to check_plain URLs, because it will convert & to an HTML entity. So I think this is more major than you think it is.
Comment #6
effulgentsia commentedI disagree. Per #3,
LinkGenerator::generate()checkPlain()s the URL before assigning it as the href value, so why would it be wrong to do it when generating an anchor tag outside of LinkGenerator::generate()? Yes, it converts the&to&, but it does that inside an HTML string, and HTML agents know to decode that.Yes, hook_help() is special because it's concatenating $output anyway, so regardless of whether the individual pieces are marked safe or not, the end result isn't. It's getting away without being marked safe because it's being assigned as a #markup value. Once we fix #2273925: Ensure #markup is XSS escaped in Renderer::doRender() to not whitelist all #markup values, we might still choose to whitelist the return values of hook_help().
While I think it would be more proper to fix all hook_help() implementations to use @ instead of !, since per above that isn't strictly needed, and would be a lot of work, here's a patch that fixes all other places. I searched for
href="!, but did not convert parameters that were the result of a check_url(), since that does its own check_plain().Comment #7
joelpittet@effulgentsia I asked this of @Cheet during the sprint. Is there any case where htmlspecialchars() would be detrimental to a URL? I couldn't think of anything specific, & is fine for a URL but maybe not for a JSON response, or in the CLI (which placeholder % is already a pain to
<em>look at</em>).I think I was running into problems with this for the batch api... but got a bit frustrated trying to debug it over the weekend and gave up... will return to the battle.
Comment #8
jhodgdonRegarding #6, really? In September of 2013, we decided to change all the hook_help() implementations, which previously used
... <a href="@url">something</a>to change them to !url, specifically so that the URLs (which we knew were safe) would not be run through check_plain(). Now you're saying we should go through and convert them all back? We'd better discuss this a bit more thoroughly before we convert everything.And then we'll need to update the "how to write help" standards on https://www.drupal.org/node/632280 and several other patches in the pipeline that are children of #1908570: [meta] Update or create hook_help() texts for D8 core modules.
Comment #9
effulgentsia commentedLooks like that hook_help() decision was made in #1876906-21: Implement hook_help() for views_ui.module? From a consistency standpoint, I disagree with it. IMO, check_plain() is *always* appropriate to use on a string that is plain-text and needs to be inserted into an HTML document. Even if you already know it to be safe. Having to think about when
!is safe vs. when@is necessary seems worse to me than always using@by default, and only using!when necessary.But, per #6, given that we do/can/will always whitelist hook_help() output, it doesn't actually matter if we use
@or!there. So, while from a consistency with the rest of core standpoint, I see no reason for hook_help() to be the sole place where we don't check_plain() an attribute value, I also see no need to reverse the 2013 decision and create a lot of work for people. Let's only revisit that if/when we're making other sweeping changes to hook_help() standards.So, let's keep this issue limited to only changing the places where it does matter, which is what the #6 patch does.
I can't think of any case where it creates a problem to check_plain() a plain-text string when inserting the result into an HTML attribute value. Even if the HTML string is then assigned to the value of a JSON variable. Any agent that deals with HTML needs to follow HTML parsing rules.
CLI being a poor interface for viewing HTML is a bigger problem than just what's covered by this issue.
Comment #10
effulgentsia commentedI believe there was a time when batch API was not properly persisting which strings have already been marked safe from one batch step to the next. That was leading to multiple double-escaping bugs. AFAIK, those are all now fixed. Or, if not, should be a separate bug report of batch API.
Comment #11
joelpittet@effulgentsia I'll open another bug report once I can narrow down what is going on there, a bit tricky because I'm not sure how batch_process() is supposed to redirect on an existing submit handler. Though this was happening on Sunday where I noticed a suspicious URL escaping while stepping through the Module Installer to try and repair it.
Comment #12
subhojit777Comment #13
bès commentedRerolling the patch to the latest head.
Comment #14
bès commentedHere is the reroll of the patch.
Now, there is still others !url in the code; but i'm not sure what we should do with them.
Comment #15
bès commentedComment #17
bès commentedReroll patch
Comment #18
star-szrPossibly some overlap with the child issues of this other one.
Comment #19
subhojit777According to issue description I have deleted all image styles from
admin/config/media/image-stylesbut I cannot see the error. Can anyone please tell where else I can reproduce the error.Comment #22
subhojit777Comment #23
willzyx commented@subhojit777 you cannot see the error because of #2490290: Subclasses of EntityListBuilder incorrectly override the #empty message. Moreover, I think we can close this issue as a duplicate of #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests
Comment #24
subhojit777There are two duplicate issues related to this #2506427: [meta] !placeholder causes strings to be escaped and makes the sanitization API harder to understand and #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests. Most work has been done in #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests, we should continue work on that issue and close the others.
Comment #25
subhojit777Closed on behalf of #2506445: Replace !placeholder with @placeholder in t() and format_string() for non-URLs in tests
Comment #26
subhojit777