TL;DR

We need a distinct "XSS Filtering For Editors" mechanism to prevent assistive editors like CKEditor from becoming an XSS attack vector. We also need a mechanism for specific assistive editing implementations (say, a Geshi code syntax highlighter for fields that store CSS) to override this step if they need special behavior.

The Details

  1. Currently, we filter all content on output. That allows us to protect site visitors from XSS attacks while preserving content in its "raw" form. In addition, it allows us to enforce markup standards ("No bold tags! No divs!") and apply transformations ("Curly quotes! Replace tokens with rich data!")
  2. When a user edits content-with-an-input-format, we send the raw content to them for display in a form. This isn't a problem, because browsers don't actually parse markup that's inside of an edit box.
  3. HOWEVER, because some assistive editors (CKEditor included) actually parse and display the raw content, XSS exploits in the raw markup can be triggered by the editor itself. (CKEditor can prevent simple 'paste a script and it gets executed' scenarios using its ACF framework, but if unsafe data actually gets saved to the DB, then a privileged user opens an edit window and changes its format to one with an Assistive Editor, the unsafe raw content could be executed.)
  4. We can't simply apply the normal output filtering before sending the content to the editor; that would also trigger the markup enforcement and transformational filters that shouldn't affect the editing process.
  5. We also can't simply run the current XSS filter function on raw content before sending it to the editor. Our current XSS Filtering mechanism uses a whitelist of tags, stripping out markup that should actually be preserved for editing purposes, applies a blacklist of attributes to any remaining tags, and applies a whitelist of URI protocols to any remaining attributes. This is fine for output, since 'Allowed Tags' lists can be set up for output, but it's is more restrictive than we need to protect the assistive editor from XSS attacks, and will interfere with many valid and safe use cases (like using an <iframe> tag, or using inline CSS styles on an element).

The Solution

Fortunately, extended dialogue at the sprint in Prague has revealed what looks like a good solution:

  1. Rather than relying on the current input filter system, we will define a new XSS-Protection-For-Editors mechanism that only attempts to scrub XSS attack vectors. It should blacklist a small set of 'inherently unsafe' tags like <SCRIPT>, blacklist attributes that are known XSS vectors like onClick and style, and whitelist known-to-be-safe URI protocols in attributes.
  2. This XSS-Protection-For-Editors scrubbing operation will be applied before content is sent from the server to the client, whenever an assistive editor like CKEditor is part of the active input format. "Dumb" unassisted text fields would remain unscrubbed, because they're not at risk.
  3. Because some assistive editors take their own responsibility for certain steps in that scrubbing process, this XSS-Protection-For-Editors operation would need to be overridable by individual assistive editors. For example, the depending on its configuration, CKEditor may want to allow a whitelist of certain inline style attributes through. A code syntax highlighter would qualify as an assistive editor, but would always want the full raw content, and would be performing highlighting operations rather than parsing the content and displaying the actual HTML.
  4. In those situations, CKEditor's Drupal integration would provide a scrubber plugin that inherits core's XSS-Protection-For-Editors plugin and overrides just the 'inline styles' handling. A hypothetical Syntax Highlighter assistive editor would override all of the steps to let everything pass through.

The Impact

If we implement the plan, where does that leave us?

  1. Raw text fields using an input format without an assistive editor would remain unchanged. They're raw. They're dumb. If you want to paste in crazy stuff and know it'll never be touched, this is the way to go.
  2. Fields using an input format with an assistive editor configured would always have the XSS-Protection-For-Editors scrubbing operation performed before content is sent to the client for editing. This would keep assistive editors save from XSS exploits.
  3. Assistive Editors that can safely sidestep some of those protections (like a syntax highlighter that doesn't actually allow the raw markup to be parsed during editing) can subclass and override the XSS-Protection-For-Editors plugin.
  4. Because the mechanism for sidestepping the XSS-Protection-For-Editors step is an explicit class, it would be easy for the security team to scan for any modules implementing it and subject them to extra scrutiny.
  5. End result: minimum impact on user content, maximum protection from XSS when using an assistive editor, and flexibility to override for assistive editors that realllllly need raw access to certain potentially dangerous bits.

There is one situation where content entered by a user could be "scrubbed out" accidentally. If they save XSS-ish data (say, a <SCRIPT> tag) using an input format that doesn't have an assistive editor, reopen the content, change the input format to on that does have an assistive editor, then save, the markup that's removed by the XSS-Protection-For-Editors scrubber would be lost permanently. I think this is acceptable: if you need to paste <SCRIPT> tags and jam onClick() events into your <A> tags, don't use an assistive editor, or use one that's explicitly designed for editing code and uses the override mechanism described in step 3 of 'The Impact.'

The Patch

During the Prague Sprint, WimLeers and members of the CKEditor team produced working proof of concept code for this approach. The patch will be coming shortly.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

eaton’s picture

Component: edit.module » editor.module
nod_’s picture

Issue tags: +JavaScript

and of course, tag.

webchick’s picture

Issue tags: +sprint, +Spark

Tagging.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
49.74 KB

Thank you so much for that very clear, concise-as-possible writeup, eaton! :)

Also special thanks to mr.baileys from the Drupal security team, who we disturbed numerous times on that day to ask for his guidance :)


I want to highlight a few important aspects of this patch:

Reuse of existing XSS protection logic
  • \Drupal\Component\Utility\Xss::filter() now has a new, optional $whitelist = TRUE parameter (API addition, not change!).
  • By slightly expanding the API of \Drupal\Component\Utility\Xss, all existing test coverage is leveraged while solving this issue; only a few additional test cases had to be added. So no only do we prevent code bloat/duplication, we also guarantee this at least as secure as our existing XSS protection.
New (code) interface
API addition: \Drupal\editor\EditorXssFilterInterface:
/**
 * Defines an interface for text editor XSS filters.
 */
interface EditorXssFilterInterface {

  /**
   * Filters HTML to prevent XSS attacks when a user edits it in a text editor.
   *
   * Should filter as minimally as possible, only to remove XSS attack vectors.
   *
   * @param string $html
   *   The HTML to be filtered.
   * @param \Drupal\filter\Entity\FilterFormat $format
   *   The text format configuration entity. Provides context based upon which
   *   one may want to adjust the filtering.
   * @param \Drupal\filter\Entity\FilterFormat $original_format|NULL
   *   The original text format configuration entity (when switching text
   *   formats/editors). Also for context.
   *
   * @return string
   *   The filtered HTML that cannot cause any XSSes anymore.
   */
  public static function filterXss($html, FilterFormat $format, FilterFormat $original_format = NULL);

}
Not using the plugin system
Initially it seemed a wise idea to use the plugin system, but now that I've implemented it, it seems completely overkill. Essentially we want a default function to do "XSS-Protection-for-Editors", and that function should be overridable. Drupal 8 plugins implies instances of classes, and since this is stateless, there's no need for all that. We just need to know on which class to call the public static filterXss() method, period. No need to create yet another plugin type, plugin manager and yet another thing in the DIC for that.
If I'm wrong, and there are good reasons to use the plugin system, I of course won't mind to make the necessary changes.
Extensive test coverage
  • Extended XssTest (unit test), run like this: cd core; php vendor/bin/phpunit tests/Drupal/Tests/Component/Utility/XssTest; cd ..
  • New StandardTest (unit test), run like this; cd core; php vendor/bin/phpunit modules/editor/lib/Drupal/editor/Tests/editor/EditorXssFilter/StandardTest; cd .. — this has very minimal test coverage only, since this class just reuses the Xss class, which has extensive test coverage in XssTest.
  • New EditorSecurityTest (integration test). Tests a total of 22 scenarios to ensure all cases are covered.
Leveraging filter types, and making them slightly more explicit
  • A long, long time ago in the Drupal 8 cycle, we introduced the concept of "filter types". We again use that here. We only apply the text editor XSS filter if a text format uses >=1 "XSS-preventing filter" (FILTER_TYPE_HTML_RESTRICTOR). I've also made the documentation of that filter type more explicit: it now specifically mentions XSS prevention as a requirement for a filter to be classified like this.
  • I'm not sure why, but the FilterHtmlImageSecure filter was classified as FILTER_TYPE_HTML_RESTRICTOR. Probably because it wasn't very clear where it should fit in. By making the requirements to be categorized as that type more explicit ("must prevent XSS"), it's now also clear that the current type for this filter is wrong. I've changed it to FILTER_TYPE_TRANSFORM_IRREVERSIBLE: the stripping of the original image and replacing it with an "error placeholder image" is irreversible.
  • Similarly, FilterHtmlCorrector was also classified as FILTER_TYPE_HTML_RESTRICTOR, but it doesn't really apply security fixes; it merely tries to improve the HTML. I've reclassified it as FILTER_TYPE_TRANSFORM_IRREVERSIBLE as well, because once incorrect HTML is fixed, it's impossible to know how to go back to the previous string representation.
    (It was originally marked as a security filter because there's no harm in running it; it only causes less things to break due to broken HTML. However, considering that this breaks the ability to cleanly reason about filter types, I feel this is better.)
  • FilterHtml (whitelist some tags), FilterHtmlEscape (blacklist all tags), FilterNull (disallow all content) are the remaining ones that are marked as being of the FILTER_TYPE_HTML_RESTRICTOR type, and all of them are very clearly correct.
  • This is a long chunk of bulleted text, but it only amounts to 3 lines being changed. I just want to be very clear about this.

Assigned: Unassigned » Wim Leers
Status: Active » Needs work
Issue tags: +Security, +API addition, +Needs security review, +CKEditor in core

The last submitted patch, editor_xss_protection-2099741-4.patch, failed testing.

Wim Leers’s picture

Looks like that's only a syntax error on PHP 5.3, which is why I didn't run into it.

eaton’s picture

I may have missed something, but:

A long, long time ago in the Drupal 8 cycle, we introduced the concept of "filter types". We again use that here. We only apply the text editor XSS filter if a text format uses >=1 "XSS-preventing filter" (FILTER_TYPE_HTML_RESTRICTOR). I've also made the documentation of that filter type more explicit: it now specifically mentions XSS prevention as a requirement for a filter to be classified like this.

Just to be clear on the reasons for this: if you've created a 'raw' input format that doesn't do any XSS prevention on output, the system won't apply any XSS prevention in the editor. Is that correct?

Status: Needs review » Needs work

The last submitted patch, editor_xss_protection-2099741-6.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
3.28 KB
53.03 KB

Just to be clear on the reasons for this: if you've created a 'raw' input format that doesn't do any XSS prevention on output, the system won't apply any XSS prevention in the editor. Is that correct?

Yes, that is correct.
If we wouldn't do it this way, then it would be impossible to use style="color:pink", or <script> tags in Drupal 8's default Full HTML text format.


Reroll to make all tests pass now that an additional bit of metadata is being passed through drupalSettings, which is why the tests were failing.

Wim Leers’s picture

Fixed two non-security bugs:

  1. wwalc (Wiktor Walc, CKEditor CTO) reported a bug in the patches above.
    The patches above track if you made any changes. If you did not make any changes, and then switch formats, it will not use the XSS-filtered result, but the original value in the DB, and then — if necessary — apply the XSS filtering to switch to your desired format. The rationale behind this is that when switching to a text format without text editor before making changes, you still have the original content as it lives in the DB, without the text editor XSS filtering. I did this so that this relatively common use case is completely unaffected by this patch. (See the data-editor-value-is-changed attribute handling in editor.js.)
    The problem he found was when you start with a format without a text editor (e.g. Restricted HTML), type some changes, then switch to a format with a text editor (e.g. Basic HTML), then switch back to the first format without making changes, now your original changes will have been undone, because there was no change tracking yet when you don't have a text editor. That's fixed now.
  2. When switching text formats and no XSS filtering is necessary, FALSE is returned as the response, yet editor.js did not yet properly handle this.
Wim Leers’s picture

… and here's the patch for #10.

mr.baileys’s picture

Status: Needs review » Needs work

Only a quick review, I'll need some more time for a full review :)

+++ b/core/modules/editor/editor.moduleundefined
@@ -386,10 +387,101 @@ function editor_pre_render_format($element) {
+ *   The HTML string that will be presented to the end user in a text editor.

Technically, the assistive editor will parse this string prior to display, so what is presented to the end user can differ from this HTML string?

+++ b/core/modules/editor/editor.moduleundefined
@@ -386,10 +387,101 @@ function editor_pre_render_format($element) {
+ *   The text format we're switching to.

editor_filter_xss() is always called prior to rendering the text editor, not just when switching between formats, so this description is misleading.

Content replaced with "false"

At some point while playing around with the editor, my content got replaced with '<p>false</p>'.

+++ b/core/modules/editor/js/editor.jsundefined
@@ -130,7 +142,48 @@ Drupal.editorDetach = function (field, format, trigger) {
+  // Otherwise, ensure XSS safety: let the server XSS filter this value.
+  else {
+    $.ajax({
+      url: Drupal.url('editor/filter_xss/' + format.format),
+      type: 'POST',
+      data: {
+        'value': field.value,
+        'original_format': originalFormatID
+      },
+      dataType: 'json',
+      success: function (xssFilteredValue) {
+        field.value = xssFilteredValue;
+        callback();
+      }
+    });

xssFilteredValue can == json False, in which case the content gets overwritten with <p>false</p>

Steps to reproduce:

  • Create a Full HTML node.
  • Edit the node.
  • Switch to Restricted
  • Switch back to Full HTML
Wim Leers’s picture

Status: Needs work » Needs review
FileSize
898 bytes
53.76 KB

#12: Thanks! I updated the two comments you rightfully found misleading. The "false" problem was fixed in the patch in #11 already, you were probably already reviewing the patch by then :)

mr.baileys’s picture

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -111,17 +115,21 @@ public static function filterAdmin($string) {
+    static $allowed_html, $whitelist_mode;

@@ -150,8 +158,12 @@ protected static function split($matches, $store = FALSE) {
+    // When in whitelist mode, an element is disallowed when not listed.
+    if ($whitelist_mode && !isset($allowed_html[strtolower($elem)])) {
+      return '';
+    }
+    // When in blacklist mode, an element is disallowed when listed.
+    else if (!$whitelist_mode && isset($allowed_html[strtolower($elem)])) {
       return '';

$allowed_html no longer makes sense as an argument/variable name since behavior now depends on the value of $whitelist, and could actually contain disallowed tags. Maybe rename $allowed_html (multiple occurences) to just $html_tags?

The same goes for $allowed_tags used elsewhere.

Sorry for the piecemeal review, I'll try and do a more thorough job soon.

mr.baileys’s picture

I have spent quite some time trying to sneak an exploit past the filtering, but was unable to do so, which means the patch seems to do a good job!

Some smaller, mostly cosmetic issues (together with the comments from #14):

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -35,10 +35,14 @@ class Xss {
+   * @param bool $whitelist
+   *   Defaults to TRUE, so the $allowed_tags is used as a whitelist of allowed
+   *   tags. When set to FALSE, the list of allowed tags becomes a list of

Should be marked as "(optional)".

+++ b/core/lib/Drupal/Component/Utility/Xss.phpundefined
@@ -111,17 +115,21 @@ public static function filterAdmin($string) {
+   * @param bool $whitelist
+   *   Ignored when $store is FALSE, otherwise used to determine whether
+   *   $matches is a list of allowed or disallowed HTML tags.

Should be marked as "(optional)".

+++ b/core/modules/editor/editor.api.phpundefined
@@ -104,5 +106,27 @@ function hook_editor_js_settings_alter(array &$settings, array $formats) {
+ * @param string &$editor_xss_filter_class
+ *   The text editor XSS filter class that will be used.

Should this also mention that the class should implement EditorXssFilterInterface?

+++ b/core/modules/editor/editor.api.phpundefined
@@ -104,5 +106,27 @@ function hook_editor_js_settings_alter(array &$settings, array $formats) {
+ *   formats/editors). Also for context.

"Also for context." needs to be a proper sentence, or can be removed altogether?

+++ b/core/modules/editor/editor.api.phpundefined
@@ -104,5 +106,27 @@ function hook_editor_js_settings_alter(array &$settings, array $formats) {
+function hook_editor_xss_filter_alter(&$editor_xss_filter_class, FilterFormat $format, FilterFormat $original_format = NULL) {

Type hinting for $format & $original_format should be FilterFormatInterface rather than FilterFormat (a couple of occurences throughout the patch).

(See https://drupal.org/node/608152#hinting)

+++ b/core/modules/editor/editor.moduleundefined
@@ -386,10 +387,101 @@ function editor_pre_render_format($element) {
+ * Apply text editor XSS filtering.

Should start with a third-person verb.

+++ b/core/modules/editor/editor.moduleundefined
@@ -386,10 +387,101 @@ function editor_pre_render_format($element) {
+ * @param FilterFormat $original_format|NULL
+ *   The original text format (i.e. when switching text formats, $format is the
+ *   text format that is going to be used, $original_format is the one that was
+ *   being used initially, the one that is stored in the database when editing).

Should be marked as (optional)

+++ b/core/modules/editor/editor.moduleundefined
@@ -386,10 +387,101 @@ function editor_pre_render_format($element) {
+function editor_filter_xss($html, FilterFormat $format, FilterFormat $original_format = NULL) {

Same comment about type hinting, I think FilterFormat should be FilterFormatInterface.

+++ b/core/modules/editor/js/editor.jsundefined
@@ -27,12 +27,23 @@ Drupal.behaviors.editor = {
+        $(field).on('change.editor keyup.editor', function () {

Is "keyup.editor" necessary here, since we are already acting on "change.editor"?

+++ b/core/modules/editor/lib/Drupal/editor/EditorController.phpundefined
@@ -49,4 +51,33 @@ public function getUntransformedText(EntityInterface $entity, $field_name, $lang
+      throw new NotFoundHttpException();

This is actually triggering a PHP Fatal error rather than throwing a NotFoundHttpException:
PHP Fatal error: Class 'Drupal\\editor\\NotFoundHttpException' not found

+++ b/core/modules/editor/lib/Drupal/editor/Plugin/EditorBase.phpundefined
@@ -24,14 +24,18 @@
- * - module: The name of the module providing the plugin.

Was this line removed intentionally? It seems unrelated to this patch.

Wim Leers’s picture

Removing "needs security review" because #15 indicates no flaws were found. I still welcome others to try to find holes though :)


#14 & #15: Thanks! :) All fixed, with a few comment replies here:

Type hinting for $format & $original_format should be FilterFormatInterface rather than FilterFormat (a couple of occurences throughout the patch).

D'oh, of course — my bad, sorry! Too much copy/pasting. Very glad you caught this.

Is "keyup.editor" necessary here, since we are already acting on "change.editor"?

Yes, because the "change" event doesn't fire right away. In any case, it doesn't hurt.
I improved it slightly now though: as soon as one change/keyup event happened, we stop listening for changes, because we know enough then: we just need to know whether the contents were changed at all or not.

Was this line removed intentionally? It seems unrelated to this patch.

You're right that it's unrelated, but since I had to update the docs there anyway, I noticed this mistake and it's better to just correct it here.


What's still needed to get this to RTBC?

webchick’s picture

I was referred to this issue by a tap-dancing ladybug man. :)

In seriousness, you may want to reach out to David Rothstein. He's done detailed review of filter system patches in the past.

dawehner’s picture

This is a classical drive-by review, so don't expect much.

Additional I noted some @file Definition at least in one place.

  1. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -35,10 +35,14 @@ class Xss {
    +   * @param array $html_tags
    +   *   An array of HTML tags.
    
    @@ -48,14 +52,14 @@ class Xss {
    +  public static function filter($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'blockquote', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd'), $whitelist = TRUE) {
    

    This docs disagree with the actual signature of the function.

  2. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -111,17 +115,21 @@ public static function filterAdmin($string) {
         if ($store) {
    

    There is another issue which gets rid of this ugly hack by just using an anonymous function which passes along the html tags

  3. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -150,8 +158,12 @@ protected static function split($matches, $store = FALSE) {
    +    else if (!$whitelist_mode && isset($html_tags[strtolower($elem)])) {
    

    We have a code style which uses "elseif"

  4. +++ b/core/modules/editor/editor.module
    @@ -386,10 +387,102 @@ function editor_pre_render_format($element) {
    +    // Only when the user has access to multiple text formats, we must add data-
    +    // attributes for the original value and change tracking, because they are
    +    // only necessary when the end user can switch between text formats/editors.
    

    I have no idea bout this but this feels like an optimization which limits some possibilities for contrib. Are data attributes meant to be namespaced and so not used in other modules?

  5. +++ b/core/modules/editor/lib/Drupal/editor/Annotation/Editor.php
    @@ -39,4 +39,11 @@ class Editor extends Plugin {
    +   * @var boolean
    

    Afaik we use @var bool

  6. +++ b/core/modules/editor/lib/Drupal/editor/EditorController.php
    @@ -49,4 +52,33 @@ public function getUntransformedText(EntityInterface $entity, $field_name, $lang
    +      $original_format = entity_load('filter_format', $original_format_id);
    

    The ControllerBase class provides an entity manager usable here.

  7. +++ b/core/modules/editor/lib/Drupal/editor/Tests/editor/EditorXssFilter/StandardTest.php
    @@ -0,0 +1,87 @@
    +class StandardTest extends UnitTestCase {
    

    YEAH!

  8. +++ b/core/modules/filter/lib/Drupal/filter/Plugin/Filter/FilterHtmlImageSecure.php
    index 2614c5d..fdcc2fb 100644
    --- a/core/tests/Drupal/Tests/Component/Utility/XssTest.php
    
    --- a/core/tests/Drupal/Tests/Component/Utility/XssTest.php
    +++ b/core/tests/Drupal/Tests/Component/Utility/XssTest.php
    

    +1!

Wim Leers’s picture

#17: I pinged David Rothstein using his d.o contact form :)

#18: thanks — drive-by reviews are actually helpful at this stage, and I'm glad to see there's apparently little left to nitpick :)

  1. Well-spotted, don't understand how I missed that. Thanks!
  2. It's fine if there's another issue, but that doesn't mean this patch should change.
  3. Thanks. I wish it weren't though. In JS, the standard is else if
  4. Yes, data- attributes are meant to be namespaced, to prevent clashes. Other modules can still use these attributes without problems, of course.
  5. Thanks!
  6. Good call, changed.
  7. Thanks!
  8. Thanks!
David_Rothstein’s picture

Thanks for pinging me. I'm probably not going to get to a full review right now, but I tried to look it over mostly at a high level.

  1. +  public static function filterXss($html, FilterFormatInterface $format, FilterFormatInterface $original_format = NULL) {
    +    // Apply XSS filtering, but only blacklist the <script> tag.
    +    return Xss::filter($html, array('script'), FALSE);
    +  }
    

    I was under the impression that blacklisting the script tag is not sufficient to stop XSS attacks? (And in general, blacklisting is fraught with peril...) Some background:

    - #275811-45: Warn about potentially insecure filter configurations as well as earlier comments
    - https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet

    The latter is the basis for some of the tests in \Drupal\Tests\Component\Utility\XssTest.... Would a blacklist filter that only blacklists <script> pass those tests?

  2. +  // If there is no filter preventing XSS attacks in the text format being used,
    +  // then no text editor XSS filtering is needed either. (Because then the
    +  // editing user can already be attacked by merely viewing the content.)
    +  // e.g.: an admin user creates content in Full HTML and then edits it, no text
    +  // format switching happens; in this case, no text editor XSS filtering is
    +  // desirable, because it would strip style attributes, amongst others.
    +  $current_filter_types = filter_get_filter_types_by_format($format->id());
    +  if (!in_array(FILTER_TYPE_HTML_RESTRICTOR, $current_filter_types, TRUE)) {
    

    Since a filter with type FILTER_TYPE_HTML_RESTRICTOR can still be configured insecurely (even intentionally so) the text editor XSS filtering may still sometimes run when it doesn't need to... I'm not sure if that's just a minor inconvenience or a problem.

  3. The new/expanded meaning of FILTER_TYPE_HTML_RESTRICTOR should be documented in the FilterInterface class.
  4. Minor things I noticed looking at the patch:
    +   *     - (optional) The allowed HTML HTML tags array that should be passed to
    

    Double "HTML".

    +   * Xss::filter() has the ability to run in blacklist mode, in which it still
    +   * applies the exact same filtering, with one exception: it no longer works
    +   * with a list of allowed tags, but with a list of disallowed tags, so that it
    +   * is possible to specify
    +   * by default, but on* and style attributes are disallowed.
    

    Something is not right towards the bottom there...

Wim Leers’s picture

#20:
(Trying to formulate a solid response, but I've been out of this for 2 weeks now, so I might be missing something.)

  1. Only a <script> tag is inherently dangerous in and of itself, because it allows one to execute arbitrary logic. Other tags, like <iframe>, are only dangerous when they have malicious attributes. Of course, in the heat of that very, very long discussion, <script> was the only one we could collectively think of. It seems like <style> is another one we should forbid. Which other tags do you think we should blacklist because they're inherently dangerous.
    P.S.: we did look at those links, and http://html5sec.org/ too. If you look at the attacks, almost all of them are about tag malformation or malicious attribute values. That's why we concluded that preventing those kinds of attacks in combination with blocking inherently malicious tags must be sufficient.
  2. Well, that is misconfiguration and AFAICT it's inherently an edge case of an edge case. We specifically went through a lot of trouble to come up with this approach that does explicitly not try to overprotect.
  3. That already has been documented. And it's not an expanded meaning, it's a more narrow meaning. From "HTML tag and attribute restricting filters." to "HTML tag and attribute restricting filters to prevent XSS attacks.".
  4. Good catches — I'll fix those :)
Wim Leers’s picture

This is a very, very important issue to get in Drupal 8. In fact, I think this issue may need to be marked critical.

It'd be great if we could get more insightful reviews, so that we can steadily move towards RTBC.

nod_’s picture

Priority: Major » Critical

Critical bugs either render a system unusable (not being able to create content or upgrade between versions, blocks not displaying, and the like), cause loss of data, or expose security vulnerabilities. These bugs are to be fixed immediately.

Too bad it doesn't make the system unstable, it's just critical because of 2 things :p

effulgentsia’s picture

Just a straight reroll for HEAD drift; no real changes.

jibran’s picture

Status: Needs review » Needs work

The last submitted patch, 24: editor_xss_protection-2099741-24.patch, failed testing.

Wim Leers’s picture

Wim Leers’s picture

Issue tags: +beta blocker

This is a critical security feature. I think this being marked a beta blocker is justified.

larowlan’s picture

  1. +++ b/core/modules/editor/lib/Drupal/editor/Annotation/Editor.php
    @@ -35,8 +35,15 @@ class Editor extends Plugin {
    +  public $is_xss_safe;
    

    Minor: Shouldn't class properties use camel case?

  2. +++ b/core/modules/editor/lib/Drupal/editor/Tests/EditorSecurityTest.php
    @@ -0,0 +1,330 @@
    +    $this->untrusted_user = $this->drupalCreateUser(array('create article content', 'edit any article content', 'use text format restricted_without_editor'));
    ...
    +    $this->privileged_user = $this->drupalCreateUser(array('create article content', 'edit any article content', 'use text format restricted_without_editor', 'use text format restricted_with_editor', 'use text format unrestricted_without_editor', 'use text format unrestricted_with_editor'));
    ...
    +    $this->normal_user = $this->drupalCreateUser(array('create article content', 'edit any article content', 'use text format restricted_with_editor'));
    ...
    +      array('author' => $this->untrusted_user->id(), 'format' => 'restricted_without_editor'),
    +      array('author' => $this->normal_user->id(), 'format' => 'restricted_with_editor'),
    +      array('author' => $this->privileged_user->id(), 'format' => 'unrestricted_without_editor'),
    +      array('author' => $this->privileged_user->id(), 'format' => 'unrestricted_with_editor'),
    

    nitpick: coding standards dictate breaking these into multi-lines.

Having read this in detail, all I can muster are coding standards nitpicks.
I don't think those should block this getting in though.

dstol’s picture

Issue tags: +Security improvements
Gábor Hojtsy’s picture

Architecturally this change looks good to me. Based on the issue summary I don't have better ideas to approach this problem either. I think if we can make the API as evident as possible and windows for failure (such as the remote XSS filtering call) not be a problem, then we should be fine. See more comments below.

  1. +++ b/core/lib/Drupal/Component/Utility/Xss.php
    @@ -35,10 +35,14 @@ class Xss {
    +   * @param array $html_tags
    +   *   An array of HTML tags.
    +   * @param bool $whitelist
    +   *   (optional) Defaults to TRUE, so $html_tags is used as a whitelist of
    +   *   allowed tags. When set to FALSE, the list of HTML tags becomes a list of
    +   *   disallowed tags, i.e. a blacklist.
    

    I needed to read this several times to understand. The $whitelist argument sounds misleading because its not at all a whitelist itself. It specifies if the other argument is a whitelist or blacklist. Maybe if you we name it $operation with allowed values FILTER_ALLOWED, FILTER_DISALLOWED or something like that (defined constants). Or something along those lines would be cleaner. Same later on.

  2. +++ b/core/modules/editor/editor.api.php
    @@ -104,5 +106,30 @@ function hook_editor_js_settings_alter(array &$settings, array $formats) {
    + * @param string &$editor_xss_filter_class
    + *   The text editor XSS filter class that will be used.
    

    This looks to be a strange pattern that you use an alter to alter a string. Other similar places I think use a regular hook to get the value and then take the last value. I don't have a strong preference, but it looks pretty strange to have an alter hook for a string only :D)

  3. +++ b/core/modules/editor/editor.api.php
    @@ -104,5 +106,30 @@ function hook_editor_js_settings_alter(array &$settings, array $formats) {
    + * @see \Drupal\editor\EditorXssFilterInterface
    

    Nice way to document how the filter should interface :) Great :) ++

  4. +++ b/core/modules/editor/js/editor.js
    @@ -130,7 +155,51 @@ Drupal.editorDetach = function (field, format, trigger) {
    +        // If the server returns false, then no XSS filtering is needed.
    +        if (xssFilteredValue !== false) {
    +          field.value = xssFilteredValue;
    +        }
    

    Is this not also happening in an error condition? Ie. let's make sure that the server being inaccessible or something along those lines will not cause an issue. I suspect that is already ok with the strict checking for false, but wanted to make sure I ask :) Sounds like the remote connection may play a role in how the XSS protection is performed. Also is this call blocking (ie. no window for the XSS while this is running)?

  5. +++ b/core/modules/editor/lib/Drupal/editor/EditorController.php
    @@ -49,4 +52,35 @@ public function getUntransformedText(EntityInterface $entity, $field_name, $lang
    +    $original_format_id = $request->request->get('original_format');
    +    $original_format = NULL;
    

    This switch from the route param being the same as the variable name but then not being the same thing is interesting... I think we want to rename the route parameter to include '_id' as well so its clearly not a whole loaded format?

  6. +++ b/core/modules/editor/lib/Drupal/editor/EditorXssFilterInterface.php
    @@ -0,0 +1,37 @@
    + * Defines an interface for text editor XSS filters.
    

    Qualify what XSS is maybe here, since this is the top level place where this is documented. I mean "XSS (Cross-site scripting)"

  7. +++ b/core/modules/editor/lib/Drupal/editor/Tests/editor/EditorXssFilter/StandardTest.php
    @@ -0,0 +1,87 @@
    + * Tests the standard text editor XSS filter
    

    Minor: dot :)

  8. +++ b/core/tests/Drupal/Tests/Component/Utility/XssTest.php
    @@ -435,6 +435,67 @@ public function providerTestFilterXssNotNormalized() {
    +   * Tests limiting disallowed tags and XSS prevention.
    ...
    +   * is possible to specify
    +   * by default, but on* and style attributes are disallowed.
    

    Limiting disallowed tags sounds like double negation to me?! Not clear what do you meant here? Tests specifying disallowed tags?

    Also something is missing from the docs here?!

nod_’s picture

Status: Needs review » Needs work

Ok not even the right issue. nevermind me. I need coffee.

nod_’s picture

Status: Needs work » Needs review
nod_’s picture

In filterXssWhenSwitching I'd send extra arguments to the callback

  // A text editor that already is XSS-safe needs no additional measures.
  if (format.editor.isXssSafe) {
    callback(field, format);
  }

So you can do filterXssWhenSwitching(field, format, originalFormatID, Drupal.editorAttach) might as well optimize for that, it's still possible to not make use of the parameters.

A possible issue when creating a new node, if you're using the restricted HTML format (no CKEditor) and write a

tag, when switching to CK it will be removed, but switching back to restricted HTML, the script tag is still removed. Works as expected on a saved node.
$(field).on('change.editor keyup.editor', function () {
          field.setAttribute('data-editor-value-is-changed', 'true');
          // …
        });
That's not true, it's like the current formUpdated, says it's changed when it might not be. Using keypress would be better, at least it won't pick up dead keys. Ideally we'd be comparing values but good enough for now I guess. Apart from that JS looks good to me :)
Wim Leers’s picture

FileSize
53.43 KB

Straight reroll of #27. Next: reroll that addresses all feedback since.

Wim Leers’s picture

FileSize
54.41 KB
13.26 KB

#29:

  1. Heh, good point. Maybe that's the case, but if so, then none of Drupal core is applying that rule to annotations. The Action, FieldFormatter, FieldType, FieldWidget etc. annotation classes also don't do this. So if we want to change that, then we should do that in a follow-up. Particularly because there already is a class property that uses an underscore in \Drupal\editor\Annotation\Editor, so only using camelCase for this new class property would be inconsistent.
  2. Fixed :)

#31:

  1. Fair point. Fixed!
  2. See #4, "Not using the plugin system" for the rationale. And as I say there, if there's a good reason to use the plugin system, I'll make the necessary changes.
  3. :)
  4. Interesting point! The funny part is that all of Drupal core's JS assumes there are no network or server errors! None of core implements the error callback (with the sole exception of Edit module). It is possible to also implement it here, but it'll complicate the rest of the code as well. Therefor I'd like to do that later, which won't put us in a worse situation than we're already in.
    This call is indeed blocking: the new Text Editor is only attached after the server-side XSS filtering has occurred successfully. Since it's precisely the "XSS due to a Text Editor being loaded" problem that we are solving, the answer is: "No, there is no window for the XSS while waiting for the response to this request."
  5. Good point — fixed!
  6. Done.
  7. Done :)
  8. Haha, copy/paste fail!

#34:

  1. Regarding callback(field, format), to get rid of one intermediate function call, always: good catch! Done.
  2. Regarding switching from plain <textarea>, then switching to CKE and back: there's no way around that. #2166399: Changing CKEditor configurations by changing text formats causes markup to be lost: warn the user when performing this advanced action exists to warn/inform the user appropriately when that happens. See the last paragraph of the "The impact" section of eaton's issue summary.
  3. First: changed to keypress. Second: note that this is only used in the "there is no text editor" case; if there is a text editor, then that text editor's onChange method will be used, which prevents the problem you describe!

Now, with all the nitpicks and small optimizations out of the way, let's get a +1 from the security team and then let's get this in!

Wim Leers’s picture

nod_’s picture

FileSize
53.62 KB

This is a reroll because the indentation standard changed for JavaScript files. No need for commit credit because of this reroll.

Status: Needs review » Needs work

The last submitted patch, 38: core-editor-xss-protection-2099741-38.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
Wim Leers’s picture

FileSize
54.42 KB

The #38 reroll moved some files around in the patch for no reason, hence breaking tests. It also introduced unrelated whitespace changes. Please don't do that. If you do, at least provide an interdiff.

This is a straight reroll from #37.

nod_’s picture

Sorry for the sloppy reroll.

The last submitted patch, 38: core-editor-xss-protection-2099741-38.patch, failed testing.

dstol’s picture

Status: Needs review » Needs work

Wim Leers brought this to the attention of the security team and asked that we give it a review. This feature seems like a great solution to mitigate what seems like a potential vulnerability in D8.

The security team had a similiar discussion about modifying input filters allowed/disallowed attributes as it relates to https://drupal.org/node/2081887. I’m glad to see mature and well thought out solution presented solution for D8.

The only tag that’s being explicitly blocked is

in Drupal\editor\EditorXssFilter? Style should be added to that list, #20 hits this too. See https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#STYLE_tags_with_broken_up_JavaScript_for_XSS and the several subsequent exploits. Link as well via a remote style sheet attack. See https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#Remote_style_sheet. I’d like to see style and link blacklisted and tests added for them.
Wim Leers’s picture

Status: Needs work » Postponed
FileSize
88.66 KB
46.46 KB

Blacklist more tags

While working on this patch to address #44, I decided to implemented as comprehensive test coverage as possible: I expanded \Drupal\editor\Tests\editor\EditorXssFilter\StandardTest with the entire list of known XSS attack vectors at https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet — this test has now grown from 4 to 189 tests! (This is also why the patch has grown >40 KiB in size.)

In doing so, I discovered two more problematic cases: <embed> and <object> are safe most of the time… unless you have Flash installed. Flash may contain JavaScript code itself, which would have access to the main document… and hence opening the door for all sorts of maliciousness.

Discussed this at depth with dstol on chat and with the CKEditor CTO, they both agree with these additions.

This does imply that it will be effectively impossible to use the <embed> and <object> tags directly whenever you use the Basic HTML format, even when you've configured it to allow the <object> and <embed>… you'll only be able to use them in text formats without any FILTER_TYPE_HTML_RESTRICTOR filter, such as the Full HTML format.
In other words: end users may experience data loss!

Being smarter to prevent data loss

Of course, data loss is not acceptable. That's why I talked to both dstol and the CKEditor CTO about working around that problem. We've come up with a solution that works:

  1. Start with the list of 5 blacklisted "dangerous" tags: <script>, <style>, link, <embed>, and <object>.
  2. When not switching text formats: remove explicitly whitelisted tags from the list of blacklisted "dangerous" tags, and add explicitly forbidden tags.
  3. When switching text formats: remove the intersection of whitelisted tags in both text formats from the list of blacklisted "dangerous" tags, and add the union of tags explicitly forbidden in both text formats. (In other words: we expect only markup allowed in both the previous and the new text format to continue to exist.)

That's it in a nutshell. Please see \Drupal\editor\EditorXssFilter\Standard for details.

In this reroll

The test coverage was expanded to ensure that this works correctly: switching to/from a text format with HTML restrictions but with the <embed> tag allowed was added to the tests.

No further API additions/changes were necessary: EditorXssFilterInterface already foresaw the possibility of text editor XSS filters needing to access the metadata of the involved text formats to do its filtering, now the default text editor XSS filter just leverages that.

For the unit tests to work without mocking procedural code, I had to move some procedural functions in filter.module onto FilterFormatInterface though, but that needed to happen anyway. That's being done in #2184315: Move filter_get_filter_types_by_format() and filter_get_html_restrictions_by_format() onto FilterFormatInterface; this issue is now effectively blocked on that.

Wim Leers’s picture

Status: Postponed » Needs review

Status: Needs review » Needs work

The last submitted patch, 45: editor_xss_protection-2099741-45.patch, failed testing.

dstol’s picture

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
111.77 KB
27.23 KB
3.57 KB

#2184315: Move filter_get_filter_types_by_format() and filter_get_html_restrictions_by_format() onto FilterFormatInterface was opened to keep this patch focused on the scope of this issue: unrelated refactoring done in another issue.
But… sadly, #45 came back red, because PHPUnit failed on qa.d.o, yet passed with zero complaints (let alone failures) locally. It failed for legitimate reasons.

The fix is simple, but like #2184315, it requires a bit of refactoring, to avoid the same "mocking procedural stuff in PHPUnit tests" problem. In theory I should open a new issue for it. But I think it's easier/faster for everybody, including core committers, to just do it here. It's moving the FILTER_TYPE_<something> constants from filter.module to FilterInterface. It's an easy search/replace operation, and there's extensive test coverage.
See interdiff-move_filter_types_to_filterinterface.txt.

So, attached is a reroll that fixes the one bug in #45, along with some super simple refactoring to allow for clean, green unit tests.

dstol’s picture

Status: Needs review » Reviewed & tested by the community

This has been quite an epic battle to get this right. It's ready now.

Wim Leers’s picture

Almost forgot!

Please give commit credit to these folks: Wim Leers, wwalc, mr.baileys, eaton, dstol. Thanks!

webchick’s picture

Assigned: Wim Leers » sun

sun indicated he'd like to take a look at this, so assigning to him.

webchick’s picture

Assigned: sun » Unassigned

sun eyeballed this the other day and the only comment he had was that it'd be nice if there was a docblock topic or similar to explain this new API, which included some of the overview info that's available in the issue summary, but not in the code. He also said though that he didn't feel this was a commit-blocker, and could be done as a follow-up.

Dries’s picture

The patch looks great. It's surprising that this is not part of the WYSIWYG editor itself, and that many content management systems needs to implement this. Any more color on that?

eaton’s picture

The patch looks great. It's surprising that this is not part of the WYSIWYG editor itself, and that many content management systems needs to implement this. Any more color on that?

CKEditor does implement scrubbing functionality for content coming into the editor via user input. The challenge comes when it also must deal with text from the CMS that didn't come in via the assistive editor's sanitization code.

We're facing it due to the flexibility of Drupal's input format system: the level of variation we allow, and the potential for a piece of content to switch formats over its lifespan, means we expose this subtle edge case and must handle it.

effulgentsia’s picture

To clarify #54, the issue summary says:

HOWEVER, because some assistive editors (CKEditor included) actually parse and display the raw content, XSS exploits in the raw markup can be triggered by the editor itself.

Why does CKEditor consider that to be acceptable?

effulgentsia’s picture

#56 was xpost with #55, so trying to rephrase the q:

CKEditor does implement scrubbing functionality for content coming into the editor via user input. The challenge comes when it also must deal with text from the CMS that didn't come in via the assistive editor's sanitization code.

http://ckeditor.com/blog/CKEditor-4.1-RC-Released says that the ACF is not security related. ACF in general makes sense to apply on user input only, and not on CMS-supplied text, but shouldn't CKEditor also have a security feature that is applied across the board? Maybe it shouldn't, but if that's the case, I think Dries wants to understand why that is.

sun’s picture

re: @Dries / @effulgentsia:

The way I see it, one of the primary reasons for why we're doing it in Drupal is that the Editor module in core is not restricted to CKEditor. If a different client-side editor library is used, then Drupal should still be secure.

We can't expect every editor plugin implementation to figure out this fundamental security aspect on their own. And as @eaton already mentioned, the concept of multiple text formats as well as being able to switch between them is custom to Drupal.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok, seems like there is general agreement that this is a good idea.

I'll be perfectly honest and say that most of this is way over my head. But it's also a beta blocker that has been sitting at RTBC for more than a week at this point, which doesn't feel good. I also did read the entire patch, and it's well documented and easy to follow. So in case we do introduce a regression here, seems like it'd be pretty easy to clean-up. I'll also be seeing Wim in person next week so I can give him a noogie in person if so. ;P

The one main concern I had with this patch was about the blacklist for tags like embed and object, since there are definitely valid reasons to enable those tags for users, and ways to safeguard them from being wide-open security holes. (Embed filter module, for example) However, the logic in filterXss() allows for this, by cross-referencing the blacklist against the explicitly allowed tags.

Lots of very nice test coverage as well, which is great. So I think this is good to go.

Committed and pushed to 8.x. Thanks!

webchick’s picture

Btw, that wasn't me being aggro, that was #2191619: New on-page comment form is deleting all hidden files :( :(

sun’s picture

Thanks!

Created #2191873: Document the WYSIWYG XSS filtering concept and architecture for developers to add some bird-level documentation on the architectural security concept, which was my only concern, as @webchick paraphrased already.

dawehner’s picture

Wim Leers’s picture

Assigned: Unassigned » Wim Leers
Issue tags: -sprint

I've updated https://drupal.org/node/1817474 because the FILTER_TYPE_* constants were moved to FilterInterface in this patch.

I'll get #2191873: Document the WYSIWYG XSS filtering concept and architecture for developers done.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

chx’s picture