Problem/Motivation

CKEditor 4.1 includes an Advanced Content Filter. Drupal HEAD is currently using it in default mode, which means CKEditor filters all user entered content (including content pasted from other web pages, Microsoft Word, etc.) to conform to what is allowed by the enabled toolbar buttons and plugins for the text format being used. For example, if in configuring the "Basic HTML" format, the administrator hasn't added any toolbar buttons or plugins that provide <table> functionality, then upon pasting HTML content from Word, CKEditor will automatically clean up the HTML to not include <table> and related tags. This is mostly a good thing: to the extent that the configured toolbar matches what is allowed by Drupal's text filters, it helps maintain parity between what the content editor sees in the wysiwyg, and what they'll get when they save the node and view it (i.e., WYS = WYG).

However, there are some situations in which there isn't a perfect match between the configured toolbar and the configured filters. Some examples:

  • The Basic HTML format in HEAD allows for the code tag, but there's no enabled CKEditor button for it. CKEditor has a "source" mode for entering HTML by hand, so you can use that to markup a code snippet with code tags, save the node, and upon viewing the node, that markup is there, as expected, since the format allows it. However, when re-editing the node, CKEditor applies its default ACF rules and strips out the code tags, since there's no enabled toolbar button for it.
  • Drupal filters can be more fine-grained than CKEditor buttons. For example, you might want a Drupal filter that allows class or style attributes on some tags, but not on others. For WYS to equal WYG, you'd want these rules applied within the editor as well, and relying on CKEditor to figure it out from the enabled buttons is not sufficient.
  • You may want the CKEditor interface to be smart about which options it shows in various dialogs. For example, if you have a Drupal filter that strips target attributes from links, you probably also want CKEditor hiding that option from its link dialog. CKEditor's ACF allows for setting the allowedContent configuration to what you want, and it will automatically adjust all of its UI options accordingly.

Proposed resolution

Rather than letting CKEditor guess the allowedContent from its button/plugin configuration, have Drupal explicitly set allowedContent to match what the format's text filters allow.

To be able to generate the proper value for that setting, we must know which HTML tags and attributes are allowed by the text format, so this issue is introducing filter_get_allowed_html_by_format() and an "allowed html callback" for filters of the type FILTER_TYPE_HTML_RESTRICTOR. (This was originally proposed over at #1782838: WYSIWYG in core: round one — filter types, but back then it was not something you could touch and see; now you can.)

Remaining tasks

All done; needs reviews!

User interface changes

None.

API changes

  1. Filters of the FILTER_TYPE_HTML_RESTRICTOR type may now define an "allowed html callback".
  2. New utility function: filter_get_allowed_html_by_format(). Allows modules to gain insight about which HTML tags and attributes are allowed by a certain text format.
Files: 
CommentFileSizeAuthor
#32 ckeditor_acf-1936392-32.patch41.93 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 56,362 pass(es).
[ View ]
#32 interdiff-fixes.txt4.22 KBWim Leers
#32 interdiff-renaming.txt16.52 KBWim Leers
#27 ckeditor_acf-1936392-27.patch40.53 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 56,177 pass(es).
[ View ]
#27 interdiff-minor_fixes_26.txt7.05 KBWim Leers
#27 interdiff-better_test_coverage.txt16.95 KBWim Leers
#22 ckeditor_acf-1936392-22.patch34.81 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 57,145 pass(es).
[ View ]
#22 interdiff.txt883 bytesWim Leers
#19 ckeditor_acf-1936392-19.patch34.82 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 56,981 pass(es).
[ View ]
#19 interdiff.txt30.31 KBWim Leers
#18 ckeditor_acf-1936392-18.patch20.79 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 55,592 pass(es).
[ View ]
#18 interdiff.txt1.51 KBWim Leers
#16 ckeditor_acf-1936392-16.patch19.33 KBWim Leers
FAILED: [[SimpleTest]]: [MySQL] 56,298 pass(es), 0 fail(s), and 1 exception(s).
[ View ]
#6 ckeditor_acf-1936392-6.patch21.36 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 53,228 pass(es).
[ View ]
#4 ckeditor_acf-1936392-4.patch21.37 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 52,503 pass(es).
[ View ]
#4 interdiff.txt14.24 KBWim Leers
#2 ckeditor_acf-1936392-2.patch21.4 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 52,457 pass(es).
[ View ]
#2 interdiff.txt994 bytesWim Leers
#1 ckeditor_acf-1936392-1.patch21.27 KBWim Leers
PASSED: [[SimpleTest]]: [MySQL] 52,510 pass(es).
[ View ]

Comments

Status:Active» Needs review
StatusFileSize
new21.27 KB
PASSED: [[SimpleTest]]: [MySQL] 52,510 pass(es).
[ View ]

This patch:

  1. sets CKEditor's allowedContent setting automatically. When no FILTER_TYPE_HTML_RESTRICTOR filters are present in a text format, it is set to true (to allow anything); otherwise it is set based on the return value of filter_get_allowed_tags_by_format(). Tests updated accordingly.
  2. adds the ability for FILTER_TYPE_HTML_RESTRICTOR type filters to have an "allowed tags callback". API documentation, filter_html & tests updated accordingly.
  3. converts FilterAPITest from WebTestBase to DrupalUnitTestBase. Speed FTW :)

Issue summary:View changes

Clarified the D7 situation for the security problem.

StatusFileSize
new994 bytes
new21.4 KB
PASSED: [[SimpleTest]]: [MySQL] 52,457 pass(es).
[ View ]

Fix notice.

Given what is being specified in the callback, 'allowed_tags' is a misnomer. It's 'allowed_html'.

Aside from that, I need to think some more through this.

StatusFileSize
new14.24 KB
new21.37 KB
PASSED: [[SimpleTest]]: [MySQL] 52,503 pass(es).
[ View ]

I also found the name for the callback bad, but couldn't think of anything better. It's so obvious now :) Thanks! Fixed.

StatusFileSize
new21.36 KB
PASSED: [[SimpleTest]]: [MySQL] 53,228 pass(es).
[ View ]

Chasing HEAD.

According to http://ckeditor.com/blog/CKEditor-4.1-RC-Released and http://docs.ckeditor.com/#!/guide/dev_advanced_content_filter, if allowedContent isn't specified, which is currently the case in Drupal HEAD, then CKEditor automatically defaults it to whatever is allowed by the enabled buttons and plugins. Therefore, I'm not immediately able to see how this issue is needed on security grounds. I tried the iframe and script attacks listed in the issue summary for all formats installed by Standard profile, and was not able to get CKEditor to run that XSS code. If someone is able to come up with an attack that works, please post it. Otherwise, I suggest we remove "security" from the tags and title, and downgrade this to a major.

In terms of the usability part of this issue (helping wys really be wyg), why do the approach taken here (setting allowedContent based on what the Drupal filter settings are) rather than enforcing consistency between the buttons/plugins and the Drupal filter settings in the first place? Wouldn't the latter be better UI in terms of consistency between how the administrator configures the toolbar and the toolbar that actually gets displayed?

@effulgentsia - were you testing in chrome or firefox? Some browsers will block scripts that appear to come "from the user".

I was testing in Firefox, but what I found wasn't just that CKEditor didn't run the code, it also stripped it from the content, so CKEditor was filtering, not the browser.

In terms of the usability part of this issue (helping wys really be wyg), why do the approach taken here (setting allowedContent based on what the Drupal filter settings are) rather than enforcing consistency between the buttons/plugins and the Drupal filter settings in the first place?

Our syncing between the Editor settings and the Filter settings is currently CKEditor-specific and will likely remain that way in Drupal 8. There is a layer of abstraction in there to allow other modules to update filter settings in the same way CKEditor does, but right now there's no such thing as a generic "Editor plugin" that would work with every WYSIWYG out there. Instead right now we only have CKEditor plugins, and any other module that provides an editor (or multiple editors) would handle its own plugins in its own way. This means that if we wanted to enforce filtering consistency between buttons/plugins and filter settings, each filter would need to integrate with each individual editor because we don't have a generic API for handling plugins across all editors. By using the approach Wim proposes here, each filter doesn't need to directly provide integration with any editor at all. The editor is responsible for utilizing the information provided by the filters and making its own configuration match. So basically, this shifts responsibility of integrating with a particular editor from the filter to the editor, and saves filter authors the need from needing to write an editor-specific integration.

#7:

RE: Security

CKE 4.0 (pre-ACF)
Example: https://dl.dropbox.com/u/12592/Drupal/CKEditor%20allowedContent/ckeditor-4.0/samples/iframe-alert.html
Security problem still exists.
CKE 4.1 (post-ACF) for Filtered HTML/Basic HTML/any text format with HTML restricting filters
Example: https://dl.dropbox.com/u/12592/Drupal/CKEditor%20allowedContent/ckeditor-4.1/samples/iframe-alert.html
Note that the HTML code is exactly the same to the above; the only difference is CKE 4.1 vs 4.0. Indeed, no more alert! This supports your argument of not seeing why this is needed.
But be aware that the above example does not yet explicitly set allowedContent to the list of allowed tags/attributes/styles/classes. Hence it will allow anything that the buttons indicate that they might generate. This implies that it might allow for too much: it will allow for style attributes (which we almost never want as Drupal users, plus it involves security risks), as well as any class (also questionable) as well as any kind of attribute (also questionable, and also involves security risks, think onClick etc.). So even though the most obvious risks may be mitigated, when users install additional CKEditor plugins, which might want to generate e.g. onClick attributes, then attackers still have free reign.
Conclusion: despite CKE 4.1's "automatic ACF" mode, it is still essential for Drupal to impose its own, more accurate, and typically more stringent restrictions.
CKE 4.1 (post-ACF) for Full HTML/any text format without HTML restricting filters
Example: https://dl.dropbox.com/u/12592/Drupal/CKEditor%20allowedContent/ckeditor-4.1/samples/iframe-alert-full-html.html
This is again identical to the preceding example, but it sets allowedContent === true
On top of the arguments above ("But be aware…"), this is where things get really nasty.
This is where we actually want iframes to work. And onClick to work. Users with access to this text format can do anything they want. To convey this "anything goes" nature to CKEditor, we must set allowedContent = true.

I hope this clarifies things sufficiently.

Why allowedContent and not just enforce button/plugin vs. filter_html setting consistency?

Wouldn't the latter be better UI in terms of consistency between how the administrator configures the toolbar and the toolbar that actually gets displayed?

Simply because using the CKEditor toolbar/plugin configuration by itself is insufficiently granular. As I said above: it might (and typically will) allow for too much: style attributes for starters, but much more than that.

Essentially: the CKEditor buttons/plugins configuration is fairly coarse, Drupal text format filter configuration can be done much more granular. As of CKEditor 4.1, via its ACF feature, CKEditor now also has a very granular configuration, so we can achieve a 1:1 mapping. We just have to start using it :)


#10: I hadn't even considered it that way, because that would result in an insane many:many relationship graph. It would indeed be conceptually feasible to have a Drupal filter correspond to one or more WYSIWYG editor buttons. But practically, not so much: each WYSIWYG plugin/button could have alternative/overridden implementations, resulting in many:more-than-many relationships, resulting in a code jungle. And then we're not even talking about supporting multiple WYSIWYG editors…

As I said above, you can view CKEditor's ACF as a way of restricting HTML in CKEditor like we have for many years in Drupal's filter system.
This is a universal concept, that any WYSIWYG editor could implement.


I had a discussion with the CKEditor folks yesterday about a related problem space (improved paste from Word). That will also hugely benefit from setting the allowed HTML tags, to avoid pasting in cruft. However, that's when I realized that the patches so far only define which tags and attributes are allowed, it does not yet allow text formats to indicate which specific styles and classes are allowed. I'll have to fix that (though that could even happen in a follow-up issue, because this implements the bulk of the work). But I first want sign-off on the overall approach.

Issue summary:View changes

Fix typo.

Issue summary:View changes

Updated issue summary.

Title:Ensure CKEditor only allows what the text format allows (to guarantee accurate WYSIWYG & security)Set CKEditor's allowedContent configuration to match Drupal's text filters (to help WYS = WYG)
Priority:Critical» Major
Issue tags:-security

Since no one identified how this patch improves security relative to letting CKEditor use its default ACF, I'm adjusting the issue title, tags, and priority accordingly. I also updated the summary to reflect my current understanding of the problem space (in particular, that we're now discussing custom ACF vs. default ACF rather than the original summary that was discussing custom ACF vs. no ACF).

@quicksketch: thanks for #10; that was helpful. It helped inform the changes I made to the summary, but I left the consideration of supporting other editors out of the summary for now, for simplicity. If we end up needing those considerations to resolve this issue properly, we can update the summary again at that time.

+++ b/core/modules/filter/filter.module
@@ -1302,10 +1388,25 @@ function _filter_html_settings($form, &$form_state, $filter, $format, $defaults)
+function _filter_html_allowed_html($filter) {
+  $allowed_html = array();
+  $tags = preg_split('/\s+|<|>/', $filter->settings['allowed_html'], -1, PREG_SPLIT_NO_EMPTY);
+  foreach ($tags as $tag) {
+    $allowed_html[$tag] = TRUE;
+  }
+  return $allowed_html;

This is lessening the accuracy relative to CKEditor's default. _filter_xss_attributes() filters out all "on*" attributes, but that's not reflected here. As a result, this patch allows onclick in through the editor (go into source mode, type that, then switch back), whereas HEAD doesn't. It seems that CKEditor doesn't run onclick events on its content anyway, so this isn't necessarily a security problem, but it still seems to be going backwards, and I'm not clear how we fix that, since CKEditor's ACF doesn't support regex rules, from what I can tell.

+++ b/core/modules/filter/filter.module
@@ -495,6 +495,90 @@ function filter_get_filter_types_by_format($format_id) {
+    // From the set of remaining filters (they were filtered by array_filter()
+    // above), collect the list of tags and attributes that are allowed by all
+    // filters, i.e. the intersection of all allowed tags and attributes.
+    $allowed_html = array_reduce($filters, function($intersection, $filter) {

How would this work with something like Media module (and transformation filters in general)? A common Media configuration is to have the "Limit allowed HTML tags" filter not include <img>. But for the Media filter to run later, and convert the inline JSON codes into <img> tags. Meanwhile, on the CKEditor side, the plugin inserts <img> tags, which only on detach() does it turn back into the JSON codes. Wouldn't the above code disallow <img> tags from allowedContent (since it's computing the intersection) and prevent the plugin from working? Or does CKEditor now support a transformation API that could allow the content to always contain the JSON codes only, and therefore, the transformation to <img> would happen in a different part of its pipeline and be immune from allowedContent?

Since no one identified how this patch improves security relative to letting CKEditor use its default ACF

I disagree. That's what I tried to do in #11, but apparently not clearly enough :)

I said this:

Conclusion: despite CKE 4.1's "automatic ACF" mode, it is still essential for Drupal to impose its own, more accurate, and typically more stringent restrictions.

But, you're absolutely right when you say that

This is lessening the accuracy relative to CKEditor's default.

That's why I said this at the bottom of #11:

[…] I realized that the patches so far only define which tags and attributes are allowed, it does not yet allow text formats to indicate which specific styles and classes are allowed. […] But I first want sign-off on the overall approach.

Maybe I should have called out filter_html's specific restrictions (no style attributes and no on* attributes) that are thus not passed on to CKEditor: since we don't pass any allowed styles to ACF, that's already in accordance (but by omission rather than intentionally).
A problem there is that filter_html uses both a whitelist (for the allowed tags) and a blacklist (for the disallowed attributes). CKEditor's ACF currently only supports a whitelist. So it's impossible to express our disallowance of on* attributes. They're going to address that in one of the next releases (TBD, probably 4.2).
That could be a reason for wanting to postpone this issue until after ACF has blacklist support.

You're also right that ACF currently does not support regex rules (or even wildcard rules), but that's coming soon: http://dev.ckeditor.com/ticket/10202.

(I wanted to keep the discussion focused on the high level aspects, that's why I didn't elaborate on these details; but I probably should have. Apologies.)


How would this work with something like Media module (and transformation filters in general)? [… etc. etc. …]

Your assumption that CKEditor would insert <img> tags which upon detaching CKEditor would be turned back into JSON codes assumes that we're using CKEditor Widgets for Media AFAICT. Without CKEditor Widgets, the JSON code would simply be always visible: when editing without or with CKEditor, it would only be rendered as an image when the end user is looking at it.

If you did not refer to CKEditor Widgets, then please clarify.

If you did refer to CKEditor Widgets, then:

  1. we're not using CKEditor Widgets yet, so I cannot answer to the current status with confidence (but we will start working on that next week)
  2. excellent question! :) If CKEditor Widgets weren't immune from allowedContent, then I'd consider that a (critical) bug. We're starting to work on CKEditor Widgets next week, in order to provide the CKEditor developers with feedback to improve CKEditor Widgets, and if this immunity weren't the case, that'd probably be the first bug report we'd provide.

Status:Needs review» Needs work

That could be a reason for wanting to postpone this issue until after ACF has blacklist support.

I'd like to comment more on #13, but for now, setting to "needs work", because I think we need more from the ACF in general. Not postponing though, because we can use this issue to clarify what Drupal needs, and help get that info to the CKEditor devs.

#14: AFAICT it's as simple as also requiring ACF to support 1) wildcarded things (ticket 10202), 2) blacklist support.

I'll await your further comments on #13 before relaying this to the CKEditor developers.

Status:Needs work» Needs review
StatusFileSize
new19.33 KB
FAILED: [[SimpleTest]]: [MySQL] 56,298 pass(es), 0 fail(s), and 1 exception(s).
[ View ]

Something that wasn't mentioned in the issue summary yet: the growing importance also makes structured content more important, for which this issue is of the utmost importance. See http://wimleers.com/article/drupal-8-structured-content-authoring-experi... for more details.


Here's a reroll of the patch in #6, from almost 2.5 months ago. It doesn't address any of the feedback yet, it's just changing things to apply to HEAD again. NR to get testbot to run, don't actually review.

Status:Needs review» Needs work

The last submitted patch, ckeditor_acf-1936392-16.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new1.51 KB
new20.79 KB
PASSED: [[SimpleTest]]: [MySQL] 55,592 pass(es).
[ View ]

Forgot a file :/

Title:Set CKEditor's allowedContent configuration to match Drupal's text filters (to help WYS = WYG)Configure CKEditor's ACF to match Drupal's text filters settings to ensure perfect accuracy
StatusFileSize
new30.31 KB
new34.82 KB
PASSED: [[SimpleTest]]: [MySQL] 56,981 pass(es).
[ View ]

What's new

All previous patches were incomplete in their coverage of possible HTML restrictions that filters may impose. They only conveyed the following filter restrictions:

  1. whitelisting HTML tags
  2. whitelisting HTML attributes for a specific HTML tag

Consequently, that was also the only information we were able to convey to CKEditor's allowedContent setting.

This patch allows for the following filter restrictions:

  1. forbidding (blacklisting) HTML tags
  2. allowing (whitelisting) HTML tags
  3. for a specific allowed HTML tag: disallow any attribute
  4. for a specific allowed HTML tag: allow any attribute, any value
  5. for a specific allowed HTML tag: disallow a specific attribute (partial wildcards allowed, e.g. on-*))
  6. for a specific allowed HTML tag: allow a specific attribute (partial wildcards allowed, e.g. data-*)) and for it a range of allowed attribute values
  7. For points 3, 4, 5 and 6, instead of a specific allowed HTML tag, you can also use the asterisk tag * which means it contains attribute restrictions that apply to all allowed tags.

As you can see, this is very expressive. The only things missing are highly dynamic things like regular expression support to specify attribute values or even callback functions. Only in the most exotic edge cases, those things are necessary; this already allows for much more expressiveness than Drupal core itself needs!

(That means this is able to express more than CKEditor's ACF can, which doesn't allow you to specify specific attribute values for attributes other than style and class. Maybe some day ACF will support that too, despite the limited use for that.)

Changes in code

PHP

To retrieve this metadata from filters, FilterInterface now has FilterInterface::getHTMLRestrictions(). FilterBase::getHTMLRestrictions() returns FALSE, so that filters that are not in fact restricting HTML even provide a sane, correct answer.
There's a new function called filter_get_html_restrictions_by_format() that calculates the end result of applying all those filters in succession (it ignores filters that aren't of the FILTER_TYPE_HTML_RESTRICTOR type). This is what text editors like CKEditor can use to configure features like CKEditor's ACF.

JS/CKEditor

In previous patches, we only had a whitelist of tags and attributes to work with. Now, with FilterInterface::getHTMLRestrictions(), we also have styles and classes. Which means we can now configure the full range of CKEditor's ACF. As of this patch, we do that.

However, CKEditor does not yet have attribute wildcard support (http://dev.ckeditor.com/ticket/10202) nor blacklist support (http://dev.ckeditor.com/ticket/10276). Yet Drupal core's filter_html filter always strips the style attribute and any on* attribute, which is a restriction like described in point 5: for all allowed HTML tags, disallow a specific attribute, with a wildcard.

Since CKEditor's ACF does not have any blacklist support yet, not will it come very soon, and this patch is so very important, the best thing I was able to do is to provide a temporary hack that adds support for it until CKEditor supports it explicitly.
So that's what I did. See the changes to core/modules/ckeditor/js/ckeditor.js.

Before vs. after

Using the "Basic HTML" text format, 3 examples:

  1. Using the <code> tags before this patch would cause them to be stripped. With this patch, they're maintained.
  2. The style and on* attributes are stripped both before and after this patch. The difference is that before, those attributes simply didn't happen to be allowed by any of the CKEditor buttons. After this patch, they're stripped because the text format is configured like that.
  3. Any class or data-* attributes would be stripped before this patch. With this patch, they're maintained. Again: not by pure chance, but because the text format is configured that way.

Use this sample HTML to evaluate: select the Basic HTML text format, click the Source button, paste it, click Source button again (to preview, this applies ACF), click Source button again (to look at resulting source):

<p class="callout"><a onclick="alert('clicked');">sfsdf</a></p>
<p style="color: red"><code>this is red code</co de></p>
<p data-foo="bar">foobar?</p>

Note that the closing "code" tag intentionally contains a space ("co de") in all code samples because filter_html apparently isn't smart enough; causing problems here on d.o… :)

Result before patch:

<p>sfsdf</p>
<p>this is red code</p>
<p>foobar?</p>

Result after patch:

<p class="callout"><a>sfsdf</a></p>
<p><code>this is red code</co de></p>
<p data-foo="bar">foobar?</p>

Conclusion

  • Drupal must steer CKEditor's ACF. Must.
  • ACF is currently less expressive than Drupal's filter system allows. We can approximate now and improve later.
  • But ACF is almost expressive enough for Drupal core's filters.
  • No matter if we ship with or without the work-around, the CKEditor developers are working on expanding the expressiveness, in http://dev.ckeditor.com/ticket/10202 and http://dev.ckeditor.com/ticket/10276.
  • The patch includes a work-around that guarantees full accuracy for Drupal 8 core's default text formats. We can choose to drop that work-around in favor of lessened accuracy, which is another acceptable solution. In that case, the result would look like this:
    <p class="callout"><a onclick="alert('clicked');">sfsdf</a></p>
    <p><code>this is red code</co de></p>
    <p data-foo="bar">foobar?</p>

Title:Configure CKEditor's ACF to match Drupal's text filters settings to ensure perfect accuracyConfigure CKEditor's "Advanced Content Filter" (ACF) to match Drupal's text filters settings

I've only read through about half the patch so far. Here are two things that stuck out:

- I don't think we need to be so elaborate in describing our tag blacklist as a "hack". I think it's a perfectly valid solution to provide a blacklist of our own by using a "instanceLoaded" event until CKEditor provides one natively. However maybe we should simply make it explicit rather than assuming filtered_html is being added. Couldn't we add a simple check server-side if "filtered_html" is present in the filter list and pass that to the editor's JS settings? I think it's only the assumption that it must be added that's the hacky part.

- filter_get_html_restrictions_by_format() has a return value of TRUE here, but the return value for what happens with TRUE is not documented:

+    if ($filter->getType() === FILTER_TYPE_HTML_RESTRICTOR && $filter->getHTMLRestrictions() !== FALSE) {
+      return TRUE;
+    }

+++ b/core/modules/ckeditor/js/ckeditor.jsundefined
@@ -98,8 +100,81 @@ Drupal.editors.ckeditor = {
+   * Drupal's default HTML filtering (the filter_html filter) also blacklists
...
+   * So this function hacks in explicit support for Drupal's filter_html's need
+   * to blacklisting specifically those attributes, until ACF supports

Small comment fix: 'to blacklisting' to 'to blacklist'.

After reading a couple times through this issue, the approach taken here seems correct. We do not want the filter/format system in Drupal to bend to the needs of editors (like CKEditor). Instead, Drupal should produce a standard, predictable representation of the content output and that editors will need to consume. If there is information loss in the exchange, we need to correct that loss at the editor level, as Wim has been pushing for with the CKEditor folks.

I walked through the JavaScript portion of the patch. It looks good. I added restrictive filters to each of the basic text formats in Drupal and walked through the a content item edit page load for each format. I'm able to add attributes to HTML elements except the attributes style and on*.

I don't feel I can give a sufficient review of the PHP portion of the patch that would allow me to RTBC this issue, but from the perspective the JavaScript, it's ready to go.

StatusFileSize
new883 bytes
new34.81 KB
PASSED: [[SimpleTest]]: [MySQL] 57,145 pass(es).
[ View ]

#20:

- I don't think we need to be so elaborate in describing our tag blacklist as a "hack". I think it's a perfectly valid solution to provide a blacklist of our own by using a "instanceLoaded" event until CKEditor provides one natively. However maybe we should simply make it explicit rather than assuming filtered_html is being added. Couldn't we add a simple check server-side if "filtered_html" is present in the filter list and pass that to the editor's JS settings? I think it's only the assumption that it must be added that's the hacky part.

… but it is a hack because it goes into CKEditor's ACF feature's internals: _ACF_HACK_to_support_blacklisted_attributes() is not using a public API!

I'm working closely with the CKEditor guys to either remove this code altogether (blacklisting support in ACF) or to have an official API to do something like this. Once we get to that point, I'd be fine with removing the "warning! hack! SCARY!" language :)

You're right that we can make it so that it only has effect when the text format actually uses the filter_html filter. I preferred to not do that to keep it intentionally hackily broken, to very explicitly convey that this must be temporary. The editor settings are then also not polluted with metadata only used by this hack.
But if you feel strongly about this, I'll definitely add it.

filter_get_html_restrictions_by_format() has a return value of TRUE here, but the return value for what happens with TRUE is not documented: […]

You've misunderstood the code :). That's a return statement in a closure for the call to array_filter(); it's not a return statement for the filter_get_html_restrictions_by_format() function itself!


#21: Fixed the typo. :)

Status:Needs review» Needs work

The last submitted patch, ckeditor_acf-1936392-22.patch, failed testing.

Dozens of "The test did not complete due to a fatal error." fatals, despite #22 being nigh identical to #19. Testbot failure. Retesting.

Status:Needs work» Needs review

#22: ckeditor_acf-1936392-22.patch queued for re-testing.

Status:Needs review» Needs work

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.phpundefined
@@ -265,4 +270,134 @@ protected function generateFormatTagsSetting(Editor $editor) {
+      $get_allowed_attribute_values = function() {
+        $values = array_keys(array_filter($attribute_values, $is_not_forbidden));
+        if (count($values)) {
+          return implode(',', $values);
+        }
+        else {
+          return NULL;
+        }
+      };
...
+                $allowed_styles = $get_allowed_attribute_values($attributes['style']);

Where does $attribtue_values come from here?! What am I missing?

Ie. how does $attributes['style'] used in the function?! It does not take an argument.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.phpundefined
@@ -265,4 +270,134 @@ protected function generateFormatTagsSetting(Editor $editor) {
+          // CKEditor only allows specific values for the "class" and "style"
+          // attributes; so ignore any other restrictions.

In contrast does Drupal allow for more restrictions? Its not clear here at all.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Tests/CKEditorTest.phpundefined
@@ -104,13 +108,22 @@ function testGetJSSettings() {
+    // Change the allowed HTML tags; the "allowedContent" and format_tags"
+    // settings for CKEditor should automatically be updated as well.

"format_tags"

+++ b/core/modules/filter/filter.moduleundefined
@@ -416,6 +416,148 @@ function filter_get_filter_types_by_format($format_id) {
+            // The tag is in the intersection, but now we most calculate the
+            // intersection of the allowed attributes.

most => must

+++ b/core/modules/filter/lib/Drupal/filter/Plugin/FilterInterface.phpundefined
@@ -176,6 +176,74 @@ public function prepare($text, $langcode, $cache, $cache_id);
+   *     array(
+   *       'allowedTags' => array(
+   *         // Allows any attribute with any value on the <div> tag.
+   *         'div' => TRUE,
...
+   *   A simpler example, for a very coarse filter:
+   *     array(
+   *       'forbiddenTags' => array('iframe', 'script')
+   *     )
+   *
+   *   The simplest example possible: a filter that doesn't allow any HTML:
+   *     array(
+   *       'allowedTags' => array()
+   *     )

These are supposed to be wrapped in @code tags no?

+++ b/core/modules/filter/lib/Drupal/filter/Tests/FilterAPITest.phpundefined
@@ -2,17 +2,19 @@
-use Drupal\simpletest\WebTestBase;
+use Drupal\simpletest\DrupalUnitTestBase;

Woot :)

+++ b/core/modules/filter/lib/Drupal/filter/Tests/FilterAPITest.phpundefined
@@ -100,10 +109,78 @@ function testFilterFormatAPI() {
+    // Test on stupid_filtered_html, where nothing is disallowed.
+    $stupid_filtered_html_format = entity_create('filter_format', array(
+      'format' => 'stupid_filtered_html',
+      'name' => 'Stupid Filtered HTML',
+      'filters' => array(
+        'filter_html' => array(
+          'status' => 1,
+          'settings' => array(
+            'allowed_html' => '', // Nothing is allowed.

Nothing is allowed vs. nothing is disallowed :)

+++ b/core/modules/filter/lib/Drupal/filter/Tests/FilterAPITest.phpundefined
@@ -100,10 +109,78 @@ function testFilterFormatAPI() {
+          'status' => 1,
+          'settings' => array(
+            'allowed_html' => 'p br a strong',

Wouldn't these be

,
, etc. Not just the names of the tags?

+++ b/core/modules/filter/tests/filter_test/lib/Drupal/filter_test/Plugin/Filter/FilterTestRestrictTagsAndAttributes.phpundefined
@@ -0,0 +1,41 @@
+ *   title = @Translation("Tag & attribute restricting filter"),

I don't think we use & *anywhere* like this(?) I'd just use good old "and" :)

+++ b/core/modules/filter/tests/filter_test/lib/Drupal/filter_test/Plugin/Filter/FilterTestRestrictTagsAndAttributes.phpundefined
@@ -0,0 +1,41 @@
+ *   description = @Translation("Replaces all content with filter and text format information."),

What? Replaces all content with information(?) it seems to return the text as-is passed in with no replacement, so does not seem like a correct description?

Status:Needs work» Needs review
StatusFileSize
new16.95 KB
new7.05 KB
new40.53 KB
PASSED: [[SimpleTest]]: [MySQL] 56,177 pass(es).
[ View ]

#26: Thanks! All fixed. :) Existing tests slightly improved. The reason that the p br a strong case also worked fine is simple: that's in fact also supported by the filter_html format, but almost nobody knows that :)

Now also with test coverage that would have prevented the first problem you pointed out in #26.

People are starting to notice the problems in D8 HEAD and are starting to file bug reports that this issue fixes!

#2020081: H4 element not distinguishable from rest of text (P element)
#2020093: Unable to add formatting in source unless the corresponding wysiwyg menu item is enabled

Status:Needs review» Reviewed & tested by the community

Thanks for the expanded test coverage and the separate interdiffs especially. Looks good to me!

Wim asked me to test this here and it solves the problem that I had. Thanks guys!

Status:Reviewed & tested by the community» Needs work

Relatively minor nits, but "needs work" at least on the incorrect checking of a never defined $allowed_values (see below).

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+        if (count($values)) {
+          return implode(',', $values);
+        }

Just a stylistic suggestion, but since the need to implode is an artifact of CKEditor's data format, I think it would be better to move it out of the closure and into where you actually set the variable further down in the function. Feel free to disagree though.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+        else if ($attributes === TRUE) {

Though it makes no functional difference, Drupal's convention for PHP code is to use elseif as one word.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+            'styles' => TRUE,
+            'classes' => TRUE,

I assume the reason this is needed is that CKEditor essentially ignores 'attributes' for 'style' and 'class'? In other words, if 'attributes' => TRUE is set, but 'styles' and 'classes' are left unspecified, then CKEditor removes all styles and classes? If so, then a comment to that effect would be helpful. If not, then a comment as to why this is actually needed would be helpful.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+                $allowed_styles = $get_allowed_attribute_values($attributes['style']);
+                if (isset($allowed_values)) {

s/$allowed_values/$allowed_styles/. We need test coverage for this as well.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+                $allowed_styles = $get_allowed_attribute_values($attributes['class']);
+                if (isset($allowed_values)) {

s/$allowed_styles/$allowed_classes/.
s/$allowed_values/$allowed_classes/. + test coverage.

+++ b/core/modules/ckeditor/lib/Drupal/ckeditor/Plugin/CKEditorPlugin/Internal.php
@@ -264,4 +269,145 @@ protected function generateFormatTagsSetting(Editor $editor) {
+            $allowed_styles = $get_allowed_attribute_values($attributes['style']);
+            if (isset($allowed_values)) {

And once more. (another hunk from the one 2 above).

+++ b/core/modules/filter/lib/Drupal/filter/Plugin/FilterInterface.php
@@ -176,6 +176,79 @@ public function prepare($text, $langcode, $cache, $cache_id);
+   * @return array|FALSE
+   *   A nested array with the allowed tags as keys, and for each of those tags

Docs needs fixing for the fact that the first level of the array is 'allowedTags' or 'forbiddenTags'. And the 2nd level of the array varies between these two.

+++ b/core/modules/filter/lib/Drupal/filter/Plugin/FilterInterface.php
@@ -176,6 +176,79 @@ public function prepare($text, $langcode, $cache, $cache_id);
+   *       'allowedTags' => array(

AFAIK, in Drupal PHP code, array keys are always lowercase (i.e., 'allowed_tags'). Perhaps except when we transfer them directly to JS, but that's not what we're doing here. Unless you can find a counter example, I suggest making that change here to match our conventions.

+++ b/core/modules/filter/lib/Drupal/filter/Plugin/FilterInterface.php
@@ -176,6 +176,79 @@ public function prepare($text, $langcode, $cache, $cache_id);
+   *       'forbiddenTags' => array('iframe', 'script')

Ditto.

Status:Needs work» Needs review
StatusFileSize
new16.52 KB
new4.22 KB
new41.93 KB
PASSED: [[SimpleTest]]: [MySQL] 56,362 pass(es).
[ View ]

Just a stylistic suggestion, but since the need to implode is an artifact of CKEditor's data format, I think it would be better to move it out of the closure and into where you actually set the variable further down in the function. Feel free to disagree though.

This already *is* CKEditor-specific code, so that's intentional :)


All points in #31 addressed. Two seperate interdiffs that combined form the whole set of changes: one with fixes for #31, one with the renaming of allowedTags to allowed and forbiddenTags to forbidden_tags, along with updated + extended docs on FilterInterface.

(I talked to effulgentsia in chat to discuss the name changes.)

Status:Needs review» Reviewed & tested by the community

Thanks.

Status:Reviewed & tested by the community» Fixed

Committed to 8.x. Thanks!

Issue tags:+Needs followup

+++ b/core/modules/ckeditor/js/ckeditor.jsundefined
@@ -98,8 +100,81 @@ Drupal.editors.ckeditor = {
+   * This is the only way we could get https://drupal.org/node/1936392 committed
+   * before Drupal 8 code freeze on July 1, 2013. CKEditor has committed to
+   * explicitly supporting this in some way.
+   *
+   * @todo D8 remove this once http://dev.ckeditor.com/ticket/10276 is done.

Do we have a d.o issue to do our part once the CK ticket is resolved? If no, can we file it, and if yes, can we link it here and in the summary? Thanks!

Issue tags:-sprint, -Needs followup

Awesome! This is the last absolutely essential foundational CKEditor integration issue :) Everything else is to provide a delightful experience, but this fixes a fundamental flaw. So glad to have this in :)

#35: Great point — follow-up created: #2023629: Remove our CKEditor ACF work-around once ACF supports blacklisting.

Automatically closed -- issue fixed for 2 weeks with no activity.

Issue summary:View changes

Updated issue summary.