Drupal input filters are used to control what HTML tags a user can use. For example, the default "filtered input" filter allows users to use <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>. This means that the node author can use tags such as <strong> to make some portion of the text bold as I've done here.

The problem is that accessible HTML should use CSS for formatting. Most WYSIWYG editors do this properly by writing CSS instead of the (deprecated) HTML tags. But our input filter system strips CSS classes and style attributes, leaving HTML only. And pages don't display as intended by the WYSIWYG editor. The only way to fix this is to replace the CSS style directives with HTML. However, many of the HTML formatting tags have been deprecated (for example <center> <font> <u> <i> are all deprecated). You can still use them, but it's not advisable. And there's just no way of accomplishing certain basic text formatting without CSS (for example, the CSS attribute "text-align: right" doesn't have an equivalent HTML tag).

So, we need to rethink the filter.module.

  • Class attributes should not filtered. For example, <p class="right"> seems like reasonable HTML to allow. Is there any security risk here? I can't think of one.
  • <span> and <div> should not be filtered. These two HTML tags are the building blocks for adding most CSS. By themselves they do very little. But they are containers for holding many CSS directives. Is there any security risk here? I can't think of any harm in allowing these.
  • We need a way of specifying what CSS attributes can or can not be used with the filter, rather than just what HTML tags can or can not be used. This is the tricky part, and will involve an entirely new way of defining filters. Do we restrict to classes? Do we define the permitted attributes? and/or the permitted attribute/value pairs? For Example, do we allow all "font-weight" attributes, "font-weight: bold", or just "class='bold'"?

IMO, these are necessary precursors to adding a WYSIWYG editor to core.

Comments

catch’s picture

I've run into this a few times. Can think of a couple of issues:

If you have some positional css in your theme - say stuff with floats or position: absolute; attached to classes, then potentially someone could re-use that to stick their html element somewhere crazy. Most of the time these are used with IDs, so might just come under 'don't do that', but still potential weirdness could ensue.

It'd be possible to close an existing div by putting

in. Does the htmlcorrector filter in core deal with unopened tags now? Might not be worth worrying about if so.

Personally I'd be against allowing inline style declarations - or at least make this configurable same as the html tags.

Oceria’s picture

I second this. I use css formatting in my nodes as well, but the html filter changes the layout in limited html filter and rss feeds. Adding tags like

does not work.
sun’s picture

Interesting. It seems, I somehow managed to circumvent the removal of 'class' attributes, see project page of Image Assist. While I think that this kind of abuse is okay, evil users could certainly apply classes that will completely break a site's layout.

However, +1 on this in general. How about a module-controlled tag and attribute register? Instead of manually typing a list of allowed tags and attributes, we should /force/ the required (for certain modules like Image Assist or Wysiwyg editors) and allow to /select/ the optional.

beginner’s picture

Yes, allowing style attributes seems logical.

dwees’s picture

Things like:

style="background-image: expression(alert('Is_XSS_HERE?))"

and

style="background-image: %26%23101%26%23120%26%23112%26%23114%26%23101%26%23115%26%23115%26%23105%26%23111%26%23110(alert('Is_XSS_HERE?))"

are going to make this a bit tricky.

markus_petrux’s picture

I wasn't aware of this discussion. Subscribing. :)

I'm writing a filter that is able to parse style properties. Not sure yet where it may lead me though.

Here I posted the function where I'm compiling information about almost all CSS2 attributes and regular expressions to parse them:

http://groups.drupal.org/node/15979#comment-54525

markus_petrux’s picture

Here's a module (comitted to my sandbox space) that tries to validate style properties:

http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/markus_petr...

I would appreciate if someone could take a look, so I'm not completely sure if it's secure what I'm doing here.

sun’s picture

Issue tags: +wysiwyg
vito_a’s picture

I'm sorry, but why can't some contributed modules help in this case? IMHO, the http://drupal.org/project/htmlpurifier and http://drupal.org/project/wysiwyg_filter are looking like a good solutions for the problem. Could these be ported to 7.x as soon as it will be released ?

deshilachado’s picture

subscribing

kroimon’s picture

I'm using the http://drupal.org/project/wysiwyg_filter to reach exactly what you want.
It's much more flexible than the default html filter. Maybe it should be included in the core to replace the html filter module...

sun’s picture

Title: Filter options based on CSS attributes, not just HTML tags » Allow more granular HTML filter options for HTML attributes, not just tags
Version: 7.x-dev » 8.x-dev

Unless we find a usable UI in wysiwyg_filter or other contributed modules, such a configuration option would totally blow away novice users.

Aren Cambre’s picture

fastballweb’s picture

I get a bit more verbose on #694904: Split 'Limit HTML Tags' Filter, but to sum up, I think there are two main reasons for using this filter. And I don't think one filter can fully work for both of them. On the one hand, you have people using this for security reasons--to disallow tags/attributes that can allow for XSS and the like. This is why all of the filters available have hardcoded blacklists of some sort--the HTML filter disallows style attributes, WYSIWYG filter (and others) disallow object tags.

On the other hand, sites like mine have a controlled group of trusted users--trusted, that is, not to do anything malicious, but I can't say the same about their HTML coding ability :) I want to control their output only to preserve a consistent style and to prevent them from messing up my site layout. So the blacklists are just a hindrance for me.

This discussion seems to be holding to the idea that we need to maintain the security aspect of this filter at all costs. That's noble but I think it's making this unnecessarily complicated. I could be wrong--maybe I'm in the very small minority--but I have a feeling that a lot of the demand for a fix would be appeased by simply making the attribute-stripping into a separate filter. No need for a complicated UI. Just break this into two filters. The attribute-stripping isn't useful to everyone.

Like I said, I'm prepared to be dead wrong. I just can't imagine many scenarios where one would want to give tons of HTML formatting options to a user that isn't trusted security-wise. So figuring out some elaborate way to allow style attributes while stripping out possible XSS sounds great, but I don't think most users would need it. And splitting the filter wouldn't complicate the UI in any appreciable way, just one more checkbox.

keithm’s picture

subscribe

alauzon’s picture

http://drupal.org/project/wysiwyg_filter made it for me. With it I can pinpoint which CSS I permit and which ones I don't.

Wim Leers’s picture

Status: Active » Closed (fixed)
  • We now have a WYSIWYG editor in core.
  • We also have standardized alignment classes in system.module.css: .text-align-(left|center|right|justify) for text and .align-(left|center|right) for everything else (images, videos, blockquotes …).
  • The <span> tag is allowed by default in "Basic HTML".
  • Things like <p class="right"> are allowed and supported in core out of the box.

Therefore, closing this issue, since the majority of it has been implemented by now. If there still is a need for the remaining few things, let's open a new, more up-to-date issue for it.