If an input format is configured with the php filter or withut the html filter, and is enabled for anonymous/authenticated users, we should have hook_requirements spit out a warning. Cwgordon noted it should only do this if registration is open, which might avoid it being overly annoying for sites where it's not likely to be a security risk

Files: 
CommentFileSizeAuthor
#79 filter-security-275811-79.patch30.72 KBpwolanin
Unable to apply patch filter-security-275811-79.patch
[ View ]
#76 filter-security-275811-76.patch30.09 KBpwolanin
Passed: 14733 passes, 0 fails, 0 exceptions
[ View ]
#72 wip-275811-72.patch30.71 KBpwolanin
Failed: 14612 passes, 0 fails, 28 exceptions
[ View ]
#70 wip-275811-70.patch31.68 KBpwolanin
Failed: 14639 passes, 0 fails, 28 exceptions
[ View ]
#69 wip-275811-69.patch31.67 KBpwolanin
Failed: 14632 passes, 0 fails, 31 exceptions
[ View ]
#66 wip-275811-66.patch32.14 KBpwolanin
Failed: Failed to install HEAD.
[ View ]
#58 insecure-filter-configurations-275811-58.patch34.8 KBDavid_Rothstein
Passed: 14763 passes, 0 fails, 0 exceptions
[ View ]
#55 275811.patch30.86 KBJody Lynn
Failed: Failed to apply patch.
[ View ]
#47 insecure-filter-configurations-275811-47.patch30.6 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#47 filter_permission_warnings.png38.1 KBDavid_Rothstein
#44 insecure-filter-configurations-275811-44.patch27.71 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#41 insecure-filter-configurations-275811-41.patch28.16 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#37 insecure-filter-configurations-275811-37.patch21.94 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#35 insecure-filter-configurations-275811-35.patch21.45 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#33 insecure_filter_configurations.patch6.88 KBJohn Morahan
Failed: Failed to apply patch.
[ View ]
#29 insecure_filter_configurations_2.patch7.26 KBDavid_Rothstein
Failed: Invalid PHP syntax in modules/php/php.module.
[ View ]
#26 insecure_filter_configurations.patch5.79 KBDavid_Rothstein
Failed: Failed to apply patch.
[ View ]
#12 filter-warnings-2.patch15.99 KBJody Lynn
Failed: Failed to install HEAD.
[ View ]
#11 filter-warnings-2.patch16.11 KBJody Lynn
Failed: Failed to install HEAD.
[ View ]
#7 filter-warnings.patch14.99 KBJody Lynn
Failed: Failed to apply patch.
[ View ]
#4 filter-security-error.patch2.22 KBJody Lynn
Failed: Failed to apply patch.
[ View ]

Comments

Priority:Normal» Critical

i would at least go one step further, to disallow any input format that contains the php filter or without the html filter for anonymous users...

+1 for warning when registration is public. (at least, mabye even disallow)

Big -1 for absolutely disallowing users to do what they want with the filters. If this is on an intranet, there may be no need for such security. Of course, that's probably a small minority, but making that option impossible without hacking core is REALLY not the right way to do this.

I am, however, also against any warning to the user that is persistent. Anything that sits around and is red is incredibly annoying and hurts usability because of that. I would think something like a confirm form that further explains the reasons why this may be bad would be good, but we should be wary of adding warnings that aren't so much educational and just serve to annoy (just look at UAC in Vista).

Status:Active» Needs review
StatusFileSize
new2.22 KB
Failed: Failed to apply patch.
[ View ]

Patch implements the functionality described by catch:
For sites which allow open registration, an error status message is displayed when anonymous users have access to an input filter which does not filter HTML or the PHP input filter.

Like Susurrus I disagreed with comment #2's suggestion to disallow such configurations completely.

I think that using the error level of status problem is appropriate however. A site which allowed open and registration and did not have proper filtration but which would receive this message in vain would be incredibly rare. An example would be a site which was secured by htaccess or some other means, or a site with its own custom filtration mechanism. I think it's worth bothering such a rare site with a potentially unnecessary error message for the benefit of the vast majority of sites.

patch looks visually solid, however I didn't test it yet.

I agree on error status, and on those edge cases, well... you shouldn't trust on a single line of defense ;)

maybe adding some tests?

Status:Needs review» Needs work

Yes, need a test.

Status:Needs work» Needs review
StatusFileSize
new14.99 KB
Failed: Failed to apply patch.
[ View ]

I wrote tests and improved the code a lot (writing the tests helped...)

I split the hook_requirements code into filter.install and php.install (as well as putting the tests separately.)

Filter functionality:
An error message is displayed on the status page when:
- The default format does not contain the HTML filter.
OR
- Anonymous users can access a format that does not contain the HTML filter.
OR
- Authenticated users can access a format that does not contain the HTML filter and user registration is open.

PHP Filter functionality:
An error message is displayed on the status page when:
- The default evaluates PHP.
OR
- Anonymous users can access a format that evaluates PHP.
OR
- Authenticated users can access a format that evaluates PHP and user registration is open.

The tests ensure error messages are displayed under these conditions.

can't you call the database outside the loop?

Status:Needs review» Needs work

I looks like the database query could indeed be outside the loop.

Also I think we should warn about this on sites that have administrator approval for new users too - just because you restrict who can join your site doesn't mean they should be able to pwn it once their in.

Assigned:Unassigned» Jody Lynn

I'll work on those two changes.

Status:Needs work» Needs review
StatusFileSize
new16.11 KB
Failed: Failed to install HEAD.
[ View ]

OK, new patch

Took the db query out of the loop and extended the error message to user registration set to open w/ approval. Updated tests.

StatusFileSize
new15.99 KB
Failed: Failed to install HEAD.
[ View ]

Patch applies fine for me so not sure why the testbed fails. Here's the same patch without the initial ? files.

The testbed has been failing all weeks due to a memory issue on the server. I'll try to test this asap.

Status:Needs review» Needs work

The last submitted patch failed testing.

This is stupid. What about input formats based on filters like Textile, Markdown or BBCode which escape all HTML? Forcing their input or output to be run through the HTML filter is undesirable. On the input side, you may have [code][/code] tags that are safe to use. On the output side, you may have advanced features (like a YouTube embed) that output tags whose general form is unsafe, but whose specific form isn't.

The proper solution is to let filters advertise themselves as "sanitizing", and have a standard testbed of XSS filters to verify this as much as possible. Or, to restructure the input format system so that every format must start with a special 'input filter' whose responsibility it is to output safe HTML (or advertise itself as unsafe), after which additional transformations can be made.

hmm... based upon modular support... i think it's the filter order that is important for your problem here...

what if i want to use [] and <>

in my opinion... HTML filter should be the only one that filters html

(please give an example of html filtering that can't be done with the html filter)

I totally agree with Steven. Additionally, HTML is just one language - and one of our goals is to support different output languages. See also http://groups.drupal.org/node/15996 and http://groups.drupal.org/node/15979

What about input formats based on filters like Textile, Markdown or BBCode which escape all HTML?

they would run before the html-filter, giving escaped html (not <a> but &lt;a&gt;) to the html filter which will apprear in the source code, displaying <a> on the screen.

@alexanderpas you totally missed the point. This example shows that it is already unnecessary to require the HTML filter in that case.

However, if you factor in advanced features (like a safe YouTube embed), this means that you are not just asking people to enable the HTML filter unnecessarily, but also to add unsafe tags like <object> and <embed> to the allowed tag list in order to make existing safe features work. And even more, in this situation, the HTML filter has to be placed later in the filter chain to not fuck things up, when in normal situations, the HTML filter should go first.

So you're requiring an unconventional and very specific configuration in situations that are already safe today. This is adding confusion to a part of Drupal that already confuses most people and in fact promotes unsafe behaviour.

Not to mention that this example demonstrates that the proposed patch fails to catch unsafe situations in the first place, like adding object and embed tags to the HTML filter.

you're totally right about that...

now i'm wondering.... as a usecase... how to allow youtube <object> tags, and disallow other <object> tags, allow bbcode including [code], disallow html except <code>

I can only think of a three-step process....
CLAIM - SANITIZE - OUTPUT

CLAIM:
htmlfilter - claims <code> tags
youtube - claims <object> tags that are for youtube
bbcode - claims [code] tags

SANITIZE:
youtube - ensures youtube objects follow specs.
htmlfilter - removes unwanted HTML (doesn't see youtube HTML and [code], has already identified <code> blocks.)

OUTPUT:
bbcode - modifies BBcode into HTML

in this model, htmlfilter won't see the youtube code, or the bbcode [code], it only sees a placeholder or something....

also, i think this fixes issues with filter-order.

The following should be a good addition to this patch. Right now one can add script, iframe or object to the list of allowed tags and sleep happily -- and the filter's help string: "Allows you to restrict the HTML tags the user can use. It will also remove harmful content such as JavaScript events, JavaScript URLs and CSS styles from those tags that are not removed." doesn't mention that some tags should never be added. At least there should be a warning or a confirmation.

The HTML filter could actually refuse to run if given some tags -- I believe it doesn't make sense to filter if someone allows script, style, iframe, frame, meta, link, object, embed, applet, layer etc.

Maybe there could be also a confirmation/warning for anything with src or similar attribute (like img, bgsound, input), which is usually not exploitable by itself, but makes CSRF easier (and CSRF vulnerabilities are not so rare even in core: DRUPAL-SA-2006-025, DRUPAL-SA-2006-025, DRUPAL-SA-2007-017, DRUPAL-SA-2008-005, DRUPAL-SA-2008-044, DRUPAL-SA-2008-047). It's too late here, does that make sense? :)

Well, too bad that Textile and Markdown do not generate safe HTML.

The Markdown filter doesn't even escape HTML. Used on it's own, <script>whatever()</script> would do fine. Textile would need

==<script>whatever()</script>
==

For Textile after the HTML filter use:

!http://example.com(x"onerror="&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#97;&#108;&#101;&#114;&#116;&#40;&#39;&#120;&#115;&#115;&#39;&#41;")!

For Markdown after the HTML filter (src="javascript:" works fine in Opera 9.64) use:

![](&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#97;&#108;&#101;&#114;&#116;&#40;&#39;&#120;&#115;&#115;&#39;&#41;)

Barry Jaspan proposed a taint flag (or a set of taint flags) system.

[...] We declare two new properties of input filters: "Allows XSS attacks" and "Prevents XSS attacks." Every input format is born with the "Allows" taint bit set (since by default, no filtering is performed). Sequentially, each input filter with "Allows" set turns on the "Allows" taint bit for the format, and each input filter with "Prevents" set turns off the "Allows" taint bit for the format. Any format that ends with the "Allows" taint bit set gets a warning, or is disabled unless the loudly-worded checkbox from above it checked, or whatever.

Example filter orders:

Default start: sets taint flag
HTML Filter  : unsets taint flag

-> Input format is safe


Default start: sets taint flag
HTML Filter  : unsets taint flag
Textile      : sets taint flag

-> Input format is unsafe


Default start: sets taint flag
Textile      : sets taint flag
HTML Filter  : unsets taint flag

-> Input format is safe


Default start: sets taint flag
Ideal Textile: unsets taint flag

-> Input format is safe


Default start : sets taint flag
HTML Filter   : unsets taint flag
YouTube       : -

-> Input format is safe

Obviously, we'd need more states to declare the PHP filter and other formats whose unsafeness doesn't hinge on HTML as Unsafe. It also relies on filter authors to properly mark their filters, but they are better suited to do that than end users.

Issue tags:+Release blocker

I'm marking this as a release blocker. There's not one week where a blog on Planet Drupal (!) doesn't have its filters misconfigured. If those people cannot figure it out, who can?

alexanderpas:

i would at least go one step further, to disallow any input format that contains the php filter or without the html filter for anonymous users...

There is currently at least one reason for making "Full HTML" the default; and that is blogapi module, which always uses the default format. Although perhaps that should be addressed elsewhere.

(please give an example of html filtering that can't be done with the html filter)

http://drupal.org/project/wysiwyg_filter does what the HTML filter cannot; it filters HTML attributes and style properties.

Assigned:Jody Lynn» Unassigned
Issue tags:+wysiwyg

Status:Needs work» Needs review
StatusFileSize
new5.79 KB
Failed: Failed to apply patch.
[ View ]

I hope this issue maintains its "release blocker" status -- as a minor API change but a major security improvement, it seems like a no-brainer to me.

The attached patch at least gets things started before code freeze, with a rough proof-of-concept implementation of the idea in #22. It successfully labels 'Full HTML' as unsafe and 'Filtered HTML' as safe, but switches 'Filtered HTML' to unsafe if you include a dangerous tag on the list of allowed tags for the HTML filter (currently just <object> or <embed>, but obviously that list can be expanded later), etc.

Obvious things that still need work:

  • The patch does not currently do anything with the PHP filter at all. It is really too bad that this issue was labeled "stupid" and derailed for so long, because in fact, 50% of the original patch dealt with the PHP filter, and it seems to me that something very similar to that approach is the right way to go for that. The PHP filter is somewhat unique, because no filter which runs after it can ever "sanitize" it.
  • We need to figure out the right way to deal with this information in the user interface:
    1. One interesting synergy is with the patch I've been working on at #11218: Allow default text formats per role, and integrate text format permissions -- there, we want to transform the text format "allowed roles" feature into standard Drupal permissions that appear on the permissions page like everything else. Combined with this patch, we would then be able to label each individual text format permission as a security risks or not, similar to the way other "insecure" permissions in Drupal are already labeled on the permissions page. This would allow site administrators to make clear, consistent choices about the security of their site, all from the same user interface.
    2. In terms of printing more generic warnings, much of the discussion above focuses on anonymous and authenticated users, which is obviously important, but just because I have a third role on my site doesn't necessarily mean I trust the users in that role. I'm tempted to say that some kind of warning should be printed if you ever assign an unsafe format to anyone who is not in the new D7 site administrator role, although that may be overkill.

Yeah, I guess it's an honor that Steven came out of retirement for just long enough to call the patch stupid. Will review.

Patch looks great.

I think we should expand it to handle the PHP filter, and do your UI 'option 1': Alert the admin through text on the regular permissions page. This combined with #11218 will bring a big increase in both security and convenience with input formats.

The other option with PHP filter (which I endorse 100% but not sure what's been said in past discussions) is to retire it from core. If people want to do crazy things with their site, they should at least have to go to contrib.

StatusFileSize
new7.26 KB
Failed: Invalid PHP syntax in modules/php/php.module.
[ View ]

Thanks for looking over the patch!

It seems to me that regardless of whether the PHP module is in core on contrib, we have to design something here that works with it. One possible option is in the attached patch. What I did there is make it so that after the filter module makes a determination about the security of the text format, it invokes a hook that gives other modules a chance to overturn that. The PHP module then labels any format insecure if it finds the PHP filter in it. That seems to work.

One thing I'm wondering a bit about this patch is where do we draw the line between secure and insecure? For example, is the <img> tag sufficient to label a text format insecure? The constants in the code refer to XSS only, but because it's currently implemented as a boolean, it's actually a broader security determination than that at the moment. I'm not sure we want to get into more complicated states than "secure" or "insecure", though - I do like the way the patch is pretty simple right now. Otherwise, it seems to me like this is getting close.

Nifty.

* lets give more info to the poor admin who gets insecure msg
* filter_format_is_insecure() - can we do filter_format_is_secure() and try name things positively? lets only do this if it doesn't make the code awkward.
* needs a test

Status:Needs review» Needs work

The last submitted patch failed testing.

Thank you for taking a first shot at this.

<script> is not a tag you would like to see in the allowed tags list (yes, I've seen cases where it was added).

1. We need to get more information to the admin.

- Perhaps allow modules to add error messages to the flag: eg (don't pin me down on the texts).

$flags['html'][] = t('The tag %tag was added to the HTML filter. This tag, blah blah blah');

or

$flags['html'][] = t('The markdown filter does not ensure the output is safe. Add the HTML filter after this filter');

Then a properly configured HTML filter would do $flags['html'] = FALSE or $flags['html'] = array().

2. We need to cater to other filters as well, such as PHP

eg $flags['php'][] = t('The PHP filter allows users to execute PHP code and do X, Y, and Z to your site.);

StatusFileSize
new6.88 KB
Failed: Failed to apply patch.
[ View ]

This should address #32 and most of #30. Also, added a check for $_POST to prevent the messages being set while the form is being built before submit (ugly, any better way to do this?).

Still needed: tests.

Cool! I think the addition of the $errors array and separating out into 'html' and 'php' is definitely an improvement (it does mean that someone could write a filter that claims to "sanitize" already-executed PHP code, but hopefully they wouldn't do that).

I'll try to look at this a bit more later (and hopefully have a chance to write a test or two), but one thing that comes to mind is that I'm not sure we should be mixing the $errors array with the determination of whether or not the format is secure. For example, this code in the latest patch:

+  // If the output is already safe HTML, then allowing dangerous tags here
+  // will do no harm.  Otherwise we want to generate a useful warning.
+  if ($tags && !empty($errors['html'])) {
+    $errors['html'][] = t('Allowing these tags is dangerous: %tags', array('%tags' => implode(', ', $tags)));
+  }

It seems a little awkward, and difficult for filter module authors to understand how to use correctly. I think we want to make this as simple as possible for them, to make sure they get it right.

An alternative might be to still pass in the $errors array and let them add messages to it when they want to, but also have them return a simple statement, more like in my original patch. For example, something like:

<?php
return array('html' => FILTER_EXPLOIT_PREVENTS);
?>

or

<?php
return array('php' => FILTER_EXPLOIT_ALLOWS);
?>

The filter_format_is_secure() function could then be in charge of clearing data from the $errors array when appropriate, and module authors wouldn't have to worry about it.

Also, about the user interface, the only reason I added any kind of user interface at all to the original patch was so that there would be something simple to test it out with. However, once we have tests, we won't really need that anymore :) I actually think we should make this initial patch just about getting the API right (which can hopefully be finished before the code freeze), and then figure out exactly how and where to present this in the user interface as part of a followup. As mentioned above, I hope that #11218: Allow default text formats per role, and integrate text format permissions will be a part of that.

Status:Needs work» Needs review
StatusFileSize
new21.45 KB
Failed: Failed to apply patch.
[ View ]

Here's a new patch:

  • Changes the API as I discussed above.
  • Adds tests.
  • Adds some API documentation.
  • Removes the UI, so we can try to focus only on the API for now.

I think this might be getting pretty close.

We should maybe think about whether it is better to use a whitelist than a blacklist, when determining which tags are unsafe in the HTML filter...?

Issue tags:+FilterSystemRevamp

This patch does not have to be ready today.

StatusFileSize
new21.94 KB
Failed: Failed to apply patch.
[ View ]

Slightly improved patch - rather than working around a static caching bug in the tests, it's better to actually fix the bug :)

Also some small improvements to the API documentation.

This patch does not have to be ready today.

Definitely agreed.

But I figured it would be useful to at least have it at a pretty far along state.... especially because of its relevance for #11218: Allow default text formats per role, and integrate text format permissions.

Sub.

This test confused me at first:

<?php
+    // Create a new format that is the same as the previous one, but escapes
+    // all HTML first; the text format should be secure.
+    unset($format->format);
+   
$format->name = $this->randomName();
+   
$format->roles = array();
+   
$format->filters = array(
+     
'filter_html_escape' => 1,
+     
'filter_html' => 1,
+    );
?>

Now I think I understand. filter_html_escape sets html => PREVENTS_EXPLOITS. filter_html, even though configured to allow an unsafe tag, can assume that if someone previous asserts html => PREVENTS_EXPLOITS that the dangerous tag must not be present anymore, and therefore it does not have to say it allows exploits due to the dangerous tag. If this logic is not documented somewhere clearly, please add it. :-)

I think the tests would be more clear (and more safe against re-order bugs) if they set $format->settings explicitly for this case instead of inheriting the settings from the previous case. This is particularly true because I notice that you reset $format->roles after some of the cases, making me wonder if something is changing it behind the scenes that requires it to be reset. If that is possible, why is it not possible for $format->settings?

In fact, taking the previous paragraph to its logical conclusion, perhaps you should break testFormatSecurity() into separate functions for each case. Then there is no possibility of re-order bugs. The only downside is the tests will take longer to run, but that is what the testbot is for.

StatusFileSize
new28.16 KB
Failed: Failed to apply patch.
[ View ]

Here is an updated patch:

  • I improved the tests as per @bjaspan's advice - I didn't actually break them up into separate functions, but I did change it so it creates a totally new text format for each group (via a helper function), so it should be safe against reordering bugs now.
  • I also added some new tests based on further discussions with @bjaspan offline: An important case to test is a filter that explicitly allows new HTML exploits (and what happens when it runs after a filter that prevents them). There aren't any core filters that do this, but there are important filters in contrib that do, so I wrote a couple tests for it here. And also it turned out that writing those tests revealed a bug in the previous ones, so it's a good thing I did :)
  • Also did some minor cleanup of the variable names within filter_format_is_secure().

Regarding the confusing test case, that is indeed inherently confusing, but it's really important we explain it well. The patch does already contain a fair amount of documentation, but any suggestions on improving it would be welcome.

The approach I took so far was to make it so that each filter is not at all responsible for thinking about what filters actually ran before them, but rather only imagining the worst case scenarios of content they might receive. The message we're really trying to communicate is something along the lines of the following: "Imagine completely sanitized text is passed in to your filter. If your filter is capable of creating a security flaw out of that, that is the only time you should return FILTER_EXPLOIT_ALLOWS."

Possible ways we can improve the message:

  1. Explicitly state in the documentation that many/most filters should not be using this API at all. The last thing we want is for the author of some harmless filter to think "well, my filter doesn't do any sanitizing, so I need to be returning FILTER_EXPLOIT_ALLOWS" -- that would break the whole system :)
  2. Give a better example of FILTER_EXPLOIT_ALLOWS in the documentation. (I used the PHP filter so far, but that's an unusual one so maybe it would be better to give an HTML-related example, from contrib.)
  3. Maybe consider renaming FILTER_EXPLOIT_ALLOWS to something more like FILTER_EXPLOIT_CREATES?

+++ modules/filter/filter.api.php 10 Sep 2009 01:39:29 -0000
@@ -122,6 +134,56 @@
+ *   way. The warning messages should be keyed by the type of output in which
+ *   the potential vulnerability exists. For example, use $warnings['html'][]
+ *   for an exploit that occurs when content that passes through the text
+ *   format is output as HTML (e.g., cross-site scripting exploits). This is
+ *   the most common type of output, but others are possible; for example, the
+ *   PHP module uses $warnings['php'][] to indicate that the PHP filter
+ *   contains an inherent vulnerability due to the fact that content which
+ *   passes through it is executed as PHP code.

We should be more precise about the possible keys here. Are html and php the only common? Would xss be valid, or would that be js? Do those keys refer to programming languages, maybe?

+++ modules/filter/filter.module 10 Sep 2009 01:39:29 -0000
@@ -15,6 +15,18 @@
/**
+ * Status which indicates that a filter prevents security exploits. This should
+ * be used by filters that sanitize previously-unsafe content.
+ */
+define('FILTER_EXPLOIT_PREVENTS', 1);
+
+/**
+ * Status which indicates that a filter allows a new security exploit, even if
+ * the content passed in to the filter was previously sanitized.
+ */
+define('FILTER_EXPLOIT_ALLOWS', 2);

I don't see the point for those statics. Subtracting 1 and we get TRUE and FALSE.

If we really, really need defines, then we need better names. But I currently don't see the point in polluting defines with those.

+++ modules/filter/filter.module 10 Sep 2009 01:39:29 -0000
@@ -188,8 +200,14 @@ function filter_format_save($format) {
   foreach ($filters as $name => $status) {
     $fields = array();
-    // Add new filters to the bottom.
-    $fields['weight'] = isset($current[$name]->weight) ? $current[$name]->weight : 10;
+    // If a specific weight was requested, use that.
+    if (isset($format->weights[$name])) {
+      $fields['weight'] = $format->weights[$name];
+    }
+    // Otherwise add new filters to the bottom.
+    else {
+      $fields['weight'] = isset($current[$name]->weight) ? $current[$name]->weight : 10;
+    }
     $fields['status'] = $status;

$format->weights[] doesn't work out. Weight is stored per filter, so it needs to be retrieved/stored in $filter['weight'].

Apparently, we only have $status now, and turning that into $filter is done in #570930: Allow to retrieve all filters (including disabled) in a text format. Hence, we should get that patch in first.

+++ modules/filter/filter.module 10 Sep 2009 01:39:29 -0000
@@ -516,6 +535,82 @@ function filter_list_format($format) {
/**
+ * Determines if a text format is configured securely.
+ *
+ * @param $format
+ *   The ID of the format to check.
+ * @param $warnings
+ *   An array which will be filled with warning messages if the format is
+ *   insecure, explaining the reasons why it is insecure. The keys of the
+ *   array represent types of output in which a security vulnerability exists,
+ *   and the values are arrays of messages describing the vulnerability for
+ *   that output type. Possible keys include 'html' (if content that passes
+ *   through the text format will be insecure when output as HTML) and 'php'
+ *   (if content that passes through the text format will be executed as PHP
+ *   code).
+ * @return
+ *   TRUE if the format is configured securely, FALSE otherwise.
+ */
+function filter_format_is_secure($format, &$warnings) {
+++ modules/filter/filter.test 10 Sep 2009 01:39:29 -0000
@@ -765,9 +796,83 @@ class FilterUnitTestCase extends DrupalW
+    $this->assertFalse(filter_format_is_secure($format->format, $warnings), t('A text format with no filters is labeled insecure.'));

The &$warnings argument doesn't make sense here. We only check one format at a time, and filters may only alter $warnings within the same format.

Because $warnings is not initialized in the test, the testbot will most likely fail with exceptions.

This review is powered by Dreditor.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new27.71 KB
Failed: Failed to apply patch.
[ View ]

Here's a new patch updated for #570930: Allow to retrieve all filters (including disabled) in a text format (which also therefore addresses one of @sun's comments above).

Also:

We should be more precise about the possible keys here. Are html and php the only common? Would xss be valid, or would that be js? Do those keys refer to programming languages, maybe?

I agree this is still confusing, and I'm starting to wonder if replacing 'html' with 'xss' might be worth thinking about...

Part of the confusion is that we only have these two examples, and both are somewhat different:

  1. 'html' means that the format is insecure when its output is processed as HTML (including JS).
  2. 'php' means the format is insecure because the format itself has already evaluated/executed it as PHP code.

We might be able to improve this by coming up with other, specific examples that we can mention in the documentation... but I'm having trouble coming up with any good ones.

For example, I can certainly imagine other output languages besides HTML, but Drupal is used to build websites in HTML, so the others seem like they are not only rare, but also likely to lead to false positives if people start using them (i.e., I don't care if a format is insecure for a language besides HTML that I never intend to use it for).... so we'd have to start thinking about allowing a $types parameter to filter_format_is_secure() or something like that. Overall doesn't sound like a great road to go down.

For examples similar to PHP, it's possible to think of some (the format can run the text through whatever local program on the operating system it wants to, I suppose), but again, pretty rare.

Are there any good, useful examples that I'm overlooking?

I don't see the point for those statics. Subtracting 1 and we get TRUE and FALSE

I think we should leave them as defines, because return array('html' => TRUE) seems to me like it would be easy to get backwards. This is already a little tricky for people to use correctly, so let's make it as clear as possible, IMO.

The &$warnings argument doesn't make sense here. We only check one format at a time, and filters may only alter $warnings within the same format.

Are you saying the &$warnings argument to filter_format_is_secure() doesn't make sense, or the &$warnings argument in the individual filter security callback functions?

I think the $warnings argument to filter_format_is_secure() is OK (and not sure I understand the comment above about one format at a time), but passing in an array by reference to the filter security callbacks does seem like a relic of an earlier version of the patch. Currently, we never pass anything but an empty array into those (since the idea is that each filter is not supposed to know, care, or be able to alter anything about the exact filters that ran before it).

I suppose we could get rid of that parameter and have the callback functions return something like this:

<?php
return array(
 
'status' => array('html' => FILTER_EXPLOIT_ALLOWS),
 
'warnings' => t('Text of the warning.'),
);
?>

Not sure if we should.

Because $warnings is not initialized in the test, the testbot will most likely fail with exceptions.

It doesn't, though :) I guess PHP is OK with passing undefined variables in by reference (which kind of makes some sense)...

By the way, as mentioned above, we still need to deal with this line of the patch, which clearly needs some help:

<?php
  $dangerous_tags
= array('iframe', 'object', 'embed', 'script', 'style'); // Obviously we can expand this later!
?>

If we continue to go with a blacklist, I discovered that the WYSIWYG Filter module seems to maintain one:

<?php
function wysiwyg_filter_get_elements_blacklist() {
  return array(
   
'applet',
   
'area',
   
'base',
   
'basefont',
   
'body',
   
'button',
   
'embed',
   
'form',
   
'frame',
   
'frameset',
   
'head',
   
'html',
   
'iframe',
   
'input',
   
'isindex',
   
'label',
   
'link',
   
'map',
   
'meta',
   
'noframes',
   
'noscript',
   
'object',
   
'optgroup',
   
'option',
   
'param',
   
'script',
   
'select',
   
'style',
   
'textarea',
   
'title',
  );
}
?>

I'm not sure if all of those are really necessary for security, but it might be a good list to start with. (Also, it seems that <img> is not on that list, which is kind of on the border of being considered a security risk, so not sure if we should include it.)

Obviously, whitelists are usually a better approach then blacklists, but the problem if we go with that is that we are then going to start getting false positive security warnings if someone configures the HTML filter to allow a tag that doesn't exist, or some obscure browser-specific tag that we didn't think of, etc.

So, I'm still not sure what to do about this issue either...

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new38.1 KB
new30.6 KB
Failed: Failed to apply patch.
[ View ]

To keep this moving, here's an updated patch to reflect the fact that #11218: Allow default text formats per role, and integrate text format permissions was committed.

Although I still think we should defer most user interface changes to a followup issue, this patch does make the one (obvious) change to user interface that we can now do as a result of the other commit.

See the attached screenshot of the permission page after the attached patch is applied - that looks very, very, very pretty, I think :)

Issue tags:+Security improvements

Adding a useful tag I just discovered...

+++ modules/filter/filter.api.php 26 Sep 2009 21:25:34 -0000
@@ -122,6 +134,56 @@
+ *   warning messages in one of two cases: (1) If the filter opens up a new
+ *   exploit in a previously safe text format, or (2) If the filter could have

(1)... (2)... This should be a - list.

+++ modules/filter/filter.api.php 26 Sep 2009 21:25:34 -0000
@@ -122,6 +134,56 @@
+ * sanitized. For example, a filter that evaluates PHP code would return
+ * array('php' => FILTER_EXPLOIT_ALLOWS). If the security callback function

On a separate line with @code + return + @endcode tags, please.

+++ modules/filter/filter.module 26 Sep 2009 21:25:34 -0000
@@ -7,6 +7,18 @@
+ * Status which indicates that a filter prevents security exploits. This should
+ * be used by filters that sanitize previously-unsafe content.
+ */
+define('FILTER_EXPLOIT_PREVENTS', 1);

PHPDoc for those constants...

"Filter security callback return value indicating possible security exploits.

This should..."

applying to both. (summary on one line)

+++ modules/filter/filter.module 26 Sep 2009 21:25:34 -0000
@@ -648,6 +669,82 @@ function filter_list_format($format, $in
+function filter_format_is_secure($format, &$warnings = array()) {
+  // Start with a blank slate of warning messages.
+  $warnings = array();

1) $format should be a fully populated $format object, not a format id.

2) $warnings is still by reference, but taking $warnings by reference makes only sense for 'security callbacks', not for this function. If this is supposed to make any sense and I just don't get it by looking at this patch, then please update the PHPDoc description for this argument, explicitly stating that it is passed by reference and _why_. :)

And, if it should make any sense, then you also have to remove that re-initialization. ;)

I'm on crack. Are you, too?

.

<?php
 
foreach ($format_status as $status) {
    if (
$status != FILTER_EXPLOIT_PREVENTS) {
      return
FALSE;
    }
  }
  return
TRUE;
?>

Maybe $return = array_flip(form_status); unset($return[FILTER_EXPLOIT_PREVENTS]); return !$return; this will only return TRUE if $return becomes empty -- ie if it only contained FILTER_EXPOLIT_PREVENTS values before we removed that. Just an idea.

re #47: The security warning on the permissions page may be pretty, but I almost couldn't see it despite knowing what I was looking for.

Yes, it would be nice if all of these warnings were red or something (rather than themed as a normal placeholder), but we can't solve that issue here.

To enable that to easily happen, the patch to review is #248598-55: Label permissions which are warned about in the user interface.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new30.86 KB
Failed: Failed to apply patch.
[ View ]

I made Sun's changes, except I didn't understand the part about PHPDoc for the constant definitions.

filter_format_is_secure seems correct in using filter ID (consistent with filter_list_format)

I'm also confused about $warnings in filter_format_is_secure. I can't see where these warnings are being used.

Regarding Chx's suggestion, that may be more efficient, but it's much less readable, and I don't think the tradeoff is worth it in this case.

Cool! I've been meaning to get back to this (and hope to soon) but have been swamped with other things. I really hope I can get back to this in the next day or so. It would be very good to push this towards an initial commit by October 15 so we don't have to try to argue for a "code slush exception" or something...

I think I finally realize why people are confused about the optional $warnings parameter to filter_format_is_secure()... the fact is, it isn't used anywhere in this patch, so perhaps that's why it looks unnecessary :) But, the idea is that we are really going to need it at some point (see Heine's comment in #32). The goal here is to get this patch in with only the minimal UI changes that it currently contains, but then have until November 15 to really figure out what the best UI is - whatever it's going to be, though, somewhere we're going to want to present the site admin with a list saying "here are the reasons why your text format isn't configured securely". So, since we still want to have a boolean returned by filter_format_is_secure(), but occasionally need to get more info out of it, I did that by having an optional array that you pass in by reference. Suggestions for a better approach are welcome :)

Also, regarding @chx's suggestion, perhaps another thing we could do there is:

<?php
 
return array_search(FILTER_EXPLOIT_ALLOWS, $format_status) === FALSE;
?>

I think that would be OK, and would certainly be readable code. Maybe it protects a bit less against buggy hook implementations than the other approaches did, but not sure that's really a big concern.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new34.8 KB
Passed: 14763 passes, 0 fails, 0 exceptions
[ View ]

D7UX stuff ate up all my time prior to October 15, unfortunately, so I'm only getting back to this now. However, we weren't really on an October 15 deadline anyway :) This issue is a release blocker, and the number of modules that would be affected by this API change is small enough that they could (and in fact probably should) be contacted by hand if this patch goes in.

I think the attached patch addresses all remaining feedback, such as in #40, #42, #49, #56, etc. I also clarified and expanded a lot of the API documentation.

The remaining TODOs are:

  • Expand the list of unsafe HTML tags so that it is complete. I don't have the time or energy to research this fully at the moment. It sounds like a perfect task for a followup patch.
  • Further use the filter security information in the user interface, which as already mentioned should be a followup issue.

Anyone up for a review and possibly RTBC?

Hm, I hope I don't have to go around begging for patch reviews :) I think this is pretty close, and it would be good to get it in as soon as possible so as to leave time to work on a good user interface.

I've actually gone ahead and (preemptively) created a followup issue at #618902: Design and implement a user interface for warning about insecure text formats so anyone who is interested can start thinking about how best to communicate this security information in the user interface.

+  return array_search(FILTER_EXPLOIT_ALLOWS, $format_security_status) === FALSE;

should use in_array() I'd think. Also, you shoudl pass the 3rd param (strict comparisons) as TRUE.

Also, I'm not sure the logic is correct - for HTML a later filter can counter-act an earlier file. But a filter that executes PHP or perl code cannot really ever be counteracted, I'd imagine - at least not by a filter that runs after. I guess it's ok in that no one would be likely to define such a thing, but seems like exec() vs. markup are fundamenetally different?

For the case of HTML, are 'allows' and 'prevents' accurate as categories? I seems like 'creates' might be more accurate - e.g. your hypothetical filter that transforms [script] into <script> creates new insecure content.

I find the signature of

+function filter_format_is_secure($format_id, &$warnings = array()) {

a bit odd - why not rename it something like

function filter_format_warnings($format_id) {

and have it return the array of warning, rather than the pass-by-reference. If the array of warnings come back empty... then you don't need to display any.

should the warnings also be displayed in the overview table at ?q=admin/config/content/formats ?

I think the approach could be simplified into 2 categories- 'exec; and 'html'.

Only for html-type filters can a later filter affect the easlier filter, so there is really no point in walking though the exec ones.

Also, it would probably be simpler in terms of the loop logic to just get all the items in a filter and then pop them off as long as they are ones that create/allow unsafe html.

Thanks for the review.

  1. Re in_array(): Yes. Duh :)
  2. Also, I'm not sure the logic is correct - for HTML a later filter can counter-act an earlier file. But a filter that executes PHP or perl code cannot really ever be counteracted, I'd imagine - at least not by a filter that runs after.

    Earlier patches in the issue addressed this by having the PHP filter use a different process for labeling a text format insecure - but ultimately it seems cleanest if all filters use the same API. I think, as you said, that it doesn't matter too much - ultimately, this entire mechanism relies on filter module authors labeling their filters properly, so any filter that tried to identify itself as "sanitizing" PHP would not be materially different from a filter that claimed to sanitize HTML when in fact it did not actually do that. Both are big mistakes.

  3. For the case of HTML, are 'allows' and 'prevents' accurate as categories? I seems like 'creates' might be more accurate - e.g. your hypothetical filter that transforms [script] into creates new insecure content.

    I agree. At least from the point of view of the filter (rather than the format), "create" makes more sense - and that is the most important place to get the terminology right since it is module-facing. I suggested FILTER_EXPLOIT_CREATES at one point above but no one responded so I didn't change it. I'll count your comment as a vote of confidence :)

  4. I find the signature of
    +function filter_format_is_secure($format_id, &$warnings = array()) {

    a bit odd - why not rename it something like
    function filter_format_warnings($format_id) {

    and have it return the array of warning, rather than the pass-by-reference. If the array of warnings come back empty... then you don't need to display any.

    Hm, at first I thought this wouldn't work because from the point of view of the filter module author, the warnings should be optional - we need to make sure to always respect the return value of their security callback regardless of whether they add a warning or not.

    However, I guess we have enough information to catch that case and use a default warning along the lines "The %filter filter lets people do bad things to your site", in which case it seems like this plan would work and would be a better API. We could still keep filter_format_is_secure() around as a simple wrapper function to filter_format_warnings(), of course.

  5. should the warnings also be displayed in the overview table at ?q=admin/config/content/formats ?

    Maybe yes, maybe no. The place to decide is #618902: Design and implement a user interface for warning about insecure text formats :)

I'll try to reroll the patch sometime later today with the above changes.

Looks like #63 and #64 were cross-posted. I'd have to think about the 'exec' thing. Theoretically, a filter could have two different types of "execs" so ideally we'd want to continue collecting warning messages from all filters even in that case. It may make sense to put all such examples under the same 'exec' key though (rather than separate ones for 'php', 'perl', etc). On the other hand, that breaks the "rule" that the key is supposed to correspond to some kind of processing language.

Overall, we probably don't want to add too much special-case code for this, since it's a very rare use case. Almost everyone will be using 'html' here. (Are there any known examples besides the core PHP module that would even use something else?)

Status:Needs review» Needs work
StatusFileSize
new32.14 KB
Failed: Failed to install HEAD.
[ View ]

here's an incomplete patch, but look at my changes to filter.module - I think that code can be greatly simplified to achieve the desired result.

+  do {
+    $security = array_pop($format_security_status['html']);
+    if (!empty($security['warning'])) {
+      $warnings[] = $security['warning'];
+    }
+  } while (!empty($format_security_status['html']) && $security['type'] != FILTER_EXPLOIT_PREVENTS);

This part won't work - we need to make sure that no warnings get returned except for a format that actually winds up as FILTER_EXPLOIT_CREATES in the end. (For example, the HTML filter does not actually create an exploit, but it still returns a warning in the case where it is configured to not sanitize completely. However, this warning is not relevant if a previous filter sanitized the text already.) We actually already have a test case for that in the code - see testHTMLFilterUnsafeTagSanitizeBefore().

Also, the "No filter has been added to sanitize raw HTML" message isn't supposed to show up unless it's the only one...

But in terms of the 'exec' stuff, it makes sense somewhat - the question is whether we want to make it so that 'html' and 'exec' are the only possible keys that people can use (is there anything we might be missing there?) vs the previous code which worked more generically for any keys that anyone might use, but did not enforce the fact that the PHP filter is impossible to sanitize.

@David - well, we cold do some slightly more complex processing here - distinguishing between and allow and create.

StatusFileSize
new31.67 KB
Failed: 14632 passes, 0 fails, 31 exceptions
[ View ]

didn't fix up the tests yet, but I think the filter.module part works in this patch. Note revised logic in the array_pop() loop. Simplified the return to jsut be an array of warnings - really don't see why we need to separate out by type.

Status:Needs work» Needs review
StatusFileSize
new31.68 KB
Failed: 14639 passes, 0 fails, 28 exceptions
[ View ]

ok, got the escape-all filter set right also.

setting to CNR is see where tests break.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new30.71 KB
Failed: 14612 passes, 0 fails, 28 exceptions
[ View ]

better - fixed some tests, remove debug code.

Status:Needs review» Needs work

The last submitted patch failed testing.

For testHTMLFilterUnsafeTag(), we definitely want only one warning to be returned - so the fact that the test in the latest patch was changed to check for two warnings and passes means there is a bug in both the code and the test :)

I'm not really seeing the advantage of 'exec' at this point? I think it's starting to make the code more complex, and the API is now less generically useful; e.g., you could previously use these functions to get information about any "output type" you could dream up and it all worked the same, but now the function is assuming internally that it knows about all of them. As mentioned above, the fact that PHP is impossible to sanitize is not only an edge case but also not an actual problem with the previous approach -- it simply means that no filter should ever claim to sanitize it.

Some of the other changes look good, though, and we should keep those.

Since we are not using the warnings right now - does it matter if it's 1 or 2? It's 2 now since we leave the default warning about unfiltered html.

I see more value in honing the code to the minimum feasible - why make it more complex than the 99% use case requires?

Status:Needs work» Needs review
StatusFileSize
new30.09 KB
Passed: 14733 passes, 0 fails, 0 exceptions
[ View ]

consolidated the code even more - did remove the fixed html/exec categories. Code should again find any categories, but consolidates all warnings into a single return array.

Yeah, this looks better - and I like the idea of having FILTER_EXPLOIT_ALLOWS and FILTER_EXPLOIT_CREATES as separate constants because it forces filter module authors to be explicit and think about the difference.

  1. I haven't looked too closely at the latest internals but I think there is a test case we can write that will fail here - a filter that returns a 'type' but not a 'warning'. (It's a useful one to write anyway, so I'll make sure to write it and see if something fails.)
  2. Instead of 'type', maybe something like 'effect'?
  3. I agree that filter_format_warnings() is more useful returning a single array, but I think it would be better to have it as a wrapper around another function, something like filter_format_warnings_by_category() - to make the by-category information accessible if it is needed. We also need to reintroduce filter_format_is_secure() as a wrapper function, because in cases where you just need a boolean it is awkward to have to explicitly load the warnings (such as in the hook_permission changes in this patch). So I think three functions in total, with the middle one possibly a private function.

Will look more at this later - I think it's overall an improvement.

I don't see why you'd ever need a boolean, since you're always just checking whether the returned array is empty or not as a boolean.

I'm really not sure of the use case for examining the warnings by category - other than unit testing. So, at most I think 2 functions.

I thought briefly about the issue of a filter that doesn't include a warning - though we could set some generic string as a warning in all cases.

StatusFileSize
new30.72 KB
Unable to apply patch filter-security-275811-79.patch
[ View ]

Here's a quick re-roll implmenting a single wrapper so you can get the warnings by category if desired.

Re-test of filter-security-275811-79.patch from comment #79 was requested by pwolanin.

Status:Needs review» Needs work
Issue tags:+Security improvements, +wysiwyg, +Release blocker, +FilterSystemRevamp

The last submitted patch, filter-security-275811-79.patch, failed testing.

Spoke to pwolanin about this, I had what I thought was a good idea about running various XSS vectors on configured formats and seeing if they got through, but that's no help if a filter converts **script** to <script> which means this is the only way to fix this.

It's an API change, but the security implications here are massive, and it's a change which only affects filter module maintainers who need to document security implications - which isn't a bad thing really.

Apart from the patch needing another re-roll, the list of HTML tags being incomplete (and possibly deserving of an _alter()) and a couple of typos for EXPLOT/EXPLOIT this looks good to me though, although I only really gave it a cursory look.

I've been ignoring this issue for a while based on the theory that the "release blocker" tag meant it could get in later, but yeah, I'm not sure we want to keep making that argument forever :)

I can dedicate some time to this over the weekend, but probably not before then. Since it's an API addition more than a change (and a limited one at that) it seems like a good candidate for getting in after the alpha if necessary.

In terms of work to do, I think the comments in #77 are mostly still relevant. And filter_format_is_secure(), even as a one line wrapper function, is absolutely important: this API should be as clear as possible, and a well-named function that returns a strict boolean is necessary for that.

It also occurs to me that it would be very useful to return the warning messages keyed by the filter that provided them. This would for example help possible uses of this in the UI (ability to display the warning message next to the relevant filter, which I believe were part of the proposed designs over at the other issue: #618902: Design and implement a user interface for warning about insecure text formats)

Version:7.x-dev» 8.x-dev

I really like and appreciate all the ideas and thinking that went into this patch, but I'm not happy with this patch yet. It adds a very difficult layer of complexity to the filter API, and while doing so, the gathered information is not verbose enough to make it useful for anything else than those permission strings. I do not want to add this burden to all filter-providing module maintainers for this sole purpose, and there's no time left to advance on this additional feature-set for D7.

In other words: At this stage, the proposed API change is too big, compared to its usefulness.

Priority:Critical» Major

Downgrading all D8 criticals to major per http://drupal.org/node/45111

Issue tags:-Release blocker

.

Category:task» feature

We've now got http://drupal.org/node/1201874 to worry about, I'm moving this to a feature request. This shouldn't prevent development of new features.

Marking as duplicate: #1208988: Deny use of the PHP evaluator filter to Anonymous or Authenticated roles.
Perhaps some of the code there can be used here.