I noticed that the security fixes that went into Drupal 6 in http://drupal.org/node/345441 were never applied to Drupal 7, and I couldn't find an existing issue for it either..... um, that's kind of unfortunate :(

Attached is a patch. The patch includes a test to prevent the input format security issue from recurring (no test for the update.php security issue, though, since I assume there's no good way to do that). This also involved reworking some of the existing input format tests.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Status: Needs review » Needs work

The last submitted patch failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review

I'm putting this back to "needs review".

The test that supposedly failed: (a) does not fail when run on my machine with the patch applied, (b) did not fail before when the testbot tried it (at least I'm pretty sure), and (c) has absolutely nothing to do with the code here, so I can't really imagine how it's being caused by this patch...

Dave Reid’s picture

I'm wondering if DrupalWebTestCase is really the best place for drupalCreateInputFormat() and drupalDeleteInputFormat(). Right now I don't see that being re-used in other tests besides filter tests... Maybe we should create a FilterTestHelper class?

David_Rothstein’s picture

That seems like it might make sense -- although on the other hand, I can imagine certain contrib modules wanting to use those functions in their tests, and DrupalWebTestCase would probably be the most convenient place to put it for them.

Is there any downside to having it in DrupalWebTestCase?

Dave Reid’s picture

Downside is, having more code/memory during all our simpletests. Contrib module test classes could just extend FilterTestHelper to pick up the functions, if I remember correctly.

Status: Needs review » Needs work

The last submitted patch failed testing.

David_Rothstein’s picture

Title: Security fixes from SA-2008-073 not applied to Drupal 7.x » Filter system security fixes from SA-2008-073 not applied to Drupal 7.x
Component: base system » filter.module
Status: Needs work » Needs review

OK, this patch no longer applied because of the "input format" => "text format" name change, and also because the update.php part of it was committed in the meantime via #361699: SA-2008-073: update.php CSRF. (I also just found #361706: SA-2008-073: Content filtering bypass, when filters are being removed and marked it as a duplicate.)

The attached patch follow's Dave Reid's suggestion for the code reorganization. I noticed that the comment module does something similar with its tests, so I guess it makes sense to do it here too. I'm not sure there is actually any practical effect on the amount of code/memory loaded, since it looks like simpletest_get_all_tests() just loads all the code all the time anyway. But at least it is better theoretically :)

David_Rothstein’s picture

Attaching the patch: attempt #2...

Status: Needs review » Needs work

The last submitted patch failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review

Dear Testbot,

Please stop crying wolf.

Sincerely,
David

:)

pwolanin’s picture

tagging

Status: Needs review » Needs work

The last submitted patch failed testing.

Berdir’s picture

The patch couldn't apply the .test changes for some reason, re-rolled..

Berdir’s picture

Status: Needs work » Needs review
sun’s picture

subscribing

David_Rothstein’s picture

Thanks for the reroll!

I've rerolled the patch again (attached) - this adds some necessary code to filter_form() that was originally in the issue at #11218: Allow default text formats per role, and integrate text format permissions but more properly belongs here.

This also gets a few other minor things - the subclasses now all properly extend FilterHelperTestCase again, and all getInfo() methods now use "public static function getInfo" (I think that change was what caused the patch to fail to apply before, but I'm not sure).

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Issue tags: +FilterSystemRevamp

Tagging to get in my list of issues.

sun’s picture

Status: Needs work » Closed (duplicate)

#562694: Text formats should throw an error and refuse to process if a plugin goes missing is the proper security hardening fix accounting for vanishing text formats as well as vanishing filters. Comes with tests.

Additionally, the filter system API was changed so that modules can properly react when a text format is updated or deleted.

sun’s picture

Status: Closed (duplicate) » Active

Re-opening based on recent discussion in #562694: Text formats should throw an error and refuse to process if a plugin goes missing

The changes related to text formats (not filters) should be extracted from the patches over there and moved over here. Therefore, status to active, as we're not re-rolling earlier patches in here.

int’s picture

Issue tags: +beta blocker

add beta blocker tag

philbar’s picture

This module does not appear on the D7 beta blocker section of the community initiative page. Should it be?

David_Rothstein’s picture

As far as how to proceed here, I think zeroing out the content (proposed in the other issue) is OK for a text format that truly is unknown, but we also need to consider this kind of case (also previously discussed):

1. Disable a module that stores some formatted text.
2. Delete a text format; hook_filter_format_delete() never gets called on the above module, since it is disabled.
3. Reenable the module. Through no fault of anyone, the module now has data stored in a text format that no longer exists.

In that scenario we do not want the content stored by that module to be hidden from display; that effectively destroys the site. There is some beginning code at #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do that would store a mapping of deleted text formats so if we encounter a deleted one we always know how to properly display it.

What we really might need here for starters is some tests (including a scenario like the above); currently the ones for deleting text formats aren't too comprehensive.

mikey_p’s picture

I think the mapping idea has more merit then the other patches at #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do, which seem to be basically the same issue. If they aren't could someone please explain to me the differences between that issue and this one?

David_Rothstein’s picture

The main issue here is that the filter module needs to have some secure behavior that applies when someone tries to filter content with a text format that doesn't exist (regardless of where or how that incorrect text format came from).

#556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do is a very real example of how that situation arises currently - and that issue is mainly about preventing that particular situation from arising in the first place. But depending on what is decided there, it could be that it will wind up totally encompassing this issue as well.

David_Rothstein’s picture

I just realized there is a bit of a complication here in terms of the upgrade path. Because of the way things were done in D6, sites could very well have lots of content (especially from CCK) whose assigned text format no longer exists, but which D6 automatically treats as though it has the D6 site's default format, filtering it in that format every time it is displayed, etc.

How do we deal with preserving behavior for that content on upgrade to D7?

Gábor Hojtsy’s picture

Migrate to the new default format and alert the user about that on upgrade?

pwolanin’s picture

How do we know that the content had the default format before? Do we just migrate all content with an invalid format?

sun’s picture

Assigned: Unassigned » sun
Status: Active » Needs review
FileSize
8.77 KB

Back to the plan in #20.

sun’s picture

re #26: Core developers want to ignore this problem. If we are not even able to simply update text format values when deleting a text format, then we're more than unable to perform such a update during upgrade. #27 and #28 already figured that we don't even know the proper replacement format for such contents. So as if one wouldn't be enough, at very least we have two poor reasons to not upgrade text format reference values.

sun’s picture

Any feedback? I think this is RTBC.

bjaspan’s picture

I think this patch is fundamentally right. No matter what, if we get a request for a text format we have absolutely no information about, the only safe thing to do is not to render the text at all. You've said it yourself: Falling back to check_plain() is not good enough. I think it would be better to render it as "This content is no longer available" instead of the empty string, but that may be bike-shedding.

I am perplexed why we agree on this issue but not on #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do. This patch contains the code:

+  if (!$format = filter_format_load($format_id)) {
+    watchdog('filter', 'Missing text format: %format.', array('%format' => $format_id), WATCHDOG_ALERT);
+    return '';
+  }

The logic can be restated as: "If the requested format does not exist, replace the format with one that we *know* is safe." Since there is only one such format and it is "return a constant string," the patch simply avoids the complexity of a real format and returns a constant string directly.

The logic in #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do can be restated as: "If the requested format does not exist, and we know what the site admin wants us to use instead, use it."

David's very reasonable example in #23 shows how these are practically the same use case. A module that is disabled when a format is deleted is never going to know about the deletion. It cannot possibly update its content even if it is just one row in a known SQL table. So either we make all that content inaccessible until the site admin takes some contrib-module-specific action not currently implemented by any code (translation: the site admin is screwed), or we accept the replacement-format approach from #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do and do what the site admin actually told us she wanted.

So, I'd like to mark this RTBC, but what I really think should happen is that the "fallback text format is a constant string" approach should be integrated with the "replacement format is what the user asked for" approach into a single patch. Otherwise, we'll commit one first and then just have to modify it slightly for the second.

We each own one of these patches. Would you like to combine them, or shall I, or do you still disagree?

sun’s picture

The fundamental difference between these two issues is that this issue derives from #562694: Text formats should throw an error and refuse to process if a plugin goes missing, which tries to account for some edge-cases:

  • If modules are disabled, formats and/or filters may be updated, but disabled modules don't update their data.
  • Filter modules may vanish for various reasons, turning text formats into security holes.
  • check_markup() must fail and return an empty string whenever it is asked to filter text with a non-existing or otherwise bogus text format. This issue was re-opened, because #562694 applied that identical paradigm not only to formats, but also to filters in a format (see previous and last bullet).
  • It's purely about hardening security by accounting for egde-case scenarios.

The primary difference to #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do is:

  • It's not at all limited to text formats. It applies to any field that stores a value referencing some other value elsewhere.
  • It's not about edge-cases; text formats always have to be re-configured at some point. Stored data simply needs to be updated.
  • It makes this issue, resp. the intention of #562694, much harder to accomplish, because formats suddenly refer to other formats, so figuring out whether data is valid or not becomes a needlessly complex job that requires custom code for every single reference value you have in your data. Runtime code, executed on every normal code path, needs to re-check the data it gets all over again to make sure it's not invalid. Invalidating any existing and known validation stage, whether in Form API or whatever API.
  • It's not about security hardening or preventing the Filter API from filtering content in egde-cases and making it output nothing, it's about ignoring and killing the D in CRUD.

David is right that both issues are also related to data handling of disabled modules. However, a proper solution for disabled modules needs to be discussed over in #734710: API functions that invoke hooks don't work as expected when a disabled module is being uninstalled.

it would be better to render it as "This content is no longer available" instead of the empty string

We can't, because we don't know where the content will be used. Ideally, we'd output an administrative warning, but the only safe way is the watchdog message already contained in this patch.

The logic can be restated as: "If the requested format does not exist, replace the format with one that we *know* is safe."

No, that's not what is happening here. Instead: If the requested format does not exist, return nothing.

It is as simple as: Create a user, then create a node, use SQL to delete the user, try to access the node: 403. Likewise, text formats and filters in text formats can only ever be applied, if the data is valid.

yched’s picture

It's not at all limited to text formats. It applies to any field that stores a value referencing some other value elsewhere.

And CCK never adressed that in any way. [node|user]reference fields keep references to nids|uids that don't exist anymore, and filter them out on output / edit. Nothing new here, we were already not able to mass update db records in a guaranteed non-breaking way in CCK D6. No one ever saw that as a critical 'OMG, we're not enforcing referential integrity'. We're just not able to enforce referential integrity, and have never been.

It's not about edge-cases; text formats always have to be re-configured at some point. Stored data simply needs to be updated.

Text format reconfiguration doesn't involve updating data, only clearing caches. Deleting formats is the issue, and it's an unfrequent case.

bjaspan’s picture

@sun, I am sympathetic to your position. Realize that the *only* part of Drupal core that handles content deletion in the way you are advocating for is Field API, and that happened mostly because I felt very strongly that it should. It was NOT easy, even though we designed for it from the beginning. It is way too late in the D7 cycle to attempt something like that now. Furthermore, it would have to be much more complicated, because it would not just be Field API---it would have to support every contrib module that stores references to formats, because any such module might be disabled. And you are in fact advocating that it should also handle all references to anything. Finally, unless whatever solution we came up with for Filter API worked instantly for both Field API and disabled contrib modules when they got enabled, we would still want the replacement format solution to keep the site secure and appearing correctly during the transition.

To put it another way, echoing yched: You are demanding a degree of referential integrity that is very difficult in large or distributed database scenarios, that no one has a really good solution to, and that no other part of Drupal (except Field API) even attempts to implement.

Dries’s picture

Status: Needs review » Reviewed & tested by the community

As far as I can tell, the patch in #29 is RTBC. Marking it as such for a while -- will commit this later this afternoon unless someone speaks up.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

No feedback. Committed to CVS HEAD. Thanks. :-)

David_Rothstein’s picture

Status: Fixed » Needs work

Major issues:

  1. We need an upgrade path for core here, at least due to this D6 bug: #863680: Deleting an input format does not update the user signature format in the database (i.e., we need to clean up the format of orphaned user signatures so that they don't stop displaying after the D7 upgrade). Not sure if there is anything else.
  2. For the same reason, every single contrib module which stores text format data will need to perform a D6->D7 update as a result of this, so at least we will need to document that in the module upgrade guide. Luckily, it's a simple add-on to the filter update code they already have to write :)
  3. When editing content with a non-existent text format, this patch doesn't seem to behave properly. It pre-selects whatever my personal default text format is (i.e., as if the content is newly created), but that is not safe. Actually, it's difficult to figure out what we should do here. Probably, most users should not be able to edit this content at all (as per the existing approach for that type of thing in http://api.drupal.org/api/function/filter_process_format/7) - we know nothing about what format it was originally intended to display in, and it may be sensitive data. We still need to preserve the ability for high-level admins to edit it though, but it can't start out in their personal default format; we need to use the safest possible format (i.e., the fallback format) here.

Minor issues:

  1. @@ -187,6 +187,10 @@ function filter_format_load($format_id) 
     function filter_format_save(&$format) {
       $format->name = trim($format->name);
       $format->cache = _filter_format_is_cacheable($format);
    +  // Programmatic saves may not contain any filters.
    +  if (!isset($format->filters)) {
    +    $format->filters = array();
    +  }
     
       // Add a new text format.
       if (empty($format->format)) {
    @@ -197,10 +201,6 @@ function filter_format_save(&$format) {
       }
     
       $filter_info = filter_get_filters();
    -  // Programmatic saves may not contain any filters.
    -  if (!isset($format->filters)) {
    -    $format->filters = array();
    -  }
    

    Not sure I understand this change? Actually, adding the first part probably makes sense, but why remove it from filter_format_save() also?

  2. -tr.error {
    +table tr.error {
    

    I have no idea what that has to do with the patch :) Also, this makes things inconsistent with the adjacent styling for tr.warning and tr.ok, and it doesn't make sense to me on a first glance; why do we ever have to worry about table rows that might exist outside a table?

  3. +  // Clear all caches. We need to invoke drupal_flush_all_caches() to ensure
    +  // that also dependent caches are flushed, e.g. the filter cache and field
    +  // cache, and also registered themes are rebuilt, since modules can also
    +  // register themes.
    +  drupal_flush_all_caches();
       entity_info_cache_clear();
    

    I'm a big fan of this change - it was silly that the previous code came so close to clearing all caches but didn't actually do it :) However, if flushing all caches really doesn't result in entity_info_cache_clear() automatically happening somewhere along the way, that sounds like there's a bug somewhere?

David_Rothstein’s picture

By the way, in terms of how to do the D6->D7 upgrade, I think #27 and #28 probably had it right? - we just have to upgrade any data with an unrecognized format to explicitly use the (old) D6 default format instead.

That may not be perfect, but it's the best we can do - and since that content would have already been displayed using that same format (whenever it was being viewed on the D6 site), we aren't introducing any security issues that the site didn't already have.

David_Rothstein’s picture

Oops, please ignore my minor issue #1. I was reading the patch file incorrectly :)

sun’s picture

re major issues #38:

  1. A lazy-update of text formats from D6 is pretty much impossible, because we simply do not know what happened to the originally referenced format. Was it replaced with another? Did it just vanish without replacement? Is the signature using the default format now? And whatever format is actually used currently, can the user even access that format?

    IIRC, #562694: Text formats should throw an error and refuse to process if a plugin goes missing contained some additional code to handle vanishing/invalid formats for the user signature, which revealed that there can be many cases in which a format may become invalid.

  2. That would mean that modules would update from format X to Y without taking into account what format X actually did, and whether format Y is a proper AND secure replacement format.
  3. I think that it's safe to use the user's default format, i.e., the most permissive possible format. I don't see how preselecting the fallback format would be an improvement. Instead, we may want to consider to try to warn the user about the format. In D6, I would've said: Let's uncollapse that text format fieldset. In D7, we may be able to conditionally render a warning? Either as page-level warning (like any other message), or within the text format widget.

re minor issues #38:

  1. IIRC, the added tests revealed the bug that $format->filters may be empty for programmed saves, which can make the storage saving/loading choke.
  2. That CSS change was taken over from #562694: Text formats should throw an error and refuse to process if a plugin goes missing, which had to add an .error class to bogus/invalid text formats in the admin form. Yes, we may want to revert that. But then again, we also want to allow for that, so perhaps ideally dealt with in a separate issue.
  3. I'm not sure whether entity_info_cache_clear() is invoked along the way -- stopped diving deeper after trying to find it in subsequently invoked functions.
David_Rothstein’s picture

A lazy-update of text formats from D6 is pretty much impossible, because we simply do not know what happened to the originally referenced format. Was it replaced with another? Did it just vanish without replacement? Is the signature using the default format now? And whatever format is actually used currently, can the user even access that format?

We do know at least the last two things - because D6 behavior chooses them for us. D6 takes such content and displays/treats it as if it were in the D6 default format ('filter_default_format' variable at the moment of the upgrade), so if we store that in the database, we maintain consistency.

Anyway, if we don't solve this, then some people upgrading from D6 to D7 are going to suddenly have a bunch of their CCK fields/etc/everything stop displaying any content. I don't think that's acceptable.

I think that it's safe to use the user's default format, i.e., the most permissive possible format.

The user's default format in D7 is not necessarily the most permissive possible format - it's just the first one on their list (based on weight). A lot of admins are probably going to choose to put Full HTML at the top of the list and therefore have that as their personal default.

sun’s picture

Status: Needs work » Needs review
FileSize
1.09 KB

In #520760-41: SA-CORE-2009-007 user signature format, we already tackled the upgrade path for the user signature format in detail. D6 updates all values to the default format already. If there's something left regarding user signature formats, shouldn't we rather re-open that issue?

AFAIK, the upgrade path for CCK fields is not contained in core, but will be a contributed project.

Attached patch contains the only two changes I can see here.

sun’s picture

Hrm. In light of #651086: Cache clearing is an ineffective mess in module_enable() and system_modules_submit(), I think we should revert all of the cache clearing changes. Seems like the topic is a lot more complex.

bjcool’s picture

Subscribe

Damien Tournoud’s picture

Priority: Critical » Major

So it seems that the security issue has been committed. Lowering priority.

David_Rothstein’s picture

Priority: Major » Critical
Issue tags: +D7 upgrade path

Well, it introduced an issue with the upgrade path, which may be critical...

Plus there is not a security issue remaining, but there is still a security weakness along the lines of what #91663: Permission of text format is not checked when editing an entity and instead reset to something a user can use. fixed for the case where the text format does exist (and that issue was critical). So I'm moving back to critical for now.

webchick’s picture

Oh, hai! What's left to get this RTBC? This is the last security issue that I can see which would block beta.

sun’s picture

First of all, we want to commit #44, which reverts changes there were unrelated to this patch (and partially wrong).

@David: Not sure I understand your remaining concerns in #47... I guess it would be possible to start with writing a test to assert that the remaining bug you think of exists?

David_Rothstein’s picture

Agreed - I will work on writing a test for this tomorrow.

By the way, in terms of this being a beta blocker, I think it probably isn't (at least not for the security reasons). I think what remains is more of a "security hardening" issue, which already exists as a problem in D6 but which we decided (and actually implemented elsewhere) shouldn't exist in D7, so we just have to fix this up to do that here also. I don't think it should block a beta release, though; I'll look more carefully into it tomorrow.

If we're aiming for a perfect upgrade path before beta, then that part of it may be a beta blocker.

sun’s picture

Status: Needs review » Reviewed & tested by the community

We still need to commit #44 (i.e., revert previously committed, bogus changes).

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Well. #44 is your own patch. Could we get an independent reviewer to back this up?

David_Rothstein’s picture

Status: Needs review » Reviewed & tested by the community

Yup, that one is RTBC.

I started working yesterday on a test for the other issues I raised, but it was a little trickier than I expected and I didn't finish it yet. I'll have something to post soon. Sorry for the delay.

In the meantime, no reason not to commit #44.

webchick’s picture

Status: Reviewed & tested by the community » Needs work

Ok cool. Unfortunately I can't since it no longer applies. ;)

webchick’s picture

Also, am I correct that we can essentially remove "beta blocker" and "Security Advisory follow-up" tags from this?

sun’s picture

Status: Needs work » Needs review
FileSize
1.09 KB

1) Looks like instead of reverting that CSS change, someone advanced on it and applied it to the other table row status classes, too. Nice. :)

2) We're waiting for David's draft patch to figure out whether this is still critical and block a beta.

David_Rothstein’s picture

Status: Needs review » Needs work

Ok cool. Unfortunately I can't since it no longer applies. ;)

Oops, I was relying on the testbot for that part of it :)

Also, am I correct that we can essentially remove "beta blocker" and "Security Advisory follow-up" tags from this?

I believe there is one thing that is still a security regression from D6 - whether it should be a beta blocker or not I am not sure. Here are the detailed steps:

  1. On a fresh installation, create a node with a filtered HTML body.
  2. Somehow make Drupal forget that the filtered HTML format ever existed (right now that is easy to do just by deleting the format - due to #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do). In any case, somehow get the site in the situation where the node body is stored with a text format that Drupal totally does not recognize.
  3. Now edit the node as an administrator (which they are likely to do, since the node body will have stopped displaying in this case...)
  4. Immediately save the form without changing any settings.
  5. Suddenly, the node body has been converted to Full HTML - i.e., any XSS attacks previously attempted in there will work.

I believe that is a security risk - simply loading and resaving a node without editing anything should not be able to convert the node to an insecure format.

The reason this was not an issue in D6 core was that the (site-wide) default text format would have been used in that case, not the admin's personal default. On a site that is otherwise configured securely, silently switching to that format does not open you up to XSS attacks. That is why I think in D7 we need to use the fallback format (i.e. plain text) which is the equivalent format that the site is supposed to always have configured securely in D7.

As mentioned, I'll continue to work on getting this into code :)

David_Rothstein’s picture

Status: Needs work » Reviewed & tested by the community

Sorry, crosspost. #56 is RTBC if the testbot agrees.

sun’s picture

@David: Thanks for the clarification. That may indeed be a problem (though we should verify that it exists through a test first). As you know, I'm hard to convince of any programmed/automated format re-selection, regardless of whether it's the fallback format we're falling back to (as that can also expose sensitive information if you consider a content using the PHP code format).

AFAICS, the real issue is that we are using a select list to choose the format. Since the assigned #default_value does not exist in the example case, no format is preselected. And since it is a select list, the user agent/browser simply defaults to the first available select option. -- Therefore, when manually updating a content that's using a format that no longer exists (for any reason), your browser resets the format to the first available option.

sun’s picture

@David: How's your test coming along? Meanwhile, I've created a patch to fix the bug.

This patch is not RTBC. RTBC is #56.

sun’s picture

+++ modules/filter/filter.module	14 Sep 2010 21:08:27 -0000
@@ -894,6 +904,17 @@ function filter_form_access_denied($elem
+    form_error($element, t('!name field is required.', array('!name' => $elements['#title'])));

ok, stupid me - this was blatantly copied from _form_validate(), which uses $elements instead of $element ;)

Powered by Dreditor.

David_Rothstein’s picture

Looks excellent! Assuming we are willing to change the UI for this. I think the other way would not be so bad (it's still better than D6, plus in the PHP code case, the sensitive information in this case would be visible to the administrator right away after they saved the node, so they would be able to correct it immediately also). Still, I agree that this "Please select" method is theoretically the best - and might also serve as a way to explain to the user what is going on, since this is a confusing situation (we could even hijack the filter tips for that?).

If all goes well, I'll have something to post tomorrow - sorry for the delay. Note that this issue we're discussing is just part of the followup (the most critical).

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Ok, committed #56 to HEAD. Back to.. needs review?

David_Rothstein’s picture

OK, I think the attached might do it.

In the end, it wasn't that much code, especially when combining it with @sun's excellent approach above.

So, this contains:

  1. A version of @sun's fix for the remaining security issue, with a test.
  2. A fix for the remaining "information-disclosure" type issue (if we don't know the format, we can't display the content to just anyone when they are editing the node, because they might not have had access to it before), and a test for this.
  3. The fix to the upgrade path, plus extensive docs there - since this particular update function, user_update_7010(), will likely be a good one for contrib modules to copy almost exactly.
  4. Minor fixes elsewhere, to be consistent with the idea that NULL (not 0) is now the only way to indicate that newly-created content does not have an assigned text format yet.

Status: Needs review » Needs work
Issue tags: -Security Advisory follow-up, -FilterSystemRevamp, -D7 upgrade path, -beta blocker

The last submitted patch, filter-format-358437-64.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review

#64: filter-format-358437-64.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Security Advisory follow-up, +FilterSystemRevamp, +D7 upgrade path, +beta blocker

The last submitted patch, filter-format-358437-64.patch, failed testing.

sun’s picture

IMHO, this manually injected null option is ugly as hell and totally inconsistent. Select lists are optional by definition. We should fix #140783: A select list without #default_value always passes form validation, so as to be able to fix this remaining bug here cleanly.

sun’s picture

Status: Needs work » Postponed

Now that I properly understood all expectations, #140783-15: A select list without #default_value always passes form validation needs to be fixed very badly.

David_Rothstein’s picture

Status: Postponed » Needs work

I don't think this should be postponed on making the code less ugly. Unless you are saying there is some way in which the code in this patch doesn't actually work, we should continue fixing the critical bug here and it can always be cleaned up later.

Looking at the test failures, I'm guessing I'm going to need to revert this part of the patch:

-  if (empty($element['#format'])) {
+  if (!isset($element['#format'])) {

(Or probably move it to a separate issue.)

I think it makes sense that a format of "0" shouldn't be treated as the default, but the way the user signatures work it's not so simple (that's what caused the test failures), and that part of the patch might not actually be critical or strictly required. Will have to look at it some point later.

sun’s picture

+++ modules/filter/filter.module	16 Sep 2010 05:32:11 -0000
@@ -821,9 +821,10 @@ function filter_process_format($element)
   // Use the default format for this user if none was selected.
-  if (empty($element['#format'])) {
+  if (!isset($element['#format'])) {
     $element['#format'] = filter_default_format($user);
   }

No, this change is correct. If there is still code that sets 0 or something else instead of NULL (resp. no #format key at all), then we need to fix that code to set NULL instead.

+++ modules/filter/filter.test	16 Sep 2010 05:32:12 -0000
@@ -534,10 +534,35 @@ class FilterFormatAccessTestCase extends
+    // Now select a new text format and make sure the node can be saved.
+    $edit[$body_format_key] = filter_fallback_format();
+    $this->drupalPost('node/' . $node->nid . '/edit', $edit, t('Save'));
+    $this->assertText($new_title, t('New title found.'));
+    $this->assertNoText($old_title, t('Old title not found.'));

(and elsewhere) This also passes in case we're still looking at the form. Eventually needs to check page redirection URL.

Powered by Dreditor.

sun’s picture

Status: Needs work » Needs review
FileSize
9.15 KB

http://drupal.org/node/140783 is really required + actually in a semi-RTBC state. chx already reviewed privately.

Nevertheless, I've double-checked that no module is passing 0 instead of NULL. But the phpDoc still mentioned that 0 would be a valid value, so fixed that.

Additionally shortened some comments and projected some knowledge I gained from #140783 for this patch. However, we still want to properly fix the underlying bug via #140783.

Status: Needs review » Needs work

The last submitted patch, drupal.filter-format.72.patch, failed testing.

ksenzee’s picture

@sun, are these actual test failures, or is this patch dependent on #140783?

David_Rothstein’s picture

@ksenzee: They're due to the user signatures issue I mentioned above. The user signature currently stores "0" in the database for signatures that have not been created yet, thus the empty() => !isset() change breaks the ability of a correct default format to be set the first time you edit your user account.

I agree in theory we want to make everything settle consistently on having NULL (and nothing else) be the indicator for "nothing has been selected yet because the content doesn't exist so the default format should be preselected" but it may turn out to be out of scope for this issue. I only put it in the patch because I thought it would be trivial.

sun’s picture

Status: Needs work » Needs review
FileSize
9.7 KB

#140783: A select list without #default_value always passes form validation landed. So let's fix this properly.

Sorry, only after reviewing the uploaded patch I realized that this patch already changes a user update that changes signature_format values already. Likely should be merged instead.

sun’s picture

Also, Text module's schema gets it right:

    'format' => array(
      'type' => 'int',
      'unsigned' => TRUE,
      'not null' => FALSE,
    ),

User's signature_format uses some strange definition.

chx’s picture

Reshuffled the logic in filter_process_format to make it a bit simpler.

David_Rothstein’s picture

  1. +  // Administrators can edit any format, users can only edit the stored and
    +  // preselected text format if it exists and they have access to it.
    +  if ($user_is_admin || ($user_has_access && $format_exists)) {
    +    // However, if the stored text format no longer exists then none of the
    +    // text formats can be used as default value, so force the administrator
    +    // to choose a new format. The previous condition enforces that only
    +    // administrators can get here for non existing formats.
    +    if (!$format_exists) {
    +      $element['format']['format']['#default_value'] = NULL;
    +    }
    

    This isn't correct. If an administrator doesn't have access to a certain format (which can happen due to #679530: administer filters permission should not affect text format widget) they can't be allowed to edit the content either, or they will be forced to switch its format to something else.

    That means if the testbot doesn't fail this patch, we need to add a test for that scenario :)

  2. Sorry, only after reviewing the uploaded patch I realized that this patch already changes a user update that changes signature_format values already. Likely should be merged instead.

    Right, what needs to happen with user signatures that have a "0" format is this:

    (a) If 'signature' is empty and 'signature_format' is 0, convert it to NULL. (That means a user account was created but without the signature ever being set, e.g. without the user having saved their account page.)
    (b) If 'signature' is not empty and 'signature_format' is 0, convert it to variable_get('filter_default_format'). (That scenario probably couldn't have arisen via core, but could via contrib.)

    It may be that other core modules have similar complications - it depends on exactly how they were using FILTER_FORMAT_DEFAULT in Drupal 6. We'll need to check. I think I made a mistake when I wrote some of those original update functions earlier in the D7 cycle; it is not quite as simple as changing all instances of "0" to variable_get('filter_default_format').

Status: Needs review » Needs work

The last submitted patch, drupal.filter-format.78.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
9.9 KB

Attached patch simplifies the conditions in a different way and incorporates David's feedback. RTBC if bot passes.

Status: Needs review » Needs work

The last submitted patch, drupal.filter-format.81.patch, failed testing.

sun’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
10.98 KB

Alright -- Taxonomy terms also have to be fixed. This patch is RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Sorry, I'm as eager to get this in as the next person, but patches need peer review.

Status: Needs review » Needs work

The last submitted patch, drupal.filter-format.83.patch, failed testing.

David_Rothstein’s picture

It's looking close - and the simplified boolean logic in this one does seem correct.

But I think we should be sticklers here given how tricky this is. I am not sure we need to fix every bit before the beta (an interim patch may be fine in the meantime) but here's my remaining todo list:

  1. Write a test for the bug that was almost reintroduced in #78 (even administrators can't be allowed to edit content that is in a format they don't have access to). That's a very tricky one, and we also missed it; really should have a test.
  2. Verify all the various modules' update functions with respect to how they update the D6 behavior of "format = 0" (I am particularly curious about node.module); we've seen already that taxonomy and user signatures are not so simple.
  3. +    // Now select a new text format and make sure the node can be saved.
    +    $edit[$body_format_key] = filter_fallback_format();
    +    $this->drupalPost('node/' . $node->nid . '/edit', $edit, t('Save'));
    +    $this->assertText($new_title, t('New title found.'));
    +    $this->assertNoText($old_title, t('Old title not found.'));
    

    @sun pointed out above this might not be testing much. We could probably just load the node and examine the stored format to verify that it was saved.

  4. The code comments in user_update_7010() now seem out of date - we probably just need to rewrite that from scratch given what we know now.
  5. +  // Replace signature_format with NULL for all empty user signatures.
    +  db_update('users')
    +    ->fields(array('signature_format' => NULL))
    +    ->condition('signature', '')
    +    ->execute();
    

    I think this update should only run on user signatures that have '0' as the current format. It is possible that someone deliberately saved something else, even with an empty signature.

  6. I noticed that filter_list_format() gives a PHP notice - via a call from text_summary() - when viewing a node teaser that is stored with a format that does not exist.

    The PHP notice isn't that important, but the behavior is. The D6 version of this security fix specifically ensured that filter_list_format() did not return an empty array when called with a non-existent format. I am not sure the reasoning there (I guess it was to protect people who might be implementing their own filtering mechanism, among other things).

    I think it's OK to return an empty array in D7 as there isn't much of an alternative anymore and that is more correct anyway, but we should at least be careful to document that. Let's make sure we're totally happy with the behavior of that function.

David_Rothstein’s picture

Forgot to mention, I should have at least a little time tomorrow to start working on this, if no one beats me to it. If I start on it, I'll assign the issue to myself.

In addition to the above list, there is also:

0. Make the existing tests pass (obviously) :)
7. I am very curious what happens if the user only has access to one text format (so that the 'select' element doesn't even display) but finds themselves in one of these situations, trying to edit content with a format that does not exist. We should verify that that works correctly.

sun’s picture

  1. Tests for filter admins to only see the formats they have access to would be nice, but that's no beta blocker. Separate issue. In fact, we could re-open the original, since I obiously #failed to add tests in that patch. ;)
  2. Double-checked node_update_7006() and it updates 0 to the last known default format, thus OK.
  3. Yes, that test should be improved. Since it's added in this beta blocking patch, it should be done correctly now.
  4. I actually attempted to rewrite user_update_7010(), but realized that it's still valid. Also, let's not hold up this beta blocker for inline code comments deeply buried in an module update function.
  5. We can add another ->condition('signature_format', 0). I didn't add that in the last patch, because I am considering filters that add stuff to entirely empty text as bogus filters, not supported by Drupal core, as that is an utterly wrong usage of filters. But anyway, if we want to add that condition, we can do it.
  6. Thanks for mentioning text_summary(). Now, what happens there looks very odd to me. However, I'm not able to see any case where it would throw a PHP notice. But aside from that, I'm questioning that entire function body. Separate issue.
  7. The additional case you're raising is already covered. The primary condition is that the format does not exist, and if the user only has access to one format, then the user is not an administrator. Non-administrative users are not allowed to edit the field.

Attached patch fixes everything that needs to be fixed.

sun’s picture

Status: Needs work » Needs review
sun’s picture

I've re-read this patch today and can't see any remaining issues that would have to be corrected to get this in.

As mentioned above, some follow-up issues:

- Add tests to verify that users with 'administer filters' only see actually granted formats. Perhaps re-open #679530: administer filters permission should not affect text format widget to do that.

- Revisit/double-check what text_summary() is doing. It looks odd.

David_Rothstein’s picture

Assigned: sun » David_Rothstein
Status: Needs review » Needs work

Working on this a bit today:

  1. Adding the test here will give us confidence that the major reworking of filter_process_format() in this patch doesn't break anything, which it almost did. Plus, it should be simple to add.
  2. At least {block_custom} needs to be updated to have NULL as the default format value for its 'format' column (to match the other updates).

    Also, with the ambiguous meaning of FILTER_FORMAT_DEFAULT in Drupal 6, I just don't think we know what data contrib modules put in these columns for nodes, comments, etc. But actually, I realized we don't need to. For all core modules, the procedure we came up with above for user signatures is also perfectly safe to run, and we should do that:

    • If 'column' is empty and 'column_format' is 0, change to NULL.
    • If 'column' is not empty and 'column_format' is 0, change to the last known default format.
    • If 'column_format' is anything else unrecognized, change to the last known default format also.
  3. Looks good. (Although I don't understand why assertUrl() uses var_export() when the output of url() is already a string?)
  4. The code comments are wrong, and should be simple to fix. We want user_update_7010() to be clean so contrib modules can use it as a model for this somewhat-tricky update.
  5. I think we need that - if someone submitted a user account page with an empty signature, we still should respect whatever format they saved with it (it is a legitimate setting in the UI).
  6. Yes, text_summary() itself is pretty odd - although I only brought that up because it was the caller of filter_list_format() which is where I thought I saw the bug. However, I can no longer reproduce the bug, and looking at the code I agree that filter_list_format() already behaves correctly for a filter that was properly disabled. I am not sure what I did to trigger the PHP notice; perhaps I had hacked around with the database while testing something and then forgot I did that. Agreed there is nothing to fix here.
  7. I just tried it and if there is only one format remaining it does not work correctly. It works fine for non-admins, but when an administrator goes to edit it, they get it automatically set to "Plain text" (with no "- Select -" option) and then a PDOException when they try to save it :) I will write a test for that and fix it.
chx’s picture

The logic is impossible to follow. I needed to draw up a truth table to figure it out.

A format_exists
B user_is_admin
C user_has_access

A B C Do we disable?
0 0 0 1
0 0 1 1 
0 1 0 0
0 1 1 0
1 0 0 1  
1 0 1 0
1 1 0 1
1 1 1 0 

Given that if (X)...elseif (Y) is the same as if (!X && Y) so you have if (!(!$format_exists && $user_is_admin) && (!$user_has_access || !$format_exists)) which is the same as if(($format_exists || !$user_is_admin) && (!$user_has_access || !$format_exists)) which is completely incomprehensible as you do not want format exists and user permissions in an OR relation. Much rather let's say if ((!$format_exists && !$user_is_admin) || ($format_exists && !$user_has_access)) now that makes sense: disable if the format does not exist and your are not admin or if the format exists but you do not have permission. So let's use that in place of the elseif.

Note: I let David to use my code or not. I do not care that much. Anything works for me. Let's get this in and Drupal 7 out.

sun’s picture

@chx: I don't see how your proposal makes the conditions any easier to understand. I don't think they can get any easier than in the current patch.

chx’s picture

The only problem with the current logic is that if you want to see when we disable you need to first evaluate the if condition and then the elseif. My version IMO is way clearer as it's one if only and IMO it mirrors the comment / intrinsic logic a lot better. But again: whatever.

David_Rothstein’s picture

Assigned: David_Rothstein » Unassigned
Status: Needs work » Needs review
FileSize
28.25 KB

Let's try the attached patch. I believe this resolves all remaining issues, and ensures a solid upgrade path for unrecognized text formats in all core modules, regardless of what data gets thrown at us.

I did not yet address @chx's comment in #92. I had to laugh reading that, because I made the exact same kind of truth table when I was trying to convince myself that @sun's previous patch was OK :) Personally, I'm happy with either one. To clarify, it seems we're deciding between my original code from the earlier patches:

  if (($format_exists && !$user_has_access) || (!$format_exists && !$user_is_admin)) {
    // Deny access to the text format selector.
    // ...
  }
  elseif (!$format_exists && $user_is_admin) {
    // Force the administrator to choose a new format.
    // ...
  }

and the code now being used in the more recent patches:

  if (!$format_exists && $user_is_admin) {
    // Force the administrator to choose a new format.
    // ...
  }
  elseif (!$user_has_access || !$format_exists) {
    // Deny access to the text format selector.
    // ...
  }

Like I said, either is fine with me, and hopefully with all the tests we have now we can feel free to switch the logic around without fear of everything breaking :) This patch could probably use an independent reviewer (who has not participated in writing it) to take a look anyway, which would help with that kind of decision. Perhaps @ksenzee will get a chance to do that next week.

sun’s picture

Assigned: Unassigned » sun
Status: Needs review » Reviewed & tested by the community

Thanks! I reviewed all changes in detail and they are good to go.

ksenzee’s picture

Status: Reviewed & tested by the community » Needs review

Let me take a look first - David is right that this could use an outside reviewer. I'll do it right now though so I don't hold anything up.

ksenzee’s picture

Status: Needs review » Reviewed & tested by the community

I read through the tests first, at David's suggestion, then read the rest of the patch, and it looks great. I will even say elegant. I am a pretty decent audience here, in the sense that I know very little about the innards of the filter system, and I understand exactly what's going on in the update functions. So contrib shouldn't have any problem following their example.

Oh, and I personally find the existing elseif logic easiest to read. But chx is right that it practically requires a truth table no matter which way you do it, and David is right that the tests should catch any flaws there.

So this is genuinely RTBC. Really nice work, everyone.

webchick’s picture

Issue tags: +API change

Awesome work! I have this on my stack to review tonight when I have a bit more brain-space, unless Dries beats me to it first.

In the meantime, if someone could provide a nice synopsis of the API (technically, database) change here that contrib modules need to do, that would be lovely.

sun’s picture

Synopsis: Don't store invalid text format ids. 0 is invalid. Non-existing formats are invalid. Store NULL to store no format.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok cool. Gave this a perusal for 20 minutes or so. Looks to be exactly the same pattern copy/pasted 5-6 times, plus expanded test coverage for text format edge cases. +1!

Committed to HEAD. Great work, folks!

rfay’s picture

So it sounds to me like there's no BC break on this one. Or is 0 used often as a text format id?

webchick’s picture

It's not a BC break, per se, it's an advisory that modules that previously stored 0 (or an invalid text format ID) in any 'format' columns need to store NULL now instead. If they do have invalid data in their format columns, they need to do something similar to what user module does in its upgrade path. Probably need to anyway just to be safe.

I'm not really sure how to handle this. A note on the 6.x => 7.x upgrade page seems insufficient. OTOH you are right that it's not a BC break of the nature that's usually posted on the devel list.

sun’s picture

Would be beneficial to announce this nevertheless. Any invalid text format value will lead to an empty string. A content/text having an invalid format assigned is not filtered and not output.

That has been the primary API change of this issue. The second/recent patch that has been committed merely updated existing D6 data to conform with that.

David_Rothstein’s picture

The fact that FILTER_FORMAT_DEFAULT ("0") is deprecated in Drupal 7 and that modules would need to write update functions because of that has been documented in the module upgrade guide for a very long time.

What changed here, though (in the recent committed patch) was the recommended/required way those update functions should be written, as well as the consequences if they were not written correctly. So that is probably worth announcing.

I updated the existing module upgrade docs (and added some new docs) for all this via the following diff:
http://drupal.org/node/224333/revisions/view/1164538/1171884

Some of that rewrites a bit what is already there, but the new section is this one:
http://drupal.org/update/modules/6/7#text_format_updates

I also created a HEAD to HEAD issue here:
#925422: Update for changes to stored text formats introduced in [#358437]

David_Rothstein’s picture

Hm. Actually, looking at all of that, I wonder if we still left something out of this issue.

Consider this code, which is still in the filter module:

 * @param $text
 *   The text to be filtered.
 * @param $format_id
 *   The format id of the text to be filtered. If no format is assigned, the
 *   fallback format will be used.
.....
 */
function check_markup($text, $format_id = NULL, $langcode = '', $cache = FALSE) {
  if (empty($format_id)) {
    $format_id = filter_fallback_format();
  }

Is that really correct, given what we did elsewhere? In other words, does calling check_markup() on a piece of text without passing in a specific text format ever make sense anymore?

It seems to be very inconsistent with the security fixes we introduced here. What it means is that if you store "0" for the format in a database table somewhere (e.g. if you did not write the recommended database update functions in this issue) and then try to run check_markup() on it, it will come out as plain text, rather than as an empty string.

sun’s picture

Status: Fixed » Needs review
FileSize
710 bytes

Good catch!

Dries’s picture

Priority: Critical » Normal
Status: Needs review » Reviewed & tested by the community

Committed to CVS HEAD. Thanks.

David_Rothstein’s picture

Priority: Normal » Critical
Status: Reviewed & tested by the community » Fixed

Yup, I guess that's all we need (at least for Drupal 7). If someone passes in a NULL format to check_markup() it's still a little odd, but it is documented, and should be OK.

sun’s picture

Status: Fixed » Needs work

Since webchick is going to rollback http://drupal.org/node/140783, this issue needs further work. Just for the record. Not going to do that.

ksenzee’s picture

Status: Needs work » Fixed

There are tests for this issue which should catch any problems caused by further work on #140783: A select list without #default_value always passes form validation.

Eric_A’s picture

I'm hesitant to ask this, but are we sure that no (Postgres) databases can have 0 as id for Filtered HTML in filter_formats after a clean install of Drupal 6? If such databases do exist and such a site has switched the default format to something else, then the upgrade path will convert all Filtered HTML content to that default?
(I would not be surprised if switching from 0 to to another default would cause problems anyways with the 0 id format in such a D6 site.)

David_Rothstein’s picture

I took a quick look at some Postgres docs:

http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-SE...
http://www.postgresql.org/docs/8.2/interactive/functions-sequence.html

and the equivalents for earlier Postgres versions. From those, it sounds to me like 0 for Filtered HTML wouldn't have happened in Drupal 6, but I really don't know. Do you have a specific reason to think it would?

If so, perhaps it's better to post a separate issue and link back here.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

effulgentsia’s picture

If anyone here can explain #1174656: Lower watchdog severity of missing text formats, I'd appreciate it. Thanks.