I noticed that the security fixes that went into Drupal 6 in http://drupal.org/node/345441 were never applied to Drupal 7, and I couldn't find an existing issue for it either..... um, that's kind of unfortunate :(
Attached is a patch. The patch includes a test to prevent the input format security issue from recurring (no test for the update.php security issue, though, since I assume there's no good way to do that). This also involved reworking some of the existing input format tests.
Comment | File | Size | Author |
---|---|---|---|
#107 | drupal.check-markup-empty.107.patch | 710 bytes | sun |
#95 | drupal.filter-format.95.patch | 28.25 KB | David_Rothstein |
#88 | drupal.filter-format.88.patch | 13.21 KB | sun |
#83 | drupal.filter-format.83.patch | 10.98 KB | sun |
#81 | drupal.filter-format.81.patch | 9.9 KB | sun |
Comments
Comment #2
David_Rothstein CreditAttribution: David_Rothstein commentedI'm putting this back to "needs review".
The test that supposedly failed: (a) does not fail when run on my machine with the patch applied, (b) did not fail before when the testbot tried it (at least I'm pretty sure), and (c) has absolutely nothing to do with the code here, so I can't really imagine how it's being caused by this patch...
Comment #3
Dave ReidI'm wondering if DrupalWebTestCase is really the best place for drupalCreateInputFormat() and drupalDeleteInputFormat(). Right now I don't see that being re-used in other tests besides filter tests... Maybe we should create a FilterTestHelper class?
Comment #4
David_Rothstein CreditAttribution: David_Rothstein commentedThat seems like it might make sense -- although on the other hand, I can imagine certain contrib modules wanting to use those functions in their tests, and DrupalWebTestCase would probably be the most convenient place to put it for them.
Is there any downside to having it in DrupalWebTestCase?
Comment #5
Dave ReidDownside is, having more code/memory during all our simpletests. Contrib module test classes could just extend FilterTestHelper to pick up the functions, if I remember correctly.
Comment #7
David_Rothstein CreditAttribution: David_Rothstein commentedOK, this patch no longer applied because of the "input format" => "text format" name change, and also because the update.php part of it was committed in the meantime via #361699: SA-2008-073: update.php CSRF. (I also just found #361706: SA-2008-073: Content filtering bypass, when filters are being removed and marked it as a duplicate.)
The attached patch follow's Dave Reid's suggestion for the code reorganization. I noticed that the comment module does something similar with its tests, so I guess it makes sense to do it here too. I'm not sure there is actually any practical effect on the amount of code/memory loaded, since it looks like simpletest_get_all_tests() just loads all the code all the time anyway. But at least it is better theoretically :)
Comment #8
David_Rothstein CreditAttribution: David_Rothstein commentedAttaching the patch: attempt #2...
Comment #10
David_Rothstein CreditAttribution: David_Rothstein commentedDear Testbot,
Please stop crying wolf.
Sincerely,
David
:)
Comment #11
pwolanin CreditAttribution: pwolanin commentedtagging
Comment #13
BerdirThe patch couldn't apply the .test changes for some reason, re-rolled..
Comment #14
BerdirComment #15
sunsubscribing
Comment #16
David_Rothstein CreditAttribution: David_Rothstein commentedThanks for the reroll!
I've rerolled the patch again (attached) - this adds some necessary code to filter_form() that was originally in the issue at #11218: Allow default text formats per role, and integrate text format permissions but more properly belongs here.
This also gets a few other minor things - the subclasses now all properly extend FilterHelperTestCase again, and all getInfo() methods now use "public static function getInfo" (I think that change was what caused the patch to fail to apply before, but I'm not sure).
Comment #18
sunTagging to get in my list of issues.
Comment #19
sun#562694: Text formats should throw an error and refuse to process if a plugin goes missing is the proper security hardening fix accounting for vanishing text formats as well as vanishing filters. Comes with tests.
Additionally, the filter system API was changed so that modules can properly react when a text format is updated or deleted.
Comment #20
sunRe-opening based on recent discussion in #562694: Text formats should throw an error and refuse to process if a plugin goes missing
The changes related to text formats (not filters) should be extracted from the patches over there and moved over here. Therefore, status to active, as we're not re-rolling earlier patches in here.
Comment #21
int CreditAttribution: int commentedadd beta blocker tag
Comment #22
philbar CreditAttribution: philbar commentedThis module does not appear on the D7 beta blocker section of the community initiative page. Should it be?
Comment #23
David_Rothstein CreditAttribution: David_Rothstein commentedAs far as how to proceed here, I think zeroing out the content (proposed in the other issue) is OK for a text format that truly is unknown, but we also need to consider this kind of case (also previously discussed):
1. Disable a module that stores some formatted text.
2. Delete a text format; hook_filter_format_delete() never gets called on the above module, since it is disabled.
3. Reenable the module. Through no fault of anyone, the module now has data stored in a text format that no longer exists.
In that scenario we do not want the content stored by that module to be hidden from display; that effectively destroys the site. There is some beginning code at #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do that would store a mapping of deleted text formats so if we encounter a deleted one we always know how to properly display it.
What we really might need here for starters is some tests (including a scenario like the above); currently the ones for deleting text formats aren't too comprehensive.
Comment #24
mikey_p CreditAttribution: mikey_p commentedI think the mapping idea has more merit then the other patches at #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do, which seem to be basically the same issue. If they aren't could someone please explain to me the differences between that issue and this one?
Comment #25
David_Rothstein CreditAttribution: David_Rothstein commentedThe main issue here is that the filter module needs to have some secure behavior that applies when someone tries to filter content with a text format that doesn't exist (regardless of where or how that incorrect text format came from).
#556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do is a very real example of how that situation arises currently - and that issue is mainly about preventing that particular situation from arising in the first place. But depending on what is decided there, it could be that it will wind up totally encompassing this issue as well.
Comment #26
David_Rothstein CreditAttribution: David_Rothstein commentedI just realized there is a bit of a complication here in terms of the upgrade path. Because of the way things were done in D6, sites could very well have lots of content (especially from CCK) whose assigned text format no longer exists, but which D6 automatically treats as though it has the D6 site's default format, filtering it in that format every time it is displayed, etc.
How do we deal with preserving behavior for that content on upgrade to D7?
Comment #27
Gábor HojtsyMigrate to the new default format and alert the user about that on upgrade?
Comment #28
pwolanin CreditAttribution: pwolanin commentedHow do we know that the content had the default format before? Do we just migrate all content with an invalid format?
Comment #29
sunBack to the plan in #20.
Comment #30
sunre #26: Core developers want to ignore this problem. If we are not even able to simply update text format values when deleting a text format, then we're more than unable to perform such a update during upgrade. #27 and #28 already figured that we don't even know the proper replacement format for such contents. So as if one wouldn't be enough, at very least we have two poor reasons to not upgrade text format reference values.
Comment #31
sunAny feedback? I think this is RTBC.
Comment #32
bjaspan CreditAttribution: bjaspan commentedI think this patch is fundamentally right. No matter what, if we get a request for a text format we have absolutely no information about, the only safe thing to do is not to render the text at all. You've said it yourself: Falling back to check_plain() is not good enough. I think it would be better to render it as "This content is no longer available" instead of the empty string, but that may be bike-shedding.
I am perplexed why we agree on this issue but not on #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do. This patch contains the code:
The logic can be restated as: "If the requested format does not exist, replace the format with one that we *know* is safe." Since there is only one such format and it is "return a constant string," the patch simply avoids the complexity of a real format and returns a constant string directly.
The logic in #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do can be restated as: "If the requested format does not exist, and we know what the site admin wants us to use instead, use it."
David's very reasonable example in #23 shows how these are practically the same use case. A module that is disabled when a format is deleted is never going to know about the deletion. It cannot possibly update its content even if it is just one row in a known SQL table. So either we make all that content inaccessible until the site admin takes some contrib-module-specific action not currently implemented by any code (translation: the site admin is screwed), or we accept the replacement-format approach from #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do and do what the site admin actually told us she wanted.
So, I'd like to mark this RTBC, but what I really think should happen is that the "fallback text format is a constant string" approach should be integrated with the "replacement format is what the user asked for" approach into a single patch. Otherwise, we'll commit one first and then just have to modify it slightly for the second.
We each own one of these patches. Would you like to combine them, or shall I, or do you still disagree?
Comment #33
sunThe fundamental difference between these two issues is that this issue derives from #562694: Text formats should throw an error and refuse to process if a plugin goes missing, which tries to account for some edge-cases:
The primary difference to #556022: When a text format is deleted, some modules (like text.module) don't update their data, even though we say they do is:
David is right that both issues are also related to data handling of disabled modules. However, a proper solution for disabled modules needs to be discussed over in #734710: API functions that invoke hooks don't work as expected when a disabled module is being uninstalled.
We can't, because we don't know where the content will be used. Ideally, we'd output an administrative warning, but the only safe way is the watchdog message already contained in this patch.
No, that's not what is happening here. Instead: If the requested format does not exist, return nothing.
It is as simple as: Create a user, then create a node, use SQL to delete the user, try to access the node: 403. Likewise, text formats and filters in text formats can only ever be applied, if the data is valid.
Comment #34
yched CreditAttribution: yched commentedAnd CCK never adressed that in any way. [node|user]reference fields keep references to nids|uids that don't exist anymore, and filter them out on output / edit. Nothing new here, we were already not able to mass update db records in a guaranteed non-breaking way in CCK D6. No one ever saw that as a critical 'OMG, we're not enforcing referential integrity'. We're just not able to enforce referential integrity, and have never been.
Text format reconfiguration doesn't involve updating data, only clearing caches. Deleting formats is the issue, and it's an unfrequent case.
Comment #35
bjaspan CreditAttribution: bjaspan commented@sun, I am sympathetic to your position. Realize that the *only* part of Drupal core that handles content deletion in the way you are advocating for is Field API, and that happened mostly because I felt very strongly that it should. It was NOT easy, even though we designed for it from the beginning. It is way too late in the D7 cycle to attempt something like that now. Furthermore, it would have to be much more complicated, because it would not just be Field API---it would have to support every contrib module that stores references to formats, because any such module might be disabled. And you are in fact advocating that it should also handle all references to anything. Finally, unless whatever solution we came up with for Filter API worked instantly for both Field API and disabled contrib modules when they got enabled, we would still want the replacement format solution to keep the site secure and appearing correctly during the transition.
To put it another way, echoing yched: You are demanding a degree of referential integrity that is very difficult in large or distributed database scenarios, that no one has a really good solution to, and that no other part of Drupal (except Field API) even attempts to implement.
Comment #36
Dries CreditAttribution: Dries commentedAs far as I can tell, the patch in #29 is RTBC. Marking it as such for a while -- will commit this later this afternoon unless someone speaks up.
Comment #37
Dries CreditAttribution: Dries commentedNo feedback. Committed to CVS HEAD. Thanks. :-)
Comment #38
David_Rothstein CreditAttribution: David_Rothstein commentedMajor issues:
Minor issues:
Not sure I understand this change? Actually, adding the first part probably makes sense, but why remove it from filter_format_save() also?
I have no idea what that has to do with the patch :) Also, this makes things inconsistent with the adjacent styling for tr.warning and tr.ok, and it doesn't make sense to me on a first glance; why do we ever have to worry about table rows that might exist outside a table?
I'm a big fan of this change - it was silly that the previous code came so close to clearing all caches but didn't actually do it :) However, if flushing all caches really doesn't result in entity_info_cache_clear() automatically happening somewhere along the way, that sounds like there's a bug somewhere?
Comment #39
David_Rothstein CreditAttribution: David_Rothstein commentedBy the way, in terms of how to do the D6->D7 upgrade, I think #27 and #28 probably had it right? - we just have to upgrade any data with an unrecognized format to explicitly use the (old) D6 default format instead.
That may not be perfect, but it's the best we can do - and since that content would have already been displayed using that same format (whenever it was being viewed on the D6 site), we aren't introducing any security issues that the site didn't already have.
Comment #40
David_Rothstein CreditAttribution: David_Rothstein commentedOops, please ignore my minor issue #1. I was reading the patch file incorrectly :)
Comment #41
sunre major issues #38:
IIRC, #562694: Text formats should throw an error and refuse to process if a plugin goes missing contained some additional code to handle vanishing/invalid formats for the user signature, which revealed that there can be many cases in which a format may become invalid.
re minor issues #38:
Comment #42
David_Rothstein CreditAttribution: David_Rothstein commentedWe do know at least the last two things - because D6 behavior chooses them for us. D6 takes such content and displays/treats it as if it were in the D6 default format ('filter_default_format' variable at the moment of the upgrade), so if we store that in the database, we maintain consistency.
Anyway, if we don't solve this, then some people upgrading from D6 to D7 are going to suddenly have a bunch of their CCK fields/etc/everything stop displaying any content. I don't think that's acceptable.
The user's default format in D7 is not necessarily the most permissive possible format - it's just the first one on their list (based on weight). A lot of admins are probably going to choose to put Full HTML at the top of the list and therefore have that as their personal default.
Comment #43
sunIn #520760-41: SA-CORE-2009-007 user signature format, we already tackled the upgrade path for the user signature format in detail. D6 updates all values to the default format already. If there's something left regarding user signature formats, shouldn't we rather re-open that issue?
AFAIK, the upgrade path for CCK fields is not contained in core, but will be a contributed project.
Attached patch contains the only two changes I can see here.
Comment #44
sunHrm. In light of #651086: Cache clearing is an ineffective mess in module_enable() and system_modules_submit(), I think we should revert all of the cache clearing changes. Seems like the topic is a lot more complex.
Comment #45
bjcool CreditAttribution: bjcool commentedSubscribe
Comment #46
Damien Tournoud CreditAttribution: Damien Tournoud commentedSo it seems that the security issue has been committed. Lowering priority.
Comment #47
David_Rothstein CreditAttribution: David_Rothstein commentedWell, it introduced an issue with the upgrade path, which may be critical...
Plus there is not a security issue remaining, but there is still a security weakness along the lines of what #91663: Permission of text format is not checked when editing an entity and instead reset to something a user can use. fixed for the case where the text format does exist (and that issue was critical). So I'm moving back to critical for now.
Comment #48
webchickOh, hai! What's left to get this RTBC? This is the last security issue that I can see which would block beta.
Comment #49
sunFirst of all, we want to commit #44, which reverts changes there were unrelated to this patch (and partially wrong).
@David: Not sure I understand your remaining concerns in #47... I guess it would be possible to start with writing a test to assert that the remaining bug you think of exists?
Comment #50
David_Rothstein CreditAttribution: David_Rothstein commentedAgreed - I will work on writing a test for this tomorrow.
By the way, in terms of this being a beta blocker, I think it probably isn't (at least not for the security reasons). I think what remains is more of a "security hardening" issue, which already exists as a problem in D6 but which we decided (and actually implemented elsewhere) shouldn't exist in D7, so we just have to fix this up to do that here also. I don't think it should block a beta release, though; I'll look more carefully into it tomorrow.
If we're aiming for a perfect upgrade path before beta, then that part of it may be a beta blocker.
Comment #51
sunWe still need to commit #44 (i.e., revert previously committed, bogus changes).
Comment #52
webchickWell. #44 is your own patch. Could we get an independent reviewer to back this up?
Comment #53
David_Rothstein CreditAttribution: David_Rothstein commentedYup, that one is RTBC.
I started working yesterday on a test for the other issues I raised, but it was a little trickier than I expected and I didn't finish it yet. I'll have something to post soon. Sorry for the delay.
In the meantime, no reason not to commit #44.
Comment #54
webchickOk cool. Unfortunately I can't since it no longer applies. ;)
Comment #55
webchickAlso, am I correct that we can essentially remove "beta blocker" and "Security Advisory follow-up" tags from this?
Comment #56
sun1) Looks like instead of reverting that CSS change, someone advanced on it and applied it to the other table row status classes, too. Nice. :)
2) We're waiting for David's draft patch to figure out whether this is still critical and block a beta.
Comment #57
David_Rothstein CreditAttribution: David_Rothstein commentedOops, I was relying on the testbot for that part of it :)
I believe there is one thing that is still a security regression from D6 - whether it should be a beta blocker or not I am not sure. Here are the detailed steps:
I believe that is a security risk - simply loading and resaving a node without editing anything should not be able to convert the node to an insecure format.
The reason this was not an issue in D6 core was that the (site-wide) default text format would have been used in that case, not the admin's personal default. On a site that is otherwise configured securely, silently switching to that format does not open you up to XSS attacks. That is why I think in D7 we need to use the fallback format (i.e. plain text) which is the equivalent format that the site is supposed to always have configured securely in D7.
As mentioned, I'll continue to work on getting this into code :)
Comment #58
David_Rothstein CreditAttribution: David_Rothstein commentedSorry, crosspost. #56 is RTBC if the testbot agrees.
Comment #59
sun@David: Thanks for the clarification. That may indeed be a problem (though we should verify that it exists through a test first). As you know, I'm hard to convince of any programmed/automated format re-selection, regardless of whether it's the fallback format we're falling back to (as that can also expose sensitive information if you consider a content using the PHP code format).
AFAICS, the real issue is that we are using a select list to choose the format. Since the assigned #default_value does not exist in the example case, no format is preselected. And since it is a select list, the user agent/browser simply defaults to the first available select option. -- Therefore, when manually updating a content that's using a format that no longer exists (for any reason), your browser resets the format to the first available option.
Comment #60
sun@David: How's your test coming along? Meanwhile, I've created a patch to fix the bug.
This patch is not RTBC. RTBC is #56.
Comment #61
sunok, stupid me - this was blatantly copied from _form_validate(), which uses $elements instead of $element ;)
Powered by Dreditor.
Comment #62
David_Rothstein CreditAttribution: David_Rothstein commentedLooks excellent! Assuming we are willing to change the UI for this. I think the other way would not be so bad (it's still better than D6, plus in the PHP code case, the sensitive information in this case would be visible to the administrator right away after they saved the node, so they would be able to correct it immediately also). Still, I agree that this "Please select" method is theoretically the best - and might also serve as a way to explain to the user what is going on, since this is a confusing situation (we could even hijack the filter tips for that?).
If all goes well, I'll have something to post tomorrow - sorry for the delay. Note that this issue we're discussing is just part of the followup (the most critical).
Comment #63
webchickOk, committed #56 to HEAD. Back to.. needs review?
Comment #64
David_Rothstein CreditAttribution: David_Rothstein commentedOK, I think the attached might do it.
In the end, it wasn't that much code, especially when combining it with @sun's excellent approach above.
So, this contains:
Comment #66
David_Rothstein CreditAttribution: David_Rothstein commented#64: filter-format-358437-64.patch queued for re-testing.
Comment #68
sunIMHO, this manually injected null option is ugly as hell and totally inconsistent. Select lists are optional by definition. We should fix #140783: A select list without #default_value always passes form validation, so as to be able to fix this remaining bug here cleanly.
Comment #69
sunNow that I properly understood all expectations, #140783-15: A select list without #default_value always passes form validation needs to be fixed very badly.
Comment #70
David_Rothstein CreditAttribution: David_Rothstein commentedI don't think this should be postponed on making the code less ugly. Unless you are saying there is some way in which the code in this patch doesn't actually work, we should continue fixing the critical bug here and it can always be cleaned up later.
Looking at the test failures, I'm guessing I'm going to need to revert this part of the patch:
(Or probably move it to a separate issue.)
I think it makes sense that a format of "0" shouldn't be treated as the default, but the way the user signatures work it's not so simple (that's what caused the test failures), and that part of the patch might not actually be critical or strictly required. Will have to look at it some point later.
Comment #71
sunNo, this change is correct. If there is still code that sets 0 or something else instead of NULL (resp. no #format key at all), then we need to fix that code to set NULL instead.
(and elsewhere) This also passes in case we're still looking at the form. Eventually needs to check page redirection URL.
Powered by Dreditor.
Comment #72
sunhttp://drupal.org/node/140783 is really required + actually in a semi-RTBC state. chx already reviewed privately.
Nevertheless, I've double-checked that no module is passing 0 instead of NULL. But the phpDoc still mentioned that 0 would be a valid value, so fixed that.
Additionally shortened some comments and projected some knowledge I gained from #140783 for this patch. However, we still want to properly fix the underlying bug via #140783.
Comment #74
ksenzee@sun, are these actual test failures, or is this patch dependent on #140783?
Comment #75
David_Rothstein CreditAttribution: David_Rothstein commented@ksenzee: They're due to the user signatures issue I mentioned above. The user signature currently stores "0" in the database for signatures that have not been created yet, thus the empty() => !isset() change breaks the ability of a correct default format to be set the first time you edit your user account.
I agree in theory we want to make everything settle consistently on having NULL (and nothing else) be the indicator for "nothing has been selected yet because the content doesn't exist so the default format should be preselected" but it may turn out to be out of scope for this issue. I only put it in the patch because I thought it would be trivial.
Comment #76
sun#140783: A select list without #default_value always passes form validation landed. So let's fix this properly.
Sorry, only after reviewing the uploaded patch I realized that this patch already changes a user update that changes signature_format values already. Likely should be merged instead.
Comment #77
sunAlso, Text module's schema gets it right:
User's signature_format uses some strange definition.
Comment #78
chx CreditAttribution: chx commentedReshuffled the logic in filter_process_format to make it a bit simpler.
Comment #79
David_Rothstein CreditAttribution: David_Rothstein commentedThis isn't correct. If an administrator doesn't have access to a certain format (which can happen due to #679530: administer filters permission should not affect text format widget) they can't be allowed to edit the content either, or they will be forced to switch its format to something else.
That means if the testbot doesn't fail this patch, we need to add a test for that scenario :)
Right, what needs to happen with user signatures that have a "0" format is this:
(a) If 'signature' is empty and 'signature_format' is 0, convert it to NULL. (That means a user account was created but without the signature ever being set, e.g. without the user having saved their account page.)
(b) If 'signature' is not empty and 'signature_format' is 0, convert it to
variable_get('filter_default_format')
. (That scenario probably couldn't have arisen via core, but could via contrib.)It may be that other core modules have similar complications - it depends on exactly how they were using FILTER_FORMAT_DEFAULT in Drupal 6. We'll need to check. I think I made a mistake when I wrote some of those original update functions earlier in the D7 cycle; it is not quite as simple as changing all instances of "0" to
variable_get('filter_default_format')
.Comment #81
sunAttached patch simplifies the conditions in a different way and incorporates David's feedback. RTBC if bot passes.
Comment #83
sunAlright -- Taxonomy terms also have to be fixed. This patch is RTBC.
Comment #84
webchickSorry, I'm as eager to get this in as the next person, but patches need peer review.
Comment #86
David_Rothstein CreditAttribution: David_Rothstein commentedIt's looking close - and the simplified boolean logic in this one does seem correct.
But I think we should be sticklers here given how tricky this is. I am not sure we need to fix every bit before the beta (an interim patch may be fine in the meantime) but here's my remaining todo list:
@sun pointed out above this might not be testing much. We could probably just load the node and examine the stored format to verify that it was saved.
I think this update should only run on user signatures that have '0' as the current format. It is possible that someone deliberately saved something else, even with an empty signature.
The PHP notice isn't that important, but the behavior is. The D6 version of this security fix specifically ensured that filter_list_format() did not return an empty array when called with a non-existent format. I am not sure the reasoning there (I guess it was to protect people who might be implementing their own filtering mechanism, among other things).
I think it's OK to return an empty array in D7 as there isn't much of an alternative anymore and that is more correct anyway, but we should at least be careful to document that. Let's make sure we're totally happy with the behavior of that function.
Comment #87
David_Rothstein CreditAttribution: David_Rothstein commentedForgot to mention, I should have at least a little time tomorrow to start working on this, if no one beats me to it. If I start on it, I'll assign the issue to myself.
In addition to the above list, there is also:
0. Make the existing tests pass (obviously) :)
7. I am very curious what happens if the user only has access to one text format (so that the 'select' element doesn't even display) but finds themselves in one of these situations, trying to edit content with a format that does not exist. We should verify that that works correctly.
Comment #88
sunAttached patch fixes everything that needs to be fixed.
Comment #89
sunComment #90
sunI've re-read this patch today and can't see any remaining issues that would have to be corrected to get this in.
As mentioned above, some follow-up issues:
- Add tests to verify that users with 'administer filters' only see actually granted formats. Perhaps re-open #679530: administer filters permission should not affect text format widget to do that.
- Revisit/double-check what text_summary() is doing. It looks odd.
Comment #91
David_Rothstein CreditAttribution: David_Rothstein commentedWorking on this a bit today:
Also, with the ambiguous meaning of FILTER_FORMAT_DEFAULT in Drupal 6, I just don't think we know what data contrib modules put in these columns for nodes, comments, etc. But actually, I realized we don't need to. For all core modules, the procedure we came up with above for user signatures is also perfectly safe to run, and we should do that:
Comment #92
chx CreditAttribution: chx commentedThe logic is impossible to follow. I needed to draw up a truth table to figure it out.
Given that if (X)...elseif (Y) is the same as if (!X && Y) so you have if (!(!$format_exists && $user_is_admin) && (!$user_has_access || !$format_exists)) which is the same as if(($format_exists || !$user_is_admin) && (!$user_has_access || !$format_exists)) which is completely incomprehensible as you do not want format exists and user permissions in an OR relation. Much rather let's say
if ((!$format_exists && !$user_is_admin) || ($format_exists && !$user_has_access))
now that makes sense: disable if the format does not exist and your are not admin or if the format exists but you do not have permission. So let's use that in place of the elseif.Note: I let David to use my code or not. I do not care that much. Anything works for me. Let's get this in and Drupal 7 out.
Comment #93
sun@chx: I don't see how your proposal makes the conditions any easier to understand. I don't think they can get any easier than in the current patch.
Comment #94
chx CreditAttribution: chx commentedThe only problem with the current logic is that if you want to see when we disable you need to first evaluate the if condition and then the elseif. My version IMO is way clearer as it's one if only and IMO it mirrors the comment / intrinsic logic a lot better. But again: whatever.
Comment #95
David_Rothstein CreditAttribution: David_Rothstein commentedLet's try the attached patch. I believe this resolves all remaining issues, and ensures a solid upgrade path for unrecognized text formats in all core modules, regardless of what data gets thrown at us.
I did not yet address @chx's comment in #92. I had to laugh reading that, because I made the exact same kind of truth table when I was trying to convince myself that @sun's previous patch was OK :) Personally, I'm happy with either one. To clarify, it seems we're deciding between my original code from the earlier patches:
and the code now being used in the more recent patches:
Like I said, either is fine with me, and hopefully with all the tests we have now we can feel free to switch the logic around without fear of everything breaking :) This patch could probably use an independent reviewer (who has not participated in writing it) to take a look anyway, which would help with that kind of decision. Perhaps @ksenzee will get a chance to do that next week.
Comment #96
sunThanks! I reviewed all changes in detail and they are good to go.
Comment #97
ksenzeeLet me take a look first - David is right that this could use an outside reviewer. I'll do it right now though so I don't hold anything up.
Comment #98
ksenzeeI read through the tests first, at David's suggestion, then read the rest of the patch, and it looks great. I will even say elegant. I am a pretty decent audience here, in the sense that I know very little about the innards of the filter system, and I understand exactly what's going on in the update functions. So contrib shouldn't have any problem following their example.
Oh, and I personally find the existing elseif logic easiest to read. But chx is right that it practically requires a truth table no matter which way you do it, and David is right that the tests should catch any flaws there.
So this is genuinely RTBC. Really nice work, everyone.
Comment #99
webchickAwesome work! I have this on my stack to review tonight when I have a bit more brain-space, unless Dries beats me to it first.
In the meantime, if someone could provide a nice synopsis of the API (technically, database) change here that contrib modules need to do, that would be lovely.
Comment #100
sunSynopsis: Don't store invalid text format ids. 0 is invalid. Non-existing formats are invalid. Store NULL to store no format.
Comment #101
webchickOk cool. Gave this a perusal for 20 minutes or so. Looks to be exactly the same pattern copy/pasted 5-6 times, plus expanded test coverage for text format edge cases. +1!
Committed to HEAD. Great work, folks!
Comment #102
rfaySo it sounds to me like there's no BC break on this one. Or is 0 used often as a text format id?
Comment #103
webchickIt's not a BC break, per se, it's an advisory that modules that previously stored 0 (or an invalid text format ID) in any 'format' columns need to store NULL now instead. If they do have invalid data in their format columns, they need to do something similar to what user module does in its upgrade path. Probably need to anyway just to be safe.
I'm not really sure how to handle this. A note on the 6.x => 7.x upgrade page seems insufficient. OTOH you are right that it's not a BC break of the nature that's usually posted on the devel list.
Comment #104
sunWould be beneficial to announce this nevertheless. Any invalid text format value will lead to an empty string. A content/text having an invalid format assigned is not filtered and not output.
That has been the primary API change of this issue. The second/recent patch that has been committed merely updated existing D6 data to conform with that.
Comment #105
David_Rothstein CreditAttribution: David_Rothstein commentedThe fact that FILTER_FORMAT_DEFAULT ("0") is deprecated in Drupal 7 and that modules would need to write update functions because of that has been documented in the module upgrade guide for a very long time.
What changed here, though (in the recent committed patch) was the recommended/required way those update functions should be written, as well as the consequences if they were not written correctly. So that is probably worth announcing.
I updated the existing module upgrade docs (and added some new docs) for all this via the following diff:
http://drupal.org/node/224333/revisions/view/1164538/1171884
Some of that rewrites a bit what is already there, but the new section is this one:
http://drupal.org/update/modules/6/7#text_format_updates
I also created a HEAD to HEAD issue here:
#925422: Update for changes to stored text formats introduced in [#358437]
Comment #106
David_Rothstein CreditAttribution: David_Rothstein commentedHm. Actually, looking at all of that, I wonder if we still left something out of this issue.
Consider this code, which is still in the filter module:
Is that really correct, given what we did elsewhere? In other words, does calling check_markup() on a piece of text without passing in a specific text format ever make sense anymore?
It seems to be very inconsistent with the security fixes we introduced here. What it means is that if you store "0" for the format in a database table somewhere (e.g. if you did not write the recommended database update functions in this issue) and then try to run check_markup() on it, it will come out as plain text, rather than as an empty string.
Comment #107
sunGood catch!
Comment #108
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #109
David_Rothstein CreditAttribution: David_Rothstein commentedYup, I guess that's all we need (at least for Drupal 7). If someone passes in a NULL format to check_markup() it's still a little odd, but it is documented, and should be OK.
Comment #110
sunSince webchick is going to rollback http://drupal.org/node/140783, this issue needs further work. Just for the record. Not going to do that.
Comment #111
ksenzeeThere are tests for this issue which should catch any problems caused by further work on #140783: A select list without #default_value always passes form validation.
Comment #112
Eric_A CreditAttribution: Eric_A commentedI'm hesitant to ask this, but are we sure that no (Postgres) databases can have 0 as id for Filtered HTML in filter_formats after a clean install of Drupal 6? If such databases do exist and such a site has switched the default format to something else, then the upgrade path will convert all Filtered HTML content to that default?
(I would not be surprised if switching from 0 to to another default would cause problems anyways with the 0 id format in such a D6 site.)
Comment #113
David_Rothstein CreditAttribution: David_Rothstein commentedI took a quick look at some Postgres docs:
http://www.postgresql.org/docs/8.1/interactive/datatype.html#DATATYPE-SE...
http://www.postgresql.org/docs/8.2/interactive/functions-sequence.html
and the equivalents for earlier Postgres versions. From those, it sounds to me like 0 for Filtered HTML wouldn't have happened in Drupal 6, but I really don't know. Do you have a specific reason to think it would?
If so, perhaps it's better to post a separate issue and link back here.
Comment #115
effulgentsia CreditAttribution: effulgentsia commentedIf anyone here can explain #1174656: Lower watchdog severity of missing text formats, I'd appreciate it. Thanks.