I note that although PHP considers a filepath like '/home/somebody/junk' or 'sites/default/files/junk' to be a correct argument/streamwrapper for all file operations, the code in file_unmanaged_copy() thinks this is wrong. But unless we want to be way different from PHP's intent, it's not wrong.
Also, this code emits an error message but then doesn't return false to indicate an error.
Comment | File | Size | Author |
---|---|---|---|
#66 | drupal.file_example_bad_watchdog_param.patch | 1.09 KB | rfay |
#62 | drupal.file_invalid_scheme_874326_62.patch | 16.66 KB | chx |
#53 | drupal.file_invalid_scheme_874326_53.patch | 17.93 KB | rfay |
#51 | drupal.file_invalid_scheme_874326_51.patch | 16.08 KB | rfay |
#48 | drupal.invalid_scheme_874326_48.patch | 14.44 KB | rfay |
Comments
Comment #1
webchickMakes sense to me, although this definitely needs sign-off from drewish, pwolanin, or quicksketch. In the meantime in
A $destnation without...
it should be "destination". :)Comment #2
aaron CreditAttribution: aaron commentedi believe that we do not expose the standard php file stream in drupal, instead implementing (by default) public:// and private:// to enforce a standard folder path that will be available to modules regardless of the site configuration. if someone wanted to write outside of those folders, the best bet would be to implement a new stream wrapper.
good catch on the return value for that section though; appears to be an oversight.
Comment #3
aaron CreditAttribution: aaron commentedunless one of the other maintainers believes otherwise, i think that this would be an api change, as it could open a can of worms to allow direct access to the php file server.
Comment #4
pwolanin CreditAttribution: pwolanin commentedRight - we blocked this on purpose, and I agree that this would be an API change and possible security issue. What's the use case?
Comment #5
drewish CreditAttribution: drewish commentedhum... i don't specifically remember this decision. i can see both sides of it it seems like you should be able to copy some file from /tmp/import/foo to public://mymodule/foo but i can see some merit to only allow streams because if you want non-stream wrappers you could always just call copy()...
Comment #6
drewish CreditAttribution: drewish commentedputting this back to but because of the missing return. i think that needs to be fixed in any case.
Comment #7
aaron CreditAttribution: aaron commentedi agree (that the missing return needs to be fixed), but that's a separate (and easier) issue.
Comment #8
aaron CreditAttribution: aaron commentedComment #9
rfayThanks for your weigh-in. I opened #876114: Normal stream schemes are not supported in Drupal. Why? as a spin-off, and would appreciate you all making a final decision there.
Changing this issue to focus only on the failure to handle the error correctly.
Comment #10
rfaypwolanin points out in IRC that general use of streams has been a security issue. Without this patch, one certainly can access file:///etc/passwd without any trouble.
Comment #11
pwolanin CreditAttribution: pwolanin commentedI think the real bug fix is just this.
Comment #12
rfayThis one has an slightly improved comment: Assert an *allowed scheme*.
Comment #13
pwolanin CreditAttribution: pwolanin commentedwell, not allowed is not valid...
In any case - should be RTBC if the tests are happy (though perhaps worrying that we have no test coverage for this).
Comment #15
pwolanin CreditAttribution: pwolanin commentedhmm, looks like something was expecting to be able to use a bare path?
I fear there is still substantial internal inconsistency in the file handling.
Comment #16
rfaySo as we fix this, one question: Shouldn't we have in each of these failure situations: watchdog() for the admin (to fix their site) as well as drupal_set_message() (to notify the user their action failed)
Comment #17
pwolanin CreditAttribution: pwolanin commentedYes, it would seem so, but we need to inspect the code to see where the problems are.
Comment #18
rfayShucks, I think this is probably a security issue, since the intent of the system is not to allow direct file access (or access via other schemes). Currently, access to /etc/passwd, file:///etc/passwd, ftp://ftp@example.com, and lots of other things is fully allowed.
Comment #19
rfayHere is a patch for *one* of the #fails I chased down on the plane. It will fail with many more.
Comment #21
rfayThis one might get them.
This also adds watchdog() calls (for the administrator) and outputs the actual path, instead of the streamwrapper.
I changed the drupal_set_message() calls to *not* output the destination path, as that could be a security issue, disclosing the filesystem structure.
Comment #22
Damien Tournoud CreditAttribution: Damien Tournoud commentedFor the love of Drupal, could we remove those drupal_set_message() and properly use trigger_error() here?
Comment #23
rfayWell, I like the idea of trigger_error(), but have quite a number of problems with it.
First, here's the code I used:
Second, here is the output to the screen:
The watchdog entry is identical, except that its type is 'php' instead of 'file':
So my questions are:
Comment #24
rfayChatted with DamZ about #23 in IRC. It seems that without adding a new drupal_trigger_error() we can't resolve the issues there.
IMO, given the nature of the way we're trying to finesse user and watchdog reporting, the existing patch in #21 might be the best approach for now.
Comment #25
rfayRelated #866756: file_directory_path('private') returns FALSE by default
Comment #26
rfayComment #27
rfayBringing over the issues in #866756: file_directory_path('private') returns FALSE by default which were:
- writing to private:// when private is misconfigured allowed writing arbitrary locations on the filesystem
- Writing a bare filepath was allowed (should be resolved already by the previous patches in this issue)
Talking with pwolanin at CPH sprint. Here's our proposal:
1. Remove file_directory_path(). Yes, it's an API change. But leaving it in can only make broken contrib.
2. Add file_get_default_uri() to replace the legitimate part of file_directory_path. This would return the URL of the default scheme, as in public://.
3. file_directory_temp() -> _file_directory_temp(). Change comments to make explicitly private. (API Change also)
4. Add tests to
- Make sure that a bare filepath can't be used to read or write or prepare_directory.
- Make sure that a misconfigured private doesn't result in ability to write arbitrary path.
Comment #28
pwolanin CreditAttribution: pwolanin commentedAfter discussion with quicksketch, we will not remove file_directory_path(), but we shoudl make sure that any public functions generally require URIs.
Comment #29
rfaySo we decided that removing file_directory_path() just disrupted too many things, would break too much contrib. Instead we'll refocus on the actual intent: Modules must use URIs, not filepaths, with the Drupal File API.. Therefore, verify that the key API functions (*_copy, *_move, etc.) throw an exception if they are used with a bare filepath. Or at least that they fail.
So instead the plan is:
1. Make sure the primary file API functions fail with a bare (non-URI) filepath.
2. Write tests that:
- Make sure that a bare filepath can't be used to read or write or prepare_directory.
- Make sure that a misconfigured private doesn't result in ability to write arbitrary path.
pwolanin will work on the changes; rfay will work on the new tests.
Comment #30
rfayThere are some pretty significant problems with the whole decision path we're on here.
1. There is lots of code that uses barepaths, both in core and contrib. One of the nastiest is in simpletest, which saves away the actual "original files" directory and then creates a new directory. How is it supposed to use "public://" for operations like that?
2. Files that are in the modules or the themes directory, for example, need to be accessible. If we disallow file api access to barepaths, then there is no clean way to do anything using file api with a file in a module directory, for example.
I'm not entirely sure that we're doing the right thing here, and perhaps we should even reconsider #876114: Normal stream schemes are not supported in Drupal. Why?.
Comment #31
chx CreditAttribution: chx commentedEdit:
Under no circumstances we are disabling the use of bare paths this late in the cycle. Edit: so #1 is correct and #11 is a far fetched API change that absolutely should not happen.I retract what I said.Comment #32
rfayComment #33
aaron CreditAttribution: aaron commentedAs I recall, didn't we create a simpletest:// stream wrapper for file tests? In conversations, we talked about creating it in the fly for that purpose, in a subfolder of the public file location.
A suggestion I'd had earlier in the process was to create a read only stream to access deeper Drupal files, such as drupal://, that would read from the Drupal root.
Comment #34
aaron CreditAttribution: aaron commentedthe conceptualization of simpletest:// (not sure on the fly if that's what we actually implemented) was that we could create simpletest:// in a subfolder of public://, then reroute the public:// variable to use that folder, so that tests would work.
Comment #35
rfay@aaron: There is no simpletest://, and there is no drupal:// or site:// or anything like that. We have public://, private://, and temporary://. So we have no general way of handling files except those that are in the transient-file areas. No way to address things that are in the module or theme or library directories.
So here's what's going on: Whatever the intent, we have allowed bare paths (which are a URI that uses the default scheme, file://) for the last year. They've been allowed in core. They've been allowed in contrib. Lots of things have been updated using the assumption that this would work. The comments on many File API functions say explicitly that barepaths work. We possibly made a mistake, but IMO we're going to have to allow all bare paths in D7 in the File API.
So here is what I think our practical choices are:
Option 1: Explicitly allow filepaths and explicitly allow URIs with schemes provided by Drupal (public, private, temporary, and any schemes implemented using hook_stream_wrappers()).
Option 2: Allow all URIs. This is the basic PHP approach to filehandling. Any URI implemented can be used with the same API. That would mean that all Drupal-implemented schemes would be supported, and it also would mean that barepaths could be supported, as the default scheme in a URI is file://. This is the generic approach to this, and it's the one that is intended by the entire PHP streamwrapper approach to file and device handling.
Comment #36
rfayOK, discussion with DamZ, pwolanin, and webchick at the sprint came up with this proposed approach:
1. Take the existing #21, because it cleans up some things that need cleaning.
2. Allow all URIs (any scheme) and also barepaths in file_unmanaged_* and file_prepare_directory()
3. Disallow barepaths and non-Drupal schemes in managed file APIs like file_copy(), etc.
4. Completely remove file_get_directory_path() and replace it (where required, like simpletest) with equivalent functionality. This will break some contrib and be annoying.
Comment #37
rfayThis one is now dependent on important spinoff: #895308: Remove file_directory_path()
Comment #38
rfayThis patch continues from #21 and is all I could get done before collapsing. It's not done, but it's a start. I probably won't be able to do anything more on this for the next week.
This will collide with #895308: Remove file_directory_path() depending on which goes first, so a reroll will be required.
Comment #40
philbar CreditAttribution: philbar commentedUpdating to reflect: http://drupal.org/community-initiatives/drupal-core
Comment #41
dopry CreditAttribution: dopry commented-1
The file_unmanaged_* api should only accept valid schemes and no schemes should be defaulted. Developers should be intentional in their use of the Drupal fileapi functions. If using bare paths is a requirement for them they should use the native php file functions. Please see my comments in the issue you opened regarding the why for this design. You're only going to make mapping these functions to object methods more difficult if you start coding outside the interfaces now.
Comment #42
aaron CreditAttribution: aaron commentedsimpletest creates dummy:// and points it to the original public://. from what i recall, public:// is then supposed to point to a newly created subfolder so that tests can be performed without overwriting things they shouldn't be touching. however, i can't find that in a quick look. the simpletest:// i recall is being used for mimetypes, but is not a real stream wrapper. maybe someone else knows better?
Comment #43
rfayRerolled #38 to account for the commit of #895308: Remove file_directory_path(). Need to see all those fun #fails again.
Comment #45
aaron CreditAttribution: aaron commented+ * @param $uti
...
+ function file_valid_uri($uti) {
should be $uri of course
Comment #46
aaron CreditAttribution: aaron commentedsince we're talking about an API change anyway, it would be really nice to be able to do this (assuming the three provided stream wrappers were read-only):
Comment #47
aaron CreditAttribution: aaron commentedseems like a less-intrusive, less-extensive change anyway, now that i think about it...
Comment #48
rfayI agree with your #46, aaron, wholeheartedly. But I think we needed to do that a long time ago. D8.
Here's #45 $uti. Wow, it was late that night. That will fix about 900 of the #fails.
If everybody and the committers want three new schemes, I'm for it. But I really think we have to live with what we did at this point.
Comment #50
rfayI think I've about got the #fails, but it's bedtime.
Comment #51
rfayLet's see how this one does. Note the controversial else{} in file_stream_wrapper_uri_normalize().
Comment #53
rfayArghh. Lost some work that had already been done. This fixes that and then there's a missing drupal_realpath() in this one.
Comment #54
rfayIssue Summary:
Although the switch from bare paths to everything-uses-URIs had been started a year ago when the streamwrappers patch (#517814: File API Stream Wrapper Conversion) was committed, it didn't quite get done due to the bug described in the initial post on this issue. And as a result of that there were a couple of discussions that didn't get completed.
Discussion 1 was about file_directory_path() and the fact that ordinary code should never need to know it, as they use streamwrapper format like public://, private://, etc. The outcome of that discussion was to completely remove file_directory_path() and that was done (already committed and controversial) in #895308: Remove file_directory_path() (#4 over there was committed in #12).
Discussion 2 was about whether bare paths were allowed. The decision was that bare paths (like "sites/all/modules/junk/junk.txt") are not allowed in the managed file API, but *are* allowed in the unmanaged file API, because of discussion 3 below.
Discussion 3 was about what kinds of streamwrappers should be allowed in what contexts. The decision taken was that only Drupal streamwrappers (public://, private://, temporary://, and module-provided wrappers through hook_stream_wrappers()) would be allowed in the managed API, but that all streamwrappers (including PHP's various file://, ftp://, http://, etc.) would be allowed in the unmanaged API. In the PHP world, a bare path use the default scheme of file://, so bare paths are allowed in the unmanaged API as well.
This patch tries to implement discussions 2 and 3 (#1 was already committed) and it adds some tests to make sure that the decisions have been implemented.
Comment #55
rfayNow assigned to "reviewers"
Comment #56
chx CreditAttribution: chx commentedThis sounds quite a good summary and decisions I can support.
Comment #57
chx CreditAttribution: chx commentedHere is a longer followup: baring implementing #46 which would require a real lot of time (do we need all three of drupal:// module:// theme:// or rather just misc:// module:// theme:// and then what about library:// and what if you put something else in sites/all and whether we need a site:// for sites/$currentsitename... there is no end to this) unmanaged file API should keep supporting bare paths which is the equivalent of file:// in PHP and if we support that then it's plain silly not to support any other PHP stream wrapper.
On the other hand, managed file API is a Drupal API and should keep Drupal stuff. It's mostly about user uploaded files and as such Drupal wrappers are adequate.
Comment #58
webchickGreat work on this, as well as the summary. I agree with the direction taken in this patch. It aligns more closely with developer expectations, and still keeps the separation between managed and unmanaged files sane.
However, this no longer applies. Can I get a re-roll?
Comment #59
webchickOh, ok. The .rej file is only whitespaces, as I have been informed by chx. However, it really seems like this needs sign-off from pwolanin, aaron, dopry, or drewish.
Comment #60
webchickSorry, that "Needs Documentation" was when I was going to commit it, before I read the entire issue. ;)
Comment #61
pwolanin CreditAttribution: pwolanin commentedVisually reviewing the patch, this is indeed in line with the direction we agreed at Drupalcon. Patch looks good, though we may want to double check the test cases that are using drupal_realpath() in a follow-up issue.
For the future we need an issue for the change described in #46 for D8 (by implication only Drupal-provided schemes are allowed), plus #908730: File API functions should throw exceptions when failures happen
Comment #62
chx CreditAttribution: chx commentedRemoved the failing whitespace hunks.
Comment #63
webchickOk, great.
Committed to HEAD.
Comment #64
rfayFollowups for D8:
#908730: File API functions should throw exceptions when failures happen
#908874: Provide module://, theme://, drupal:// schemes for files
Comment #66
rfayDiscovered a little typo/error that results in a (very) bad watchdog message. This watchdog call just had the wrong parameter.
Comment #67
tim.plunkettGood catch.
Comment #68
webchickCommitted to HEAD. Thanks!
Comment #70
amonteroFollowup: there is another instance of #66 at the issue #1815930: Update watchdog message in file_unmanaged_copy with correct string/variable replacement values