PHP allows users to implement their own stream wrappers. This makes it possible to access virtual or remote filesystems as if they were on the local disk using fopen(), unlink(), file_get_contents() etc. The files may be stored e.g. as real files on a central server, as virtual files in a database, on remote a storage service like Amazon S3.
Handling uploaded files with Drupal in a webserver cluster is not trivial if the parallel servers do not have a shared directory on disk, e.g. using NFS. Using a stream wrapper administrators can implement their own custom logic that implements a global filesystem in their server cluster. Other uses include the ability to push additional copies of files to e.g. a CDN.
Stream wrappers has a well-defined interface that isn't specific to Drupal, so implementations of such wrappers are useful for other purposes as well.
Stream wrappers work almost like regular files, but only almost. Certain functions do not support stream wrappers, including realpath(), glob(), chmod(), tempnam() etc. So you cannot just insert "mywrapper://mysite.example.com/files" instead of "files" in Admin → Site configuration → File system.
However, with only a few modifications to the Drupal core it works. We currently run several sites in production using stream wrappers for file_directory_path and file_directory_temp. We have done a few modifications to include/file.inc:
- we added a new function, file_realpath(), that implements realpath() for stream wrappers, and then replaced all occurrences of realpath() with file_realpath()
- we added @ in front of chmod()
- we replaced
tempnam(realpath(file_directory_temp()), 'tmp_')
withfile_directory_temp() .'/tmp_'. md5(getmypid() . microtime(true))
We also had to do a few minor tweaks to some contrib modules, but generally things have just worked.
I suggest that Drupal is made stream wrapper friendly. As mentioned, this requires a few changes to file.inc and that developers refrain from using certain filesystem functions.
If people like this idea, I will post patches. Our current solution requires some adjustments in order to be applied to Drupal CVS (e.g. microtime(true)
is PHP 5 only).
Comment | File | Size | Author |
---|---|---|---|
#183 | stream_wrappers_227232-183.patch | 33.93 KB | pwolanin |
#182 | stream_wrappers_227232-181.patch | 33.93 KB | pwolanin |
#181 | stream_wrappers_227232-180.patch | 33.93 KB | pwolanin |
#178 | stream_wrappers_227232-178.patch | 34.29 KB | pwolanin |
#177 | stream_wrappers_227232-177.patch | 34.26 KB | pwolanin |
Comments
Comment #1
AjK CreditAttribution: AjK commentedJust so you know, Drupal 7 will be the first PHP5 only release.
Comment #2
c960657 CreditAttribution: c960657 commentedThis patch shows our changes to the Drupal 6 core. I'll whip up a patch against Drupal 7 as well.
Comments are welcome.
Comment #3
c960657 CreditAttribution: c960657 commentedPatch against HEAD.
This has been tested using the Amazon S3 package in PEAR , and file_directory_path set to "s3://mybucket/files" (running this example requires an Amazon S3 account).
This runs extremely slowly due (a caching layer is needed), but it serves as a proof of concept.
Comment #4
Wim LeersAnother interesting patch, c960657. Thanks! And subscribing. Will review when/if I find the time.
P.S.: I wish you had an easier to remember and type nickname though ;)
Comment #5
Susurrus CreditAttribution: Susurrus commentedI would think this would be a perfect situation where a unittest would be useful. It seems like all you need to do is create a new class and then access it. Creating a class that just accesses bunk data that would be static between all instances could do this and allow testing for everyone, including users who don't have an S3 account.
I'll see if I can look into that this weekend.
Comment #6
Dries CreditAttribution: Dries commentedYou have my interest with this patch, c960657. Cool stuff.
file_realpath() could use an example and/or some more verbose explanation.
It would be great if you could extend the PHPdoc to capture some of your experience and knowledge. It's probably useful to include some information about stream wrappers to help people understand certain decisions.
I agree with Susurrus that some tests would be key to avoid that we re-introduce changes that might break stream wrappers.
Comment #7
moshe weitzman CreditAttribution: moshe weitzman commentedSubscribe - nice stuff.
Where would you see caching being added for an S3 stream?
Comment #8
c960657 CreditAttribution: c960657 commentedI expanded the comments for file_realpath().
I also added support for resolving "/../" etc. This isn't necessary in a setup like ours, because you cannot escape the S3 bucket using "/../". But in general it is required in order to make file_check_location() work reliably.
The resolver in _file_remove_dot_segments() is based on the pseudo-code in RFC 3986 (about URIs). It is probably possible to do something similar to this using fewer lines of code (e.g. using regular expressions), but it isn't as trivial as it may sound, so for now I decided go with something that is known to work in a secure and reliable way.
WRT testing it would be nice if the whole testsuite for all modules using files could be run both with and without file_directory_path set to a path using a custom stream wrapper. I don't know how to achieve that.
>Where would you see caching being added for an S3 stream?
We have implemented caching in separate stream wrappers. We have file_directory_path set to filecache://dbstatcache://s3://static.example.com/files, where static.example.com is the name of the S3 bucket. Filecache caches files on the local disk of the webserver, but passes all stat() calls on to the dbstatcache. Dbstatcache caches stat information in a database that is accessible by all webservers in the cluster (this is necessary in order to detect if an already cached file is deleted or updated), but passes all fopen() calls on to the s3-wrapper. This may sound overly complicated, and perhaps it is, but the initial idea was to create simple general-purpose components and perhaps release them as open source (so far we have only released the S3 package through PEAR, but we don't mind sharing the other).
So far we only use S3 as a SAN with local caches, but using custom_url_rewrite_outbound() we could serve files directly off S3.
Our solution works, but the speed is not impressive. The end user usually don't feel a difference because of the local cache, but uploading big files is rather sluggish. E.g. ImageCache is not optimized for high-latency filesystems. Improving that is the next step.
Comment #9
chx CreditAttribution: chx commentedThere are code style issues like
} else {
also _file_remove_dot_segments is extremely complex, are we sure we need all this?Comment #10
Susurrus CreditAttribution: Susurrus commentedDoes realpath() not do the job for removing dots?
Comment #11
c960657 CreditAttribution: c960657 commented>are we sure we need all this?
Perhaps not. As mentioned, we haven't implemented it in our setup, and so far without problems, so perhaps nothing in core relies on the /../ resolving feature of realpath(). The same can easily be archived using dirname().
file_realpath() could be altered to simply return false on all stream wrapper paths containing /../ rather than trying to resolve them using realpath/file_realpath(). This would still allow file_check_location() to detect potentially dangerous paths.
>Does realpath() not do the job for removing dots?
Yes, but only for regular files that exist on disk. It does not work for stream wrappers or arbitrary strings containing dots. This is the reason for introducing file_realpath().
I have attached a regexp-based approach for removing dot segments, just to show an alternative implementation. It is fewer lines of code, but as mentioned, resolving of dot segments may not be necessary at all.
Comment #12
Dries CreditAttribution: Dries commentedI suggest that we leave out the more complex dot-handling for now. It's probably better to focus on testing, and to introduce the complex dot-handling when we have a real need for it, and when we can actually write tests for it.
Comment #13
c960657 CreditAttribution: c960657 commentedI updated the patch to simply return FALSE on paths containing "/../". AFAICT this is enough to prevent security exploits (assuming that the wrapper is relatively sane).
I have made file_realpath() less complicated by making it illegal to specify file_directory_path and file_directory_temp with a file:// prefix. I don't see a need to allow two equivalent syntaxes for specifying filesystem paths - it will only introduce another attack vector.
Comment #14
Dries CreditAttribution: Dries commentedYou didn't upload your latest patch.
Comment #15
c960657 CreditAttribution: c960657 commentedSorry, here it is :-)
Comment #16
Dries CreditAttribution: Dries commentedThanks c960657. It looks good to me.
One more question -- could you write some SimpleTests for this (without having to pull in an Amazon S3 library)? It would help ensure this feature gets maintained properly going forward.
Comment #17
c960657 CreditAttribution: c960657 commentedI have added a dummy stream wrapper that may be used for testing.
I added two test classess: FileTestCase and FileStreamWrapperTestCase. The latter inherits from the former and is basically just a way to make sure that all tests are run with and without the dummy wrapper.
I have added tests for some of the functions in file.inc. I'd like to hear some comments on whether this is the right direction before I write further tests. I am new to writing unit tests.
Also, I have made a few more changes to file.inc:
Comment #18
c960657 CreditAttribution: c960657 commentedA question about file_directory_temp:
At Administer → Site configuration → File system, the description for the "Temporary directory" text field says:
However, AFAICT uploaded files end up in file_directory_path while being previewed. Is the description wrong?Would it be reasonable to say that file_directory_temp shouldn't be used for saving files between requests? In that case file_directory_temp could be restricted to always point to the local disk, i.e. not use stream wrappers. The purpose is to supply a place on the local disk where native PHP functions that does not work with stream wrappers can write. E.g. imagegif() does not work with stream wrappers, so it would be convenient to write to a temporary file before copying the file to file_directory_path.
Comment #19
moshe weitzman CreditAttribution: moshe weitzman commentedI believe that we keep file uploads in temp dir until a node is saved (for example). thus, the file neds to persist as the user is previewing and improving his node.
Comment #20
c960657 CreditAttribution: c960657 commentedI thought so too. But the upload module appears to save uploaded files directly in file_directory_path (using file_save_upload() in upload_node_form_submit()). The fact that the node hasn't been saved yet is represented using the FILE_STATUS_TEMPORARY flag in the "files" table. When the node is saved, the flag is changed to FILE_STATUS_PERMANENT using file_set_status().
The remaining places in core that use file_save_upload() either discard the temporary file after having read it (includes/locale.inc) or copy it to another location (modules/system/system.admin.inc and modules/user/user.module).
But perhaps I missed something when grep'ing around ... ?
Comment #21
dopry CreditAttribution: dopry commentedno file_directory_temp is no longer used for file uploads or storing previews. The description on the file system settings page is incorrect.
I conceptually like stream wrappers and the idea of using them. However, I don't see their utility with this implementation.
for FUSE: http://code.google.com/p/s3fs/wiki/FuseOverAmazon
I mean it would be totally cool to be able to write fileapi as a stream wrapper, but really it's just another layer of abstraction in an already difficult and poorly understood API. It doesn't allow me to do anything I can't already do with more mature tools.
Comment #22
dopry CreditAttribution: dopry commentedAs a systems administrator I could never recommend this as a solution to cluster file distribution when there are file systems like OCFS and NFS freely avaliable, as well as various SANS options that are orders of magnitude more mature.
Comment #23
c960657 CreditAttribution: c960657 commentedI am not sure what you miss WRT url generation, access control and error handling?
The URLs generated by file_create_url() work fine as long as you use private files. Errors generated by custom stream wrappers are handled the same way in PHP as errors from the local filesystem.
The hook_file stuff and s3fs look interesting, though.
Comment #24
dopry CreditAttribution: dopry commented@c960657:
re: file_create_url with private files only... so in the S3 use case it has to be read from the remote service and served by drupal... so the file has to move across 2 networks before getting to the client, and the process serving the request has to sit there waiting reducing the number of clients your webserver can support. It would be so much nicer if files could be served directly from the remote service... then access control integration becomes an issue.
re: error handling... right now a lot of file handling happens with the assumption that underlying errors are permanent... ie) if a file can't be stat'ed it doesn't exist... however with stream handlers to a remote service this could be a temporary error.... The api layer above the stream handlers will need to be tolerant of these temporary errors that do not happen with a real filesystem. I'm not sure if it's really an issue, but something that should be considered maybe a new class of soft error for file operations.
I really like the concept of stream wrappers, but I think the implementation has to go deeper to be worth while. If we're going to embrace streams we might as well fully embrace streams for the filesystem... not just one particular stream for the entire FS and we also need to address URL generation and access control, even more so than the error handling in my opinion.
It seems like a much cleaner and more native way of implementing the drivers for my mountpoints / fileapi concept. I already have the mount point management and path parsing layers written.. I haven't revisited it since late Drupal 4.7 or early D5 development so who knows what condition it is in...
I'll think more about and keep playing devil's advocate.... I know I already have my ideas and long term vision, alternatives are welcome... I'm usually pretty sleep deprived when I hop on here.
Comment #25
c960657 CreditAttribution: c960657 commented>so in the S3 use case it has to be read from the remote service and served by drupal...
>so the file has to move across 2 networks before getting to the client,
Yes, the first time, until the webserver has a copy in its local cache.
>It would be so much nicer if files could be served directly from the remote service...
One way of doing this is using custom_url_rewrite_outbound(). I don't have experience with that yet, though (the possibility of rewriting the outgoing hostname was added in Drupal 6, and our current setup was made before that).
re: error handling:
I don't know much about NFS, but I assume any networked storage may have temporary errors.
>It seems like a much cleaner and more native way of implementing the drivers for my mountpoints
>/ fileapi concept.
I tend to agree. However, using a custom-written stream wrapper, you have greater flexibility WRT to what happens when a file is written, moved etc., and you can write your own local rules in PHP rather than writing your own FUSE wrapper - a thing that most PHP programmers probably aren't familiar with.
>I'll think more about and keep playing devil's advocate....
I appreciate your comments.
We initially chose to use stream wrappers because it seemed like a non-intrusive and non-system-administration-intensive way of supporting multiple servers in a cluster. I am not sure it was the best solution, but at least it works, though not as fast as we would like during uploads (perhaps partly because we also map file_directory_temp to our S3 drive because of the current description as mentioned in #18). Also, the work connected with keeping contrib modules stream wrapper-aware turned out to be bigger than initially expected (though, if stream wrapper support is made official in Drupal core, we can push our module changes upstream).
Comment #26
dopry CreditAttribution: dopry commentedre: pulling from s3...
it's a cost and I understand caching locally, but it's a big waste of bandwidth, and as you noted the latency issues can be killer. thats why I halted prior S3 work the latency was proving to be a bigger challenge than I anticipated.
re: generating URL's.
if file_create_url were stream aware or had a way of hooking into URL handlers for different streams we could probably solve this in core. just a thought...
re: NFS... not nearly as frequently as S3 or as a result of latency, and it's all kind of fubar on a site when NFS goes awol in a cluster... like ugly bad :) like no document root bad...
re: fileapi/stream wrappers... so I ripped the mountpoint system from fileapi and put it in my sandbox
http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/dopry/mount... (D5, should be trivial to port to D7) with a completely untested modification to use stream wrappers instead of the original driver system I was developing for file api... It would be completely dependent on the stream wrappers in core thing, but I see a good match in my head. The contents of mountpoint.module would probably move into file.inc and mountpoint_ui would probably be headed to system.module. With tokenized paths and a smarter file_create_url it could be a fun mix.
re: you'd probably be happier with OCFS or NFS, the administrative overhead is actually minimal if you're already running a cluster, but if it gets the job done... I do still have reservations about file manipulations
and the like... (imagemagick, GD, etc)... It seems like going this route we will be copy files to local tmp storage pretty often until the rest of php's coolness becomes stream aware... I should probably look into that since It will affect a lot of the development work I do.
/me mutters about getting all excited about things he didn't want to get involved in and puts his grumpy hat back in the closet for a bit.
Comment #27
mikl CreditAttribution: mikl commentedWell, since my esteemed colleague, c960657, is in Riga on vacation, I'd just like to say thanks for the input, dopry. It has certainly given us something to think about.
As c960657 mentioned, getting all this to work with contrib-modules is quite painful – I spend the greater part of wednesday this week trying to upgrade image(api|cache|field) to latest stable (since we have been running -dev releases for a while), and that wasn't all that fun.
In any case, we're not dead set on S3. In fact, http://drupal.org/project/cdn looks promising, but that also involves patching the core :(
Comment #28
c960657 CreditAttribution: c960657 commentedThough NFS (or equivalent services) may be one solution for handling files in a cluster, I think the stream wrapper approach is still worth considering. In particular, I think it is convenient to be able to write your own logic in PHP that suits your particular needs.
NFS also has its issues, so I think it would be useful to have a choice between several different approaches.
I have looked at hook_file, and I fail to see how it covers the same use cases as a stream wrapper. I can see that you are able to hook into file load and save operations, but AFAICT you cannot e.g. use file_scan_directory() to enumerate files not present on the local machine.
Updated patch:
Comment #29
dopry CreditAttribution: dopry commentedwith hook_file but you could guarantee changes across the cluster on hook_file events... no you couldn't use file scan directory in the same way via hook_file, but any FUSE mounted file system could be used the same way.
Comment #30
c960657 CreditAttribution: c960657 commentedChasing HEAD.
Comment #31
c960657 CreditAttribution: c960657 commentedChasing HEAD.
Comment #32
lilou CreditAttribution: lilou commentedNeed to be re-rolled :
file.test must be moved to modules/simpletest/test/
Comment #33
c960657 CreditAttribution: c960657 commentedReroll. Still work in progress.
In order to test the patch, I added the following to my settings.php in order to be able to run all existing tests with file_directory_path pointing to a stream wrapper. :
This is probably not the way to go, but at this stage it is probably a good way to ensure thorough testing. I should work on making a regular testcase that test various modules that use files.
There is a failed test here and there, but generally it doesn't look too bad.
Comment #34
drewish CreditAttribution: drewish commentedsubscribe.
i'm interested in reviewing this further but there's some obvious things that need cleanup. the phpdoc doesn't follow the coding standards and the naming of things seems odd, file_realpath should probably be drupal_realpath, etc.
Comment #35
c960657 CreditAttribution: c960657 commentedThe patch is more or less complete, but lacks tests. No ready for review yet.
Instead of the addition to settings.php mentioned in comment #33, I now hacked DrupalWebTestCase::setUp() to set
file_directory_path
to a path with a wrapper prefix, andfile_downloads
toFILE_DOWNLOADS_PRIVATE
. At the top of boostrap.inc I added code to always register the dummy stream wrapper with PHP. These two changes are included in the patch but are not intended to be committed.With these two non-standard settings, all existing tests passes except two: 5722 passes, 2 fails, 0 exceptions
Without the two non-standard settings, all existing tests passes.
All tests have been run with the patches from #299290: User tests exceptions and #310113: SimpleTest: assertFieldByXPath() doesn't work in some environments, and with the change from #298309: DatabaseStatement::execute with PDO::FETCH_CLASS calls __construct() after setting object attributes reverted.
I had to change some of the tests to make the compatiable with stream wrappers and/or private files. This may not be necessary, because the tests are expected to run from a default configuration, i.e. without stream wrappers and with public files. But on the other hand it may be convenient to be able to run all tests with these non-standard settings like I have been doing.
Comment #36
c960657 CreditAttribution: c960657 commentedChasing HEAD.
Comment #37
c960657 CreditAttribution: c960657 commentedChasing HEAD (following the big hook_file commit).
Comment #38
aaron CreditAttribution: aaron commentedspoke w/ dopry & drewish in irc. looking at this with an eye of how emfield could benefit from the patch. i'll try to weigh in over the weekend.
Comment #39
c960657 CreditAttribution: c960657 commentedChasing HEAD.
Comment #40
c960657 CreditAttribution: c960657 commentedChasing HEAD.
(please ignore the patch in comment #39 - it belongs to another issue)
Comment #41
c960657 CreditAttribution: c960657 commentedI have added tests for the new functions, drupal_realpath() and file_is_wrapper().
I also added a few litmus test to verify that some of the basic functions are working as expected when file_directory_path is a path with a stream wrapper prefix. Optimally the testbed robot should run the complete testsuite (all 6000 tests or so) with
file_directory_path
pointing to a directory with a stream wrapper prefix (the same applies for other configuration variables that affect many tests like clean_url's enabled/disabled, $db_prefix empty/non-empty etc. as discussed in comments #5 and #6 in #239542: Daily test runs). This is now possible, because setUp() preserves the existing values offile_directory_path
andfile_downloads
like it currently does forclean_url
.If you want to run the complete suite using the dummy stream wrapper, add the following to settings.php:
Then go to http://example.org/admin/settings/file-system and change File system path from e.g.
sites/example.org/files
todummy://sites/example.org/files
, and set the download method to Private files. Then run the tests.When the original file_directory_path is a regular file, all tests pass on Linux on Windows (except those that fail without the patch). When the original file_directory_path is a path with a stream wrapper prefix, all tests except the two mentioned in comment #35 pass on Linux (still waiting for the results from the Windows box).
Comment #42
c960657 CreditAttribution: c960657 commentedReroll.
Comment #43
c960657 CreditAttribution: c960657 commentedI think this works good enough to get some review.
Comment #45
c960657 CreditAttribution: c960657 commentedIt looks like the testbot was complaining about streamwrapper-D7-15.patch, not streamwrapper-D7-18.patch. Here is a fresh reroll that passes all tests on my box.
Comment #46
c960657 CreditAttribution: c960657 commentedComment #47
dopry CreditAttribution: dopry commentedThe following chmod will become useless with your edit. It is also scope creep for what you are implementing in this patch. Either leave it alone or update it the logic around directory creation and error reporting.
Why is real path removed above? It's also removed from several places in the Tests is this purposeful? Should it have been changed to drupal_realpath?
Should we be somehow checking permissions on the stream wrapped files? It seems naieve to just skip the permissions check because it's stream wrapped. Can we require stream wrappers used with drupal to implement stat?
Comment #48
c960657 CreditAttribution: c960657 commentedTrue. I have now removed the chmod(). The reason for setting the permissions with mkdir() is that chmod() does not work with stream wrappers.
Filesystem functions automatically resolves relative file references. Without stream wrappers, adding an extra realpath() is redundant. Adding an extra drupal_realpath() is redundant, with or without stream wrappering. Some functions give slightly different error messages if the specified file does not exist, but that is the only difference AFAIK.
The thing is that chmod() does not support stream wrappers. For some reason it is only possible to read stat information on stream-wrapped files, but not to modify it. Hopefully this will be fixed in future versions of PHP. The only exception (quite strange) is that you can specify permissions on new directories with mkdir().
Calling chmod() on a stream-wrapped file and then testing the outcome using fileperms() will always fail (unless the permissions already had the requested value). If Drupal's SimpleTest supported a "skip" result in addition to "pass" and "fail" like e.g. PHPUnit, this would have been the proper status to return. Alternatively we could just let the tests fail, but I think it is better to omit them so they don't remove people's attention from other failures.
As long as all files are to be saved with the same permissions, people could make their stream wrapper implementation apply default permissions on all new files. The default permissions could be hardcoded inside the stream wrapper or set in the wrapper's default stream context e.g. in settings.php (a similar approach would probably be required, if the files are saved on e.g. S3, NTFS or other filesystems whose permission models are much different from the one supported on Linux and used my chmod/chown).
Comment #49
c960657 CreditAttribution: c960657 commentedComment #50
dopry CreditAttribution: dopry commentedThe directory creation logic is still incorrect.. failed mkdir's will not report an error...
Comment #51
c960657 CreditAttribution: c960657 commentedD'oh! Sorry. I hope I got it right this time.
Comment #52
naxoc CreditAttribution: naxoc commentedsubscribe
Comment #54
c960657 CreditAttribution: c960657 commentedReroll.
Comment #56
c960657 CreditAttribution: c960657 commentedSorry, the patch was missing file_dummy_stream_wrapper.inc (I wonder why this only caused two exceptions?)
Comment #58
c960657 CreditAttribution: c960657 commentedReroll.
Comment #59
c960657 CreditAttribution: c960657 commentedComment #60
chx CreditAttribution: chx commentedI would use
As empty strings casted to bool are
FALSE
this allows you to use it as you used in the patch but also to reuse it whenever you need the name wrapper as well.Also, I think you got the importance of this feature wrong. Fixed :)
Comment #61
drewish CreditAttribution: drewish commentedFinally had some time to dig into this, I think this would be a very cool feature. Here are my comments:
Comment #62
chx CreditAttribution: chx commentedI am looking at the
and similar and wonder "is this secure"?
Comment #63
c960657 CreditAttribution: c960657 commentedThis addresses the points raised in #60 and #61.
I discussed how to write tests with drewish on IRC. Now tests for stream wrapper support in some function, foo(), is added to the class that tests that function. So in addition to testFoo(), another function testFooWithStreamWrapper() is added. See the patch for a few examples. This is still work in progress.
Since the stream wrapper tests are now no longer limited to the file tests, I added drupal_web_test_case::useStreamWrapper() instead of FileStreamWrapperTest::registerStreamWrapper() (from the previous patch). I moved file_dummy_stream_wrapper.inc into /includes in order to be able to autoload it (if it is part of the SimpleTest module it isn't available for autoload when invoked from the SimpleTest browser AFAICT).
Also, the dummy wrapper prefix is not registered when invoked from the SimpleTest browser. I fixed this by adding a temporary hack to file_directory_path(). This is only necessary until #329177: Get rid of ugly simpletest regexp. lands.
I also discussed #62 with chx. He was worried that removing realpath() would introduce security exploits. I don't think it will, but please consider this carefully when reviewing the patch.
Comment #64
c960657 CreditAttribution: c960657 commentedAdded more tests. I think all file-related tests now have a stream wrapper counterpart.
I modified some of the existing tests to use the files supplied by drupal_web_test_case::drupalGetTestFiles() rather than misc/druplicon.png etc. so they can test stream wrappers just by setting file_directory_path() to a path with stream wrapper prefix prior to running the test.
The hack in file_directory_path() mentioned in #63 has been moved to _drupal_bootstrap(), but this is still just a temporary measure until #329177: Get rid of ugly simpletest regexp. has been fixed.
Don't be alarmed about the size of the patch. The actual change to core is rather limited. The majority is test-related.
Comment #66
c960657 CreditAttribution: c960657 commentedI cannot reproduce the test failure reported by the testing bot. Any ideas?
Comment #67
chx CreditAttribution: chx commentedThere has been fixes. I request a re-test.
Comment #69
c960657 CreditAttribution: c960657 commentedFound the problem.
Comment #71
c960657 CreditAttribution: c960657 commentedReroll.
Comment #72
c960657 CreditAttribution: c960657 commentedComment #73
c960657 CreditAttribution: c960657 commentedComment #74
drewish CreditAttribution: drewish commentedc960657, at one point (on irc?) you'd said you've got an s3 stream wrapper. would you mind posting that? i'd love to actually test this with a "real" wrapper.
also i thought you were going to move the test stream wrapper init code out of _drupal_bootstrap()?
Comment #75
drewish CreditAttribution: drewish commentedi guess i cross posted you.
Comment #76
c960657 CreditAttribution: c960657 commentedWe use the stream wrapper in the Services_Amazon_S3 PEAR package (maintained by me).
In addition we use two cache layers: one local file cache on each web server, and a shared cache in the database for stat data (so that if a file is e.g. deleted on one server, the other servers know that their cached copy is stale). Our file_directory_path is filecache://dbstatcache://s3://bucketname/files. It may look insane, but the idea is to separate the individual layers and create some general-purpose wrappers that can be submitted to e.g. PEAR, though I never got around to doing that (but if you prefer, you can of course combine S3 access and caching in one wrapper). We have combined all this in a custom module (currently not compatible with D7).
I am waiting for #329177: Get rid of ugly simpletest regexp. to land. For now I have moved it to the bottom conf_init(), the same place as where #329177 hooks.
Comment #77
c960657 CreditAttribution: c960657 commentedComment #78
drewish CreditAttribution: drewish commentedthe patch doesn't apply. i started to re-roll it but got bogged down and ran out of time fixing file.test. the thing that still bothers me about this patch is that it adds in an additional case to every function... really wish we could do everything as a stream or none of it because it just seems like there's bound to be bugs lurking in there.
that said i went a head a refactored the tests so that the stream tests extend each test case and the setup function calls $this->useStreamWrapper(); this way every test is run for both version. it's less efficient but at least we can be sure that both paths work correctly.
i'd merged the tests and in the process of creating a patch managed to delete all the changes but had a copy of file.test kicking around so i'm attaching that.
Comment #79
drewish CreditAttribution: drewish commentedComment #80
c960657 CreditAttribution: c960657 commentedReroll based on attachment to #78.
I.e. make file_directory_path() always return a wrapper prefix, possibly
file://
? I guess that is feasible. Note that file_unmanaged_* would still have to support regular filesystem paths in order to work with e.g. file_directory_temp.Comment #81
drewish CreditAttribution: drewish commentedcool, i'll try to give this a close review tonite.
Comment #82
c960657 CreditAttribution: c960657 commentedReroll (due to #334303: Handling overwriting of managed files (with unittests!)).
Comment #83
dopry CreditAttribution: dopry commentedI'd like to note ongoing work as a part of the media sprint is stream wrapper support. The current work can be found in contributions/modules/stream. It includes a public:// and private:// stream wrappers by default and a basic set of tests.
http://www.drupal.org/project/media
http://cvs.drupal.org/viewvc.py/drupal/contributions/modules/media/resource
The work is much more invasive than this patch so I am implementing it in contrib with a replacement for the current file api that leverages the stream wrapper support more fully. It is a bottom up OO rewrite of how non-database local and remote resources are managed.
Comment #85
andreiashu CreditAttribution: andreiashu commentedgreat post !!! subscribing
Comment #86
c960657 CreditAttribution: c960657 commentedReroll.
Comment #88
c960657 CreditAttribution: c960657 commentedPatch missed a file.
Comment #89
kwinters CreditAttribution: kwinters commentedPaths like "../logs/error_log" could potentially be sensitive and don't contains "/../" (but a path that ended with "/.." would always be a directory, not sure if that's OK in this context, rm -rf would be bad).
Comment #91
c960657 CreditAttribution: c960657 commentedReroll (due to #380064: DX: Make file_scan_directory() use save property names as file_load()).
Comment #92
jhedstromSubscribing.
Comment #94
c960657 CreditAttribution: c960657 commentedReroll.
Comment #95
gordon CreditAttribution: gordon commentedsubscribe
Comment #97
ghoti CreditAttribution: ghoti commented<aol>me too!</aol>
Comment #98
bjaspan CreditAttribution: bjaspan commentedsubscribe
Comment #99
drewish CreditAttribution: drewish commentedjust a note that #203204: Uploaded files have the permissions set to 600 was committed which adds a drupal_chmod().
Comment #100
c960657 CreditAttribution: c960657 commentedReroll.
Comment #101
c960657 CreditAttribution: c960657 commentedReroll.
Comment #102
eojthebraveThis seems like it has a lot of potential. Subscribing, and after a browsing through the patch real quick here are a couple of things I saw.
contains a few typos.
Should be.
There is a weird character in this line.
+ 'name' => t('File paths and directories – with stream wrappers'),
Not sure what it is supposed to be.
Looks like the weird character is also here
+ 'name' => t('Unmanaged file delete – with stream wrappers'),
and here
+ 'name' => t('Unmanaged file copying –with stream wrappers'),
and here
+ 'name' => t('File saving – with stream wrappers'),
and here
+ 'name' => t('Realpath resolving – with stream wrappers'),
Comment #103
c960657 CreditAttribution: c960657 commentedReroll with typos fixed and weird characters removed (the weird character was actually an en dash, but a regular hyphen is probably better after all).
Comment #105
webchickHEAD broke.
Comment #106
jmstacey CreditAttribution: jmstacey commentedSubscribing.
Comment #107
aaron CreditAttribution: aaron commentedI plan to jump back into this issue, hopefully this week.
First off, I'd like to note that dopry's work at http://drupal.org/project/media (with the stream wrappers in its included Resource module) has been invaluable. Anyone interested in seeing PHP stream wrappers in core should take a look at that.
I plan to take a look at the current status again. I imagine it's killing a lot of puppies, but I see no way around that to at least get the initial jump start needed. However, once we have the basic engine in place, we'll be able to do some really exciting things.
Comment #108
aaron CreditAttribution: aaron commentedFYI, I'm mentoring Jon Stacey (jmstacey), a student for GSOC, who's developing the Media module. We're changing tracks of that project slightly, to develop in d7 rather than d6, because there are things in core that will make the project easier to bring to fruition.
We'd like to see more, including this patch. However, the terms of a Drupal project w/ GSOC are that it can't be dependent on making commits in core.
That said, we've decided to focus the next week or two on core patches that would help things along. Specifically this patch, and probably another patch after this to make related changes to hook_file that take stream wrappers into consideration.
Jon's getting up to speed on this patch, as well as myself (I haven't looked at it in a few months).
I'm not asking anyone to vet the work or anything, and am definitely not asking that we speed up the process of adoption -- obviously we want to ensure the best work possible goes forward, and don't want to make any mistakes by committing something that's not quite ready.
However, I would be eternally grateful if interested folks could give a few minutes later in helping to review things, and help us find blocking issues.
Thanks,
Aaron Winborn
Comment #109
drewish CreditAttribution: drewish commentedi think there's a few issues that this ties in with that all need to be resolved at once. the thing i don't like about this patch is that it lets us use normal paths or stream wrapped paths. as i've said before i think we should pick one or the other. if say that we're only going to allow stream wrapped paths we've got an approach that could help solve several issues: #366464: File table includes file_directory_path() #166759: Public/Private File Handling #329301: Rename {files} to {file} and add unique key to the filepath column. i think that dopry's work on the resources module had some very promising work started. he had separate stream wrappers for public, private and temp files. i think we need to have separate directories for each and which would let us keep the full path out of the database table and still allow us to finally get a unique key on the filepath column.
Comment #110
Anonymous (not verified) CreditAttribution: Anonymous commentedsubscribe, coming from #166759: Public/Private File Handling.
Comment #111
jmstacey CreditAttribution: jmstacey commentedAaron, drewish, and I had a chat this morning and the three of us have agreed to a tentative plan that I'll try to describe with our reasoning below.
The bottom line is that we need a unified resource handling system. We have an opportunity now to take care of several issues in one swoop. Ultimately, we envision core providing three basic stream wrappers: public://, private://, and temp://. This alone will bring additional functionality to core such as the ability to use both public and private filesystems at the same time. Additionally, this paves the way towards allowing contrib modules to provide extended functionality (youtube://, s3:// ...), and yet be built on a common base.
As drewish previously mentioned, this would take care of a couple issues we're facing:
#166759: Public/Private File Handling - This would be bundled up neatly once stream wrappers are implemented. The registered stream wrapper (public, private, temp,...) simply looks up the configured path based on the URI.
#366464: File table includes file_directory_path() - This issue can be cleaned up because the wrapper would determine the path. For example, the resource public://friends/tom/london.jpg would be handed off to the public stream wrapper which would consult the configuration settings and look in, for example, sites/default/files/friends/tom/london.jpg. A resource private://family/jennifer/wedding.jpg would be handed off to the private stream wrapper and look in say /home/bob/private_files/family/jennifer/wedding.jpg.
To accomplish this goal we need to concurrently tackle several areas:
We need to merge this patch with the work done by dopry on resource.module (currently part of Media).
#329301: Rename {files} to {file} and add unique key to the filepath column - We propose the following changes:
Upgrade path: We propose leaving the files table in place until all modules have migrated. We would provide helper functions to assist in migration. For example, given an old filepath return a new D7 file object that can be passed on to the appropriate storage mechanism. As a result of the proposed database schema changes we would need to provide patches for core modules that rely on the files table. At this time we are only aware of upload.module and filefield needing attention, but we will need a comprehensive evaluation.
How you can help
Getting this done will lay the groundwork for a lot of incredible new features, all built on a common foundation. However, there is a lot to accomplish and the more eyes the better. I'll be working on this full-time for the next couple of weeks. We need help from every area so drop by #drupal (irc) and ping one of us if you have the time.
Comment #112
Anonymous (not verified) CreditAttribution: Anonymous commentedjmstacey: awesome summary.
i'm beejeebus in IRC, i'll be online saturday and sunday, see you then.
Comment #114
jmstacey CreditAttribution: jmstacey commentedJust a small update. I've created a fork of mikl's GitHub drupal mirror for use in rapid collaborative work on the patch: http://github.com/jmstacey/drupal/tree/master
At this point the fork consists of streamwrapper-D7-38.patch, merged to drupal head, and the stream wrapper interface and registry implementation from dropyr's work on resource.module (+coding standards and doc improvements). Tomorrow I'll be fixing up tests to make sure that basic stream wrapper support actually works, and then I'll start working on file API.
Word on the street is that there will be a mini-sprint this weekend with drewish and justinrandell. If you're interested in helping out, ping me and I'll provide commit access to the development branch.
Comment #115
jmstacey CreditAttribution: jmstacey commentedHere's a checkpoint so that cursory reviews may begin. There may be regressions from streamwrapper-D7-38.patch because of a different design.
Changes so far:
Other notes:
Coming up:
Comment #116
Anonymous (not verified) CreditAttribution: Anonymous commentedupdated patch is just a rearrangement of the patch at #115 so it can be applied to a clean checkout of HEAD with -p0 - no code changes.
Comment #117
Anonymous (not verified) CreditAttribution: Anonymous commented@jmstacey: i've got a git clone of your github drupal mirror, ping me in #drupal when you get a chance. i'm a git n00b, so i'll need some pointers re. pushing changes back up.
currently just trying to get basic user picture and upload module functionality to use the stream wrappers and see what works/doesn't work.
Comment #118
drewish CreditAttribution: drewish commentedHere's a summary/export of what we got done today. If you'd like to see it in more gory detail check out jmstacey's drupal github repo.
drewish:
- Removing all the trailing white space.
- Reorganizing the files that the streamwrapper classes are in to slim it down to two files (includes/stream_wrapper_registry.inc and includes/stream_wrappers.inc)
- Renaming all the class names to be CamelCased rather than underscore_separated.
- Tried to un-rat-fuck the whole Drupal{Public,Private,Temp}StreamWrapper inheritance tree.
justin:
- convert DrupalStreamWrapperRegistry to a static class rather than a singleton, and updated the callers
- moved stream_scheme() and stream_wrapper() into DrupalStreamWrapperRegistry
- renamed all of the underscore-named functions to CamelCase
Comment #119
tic2000 CreditAttribution: tic2000 commentedComment #121
jmstacey CreditAttribution: jmstacey commentedQuite a bit of work has been done since that patch was generated, especially with the File API. The latest checkout has all file related tests passing (last I checked). The fork needs to be merged, retested locally, and then a working patch can be submitted for review. While all the tests are working, they are not comprehensive in the context of the additional and changed functionality and could use some more work. However, initial review may commence.
I'll try to slip this in tomorrow if I have a little time between sessions at DrupalCamp Colorado.
Comment #122
Anonymous (not verified) CreditAttribution: Anonymous commentednice work. i've just pushed up a change to move the stream wrapper initialisation in to a bootstrap phase.
Comment #123
aaron CreditAttribution: aaron commentedFolks should take a look at #499156: CDN integration: allow file URLs to be rewritten by hook_file_url_alter(), as a) it's going to break our patch if committed first, which b) adds bloat to this patch, since we'd have to rewrite that patch, and c) it's on the fast-track for committing, as it's tagged Favorite-of-Dries.
Comment #124
aaron CreditAttribution: aaron commentedAlso, FYI, we're on track for getting this RTBC tomorrow, as we'll have a bunch of folks helping to review at the Media Sprint. See you in the morning!
Comment #125
jmstacey CreditAttribution: jmstacey commentedI've attached the latest diff so that the testing servers can have a go at it. The file and user profile tests are passing locally. We'll have our work cut out for us to get it RTBC but I think we can pull it off. Here's a quick list of things off the top of my head, I'm sure there are more:
DB Changes
The aforementioned DB changes in #111 have been made with the exception of the removal of the 'status' column. The use case at hand is that you have a temporary image that you would like to make publicly available (e.g. advanced caching). temp:// in its current incarnation returns the system temp path (a replacement for file_directory_temp()). Temp can have many different meanings. I'd rather not require a new stream wrapper for a use-case like this. For the time being I've retained the status column for the continued indication of permanent/temporary files rather than move towards something messy like public://temp://myfile.jpg.
Comment #127
Wim LeersAfter a skim through this issue, I have two questions:
- I'm wondering what performance toward S3 and so on would be like. Has anybody performed benchmarks already? Shouldn't stream wrappers to potentially high-latency remote servers not be used with care, e.g. only from within cron, to prevent extremely long page rendering times? (If PHP had threads and no limited execution time, this would not be a problem, or at least much less so).
- Doesn't the lack of chmod() support have major implications? But it does seem to be in the latest patch.
Very quick code style review:
- It's not "nineth" but "ninth".
- It's not "compatability" but "compatibility".
- Tabs instead of spaces on several occasions.
- Always use braces, never omit them (in single-line if/else tests).
- As far as I know we don't mention the type of a variable in doxygen in Drupal?
Comment #128
aaron CreditAttribution: aaron commented"As far as I know we don't mention the type of a variable in doxygen in Drupal?"
http://drupal.org/node/1354 does not, but there are places in core that do.
Comment #129
jmstacey CreditAttribution: jmstacey commentedTo answer your questions.
1. Stream wrappers to potentially high-latency remote servers will need to be used with care. Any implementations will need to implement caching for best performance. This is no different than writing an Amazon S3 backed filesystem using FUSE. I think most implementations won't have a problem until you get to the point of uploading an extremely large file. At this point, any performance enhancers would rely on the wrapper implementation to do stuff in the background and such.
2. drupal_chmod() supersedes chmod() and is backwards compatible with normal paths. For streams, drupal_chmod() will pass the job off to the appropriate wrapper. In the case of public, private, temp, this just interpolates the URI and calls normal chmod(). In other cases the wrapper might just ignore the chmod and do nothing (e.g. for S3). For normal paths, drupal_chmod will just fire off a regular chmod.
- The spelling corrections are fixed and I replaced any tabs the slipped in.
- I didn't notice any if/else short statements in my skim but I must just be overlooking them because I know I've seen them. Line numbers would help.
- As far as documentation, the Doxygen page doesn't mention using it, however I find it very useful because the type is not always mentioned in the parameter description. You immediately know what's expected or to be returned without looking at the code. Some current core functions such as the existing drupal_chmod add the variable type. We should make a decisions and update the Doxygen page to specify the standard we'll use. The coding standards page also needs to be updated to reference the alternate camelCase use with SimpleTests.
Comment #130
jmstacey CreditAttribution: jmstacey commentedJust the minor modifications I mentioned before.
Comment #131
aaron CreditAttribution: aaron commentedpotential workflow for s3 using stream wrappers:
* user uploads a file through the s3 wrapper
* s3 wrapper saves the file locally and marks it to be moved on the next cron
* later, someone wants the URL for that file
* if the file has not yet been moved, the local url is returned
* else if the admin checkbox for using s3 is turned off, the local url is returned
* else if the server has otherwise determined that s3 is down, the local url is returned
* else the remote s3 url is returned
performance should be optimal in this setup
Comment #132
aaron CreditAttribution: aaron commentedsorry -- will this rerun the test?
Comment #134
aaron CreditAttribution: aaron commentedFatal error: Class 'DrupalStreamWrapperRegistry' not found in /var/www/d7-stream/includes/file.inc on line 1615 when you run install.php. i'll look at that right now.
Comment #135
tic2000 CreditAttribution: tic2000 commentedComment #136
Anonymous (not verified) CreditAttribution: Anonymous commentedat #134 - that's because we use the files in the installer, but don't initialise the Drupal's stream wrappers. working on patch...
Comment #137
aaron CreditAttribution: aaron commentedI just committed a refactoring of file_get_mimetype in git, so that it now works through the schema/uri (allowing us to set custom mimetypes for youtube videos, etc).
Comment #138
aaron CreditAttribution: aaron commentedadded constants for SCHEME_PUBLIC, SCHEME_PRIVATE, SCHEME_TEMP
Comment #139
aaron CreditAttribution: aaron commentedthis latest patch is up to #138, and also includes a new test for SimpleTest, as we now add a simpletest:// stream wrapper for testing.
Comment #140
RobLoachWow, this is awesome... Subscribing.
node:// WHAT?!
Comment #141
jmstacey CreditAttribution: jmstacey commentedCurrent known issues:
Comment #142
jmstacey CreditAttribution: jmstacey commentedHere's the latest patch. This resolves the previously known issues, and some other unexpected breakages.
Here's the most recent list of things that need work:
D.O. Documentation introducing stream wrappers (see notes in #125). Search and Replace @see lines with real URL.Abstract static method cleanup. Initially, there was some confusion about abstract static methods... they don't make sense [Source]. This resulted in some messy calls like demonstrated on http://drupalbin.com/10347. While static functions would be ideal, I don't know if this can be accomplished elegantly. I'll be happy with the current implementation if we only pass data once (when instantiating).Usage of normalizeUri(). Anything that has the possibility to be located in a subdirectory should use normalizeUri and assume a prepended slash. This takes care of cases such as profile images that can be configured by the user. Images might be stored in the root directory so normalizeUri will turn "public:///" into "public://" and the assumption of the forward slash will allow subdirectories such as "public://pictures" to work.Upcoming code sprint in Philadelphia, July 26-27.
Comment #144
jmstacey CreditAttribution: jmstacey commentedThis reroll fixes the file download test which was failing because the private file path doesn't exist on the testing slaves. This also includes cleanup of the abstract static issue and a couple other exceptions. Simpletest is reporting a failure locally, but not what test specifically, so I'm hoping the testing server will shed more light on that problem.
Comment #146
jmstacey CreditAttribution: jmstacey commentedThanks to cwgordon7 this test should finally pass. We're going to look into breaking this down into smaller and more specialized patch sequences, so this may be one of the last giant patches.
Edit: Actually this won't pass because #515280: file_check_directory() should create recursively was just committed.
Comment #148
jmstacey CreditAttribution: jmstacey commentedOn recommendation from cwgordon7 and drewish I have split this patch into two parts: stream wrappers (this issue), and #517814: File API Stream Wrapper Conversion. This patch will obviously need to be committed before the second.
Keep an eye on #395472: Plugin Manager in Core: Part 1 (backend) which may cause this patch to fail erroneously.
Comment #149
drewish CreditAttribution: drewish commentedsubscribing.whoops, wrong issue.Comment #150
jmstacey CreditAttribution: jmstacey commentedThis patch will now need to be committed before #519392: File API Stream Wrapper Support (Part 1) followed by #517814: File API Stream Wrapper Conversion. For those following the GitHub repository there are now six branches:
Here are the most important as they build off of each other:
upstream/master: This is the Git mirror of Drupal CVS. You must add this branch yourself for each new clone using "git remote add upstream git://github.com/mikl/drupal.git"
styling_corrections: This branch contains random fixes that we noticed along the way but weren't directly affected by changes we're making. This branch should be diffed directly to upstream/master. An issue will be created for it down the road whenever someone has time.
Comment #151
robertDouglass CreditAttribution: robertDouglass commentedSubscribing. In awe of the persistent effort going on here. Keep it up.
Comment #152
drewish CreditAttribution: drewish commentedJust did a quick scan to see what was left in this patch and noticed a few of things. I'm very happy with where you chose to split the patch up.
The comment in DrupalStreamWrapperRegistry::getStreamScheme() is cryptic. I'd suggest either expanding it to be proper sentences or dropping it.
Can we drop the blank first line from functions (i.e. DrupalStreamWrapperRegistry::getValidStreamScheme() and getInstanceByUri(), etc)? IMHO it just wastes space.
DrupalStreamWrapperInterface::getInternalUri()'s comment references realpath but ommits the ()s that help indicate it's a function.
"mime type" in comments should probably be MIME-type or MIME type. I'm not sure if core follows a standard.
"@return bool" I don't think our coding standards call for using the type hints.
"PHP's realpath does not..." again, realpath should have parens following it.
I hate the word interpolation in comments. Can we use something that people won't have to look up in the dictionary?
I don't think canonicalized is a real word. Lets just use "canonical, absolute path" instead.
Why does DrupalLocalStreamWrapper have private member variables? I think protected would be fine and give more flexibility to extending classes.
DrupalStreamWrapperInterface::getInternalUri() and getExternalUrl() have pretty sparse documentation. It'd be good to flush those out a bit. Perhaps with an example?
Why all the extra work in DrupalLocalStreamWrapper ::getMimeType()? Why can't we just call file_get_mimetype()? All the code looks copied and pasted directly from there.
I think the convention is to put @see tags below @return tags but in DrupalLocalStreamWrapper there's a lot of these in the reverse order. I'd go check the coding standards but I'm tired enough that odds are I wouldn't finish the review.
I'm wondering if we're going to have to conditionally define the Drupal{Private,Private}StreamWrapper classes depending on which file transfer mode is enabled...
Comment #153
jmstacey CreditAttribution: jmstacey commentedI compartmentalized the changes so you can just glance at the specific changeset rather than sifting through everything again.
[stream_wrappers 89fc86b]
[stream_wrappers 971bd35] Personal preference for readability, but I think I got them all. What about classes though?
[stream_wrappers cbefe60]
[stream_wrappers 099d29e]
[stream_wrappers cefdd692] I copied that sentence straight from PHP documentation on realpath(). I've changed as recommended.
[stream_wrappers cdc625f5] changed to private, followed by [stream_wrappers 3b5f020c] which makes them public [we were using blind magic getters/setters, so we might as well open them up).
[stream_wrappers eb5f737] and [stream_wrappers 888e6b8] getInternalUri() will also contain the "@see drupal_realpath()" when the function is made available in #519392: File API Stream Wrapper Support (Part 1).
This is indeed a copy/paste from file_get_mimetype(), but there is a very important reason. We can not simply call file_get_mimetype() with a stream because it doesn't know what to do with it. It's not something that PHP stream wrappers will handle transparently for us. We have to specifically tell it what wrapper implementation to use which is why we need a registry, so that we can follow the path backwards in a way that PHP does not natively handle. #517814: File API Stream Wrapper Conversion removes all of this code from file_get_mimetype and simply invokes the appropriate wrapper implementation. So there won't be any code duplication in this area when everything falls into place.
[stream_wrappers 2c271b9]
[stream_wrappers 4f0f530] This will need to be brought up on the other patches as well because I did this throughout everything. This should be specifically put in the coding standards.
Ideally all three wrappers will be registered and available simultaneously. At least that's the whole idea for stream wrappers. At this level of implementation it doesn't have any affect on existing Drupal functionality. Once we start get into #517814: File API Stream Wrapper Conversion we'll need to decide on a granularity. I'm in favor of just keeping a system-wide default that modules may use, but of course can override however they want. In the end, it won't matter what this default is set to because the appropriate wrapper will be determined by the URI. Se we can run public, private, (and more) setups side by side without stepping on each other.
I'm looking to bringing some of that work out of Part 2 so that users have a interface for changing the variables stream_public_path, stream_private_path, and stream_temp_path.
Here are some miscellaneous doc changes not specific to any issues raised: [stream_wrappers d504642]
Patch roll coming up in a few minutes...
Comment #154
jmstacey CreditAttribution: jmstacey commentedgetMimeType() is now a static method. This allows file_get_mimetype() to be backwards compatible and be moved from part 2 to part 1.
There is a reason that this is the only static method while the others are not, and I'll try to explain. First, read "Why does PHP 5.2 disallows abstract static classes?." The first problem is that getDirectoryPath() is not part of the wrapper interface, it is something specific to local Drupal wrappers, and so is part of the class not the interface. The second problem is that the implementation of getDirectoryPath varies depending on what wrapper we're dealing with. So, even if you can add an abstract static method to a class (I can on my test box), there is no way to call it through the parent because it belongs to the child.
Hopefully that makes sense and clarifies some of the design decisions that were made.
Comment #155
aaron CreditAttribution: aaron commentedwe need to also keep an eye on #391330: File Field for Core, as i suspect if it gets in first it will impact this patch (and vice versa). i'll review both these issues more this weekend with that in mind.
Comment #156
quicksketchFollowing to keep track of this issue. Just at first glance, it looks like the getMimeType() method includes the same bug as noted in #520664: file_get_mimetype() should not be case-sensitive.
Comment #157
SeanBannister CreditAttribution: SeanBannister commentedSub
Comment #158
jmstacey CreditAttribution: jmstacey commentedHere's a new one.
Changes
- file_get_mimetype() case sensitivity fix brought forward from #519392: File API Stream Wrapper Support (Part 1).
- __put() fix brought forward from #517814: File API Stream Wrapper Conversion.
- normalizeUri() now removes the trailing slash like the documentation indicates.
- New unit tests for getStreamTarget() and normalizeUri().
Comment #159
jmstacey CreditAttribution: jmstacey commentedI took a quick look at #391330: File Field for Core and from what I can tell it will only really affect #517814: File API Stream Wrapper Conversion. I've added a note to that issue.
This patch no longer contains any changes or removals. Everything is brand new. It's really quite boring in comparison to the merges on the other two branches :-)
Comment #160
aaron CreditAttribution: aaron commentedNote: #531476: Add favicon mimetype theme setting removes the included file.inc in the maintenance theme, which was complicating matters somewhat.
Comment #161
pwolanin CreditAttribution: pwolanin commentedConverted static class to a Drupal hook
Comment #162
dopry CreditAttribution: dopry commentedYou might want to remove file_put_contents('/tmp/log.txt', print_r($wrappers,1)); from line 67 of the patch.
The parse_uri_scheme function is incorrect. The URI format according to the rfc is : the // is specifically the common internet scheme indicator. I understand that PHP internally makes the :// assumption as well, but it breaks relative file paths file:path/to/file.ext and other scheme formats such as mailto:user@examples.com.
Just because jani@php.net is too good for standards doesn't mean drupal shouldn't do it's best to adhere to them properly, see http://bugs.php.net/bug.php?id=47070. At the very least some pressure should be applied to the PHP folk to get them to recognize and correct their improper support of the URL format. I guess it could live until php gets in line, but the reasoning for not properly parsing scheme URI's should be noted.
Comment #163
pwolanin CreditAttribution: pwolanin commented@dopry - sorry about the leftover debug code.
Also, http://us.php.net/manual/en/function.fopen.php says that PHP looks for a filename of the form "scheme://...", so it would seem that we are locked into this to some extant in terms of responding to what PHP expects.
Comment #164
eojthebraveTook a quick peak at this, didn't make it all the way through ran out of ice-cream. Here are the notes I kept.
Order is wrong here.
Lines should be reversed and the "Returns the ..." line should be indented. And we don't usually specify variable types like that.
Probably don't need to have the extra newline in file_uri_scheme
"usefule" should be "useful" in comments for file_stream_wrapper_valid_scheme, there is also an extra line break between the summary and first @param
+ * A stream, referenced as scheme://target
For file_stream_wrapper_get_instance_by_uri needs a "." at the end. I think the return documentation for this is wrong to. It looks to me like you are confusing private and public. With a URI of public:// I would expect to get an instance of the Public stream wrapper class.
"container" should be "contain" in documentation for file_stream_wrapper_get_instance_by_scheme
There is an extra space here.
+ * This allows you to set the URI. Generally is only called by the factory
The $mode param for DrupalStreamWrapperInterface::chmod() needs documentation.
DrupalLocalStreamWrapper::stream_open() documents @param &$opened_path, but uses a variable named $opened_url
I think you might be missing an & here.
+ if ((bool)$this->handle && $options & STREAM_USE_PATH) {
That or I just don't know what using a single & operator does.
The $operation param for DrupalLocalStreamWrapper::flock() is un-documented.
Comment #165
drewish CreditAttribution: drewish commentedi've fixed the issues identified by eojthebrave in the git repository. we'll re-roll later today.
Comment #166
drewish CreditAttribution: drewish commentedokay here's the above changes with some additional docs for the hooks we've introduced.
Comment #168
jmstacey CreditAttribution: jmstacey commentedMade a few tiny doc and whitespace fixes.
Bitwise operator.
Comment #169
jmstacey CreditAttribution: jmstacey commentedSetting to needs review.
Comment #170
dopry CreditAttribution: dopry commentedThe bug is back open in PHP land... regarding :// the point is that it doesn't match the URL standard and should be documented as there is a good chance PHP will support well formed URL's properly in the future.
Also how did this patch end up in the 'file' name space? Was there strong resistence to 'stream'? if node:// and mailto:, etc are possibilities... why bury it in the file namespace?
Comment #171
pwolanin CreditAttribution: pwolanin commented@dopry - the functions are in file.inc and our main use in core will be for files, so I think it's a reasonable name space to use.
Comment #172
aaron CreditAttribution: aaron commented@dopry - I agree that we should document that :// business so it can be addressed at a future point.
Comment #173
aaron CreditAttribution: aaron commentedoops, sorry, didn't mean to reset the status...
Comment #174
pwolanin CreditAttribution: pwolanin commentedNew patch with :// issue documents and removing uneeded constants. Also makes sure stream wrappers get registered during bootstrap.
Comment #176
pwolanin CreditAttribution: pwolanin commentedoops - the Class needs to be in file_test.module
Comment #177
pwolanin CreditAttribution: pwolanin commentedMinor doxygen fixes, removed t() from getInfo(), no real code changes.
Comment #178
pwolanin CreditAttribution: pwolanin commentedwrong patch. This one has the t() fix
Comment #179
aaron CreditAttribution: aaron commentedthis looks good to me. let's get this one in!
Comment #180
Dries CreditAttribution: Dries commentedI reviewed this in private with the sprint team, through pwolanin. I should probably have shared my review publicly. The next re-roll should be commit-able.
Comment #181
pwolanin CreditAttribution: pwolanin commentedfinal version with more comment cleanup and CHANGELOG
Comment #182
pwolanin CreditAttribution: pwolanin commentedCHANGELOG typo
Comment #183
pwolanin CreditAttribution: pwolanin commentedDoxygen typo also noticed by moshe.
Comment #184
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD! Thanks all ... onto the next issue.
Comment #185
dopry CreditAttribution: dopry commented/me waves excitedly from the peanut gallery.
Comment #186
aaron CreditAttribution: aaron commentedIn case anyone's looking for the next patch: #517814: File API Stream Wrapper Conversion
Comment #188
hass CreditAttribution: hass commentedHow bad that the
_file_remove_dot_segments
(#9, #11) function have not found the way into core! I would be really happy if this can get into core and I can rely in modules on this code. I have an open issue in #832388: Invalid URI's not normalized / dot-segments not removed that need uri normalization functions, but all the functions I've found in core have missleading names and do not really do uri normalization except "schema" normalization.It would be good to clean up core or implement all uri normalization functions required by RFC 3986. Code in core should be reusable...
Comment #189
kukle CreditAttribution: kukle commentedSubscribing