As discussed in #1668892: Enable secure compiled information on disk . I separated out loader.inc because that will not be upgradable (and index.php), the rest of core will be -- eventually I will want to move the inclusion of loader.inc into index.php to make bootstrap.inc easy upgradeable. Most of the patch is, in fact, just this moving around. drupal_include
on line 147 down to line 209 in loader.inc is all that is new. Easy as pie... two new settings.php $conf vars, php_path specifying where we can write to and php_prefix is the fixed random. These eventually will be written by the installer.
Comment | File | Size | Author |
---|---|---|---|
#129 | 1675260-129.patch | 33.59 KB | chx |
#125 | 1675260-121-124-interdiff.txt | 12.02 KB | pwolanin |
#124 | 1675260-124.patch | 30.84 KB | pwolanin |
#121 | 1675260-121.patch | 30.55 KB | chx |
#121 | interdiff.txt | 10.81 KB | chx |
Comments
Comment #1
chx CreditAttribution: chx commentedObviously, testing, more doxygen, installer support is necessary still but first let's agree whether we want this and let's see whether it passes like this.
Comment #3
chx CreditAttribution: chx commentedNow this tries to load the override first then the original. Edit: the test failures were due to missing globals.
Comment #4
chx CreditAttribution: chx commentedI rewrote the flilter logic to use less memory and only 81 new LoC instead of 83. Yes, it takes less than 100 LoC to implement this. I promised a very small API didn't I :) ?
Comment #5
chx CreditAttribution: chx commentedThis is in line with #133 and has tests (woot!) which I have tested to pass. I have not tested the core installer , though. Also some doxygen is missing. But this is big progress.
Edit: I have not separated out the loader functionality into a separate file. That's premature optimization.
Comment #6
chx CreditAttribution: chx commentedSome simplification in the writer.
Comment #7
chx CreditAttribution: chx commentedThis one here unifies the reader and the writer into one filter. Took me a bit to figure this one out. But the code reuse in the filter is so high that it worths it. Previously we had an optional read_callback and an optional write_callback instead we now have an optional path_prefix_callback. For the filter, this is responsible for creating the directories.
Comment #8
chx CreditAttribution: chx commentedOh, and if read_callback and write_callback is gone then we can clean this up further returning drupal_php_read and drupal_php_write to their intended size: one line of code each. I have noticed that while stream_filter_register is not fussy about registering the same thing twice, stream_wrapper_register is and this might cause a problem with the database wrapper (or anything else people might want to add). A drupal_stream_wrapper_register() has been added to avoid this.
Comment #9
neclimdulSome rough docs I wrote while reading the code and a fix for the shortcut in _drupal_php_helper().
Comment #10
chx CreditAttribution: chx commentedneclimdul raised the question whether we need multiple loaders, the $type argument. However, the DIC can not be stored in the database and so if there is only one loader that means we can not ever use the database. We can say we only want the filesystem and the only configuration is allowed is whether you want native or through a user stream filter and throw the whole config and pluggable thing out. I do not feel good about that approach :)
Edit: we want to inline the variable get as
$GLOBALS['conf']['php_loader']
in the next patch.Comment #11
chx CreditAttribution: chx commentedThe original issue was a major feature request so I guess this is major too.
Comment #12
effulgentsia CreditAttribution: effulgentsia commentedI haven't reviewed the code in detail yet, but my main concern with using a stream filter is that, per #1668892-127: Enable secure compiled information on disk, it's not compatible with APC caching. The issue summary of #1673162: Analyze performance of uncompiled and compiled DI containers contains some benchmarks for apc and no-apc loading of compiled/uncompiled DIC. The no-apc numbers still show an improvement relative to uncompiled, so a compiled DIC with #9 is still an improvement compared to HEAD. But my hunch is that if we proceed with this for now, we'll still need a follow-up to optimize further for APC. OTOH, the idea in #1668892-144: Enable secure compiled information on disk sounds awesome if it can fly with the security team. Thoughts?
Comment #13
effulgentsia CreditAttribution: effulgentsia commentedComment #14
chx CreditAttribution: chx commentedA completely new direction which does not involve streams. It involves hash-named directories, and a file name in depending on the mtime of the directory. Comments explain why it's secure. No doxygen or tests yet.
Comment #16
chx CreditAttribution: chx commentedBah, that doesn't change the design. Also note that the next patch will need to deal with existing dirs -- which is easy, just delete the file from the directory that will update the directory mtime to be current (same time as mktime IMO).
Comment #17
chx CreditAttribution: chx commentedI have checked on Windows that file creation and deletion affects the holding directory mtime. So that's settled. Performance note: the operating system needs to read the inodes of every directory en-route anyways and so there is no additional I/O because of the filemtime calls, the only additional time is that information needs to up to PHP. In general, this is as close as we can get to just native include. Works with apc. Even apc.stat=1 which no other solution we came up with so far could claim.
There was a complaint this will be harder to handle this with git. Let me introduce you to
git add -A sites/default/files/codegen
which adds new ones and removes old ones. Another complaint will be that git checking out will change the directory mtime. While this is true, the attached version (still "bare", no docs, no tests but LOTS of comments) actually stores the filenames (lightly encoded) and it is trivial to write a small and superb fast drush script that iterates over the codegen dirs, decodes the filenames (by replacing the # with slash) and redo the hashing of the filenames (realizing the need for it, I have moved this into a separate function).Comment #18
effulgentsia CreditAttribution: effulgentsia commentedI think this is a good approach, but:
- We should still make it swappable, following the cache system pattern ($conf variable mapping a 'bin' (e.g., twig, dic) to a class, where #17 is a default class we ship with).
- I'm not sure it's a good idea to use the public files dir. Perhaps another $conf variable that defaults to 'sites/default/codegen'.
- Should _drupal_php_rename_to_salted() also chmod the file to 400?
Before any of the above though, let's get some of the active participants from #1668892: Enable secure compiled information on disk to +1 this approach or poke holes in it.
Comment #19
effulgentsia CreditAttribution: effulgentsia commentedA few more people weighed in on #1668892: Enable secure compiled information on disk that the approach in #17 is sufficiently secure. Marked that issue a duplicate. Bringing over its tags. Setting to "needs work" at least for #18.1.
Comment #20
effulgentsia CreditAttribution: effulgentsia commentedAdding Performance tag to clarify why this is a "task", not a "feature request". We have good reason to expect from #1673162: Analyze performance of uncompiled and compiled DI containers that this will be necessary to avoid/minimize performance regressions from D7 to D8.
Comment #21
chx CreditAttribution: chx commentedHere's a classed version. It's a component -- there's nothing Drupal-ish about this, after all. The only problem was that we had a drupal_hmac_64 call but I changed even that to an injected thing. Non-Drupal people can pass in a closure wrapped around hash_hmac or whatever. Right now it's not plugged into Drupal either. I am not deciding on public files or not in this patch, that'll be the next but I would like to note that making users to create and chmod another directory is a pita. Added chmod 0400.
Is @var Callable a thing?
Edit: it seems it is, just with a lowercase c.
Comment #22
pounard@var callable
Without caps. "callable" is a valid type hint in PHP 5.4, accepting all of what is callable (function name, closure, array with an object and a function name, etc...).
Nice code, thanks chx, the interface segregation here is exactly what we need at this point.
Comment #23
Crell CreditAttribution: Crell commentedThis is looking much much better than the earlier versions! No detailed review of the algorithm itself yet, but some general comments:
This should be @var
Since the constructor is not part of the interface, these should be broken out to separate variables rather than a config array. It makes the class more self-documenting. It also means we can provide a docblock for the constructor with very clear documentation about what to pass in and what it does.
Also, should ProtectedLoader be named something more descriptive, like TimeProtectedLoader, or some such? (To differentiate from, I presume, someone implementing some other protection mechanism.)
Comment #24
chx CreditAttribution: chx commentedWe have discussed the $config thing and agreed we keep it just document it very well.
Comment #25
pounardThis is super low level stuff, and it not be magically instanciated by some DIC, but using pure hardcoded code in some bootstrap phase I guess, so I'm pro keeping the options array here that will be the one we get from the settings.php file.
I don't want to force the contructor to be in the interface, an intermediate solution might be a setOptions() method or just a really good interface PHP documentation.
Comment #26
neclimdulre the constructor, I agree. Because its not part of the interface its up to the implementation to clearly document the argument and the use polymorphism/good DI practices to abstract the rest away.
Comment #27
chx CreditAttribution: chx commentedTested, documented, binned, whatnot. Added chmod() calls to really batten down the hatches.
Comment #28
pounardIs there a need to allow multiple loaders with the $bin parameter? I tend to think no, any opinion?
Comment #30
chx CreditAttribution: chx commented#28, effulgentsia points out that we have three use cases for the system: Twig, DIC and module upgrades. Twig and DIC can be regenerated but module upgrade can't and so they deserve separate handling. I am puzzled by the 12 exceptions, how can
if (is_dir($path)) { chmod($path, 0700);
saychmod(): No such file or directory
?Comment #31
chx CreditAttribution: chx commentedAh ha. Stream wrappers without stream_metadata. OK, moved that chmod into my test.
Comment #32
chx CreditAttribution: chx commentedRerolled against HEAD plus solved the test cleanup issue once and for all -- a crashing test might have left unreadable dirs behind. Not any more.
Comment #34
chx CreditAttribution: chx commentedMeh.
Comment #35
xjm#34: 1675260_34.patch queued for re-testing.
Comment #36
chx CreditAttribution: chx commentedI found two debug() statements and removed them.
Comment #37
chx CreditAttribution: chx commentedInstead of copying the whole of file_unmanaged_delete_recursive() I made it slightly more reusable. I do not want to add a possibility of chmod'ing into that function, it's dangerous enough.
Comment #38
pounardI don't see the point of functions
drupal_php_include()
anddrupal_php_write()
etc... We are slowly switching a lot of core code to OOP code, and I think that people are smart enough to usedrupal_php_get_loader()->write()
,drupal_php_get_loader()->phpInclude()
, etc... Just as cache system is also slowly evolvoving to.Plus it avoid to have too many functions in global namespace and leverage the OOP architecture below (also the IDE autocompletion if you have one), it encapsulate a bit more what is really behind the procedural wrapper and avoid this indirection layer.
Why is the test a WebTestCase? It would be faster and cleaner to be able to have a UnitTest instead if we can.
That's only an opinion, aside of that I like this patch.
EDIT: Typo.
Re-EDIT: Some other typo.
Comment #39
chx CreditAttribution: chx commentedImplemented #38, renamed phpInclude to includePhp (thanks pounard for the name). The test now is so, so very fast, thanks for the idea.
Comment #40
pounardReally nice, it makes the change even smaller.
Comment #41
pounardIf tests pass, it's RTBC for me.
Comment #42
aspilicious CreditAttribution: aspilicious commentedI would like to see some doc improvements before this gets in.
This is longer than 80 chars
Missing @file
Always add a newline after a "class xxx" line
This needs an @param
Needs some extra explanation
End with a period
Add a newline above and create a docblock for this
Missing @file
Newline after this
IncludeS (same for the other functions in this interface)
We always add both cases in the return documentation. So there can't be any confusion. Even if i's kinda duplicating.
Deletes a directory recursivly ...
Can use a docblock header
With new files I would like to document test classes better.
21 days to next Drupal core point release.
Comment #43
chx CreditAttribution: chx commentedThanks for guarding our code quality. Really appreciated.
Comment #44
chx CreditAttribution: chx commentedMore!
Comment #45
chx CreditAttribution: chx commentedEven more :)
Comment #46
pounardI think it's ready now.
Comment #47
chx CreditAttribution: chx commentedAdded a test for the native loader. This makes for a framework for any other loader test btw.
Comment #48
chx CreditAttribution: chx commentedHopefully last one: very small doxygen changes.
Comment #49
sunAny particular reason for why this isn't a factory; i.e.,
Drupal\Component\PhpLoader\PhpLoaderFactory
?If there is a good reason, then let's document it in the phpDoc.
Given that the whole purpose of this is to load static, low-level stuff that has been dumped to disk, and that this stuff holds the definitions for services and their parameters, the usage of variable_get() here doesn't make much sense to me. Let's replace this with a simple
global $conf
.'prefix' seems to be a directory in all cases, so why not name it 'directory'?
The only thing that _simpletest_delete_recursive() is additionally doing is to call chmod() on $entry_path before recursing. Wouldn't it make more sense to replace $function with
$chmod_dir = NULL
?Why do we pass the called $function name itself as parameter?
Can we rename this to
$options
?1) The call to file_exists() seems unnecessary. It would be better to have a separate ::exists() method that can be called when desired. I don't see why the ::includePhp() method shouldn't unconditionally call into include_once when asked to do so.
2) Why does includePhp() use include_once? If there is a need for include_once, then that should be a separate includeOncePhp() method.
3) Given how expensive it is to calculate the hashed file path, I wonder whether the includeOncePhp() method should use an internal
private $includedFiles
property to quickly look up whether a given $filename was included already.Why do we try to hide write errors? They are not suppressed anywhere else.
Can we add docs that explain why $dir is touched? (outside of the loop)
That's a nice trick, but it makes the logic unnecessarily hard(er) to follow. It would be cleaner to put one clearstatcache() before the while() and another one at the end of the inner loop.
Shouldn't those two class names at least be rooted to the top-level namespace?
Let's rename the actual test method into testWhatever() on the base class and make the setUp() method for the individual implementation classes provide the implementation to test in
$this->loader
.That said, the tests do not seem to be complete in terms of expected functionality/behavior. Let's add some more tests for our expectations there, especially regarding security.
Comment #50
chx CreditAttribution: chx commentedComment #51
chx CreditAttribution: chx commentedAny particular reason for why this isn't a factory; i.e., Drupal\Component\PhpLoader\PhpLoaderFactory?
Because that's 500x times as many characters to type. Unlike others I am not drunk on OOP for OOP's ske.
'prefix' seems to be a directory in all cases, so why not name it 'directory'?
It's a path prefix, after all. I admit it comes from the time when it was a filter though. I do not care much what it is called.
The only thing that _simpletest_delete_recursive() is additionally doing is to call chmod() on $entry_path before recursing. Wouldn't it make more sense to replace $function with $chmod_dir = NULL?
Nope. file_recursive_unmanaged_delete() has a stated purpose to not delete things and I want to honor it. It also works with stream wrapper and chmod() doesn't (pre-PHP 5.4 it's always a warning, 5.4 needs stream_metadata implemented which we obviously don't have)
Why do we pass the called $function name itself as parameter?
Why not?
Can we rename $config to $options?
Why would I when it's the configuration for the clsas?
The call to file_exists() seems unnecessary. It would be better to have a separate ::exists() method that can be called when desired. I don't see why the ::includePhp() method shouldn't unconditionally call into include_once when asked to do so.
Because otherwise you can't return TRUE / FALSE.
2) Why does includePhp() use include_once? If there is a need for include_once, then that should be a separate includeOncePhp() method.
It can be include. Core everywhere uses _once even in places which have guards against running twice. I do not see a reason to bikeshed it here.
3) Given how expensive it is to calculate the hashed file path, I wonder whether the includeOncePhp() method should use an internal private $includedFiles property to quickly look up whether a given $filename was included already.
That sounds a bad idea to me memory-wise given how this is going to be used the re-loading is an unlikely problem.
Why do we try to hide write errors? They are not suppressed anywhere else.
Once again, I do not care much. We have a FALSE return value I felt it's the role of the caller to handle the error case.
Can we add docs that explain why $dir is touched? (outside of the loop)
It might actually be obsoleted by now, actually.
That's a nice trick, but it makes the logic unnecessarily hard(er) to follow. It would be cleaner to put one clearstatcache() before the while() and another one at the end of the inner loop.
It's documented and I rather keep the code shorter but -- I do not care.
Shouldn't those two class names at least be rooted to the top-level namespace?
Likely they should.
Let's rename the actual test method into testWhatever() on the base class and make the setUp() method for the individual implementation classes provide the implementation to test in $this->loader.
That said, the tests do not seem to be complete in terms of expected functionality/behavior. Let's add some more tests for our expectations there, especially regarding security.
I have unassigned and unfollowed the issue. We have committed much larger (kernel) patches with practically zero testing. At least this has a test. I am not keen on writing tests for chmod().
Comment #52
effulgentsia CreditAttribution: effulgentsia commentedThanks for driving it this far, chx. I'll take a turn.
Comment #53
effulgentsia CreditAttribution: effulgentsia commentedI'm almost done implementing sun's feedback and my own, but not quite. Will finish tomorrow and post an updated patch.
Comment #54
chx CreditAttribution: chx commentedNote: I thought that touch is necessary to make sure the dir mtime / filename changes but then again if there was a file in there we delete it and that changes the dir mtime so it is, indeed, seems unnecessary. On the other hand, the cost of one touch to make extra sure the dir mtime changes even if it was empty is so minuscule that I would keep it.
Comment #55
pounard@effulgentsia
Please don't implement sun's suggestions blindly, I don't agree with all of those, and they need some talking first.Ok chx's answers are pretty good IMHO.EDIT: About the
includeOnce
name, I'm against. This is a compiled code class loader more than a php include file helper and I don't see where this could change, I don't think bikeshedding about*_once()
,require()
orinclude()
has any sense in this issue. Once a file is loaded, it shouldn't be loaded twice because all its compiled code (mostly classes) are up, and it is very likely to be context agnostic code (so include makes no sense in the end. I'd prefer to call the methodloadCode()
or something instead of keeping theincludePhp()
bikeshedding name (even if we agreed that load was a bad name with chx in the first place).@chx Tell me if I'm wrong about the context agnostic side, especially regarding Twig of which I don't much about.
Comment #56
pounardSorry, browser cache problem.
Comment #57
chx CreditAttribution: chx commentedWhat definitely needs implementing is a test for delete and fixing the delete using classes not use'd because that's a fatal error what sun found.
Comment #58
effulgentsia CreditAttribution: effulgentsia commentedI still want to add some more tests after lunch, but meanwhile, here's a patch that can be reviewed. I renamed a lot, so not attaching an interdiff.
Comment #60
pounardI think that the "generated php" is wrong. It can store PHP code, in general.
Comment #61
effulgentsia CreditAttribution: effulgentsia commented#58: 1675260_58.patch queued for re-testing.
Comment #62
effulgentsia CreditAttribution: effulgentsia commentedBut the purpose is to store PHP generated by the Drupal process. Even if it's a "download code from d.o. and save it locally", isn't it still "generated" in that sense?
Comment #64
effulgentsia CreditAttribution: effulgentsia commented#60 was x-post I think. Reassigning to myself.
Comment #65
effulgentsia CreditAttribution: effulgentsia commentedAdded more tests. I haven't been able to replicate the #58 failures locally yet, but I wonder if the changes here happen to fix them.
Comment #67
effulgentsia CreditAttribution: effulgentsia commentedOk, now I can replicate the "Resettable static variables test" failure locally. Working on it. Also, upon thinking about it, I agree with pounard (#60) to remove "generated": it's an extra word that's unnecessary. WIll do so in the next patch.
Comment #68
effulgentsia CreditAttribution: effulgentsia commentedComment #69
effulgentsia CreditAttribution: effulgentsia commentedAdding the DIC use case into here.
Comment #70
effulgentsia CreditAttribution: effulgentsia commentedI'm done for the day. Will check in on this again tomorrow morning.
Comment #72
katbailey CreditAttribution: katbailey commentedPlease let's keep the scope of this issue to the mechanism for writing and reading the files. The container use case had originally been included in the bundles patch and was removed pending a secure way to do it, which was this issue here. A follow-up issue was then created for the container compilation, which is here: #1706064: Compile the Injection Container to disk, postponed until this one gets in. As you noted on that issue today, @effulgentsia, I had already submitted a patch which puts back the code that was removed, adjusted to work off chx's solution in #48. Adjusting it to account for renames of the various functions involved will be trivial.
Adding in the container compilation use case will only serve to slow down this issue - the exact way that we do the container compilation needs a whole separate discussion.
Comment #73
effulgentsia CreditAttribution: effulgentsia commentedThe #69 failures are interesting. Wondering if they're random or will happen again in the same locations, so uploading again to get a separate bot run.
Comment #74
effulgentsia CreditAttribution: effulgentsia commentedRe #72, fair enough. Reuploading #68 which is just the system without a use case. Looking forward to some reviews.
Comment #75
chx CreditAttribution: chx commentedThanks for the patch.
In the security test I would mention that the security problem we work against here usually does not include access to chmod only an arbitrary setting of upload file names so even that 1 second window is, in fact, much, much shorter: the window between chmod 0300 and 0100.
Also, there is no delete test. This might be related to the single exception we are getting: opendir(/var/lib/drupaltestbot/sites/default/files/checkout/sites/default/files/simpletest/593341/php/dependency_injection/container.php): failed to open dir: Permission denied" If I will have time to look into today I will reassign to myself, not yet sure.
Comment #76
effulgentsia CreditAttribution: effulgentsia commentedPhpStorageTestBase::doTest() tests delete(). Good suggestion on docs improvement for the security test. Back to "needs review" though to encourage more feedback on the rest of the patch.
Comment #78
pounardThe tests are not easily extendable, I think sun raised that point.
The abstract test method should define a
getPhpLoader()
(no name bikeshedding here) function, and carry all the tests (except the implementation very specific related tests), and the implementors would only have to extend it and implement this method.Comment #79
chx CreditAttribution: chx commentedThis issue is an absolute textbook example of bikeshedding -- the kernel patches are the atomic plants noone understands so they fly in without (much needed) scrutiny this issue is the bikeshed which gets (totally unneeded) lots of nitpicking. I will fix the failing tests and that's it.
Edit: doesn't fail for me locally. Re-testing.
Edit2: holding up the patch for alleged limitation of tests is bikeshedding, plain and simple. They are extensible enough and what if they aren't? There won't be many contrib alternatives and if they find the core test limited they can write their own just fine. The API needs to be unlimited, not the tests.
Comment #80
chx CreditAttribution: chx commented#74: 1675260_68.patch queued for re-testing.
Comment #81
pounard@chx sorry for the occasional bikeshed.
Comment #82
Dries CreditAttribution: Dries commented@msonnabaum will beg you to remove the
file_exists()
call inload()
because it could be a performance killer on network file systems. We should ask ourselves if it is really necessary to do thatfile_exists()
call. I can't really tell because there are not any use cases yet in the code, it seems.It is odd that we need to do a
file_exists()
check. Why would this file get called byfile_unmanaged_delete_recursive()
on a non-existing file?Comment #83
moshe weitzman CreditAttribution: moshe weitzman commentedSince the generated PHP is basically a cache, I think it would be fine to default to writes to temporary:// instead of public://. The issue is that all requests need to access the generated php file and that read will be slow when you have a shared filesystem. This also avoids a WTF with private files within the public:// directory.
Comment #84
chx CreditAttribution: chx commentedIf we want to use this for the module upgrade scenario it needs to be permanent.
The only reason include it has a file_exists because otherwise we can't have a boolean return. Feel free to remove and replace with documentation. I won't. I promised to get this green again and that's it. It wasn't actually failing so that was easy.
I have no idea why effulgentsia added a file_exists before the chmod.
Comment #85
effulgentsia CreditAttribution: effulgentsia commentedI'll work on this some more this afternoon to incorporate recent feedback from pounard, Dries, and Moshe.
I think we can go with temporary as the default, and in the issue that we do module upgrade as a use case, have that one use a special bin. Thoughts?
Something failed for me at one point when I didn't. I'll look into it and if it fails again, see if there's a more proper fix.
Comment #86
chx CreditAttribution: chx commentedHrm, yeah, we can go with another bin, after all temporary is a variable too, isn't it?
If by pounard's feedback you mean the tests, I am very much against the idea of adding a test method to the abstract class and then have the real test classes with a getinfo and a setup. It'd work. That's what Doctrine does for the AnnotationReader test. It took me about an hour to comprehend that one. It borders deliberate obfuscating tests. The tests are fine as they are. They are trivial to extend if what you want to do is the same -- if not, write your own test, I already said so. (I find it amusing how the OOP people are suddenly against injecting just because we didn't pick the sacred "implement and override" bullshit they pollute everything with)
Comment #87
effulgentsia CreditAttribution: effulgentsia commentedRemoved the file_exists(). Renamed doTest() to assertCRUD(), but per #86, didn't change the structure.
@moshe: would it be okay to deal with a non public:// default in a follow up? I think we may need to introduce something different than temporary:// (perhaps local://). Also, I want to avoid calling file_directory_temp() from drupal_php_storage(), so that drupal_php_storage() can be invoked prior to database bootstrap.
@chx, re #75: I didn't change the docs, because testSecurity() is testing the mtime protection in addition to the permissions. So, you're right that permissions alone provide a lot of the security, but since we also have mtime protection, that only kicks in if someone manages to break permissions. Maybe there's a way to capture that better in the docs, but I might not work on this again for a bit, so putting this up as an rtbc candidate in the meantime.
Comment #89
sunI've seen that @chx posted a question on http://stackoverflow.com/questions/11518486/ctime-mtime-holding-director...
But did we actually verify the changing mtime behavior? It looks like we're adding a dependency on a rather "magic" behavior of certain filesystems here - which builds the basis for an essential security mechanism.
This is what Windows 7 shows me:
In essence, in line with the answer on SO, the mtime is only updated when a file within the directory is created or deleted, but not when a file's content is updated.
In turn, we seem to have the expectation and assume that no one will be able to execute a script along the lines of:
It is possible that you need to have lower-level access in order to perform such a script in the first place. OTOH, relying on this magic filesystem(-specific) behavior is a bit concerning. Just wanted to point out my findings.
Comment #90
effulgentsia CreditAttribution: effulgentsia commented#87: 1675260_87.patch queued for re-testing.
Comment #91
moshe weitzman CreditAttribution: moshe weitzman commentedI would be ok if temp:// were handled on a follow-up.
Comment #92
chx CreditAttribution: chx commented#89 this is not what we are protecting against. Also note that your find will abort with permission denied. The dirs are not listable.
Edit: if that script is PHP and it has sufficient rights to chmod a dir created by the web server, it already has the same privileges as you would gain by owning Drupal. If that was bash, ie you gained a shell running under web server rights then you can create a PHP, file in the public dir, write SetHandler application/x-httpd-php to .htaccess (perhaps in a subdir if you happen to not be able to override the one Drupal dumped) and run whatever PHP you wanted. Breaking this is again unnecessary. No, what we want to stop is truly crappy scripts which allow arbitrary file uploads because just including a bare PHP file from an Apache writeable dir is a new vulnerability to those scripts and so we protect against that.
Comment #93
chx CreditAttribution: chx commentedSo let's recap the mtime reliance.
Putting a file in a chmod 0100 directory already protects against one off usage of a leaky upload script.
Making the file a secret hash protects against an infinite repetition of a leaky upload script to write a known filename in the (very short) chmod 0300-0100 window because simply the attacker can't know what filename to write.
The attacker could learn of the filename if the error display setting would be switched on (which is wrong for production anyways) PLUS an error in the generated PHP script would reveal that secret to the browser. See automation above for exploiting this. Making the hash relying on directory mtime protects against this step but this step alone. Should directory mtime not change upon file write/delete then we are still quite secure: the attacker needs this superb weird filesystem, a leaky upload script, some bogus PHP code deployed, error display on to launch the automated attack and even then it needs luck to avoid being detected by hammering a script for catch the short chmod window.
Comment #94
effulgentsia CreditAttribution: effulgentsia commentedWell, we actually have multiple layers of protection. If chmod were truly enough, we wouldn't need the mtime business at all, because both the directory and file are made not writable.
We have another layer of protection that the directory is made not readable either, making disclosure of the hashed file name also not easy.
So we're assuming here that we want another layer of protection in case the chmod isn't enough (e.g., Windows?, or if a crappy upload script is so crappy that it makes the destination writable before copying an uploaded file to it?) and the file name gets disclosed anyway (e.g., if the code inside throws uncaught errors and the site has errors reporting to the screen?). In this case, the directory mtime provides protection against a file move (i.e., what PHP's move_uploaded_file() does). #89 is correct that on all systems (not just Windows), writing to the file directly (rather than moving another file over it) doesn't touch the directory's mtime, so would be a vulnerability. So we're relying that a crappy upload script isn't so crappy that it fails to validate extension, makes the destination file writable, *and* doesn't use move_uploaded_file() or similar.
How much of the above should be documented in the code? Do we want detailed comments in there about "ok, here's what you need to do to break this"?
[Edit: this was an xpost with #93]
Comment #95
chx CreditAttribution: chx commentedI think a link to this issue is adequate.
Comment #96
pounard@#82,#84,
Actually, include(_once|)() will return a strict FALSE if the include failed. So in that case we could
return FALSE === @include_once $filename
instead thus avoiding both fatal errors and doublestat()
on the filesystem.@#83
We actually already discussed this on IRC, and using temporary here is a really bad idea. Yes it some kind of cache, but it is a persistent cache. Temporary file system will be prone to the operating system recurrent
clean cyclespurge, while when the site will be a in aggressive production mode it won't attempt rebuild by itself (I hope). Plus using temporary means potentially shared data with other daemons or sites on the operating system level, which makes it even less secure.@#86
Even if I do not agree completely, I'm OK with keeping chx's tests as-is in order to avoid bikeshed, and because it works, so we don't need to refactor for the sake of refactoring here.
EDIT: Minor typo.
Comment #97
pounardWhy assigning a temporary variable here? I don't see the need of those changes, at all, and considering that PHP will not have such an efficient compile phase because of its highly dynamic nature, avoiding those seems like a good idea.
In the
unlink
case, we can afford not putting the @ in front or testing with afile_exists()
around first if we want to get rid of errors, because the delete will not be done on performance critical code pathes.Comment #98
chx CreditAttribution: chx commentedHrm, I missed
on the include manual page. Thanks. So the next patch needs to remove the variable in #97 and implement the include-return.
Edit: I fixed the PHP manual page so it won't be so easy to be missed now.
Comment #99
Crell CreditAttribution: Crell commentedPlease be sure to include a comment for why we're using an @ then, which is otherwise a big no-no. (I can see the valid use here, because of the E_WARNING, but we should document why we're making an exception.)
Comment #100
chx CreditAttribution: chx commentedNote: I am not rerolling this to protest against the bullying-review in #49. This sort of thing must stop and I have no other ways but to strike. I haven't rolled a patch since in this issue and will not roll again nor will I in any other similar issue.
Comment #101
chx CreditAttribution: chx commentedAlas, our process contains no protection against such. If you go on strike , the bully won and the issue died / stalled. If you don't then you waste a ton of time defending non existing problems and again the bully won in stalling the issue as he wanted.
For the sake of this issue I will continue but -- how is the conflict resolution process coming :) ?
Comment #102
pounardJust ignore conflicts and continue with the patch. Once all fatal errors and dangerous code path will be eliminated, and all needed features will be here, the names/whitespace/etc bikeshed can go on.
Comment #103
chx CreditAttribution: chx commentedThis implements #98 and #99 as agreed. Sensible requests I can work with.
Comment #105
chx CreditAttribution: chx commentedD'oh, include is a statement, so...
Comment #106
catchWe can drop the @$function() call stuff when we've done #1247666: Replace @function calls with try/catch ErrorException blocks, just spamming that here.
Comment #107
pounardA try/catch block doesn't turn off the PHP notices, @function does: in this case this is the only goal we want to achieve, avoid a PHP file not found notice (and not Exception) because our layer does its own error handling. Try/catch blocks in this code would solve nothing.
EDIT: I read the other topic after writing that, I'm not sure this is a good idea to switch all PHP notices to exceptions, we are in a low level function here and spawing an exception object might be a bad idea for performances.
Comment #108
chx CreditAttribution: chx commentedStop. That's the other issue. Let's stay focused on this one.
Comment #109
pounardLet's mark this RTBC and see the bikeshed happen. I don't care for the names to change again, they are explicit enough for me.
EDIT: I did a quick eye review, and followed all the interdiff since the begining, I didn't see anything chocking happen. I'm happy with the interface and the names and the tests passes, so I'm totally fine with this being RTBC. As I said before, we will find bugs as soon as we will really use it, so let's not bikeshed it without really using it first. I won't do a whitespace review, so if anyone else wants.
Comment #110
webchickThis is way above my "paygrade" :) so I would prefer to defer final sign-off on this patch to catch or Dries.
Comment #111
webchickMoving to catch for now to reflect this.
Comment #112
pwolanin CreditAttribution: pwolanin commentedThis patch should include a read-only implementation of PhpStorageInterface
I'll add a simple implementation, but I think the more serious work on that should be in the follow-ups.
Comment #113
chx CreditAttribution: chx commentedGiven what this issue already went through, I would rather not add that here, to avoid further debates. (Should write do nothing? Throw exceptions? Etc.) It is entirely possible that we need higher level changes to react etc.
Let's do that in a folllowup.
Comment #114
pwolanin CreditAttribution: pwolanin commentedPatch also needs work since $GLOBALS['drupal_hash_salt'] may not be set - should use drupal_get_hash_salt()
Comment #115
chx CreditAttribution: chx commentedAs I said before, while "atomic plant" patches get in without any review #1599108-147: Allow modules to register services and subscriber services (events) because noone understands them (I was reviewing and discussing that one with Kat at the very moment it went in) this has no chance of getting in ever because everyone has their opinion. Let's admit it and stop wasting everyone's time.
Comment #116
webchickIf you no longer wish to work on this due to your disagreements with the feedback you're receiving, that's your perogative, but don't abuse the issue status flag.
Hint: A handy way to keep people from ever reviewing your patches is to treat them like this when they do.
Comment #117
pwolanin CreditAttribution: pwolanin commentedfollow-up issue to fix use of global variable $drupal_hash_salt: #1739986: Fix fallback in drupal_get_hash_salt(), move it to bootstrap.inc, use instead of $GLOBALS['drupal_hash_salt']
Comment #118
pwolanin CreditAttribution: pwolanin commentedHere's a new patch + interdiff that adds some more code comments, adds a read-only class, and adds some more tests including a demonstration that the Mtime class cannot protect against e.g. broken code that does a fiel_put_contents(). We use this, for example, in javascript_libraries module for local caching of files, so a logic error in such a module is a possible attack vectot.
Comment #119
chx CreditAttribution: chx commentedTrying to write a calm review: such a logic would need to chmod the parent directory first and then find out the secret filename. Neither is likely.
Also, the sleep is completely unnecessary, we have dropped the filemtime directory mtime comparison when chmods were added. That can be debated whether it makes sense to drop it. In my opinion it's much better to have but then people have said that the additional mtime can be significant on a networked FS or something. I very seriously doubt there is a system where the mtimeprotected loader will be used and a single mtime is significant and also there are filesystem level caches but -- did I mention I am sick of people bikeshedding this death? If we add it back then of course the test won't work. Not that it does now:
(edited for brevity) color me very confused on what this intends to do. hacked starts as FALSE. Then it asserts it stayed FALSE...
Comment #120
chx CreditAttribution: chx commentedAlso, I am totally not sure why would we want to add such a test to core. The test it not about any functionality we do not want to regress but about a known limitation. We know what would need to change to immediately make it obsolete. What purpose it serves? Proof of concept -- there's no need to prove we know already.
Comment #121
chx CreditAttribution: chx commentedSummary of changes
The interdiff is big because of the massive test changes.
Comment #122
pwolanin CreditAttribution: pwolanin commentedI'm more confortable with the system defaulting to a mode that protects against the full range of mechanisms to update files.
possibly we should add 'bin' as a param that's used by class FileStorage?
However, that can be a trivial follow-up once we start to use this code and see what's needed to support compiling the DIC.
Comment #123
catchWhy do we need both $configuration['bin'] and $bin?
file_mtim().
Why can't we use file_save_htaccess() here? If there's a reason, why can't we make file_save_htaccess() capable of being used? It's not great that the text is copy/pasted in this function - makes it harder to update tc.
This comment contains a lot of the underlying discussion in this issue, but it's not put as well in the code itself as it is here - if possible it'd be nice to move it into the actual storage class.
Generally I think this looks fine, the security implications have been discussed at length in this issue and I'm happy with the balance that's in the patch.
Would be great to see performance numbers for how expensive this is compared to raw include_once() just to have an idea - presumably we'll recommend that sites that care about performance will check in any generated PHP code to version control and use a raw PHP reader but lots of sites won't. However I can't believe it'll be more overhead than using un-compiled containers or uncompiled Twig templates and whatever other horrible things we'd do without it so that doesn't block a commit at all.
Comment #124
pwolanin CreditAttribution: pwolanin commentedHere's a revised patch taking those comments into account, and making sure to set all standard configuration keys.
Comment #125
pwolanin CreditAttribution: pwolanin commentedhere's the interdiff too
Comment #126
catchOK the interdiff is really a diff of the two patches which is hard to read, but I think it's OK if I'm not going too cross-eyed reading it.
One more question - should the $configuration['fast'] stuff be a separate implementation that extends from the secure version then instead of configuring it as fast you just switch to that storage backend? We don't necessarily need to do that to get this in but it feels like it might be simpler.
Comment #127
pwolanin CreditAttribution: pwolanin commentedIt could be a separate class, sure. it would just change the load method. Given that there is no use of it now except in the test, we could also do it in a follow-up.
Comment #128
pounardI agree with catch here, but getting the patch commited is really important to unblock WSCCI/kernel stuff, so IMO pwolanin's suggestion to do a follow-up sounds good to me.
Comment #129
chx CreditAttribution: chx commentedClass split, test split, fast removed. Whether MTimeProtectedFileStorageTest extends MTimeProtectedFastFileStorageTest or vica versa or both extend the same abstract base class is most definitely a followup. Such ambiguity does not exist for the actual storage classes, there MTimeProtectedFileStorage must extend MTimeProtectedFastFileStorage because while MTimeProtectedFastFileStorage uses FileStorage::load() and FileStorage::exists(), MTimeProtectedFileStorage has its own load and exists methods (and not much else). I ran the PHP Storage tests, they pass, I checked git, the new files are here, all looks good.
Comment #130
pwolanin CreditAttribution: pwolanin commentedHa, I was writing almost exactly the same change yesterday, but was having a test fail so I hadn't posted it yet.
However, instead of "MTimeProtectedFastFileStorage" I used "MTimeLessProtectedFileStorage". Which aspect is more important to emphasize?
Comment #131
chx CreditAttribution: chx commentedMTimeLessProtectedFasterFileStorage :D ?
Comment #132
pwolanin CreditAttribution: pwolanin commentedHa, that would be fine with me.
Comment #133
catchOK I discussed this a bit in person with chx and pwolanin (separately), and have looked through a few times now. The two classes seem much happier to me, I like mmmmm timeless as a name but we can do that in a follow-up.
This will need a change notice.
Comment #134
chx CreditAttribution: chx commentedhttp://drupal.org/node/1747970
Comment #135
mikeytown2 CreditAttribution: mikeytown2 commentedI saw that the code that went in uses clearstatcache() without specifying the filename; thus clearing the whole stat cache. I created an issue that addresses this in core #1748880: Only clear the stat cache for the files we care about
Comment #136
pwolanin CreditAttribution: pwolanin commentedaltered the change notice to use
drupal_php_storage('mymodule')
Comment #137
chx CreditAttribution: chx commentedSure.
Comment #138
chx CreditAttribution: chx commentedComment #140
clemens.tolboomI'm not sure I understand the 'leaky script' solution at full.
But is it still possible to fix #1908440: Relax MTimeProtectedFileStorage permissions for DX, drush integration and world domination as that only umask stuff?