Enable secure compiled information on disk [#1668892]

Problem/Motivation

For Drupal 8, with the Dependency Injection Container we really, really want to compile the container configuration to a serialized form. Re-configuring the container on every page request would be on the scale of re-running hook_menu and hook_theme on every page request; that's simply not acceptable.

Symfony supports dumping that information to a couple of different forms, and we could add additional dump mechanisms, such as to JSON, or the database, or something. However, the fastest mechanism (and, incidentally, the easiest to debug) is to dump to PHP code, which can then simply be loaded off disk (cached in APC).

For Drupal 8, there's a strong push to use Twig for the template layer. That has many advantages, discussed elsewhere, but most of Twig's performance benefits come from its ability to compile templates to PHP code that can simply be executed. That, of course, requires generating PHP code, and there is no other compiled form available (as that would entirely defeat the purpose).

As we all know, generating PHP from Apache that gets executed by Apache is a security no-no. We are drawing the line between acceptable and unacceptable risk is where your directories are Apache writeable and at the same those written things are Apache executable. This risk is too big for us because both the write and the execute can happen outside of the Drupal realm. (See more at 5) )

For most Symfony2-based applications, that is no problem. Configuration changes are done by editing a config file and running a CLI command (using Symfony's Console component) that rebuilds the DIC, regenerates template files, etc. (It may be multiple commands, I don't know off hand, but that's irrelevant.)

For Drupal, we have this really big administrative area where people can change all sorts of things without editing a config file. That's rather the point of Drupal. That applies to important things like enabling modules, even if most configuration will be handled by CMI, not by the DIC. We also consider as a target market people who are making changes without shell access. And, we have this annoying problem that we really really care about security.

Basically, we run into another case of the classic 3-point balance: Performance, Security, Usability -- Pick two.

We need to find some way to resolve this. In general, we need to understand that we will *not* be able to entirely get all three of our wants (Security, Performance, Usability), so we are going to have to sacrifice at least some of all of them.

Proposed resolution

We provide a very small, very tight API, pluggable via settings.php (yes, this will be a different plugging mechanism) that can:

Write PHP files
Load PHP files
Tell the rest of the system whether 1. is available.

The implementations we ship with are:

Bare files. Works for shared hosts / devel sites where the same user runs Apache and owns the files and if you chmod every time you want to do an operation from the UI needing a write.
A stream wrapper reading-writing PHP files with a header consisting of <?php exit; ?>randomstring. This would be used mostly by shared hosts or no-staging installations.
Perhaps a Phar storage and reader, Phar is native, code will be minimal and fast, same as native except the chmod will be done on one file only.

Before #78, we had the following, competing proposals:
1) Say that certain changes, those that would affect the DIC, may only be made if you have shell access. Yes, that would include enabling/disabling modules. Pro: Neatly avoids the problem. Con: Neatly kills a large chunk of Drupal's flexibility.

2) Dump DIC information to JSON, YAML, or some other non-executable format. Symfony has support for this, so we just need to choose or write a backend. Pro: We're not writing anything more dangerous to disk than CMI files will be. Con: Doesn't help Twig at all, and is slower than PHP code.

3) Dump DIC and Twig files to the database, and write a stream wrapper that lets us include() them straight out of there. Pros: Very elegant for the rest of the system, and should be reasonably easy to do. Con: I have no idea if the performance of that would be any good, or if it's as bad as eval(). Introduces an extremely early database dependency, and in turn makes putting the database connection in the DIC a circular problem.

4) Require themes to come with pre-compiled Twig files. If you're editing a theme, you need to rerun the compile script. Pros: Neatly side-steps the security issue. Con: No idea if this screws with Twig's design because then compiled files would not all be in the same directory. Does nothing at all to help the DIC problem. Could make theme development more difficult, like it was before the "rebuild theme registry on ever page" checkbox was invented.

5) Use a stream wrapper to include files in the filesystem. This allows to add a random value to the beginning of the file (something like exit; randomvalue where randomvalue is the same for every file and is settings.php / CMI stored ) so that an attacker can't upload a valid include (because the randomvalue is unknown). We can also, possibly, put them in a random directory CMI-style, although whether or not that's necessary is debatable. Pros: this is likely quite performant and usable too. Cons: the security here is debatable but it seems anyone able to exploit this would be able to exploit a Drupal without this subsystem as well.

Remaining tasks

Figure this out. :-)

User interface changes

TBD

API changes

TBD

Comments

Comment #1

chx CreditAttribution: chx commented 3 July 2012 at 03:05

5. We reuse the filetransfer for this. Configurable perhaps to disable certain backends? Anyways, if you are local writeable which the whining crowd will be then you are "golden". If you dislike filetransfer? CLI is your friend. In a devsite-prod site scenario, the devsite will be local writeable and the prod site will get compiled stuff deployed so this is not a problem for the higher end too.

Comment #2

Crell CreditAttribution: Crell commented 3 July 2012 at 04:30

Agreed that this is largely a problem only for people in single-install scenarios. Unfortunately, that's a huge segment of the population, and would include, say, everyone on Drupal Gardens. What chx is suggesting would imply that any time you enabled a module or theme or did certain other things you would have to also enter your shell password. Something like a build in sudo command, which only works if your server has FTP or SFTP setup on it... which, sadly, a fair number do not. (Or they're missing the PHP libraries needed to connect to those.) I don't know if that's a good usability trade-off point.

Comment #3

Crell CreditAttribution: Crell commented 3 July 2012 at 04:44

Other note that msonnabaum reminded me of: Any of the CLI script options (which I don't think we can avoid) should not rely on Drush. That means we probably want to pull in the Symfony Console component for those. Which is not actually a bad thing, especially as it means that #1599108: Allow modules to register services and subscriber services (events) can use the Symfony Bundle default implementation that has a dependency on it. :-)

Comment #4

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 3 July 2012 at 06:17

I noted this in the other issue, but:

As we all know, generating PHP from Apache that gets executed by Apache is a security no-no.

People keep saying this without saying why. Perhaps it's a noob-ish thing to ask about, but I would really like to hear the reasoning behind this.

Comment #5

pounard

French

CreditAttribution: pounard commented 3 July 2012 at 07:01

When the HTTPd can write code that can be executed it's probably both the easiest vulnerability to exploit and the most dangerous one, as soon as an attacker can execute arbiratry code, he has full control over site and data.

Comment #6

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 3 July 2012 at 07:55

Well, we've had the PHP input filter in core for years. Evidently, that was not a security concern, else we'd have removed it years ago. IMO, if the function that's actually writing the PHP is very isolated from the user, it's not much different than the PHP input filter.

Comment #7

Damien Tournoud CreditAttribution: Damien Tournoud commented 3 July 2012 at 08:41

3) Dump DIC and Twig files to the database, and write a stream wrapper that lets us include() them straight out of there. Pros: Very elegant for the rest of the system, and should be reasonably easy to do. Con: I have no idea if the performance of that would be any good, or if it's as bad as eval(). Introduces an extremely early database dependency, and in turn makes putting the database connection in the DIC a circular problem.

I would be in support of that. Performance should be as good as local files since we fixed a bug in APC handling of file URLs (fixed was released in 3.1.10).

Comment #8

chx CreditAttribution: chx commented 3 July 2012 at 14:17

One wonders whether we could avoid the reliance on the full database system by putting these bits and pieces in SQLite (perhaps in the config dir), providing the path to it in settings.php and using raw PDO in the stream wrapper. Perhaps.

Comment #9

neclimdul

He/Him

CreditAttribution: neclimdul commented 3 July 2012 at 14:24

@cweagans I'll try to address your questions but we should really be discussing the options rather than comments on process in the Drupal community that only have limited connection to the issue. The provided options have plenty to discuss and your points would be more meaningful in affecting change if they where based around them. Even so, you should not feel "noob-ish" for asking something you don't know. Also, the opinions not documented on drupal.org are my own not necessarily those of the security team.

Well, we've had the PHP input filter in core for years. Evidently, that was not a security concern, else we'd have removed it years ago.

Its a constant concern actually and the reason it was moved to its own module and why I personally discourage its use if at all possible. Even then it is always behind a permission marked to give full control of the site. @see Preventing execution of untrusted PHP and Paranoia

As we all know, generating PHP from Apache that gets executed by Apache is a security no-no.

People keep saying this without saying why. Perhaps it's a noob-ish thing to ask about, but I would really like to hear the reasoning behind this.

Apache writing executable code isn't exactly the problem. In a perfectly controlled situation it might be ok. The problem is that crossing that line means that security concerns become more complicated as any sort of user input could potentially be written to disk in some sort of executable format. This gives attackers many more options for getting PHP written to disk and it is a situation where the entire process and any implications need to be reviewed. But since hooks make any process we provide infinitely customize-able that becomes fairly impossible and a hairy mess leading to the general rule of "don't have Apache write executable code to disk" because that's a much easier process to enforce.

edit: typos. i can't type.

Comment #10

chx CreditAttribution: chx commented 3 July 2012 at 14:38

Yeah and that makes me wonder how secure writing to the db is. It's somewhat harder to get that executed because it's not a file you can browse to but if you can get Drupal to execute it? It's really tricky.

Comment #11

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 3 July 2012 at 14:56

@cweagans I'll try to address your questions but we should really be discussing the options rather than comments on process in the Drupal community that only have limited connection to the issue.

The only reason that I asked in this issue is that I feel it doesn't have only a limited connection to the issue. If we can just write out a PHP file to some location on disk, then this ticket becomes moot.

Apache writing executable code isn't exactly the problem. In a perfectly controlled situation it might be ok. The problem is that crossing that line means that security concerns become more complicated as any sort of user input could potentially be written to disk in some sort of executable format.

If we're only building a file based on a bunch of file paths in the system, we don't need any user input. Just build a list of paths from the data on disk, write it to a PHP file on disk, and call it good. I don't see security problems with this if it's completely isolated from user input, which seems to be the main concern. Is there something other than file paths that are going to be written to this file? Am I just missing something?

Comment #12

pounard

French

CreditAttribution: pounard commented 3 July 2012 at 15:31

As soon as the file is writable by the webserver, someone that exploits a vulnerability that allows him to write arbitrary file content may directly write PHP code in that particular file. There is no real importance of how he did exploit the other flaw and wrote content into it (it could be a fake POST'ed public URI pointing to it, for example), the fact that the file is writable is enough by itself: because in order to read the system list array in the PHP file, you have to execute it, and if the attacker succeeded in writing it, then he has full control over the site. No matter how hard we try to contain it, once the file is writable, every vulnerability about writing arbitrary file content becomes an arbitrary code execution vulnerabilty as a consequency.

Comment #13

Crell CreditAttribution: Crell commented 3 July 2012 at 15:32

That's a valid point, chx. I don't know that Apache writing a PHP string to the DB and then executing it is any better, security-wise, than Apache writing a PHP string to disk and then executing it. Ibid for writing to SQLite, MongoDB, or Memcache.

Comment #14

pounard

French

CreditAttribution: pounard commented 3 July 2012 at 15:39

@#10, @#13 Yes this is true, as soon as you write and execute code, you open potential vulnerabilities, no matter where this code comes from.

Comment #15

catch

he/him

English

CreditAttribution: catch commented 3 July 2012 at 15:59

There's an option #6, which is not great, but might potentially work as a fallback:

- put all actions (such as enabling modules) that would involve writing PHP code into a queue.

- run down the queue on cron.

This only sidesteps the permissions issue if the site isn't also using poormanscron, but it might be a fallback in the case there's neither cli access nor ftp/sftp libraries available.

@Crell's point in #13, I think it's slightly better if you're on a terrible shared host where apache could write to any of the home directories on the server - since writing PHP to a database (except possibly sqlite) would require database access too, not just messed up file permissions.

Comment #16

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 3 July 2012 at 16:01

As soon as the file is writable by the webserver, someone that exploits a vulnerability that allows him to write arbitrary file content may directly write PHP code in that particular file.

Us directly writing PHP isn't going to have any relation to that issue though, since we'd totally isolate it from user input. Plus, if an attacker exploits a vulnerability that allows him to write arbitrary file content, AFAIK there's nothing stopping him from writing a PHP file himself in the files directory and bootstrapping Drupal.

Comment #17

pwolanin CreditAttribution: pwolanin commented 3 July 2012 at 16:07

Sounds like Twig should need to be pre-compiled or compiled using a CLI tool. (is it settled we are using Twig?)

The routing info needs to be in the DB I think except perhaps for special cases like install or maintenance mode.

Comment #18

pounard

French

CreditAttribution: pounard commented 3 July 2012 at 16:20

AFAIK there's nothing stopping him from writing a PHP file himself in the files directory and bootstrapping Drupal.

Not if the file are not writable on the filesystem. The danger is not writing the files ourselves, but leaving the permission open to the HTTPd or CGI user to write it.

Comment #19

pwolanin CreditAttribution: pwolanin commented 3 July 2012 at 16:21

Can you have a compiled version of the DIC and then override is from e.g. a JSON file based on changes made in the UI?

I think a cycle of "change in the UI and then optimize with a CLI tool for production" might be a reasonable way forward for Drupal.

Comment #20

pwolanin CreditAttribution: pwolanin commented 3 July 2012 at 16:28

An override like this might also be a useful way to e.g. implement maintenance mode, so you just remove the file when yougo out of maintence mode.

Comment #21

Crell CreditAttribution: Crell commented 3 July 2012 at 16:31

Issue tags:

+wscci-hitlist

Tagging.

Comment #22

chx CreditAttribution: chx commented 3 July 2012 at 16:59

So let me summarize the security aspects and database code. So if there is a vulnerability in upload functionality and a malicious user uploads a script and executes it. We protect against this by adding a .htaccess to the files directory (which is Apache writeable) and urging everyone to make the other directories non-Apache writeable. The assumption is that upload functionality will not be so buggy to allow you to override .htaccess.

If we write code to the database, what would it require for an attacker to write arbitrary code there and get it executed? Edit: I can think of the following:

have a Drupal bug where you can execute PHP from an arbitrary path. If you have such then you can just upload a txt file into the files directory and execute it.
Have a way to arbitrarily write the database. I am afraid in this case you probably would reset the uid 1 email address / password and you are done.

So. I do not see new avenues of attacks opening.

Comment #23

donquixote CreditAttribution: donquixote commented 3 July 2012 at 19:57

cweagans / pounard:

AFAIK there's nothing stopping him from writing a PHP file himself in the files directory and bootstrapping Drupal.

Not if the file are not writable on the filesystem. The danger is not writing the files ourselves, but leaving the permission open to the HTTPd or CGI user to write it.

I don't get it. httpd can already write to files for the regular upload, css aggregation, etc. Or have I done it wrong all the time?
Linux file permissions cannot distinguish between .php files and non-php files, can it. Yes there is the executable bit, but we want to disable that anyway, don't we? We only want to "include" those files, not execute them directly.

The codegen directory would have a status similar to the private files dir: Outside of web root, etc.

3) Dump DIC and Twig files to the database, and write a stream wrapper that lets us include() them straight out of there.

Would it be equivalent (and superior) if we just dump a file to disk, with the literal php code, but with a bogus string at the beginning which makes it break when trying to execute? I only had a short glimpse on the doc page for stream wrappers, but I guess if they can handle php in the database, they can also handle php from a file, and remove the "lock".

Comment #24

chx CreditAttribution: chx commented 3 July 2012 at 17:20

Also, once we decide we like stream wrappers, we do not necessarily need to use the database. We could rot13 the PHP or anything that allows quick reading and does not allow Apache-browser interaction to execute a .PHP file. Edit: hey, that's what #23 suggests too.

Or ... why don't we write into a random named directory much like CMI does and just include from there? We could replicate the whole PSR-0 dir structure in there, register it as a prefix and go to town?

Comment #25

chx CreditAttribution: chx commented 3 July 2012 at 17:29

Hrm, the random dir approach has the flaw that through error messages we might reveal the path but if we use a stream wrapper to add random crap in the beginning PLUS a random dir that's defense in depth, isnt it?

Comment #26

pounard

French

CreditAttribution: pounard commented 3 July 2012 at 17:41

@#23 You can write a lot of stuff with actual Drupal, but none of those files will be arbitrarily included as plain PHP. If we add generated code, we will arbitrarily include this code, no matter what, the real security issue is only this.

Comment #27

effulgentsia CreditAttribution: effulgentsia commented 3 July 2012 at 18:06

Sounds like the key issue is that we don't want a file upload vulnerability to turn into a remote code execution vulnerability. Which means preventing both direct Apache execution and also preventing a Drupal include() of anything that is vulnerable to a file upload vulnerability.

Comment #28

donquixote CreditAttribution: donquixote commented 3 July 2012 at 19:57

#26 now it does make sense :)
So the point is not that rogue httpd might execute a Drupal-generated file as php, but that Drupal might execute a rogue-httpd-generated file as php.

The little brother of this would be if Drupal interprets a rogue-httpd-generated file as configuration (yml, json, xml, etc). This is not as easy to abuse as directly executed/included php, but it is a form of injection that could indirectly be used for all kinds of damage.

The random crap at the beginning of files does not help, unless this contains some kind of secret signature thingie. The rogue httpd could simply put the same crap at the top of the modified file.

The point with putting the code in the database is that httpd cannot easily write to the database. Or can it?
It can look into settings.php to extract db username and password, log in to mysql, and inject the php code. Damage dealt.

-----

In essence:
We cannot prevent a rogue-httpd (code) injection attack, but we don't want to make it too easy.
Is this correct?

------

EDIT:
Ok, probably pointless. If we have a rogue http, the damage is already dealt.
So we don't assume a rogue httpd, but rather a limited file upload vulnerability, where the file is saved into the codegen dir. As #27 already says.

Comment #29

chx CreditAttribution: chx commented 4 July 2012 at 00:25

The point with putting the code in the database is that httpd cannot easily write to the database. Or can it?
It can look into settings.php to extract db username and password, log in to mysql, and inject the php code. Damage dealt.

No the point is (or rather was because it seems we found better solutions via the same stream wrappers mechanism) is that Apache can't read it. Not that Drupal can't write it. I said it above: if you can read settings and the log into mysql you can already pwn the system -- just write the system table to include anything you want.

Comment #30

Crell CreditAttribution: Crell commented 4 July 2012 at 00:58

So to put it practically, the problem is if an Apache process can write this string to disk in a web-accessible place:

include 'settings.php';
// Do stuff with $databases to setup the DB connection.
Database::getConnection()->query("UPDATE {users} SET pass='pw0ned' WHERE uid=1");

And then someone hits that URL in a browser, well, pw0ned. (I know that's not the exact script, but you get the idea.)

chx is correct, though, and I recall discussing a similar concept with Matt Butcher a few months ago. He thought I was nuts. :-) I think it's promising, though. Would simply making all PHP-writes pass through an encoding filter (as simple as rot13 maybe or possibly something more advanced than that with a salt and everything; not sure) so that what gets written to disk isn't executable except through Drupal acceptable from a security perspective? I'm sure it would not be as fast as straight up PHP, but it's probably close enough as long as we don't use a complex algorithm, and that would let us get all of the optimization benefits of writing PHP code to disk. It would also allow a really easy rebuild trigger:

if (!file_exists('compiled-di.php')) {
  // Run a recompile.
}
include_once('encodedphp://compiled-di.php');

(Or something along those lines.)

So if you ever run into a registry-rebuild-type situation, delete a file (or several files if we're talking Twig) and it will rebuild.

Would that satisfy the security folks?

Comment #31

chx CreditAttribution: chx commented 4 July 2012 at 01:05

Note: adding something as simple as <?php exit; ?> to the beginning of the files and then using a stream wrapper to skip the first 15 bytes is enough to disarm anyone browsing those files.

Comment #32

pwolanin CreditAttribution: pwolanin commented 4 July 2012 at 01:14

@chx - how does that prevent you from havoc if Drupal is writing them and there a vulnerability?

Comment #33

Crell CreditAttribution: Crell commented 4 July 2012 at 01:44

pwolanin: I think the conceit is that if you can hack into Drupal far enough that you could run arbitrary code, you could easily bypass whatever protection we put on that file. But if you're able to execute arbitrary code, why would you bother going after that file? You already have arbitrary code access, game over.

We just need to secure *that* file as an attack vector, not prevent it from being attacked from within a pw0ned site.

Comment #34

effulgentsia CreditAttribution: effulgentsia commented 4 July 2012 at 02:10

Quoting myself from #27:

Which means preventing both direct Apache execution and also preventing a Drupal include() of anything that is vulnerable to a file upload vulnerability.

#31 is a very elegant solution to the first part. For the second part, is having the default location of the 'encodedphp://' stream wrapper be sites/default/files/encodedphp_SOMETHING_LONG_AND_RANDOM sufficient security? I'm unclear if #25 answers yes or no to that.

Note that I'm assuming we'll allow for sites wanting extra security (or for other reasons) to via settings.php remap the stream wrapper to something else, but we also need our default solution to be sufficiently secure.

Comment #35

chx CreditAttribution: chx commented 4 July 2012 at 02:13

Some bugs are going to allow remote execution no matter what, the worst case is perhaps the one where you can change the system table and get anything at all included by Drupal.

But, what we are saying iis that we are drawing the line between acceptable and unacceptable risk is where your directories are Apache writeable and at the same those written things are Apache executable. This risk is too big for us because both the write and the execute can happen outside of the Drupal realm. We think that Drupal itself is quality software enough that such horrific bugs like the system table writeability are not happening. And so having a stream wrapper read and write in a controlled matter is an acceptable risk. It's not "secure" it's an acceptable risk.

Comment #36

chx CreditAttribution: chx commented 4 July 2012 at 02:19

Re #32 you are right, we need to make sure that if there is an upload vulnerability then the file written to the filesystem by the attacker is rejected by the stream wraqpper. So <?php exit;?>1234567890 the number being a random hash (same for all files) and checked by the stream wrapper.

Edit: this answers #34 too, I think.

Comment #37

pwolanin CreditAttribution: pwolanin commented 4 July 2012 at 02:30

These seems similar to some of the CMI discussion?

Obviously that could be protection against a certain class of attacks, but not against someone who gets a copy of your codebase - and makes it more risky to let people see your code repo? Or you think these are files that cannot be under version control?

Comment #38

effulgentsia CreditAttribution: effulgentsia commented 4 July 2012 at 02:42

The random key for CMI is in settings.php, and I assume any other random keys created for this would be too. That's where database credentials are too; do you think these keys pose a greater risk than that?

Comment #39

donquixote CreditAttribution: donquixote commented 4 July 2012 at 02:52

#31 is a very elegant solution to the first part. For the second part, is having the default location of the 'encodedphp://' stream wrapper be sites/default/files/encodedphp_SOMETHING_LONG_AND_RANDOM sufficient security? I'm unclear if #25 answers yes or no to that.

Why put it within files? Why not a dedicated location such as
sites/default/codegen/$randomstring/ ?
If we ask the user (site builder) to set permissions for sites/default/files, then we can do the same for sites/default/codegen.

(obviously the better location would be sth entirely outside of web root)

Comment #39.0

chx CreditAttribution: chx commented 4 July 2012 at 03:30

Issue summary:

View changes

Added 5)

Comment #40

Crell CreditAttribution: Crell commented 4 July 2012 at 03:32

I don't see why we'd need a randomized location. For CMI that was the alternative to making a hacked up not-PHP-not-JSON/YAML file, and I'm still not a huge fan of that there, either. ;-) We're talking about a file that would be executable PHP anyway. I don't get why it's needed here, and for a sysadmin I think it is a usability negative.

Outside webroot would indeed be preferable, if available, but we cannot guarantee it is.

Comment #41

chx CreditAttribution: chx commented 4 July 2012 at 03:33

Updated the issue summary with the currently discussed "use a stream wrapper to include verified files from perhaps a random dir" as 5)

Comment #41.0

chx CreditAttribution: chx commented 4 July 2012 at 03:33

Issue summary:

View changes

fleshed out 5)

Comment #42

chx CreditAttribution: chx commented 4 July 2012 at 03:41

Crell raised the question of debuggability while pwolanin wanted more security. Both can be solved if we expose an API which is higher level than stream wrappers (drupal_load is already there ) which is overrideable in settings.php / CMI. On one end, one insecure implementation shipping with devel module could just read files from the disk for local development on the other end paranoia module could ship one that simply blackholes writes.

Comment #42.0

chx CreditAttribution: chx commented 4 July 2012 at 03:41

Issue summary:

View changes

added a security notice

Comment #43

Crell CreditAttribution: Crell commented 4 July 2012 at 03:48

I quite like #42. I'm still not sold on the randomized directory, as I don't think that's necessary if we have the in-file hash, but otherwise this approach would let us write the DIC and Twig files to disk as PHP, and potentially other generated code. That is really powerful and opens up a lot of options for us.

Comment #44

chx CreditAttribution: chx commented 4 July 2012 at 03:54

(Side note. Resolving this issue allows for the much desired superb easy module installs and upgrades as well.)

Comment #45

donquixote CreditAttribution: donquixote commented 4 July 2012 at 04:10

@Crell (#40):
effulgentsia in #27 and #34 is spot-on.

There are 2 steps to the problem:
1) Something writes a php file.
2) Something executes that file.

This gives an attacker basically four options (started with two, now it's four :) )
a) Write a malicious php file, in a place where it will be executed by Drupal.
b) Alter the Drupal code generation, so that Drupal generates malicous php.
c) Cause the execution of a file that is not meant to be executed, or not at this moment. Typically by typing a url, if the file is web-accessible, hoping that Apache will execute it.
d) Write a malicious php file in a web-accessible place, hoping that Apache will execute it, if you type the url.

I think what we have to care about the most here is attack option (a).
Attack (b) has very little to do with how and where the code is saved: It can be encrypted, in the database, etc, it will all not help.
Attack (c) would apply to generated code, but does equally apply to all static php code: We don't want that to be executed randomly, only included by index.php.
Attack (d) already applies to the public files dir in D7, so the code generation does not change this much.

Prevent file injection to codegen dir

So, how can we prevent the attacker from writing those codegen php files?

Filesystem permissions are not the solution: Both Drupal and the attacker probably act via www-data / apache, so the same permissions apply to both.
If the attacker has full control over apache / www-data, we have lost anyway.

The attack we want to and can defend against is abuse of a file injection vulnerability. Such vulnerabilities are typically related to file uploading, but could also be other things. It could be something in Drupal, but it could also be a separate web application on the same server.

We want to choose a location that is less likely to be vulnerable to file injection, and where the file injection attack requires additional effort. This additional effort can be to determine the name of the secret dir, or to put a correct signature something at the top of the file.

The exit(); does not help, because the attacker can easily write that into the injected file, and have it executed by Drupal. It would help vs attacks (c) and (d), but I don't think these are as relevant.

Whether web-accessible directories are more vulnerable to file injection - hard to say. Probably it is easier to spy about the directory structure, if it is web-accessible. So, anyway we don't want it to be web-accessible.

The question is, what kinds of file injection attacks and vulnerabilities do we expect?

Comment #45.0

donquixote CreditAttribution: donquixote commented 4 July 2012 at 04:10

Issue summary:

View changes

Fix typo.

Comment #46

chx CreditAttribution: chx commented 4 July 2012 at 04:23

To reiterate: we are telling the users that if their setup is such that Apache can write a PHP file and you can point your browser to it, that's a security risk we do not want to tolerate. We deem that setup insecure and we do not want to rely on such. We definitely do not want Drupal to write any raw PHP file that can be Apache executed. That's where we start. This kills d) and a) both.

I think that the nature of b) is such that if you can do that, you have already owned Drupal badly. This assumption might or might not be true. Yes, there can be a bug where some user input gets written out as PHP. This would be unfortunate but bugs are a fact of life. We said numerous times in Twig issues when people wanted to browser edit that will be possible through a stream wrapper. So there is definitely a need for browser input to be executed. This is dangerous. No doubt. What this issue is about is finding a good compromise. Apache writeable is a compromise we deemed unacceptable and Wordpress accepts it. It's a matter of priorities. We are drawing the line somewhere else. The line needs to be drawn.

As for c) adding an exit simply protects from it. Adding a random dir does not help much nor does it hinder for the matter.

Comment #47

donquixote CreditAttribution: donquixote commented 4 July 2012 at 04:34

#46:
Web-accessible or not has nothing to do with attack (a).
I say "executed by Drupal", where Drupal assumes this file was legitimately created by Drupal code generation, but in fact it was placed there by the attacker, abusing a file injection vulnerability.
The folder that is affected by the file injection vulnerability could happen to be web-accessible or not - it does not matter, so long as Drupal treats those files as if they were legitimately generated.

So in fact it does not kill (a).

Comment #48

chx CreditAttribution: chx commented 4 July 2012 at 04:58

But the suggested random secret after the <?php exit ;?> does kill it. The attacker can't know that one. Meanwhile msonnabaum raised the question -- can't we use phar? And while phar signing requires openssl which we cant require we could get around that but the real problem iis a phar.readonly setting which is php.ini only and so if it's set on a shared host, we can't disable it and it would create a support nightmare where we need to tell people to edit their php.ini.

Comment #50

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 06:52

Just a note, any solution that are based on encoding the file or a stream wrapper are pretty much useless, because the code will decoded and executed anyway. If the attacker knows how it is encoded, he will be able to encode its own version. The only failsafe here is encoding using a private key. Anyway, this sound like major overkill for performances.

Comment #51

killes@www.drop.org CreditAttribution: killes@www.drop.org commented 4 July 2012 at 09:10

Well, we've had the PHP input filter in core for years. Evidently, that was not a security concern, else we'd have removed it years ago. IMO, if the function that's actually writing the PHP is very isolated from the user, it's not much different than the PHP input filter.

The PHP input filter is an accident waiting to happen. I'd have liked to remove it in Drupal 4.7...
So, taking a few extra steps of security would be helpful. I can't claim to fully grasp the concepts involved here, but chx' last idea sounds compelling.

Comment #52

chx CreditAttribution: chx commented 4 July 2012 at 09:23

#50 please re-read the proposal, it's a fast one and involves a secret.

Comment #53

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 09:46

@#48 The random secret must be a random size secret (not just a random secret) or it must be a signed hash (using a private key) of the file content the stream wrapper will know. That said, if someone attempts and succeeds in changing a stream wrapper scheme in, I don't know, any file upload, we are fucked again.

Anyway, this is still overkill, this will force to implement a PHP-land stream wrapper that encodes and sign, but worse that will decode and verify at runtime, depending on how much code we write, this could have a negative impact on performances. The code will also be really complex and we will end up with ofuscated generated files, ofsucated include scenarios, ofuscated bootstrap, ofuscated compile process; Once again we risk to loose developers that like development but do not like eating spaguettis.

All these convoluted solutions seems quite random and pointless to me, the more they become complex, the less I want to listen them and the more I think that we just like complexity for being complexity with no real link to the real world or whatsoever. I don't want Drupal code just to be fun nor just being a showcase that says "yes, we like complex and fun, and we know how to do it, get over it", that's already what's being done since Drupal 6, and in general it brings more problems than solutions, and a lot of people outside just hate that.

My point was, and will always be that we should make the compile process optional, and off per default, and let the power users/site admins choose whether or not they want to have it. If they do, they will write protect their generated files by themselves once generated on production sites, and keep them writable on their development environment. End of story, clear as water, simple as 1 + 1 = 2, anybody can understand and do that. Site owners will have full control security by themselves, and probably will delegate it to the operating system using a simple chmod command (which, and sorry if that frustrate people, will always be a lot more efficient and secure than us) and the code will remain nice and clean and efficient and effective, and everybody will be happy in the end.

Comment #54

catch

he/him

English

CreditAttribution: catch commented 4 July 2012 at 10:55

Whatever happens with this I'd like it to be possible to just do a compile from the cli for both the container and Twig templates, for people that have cli access, we shouldn't enforce anything that happens here on sites that do.

Beyond that, I think it comes down to the relative performance/complexity of the PHP stream wrapper (or any other trickery we might come up with) vs. not using compilation at all - since we're compiling primarily for performance it'd be good to get some very rough numbers of what this might look like.

Comment #55

chx CreditAttribution: chx commented 4 July 2012 at 10:55

To answer, #53, let me quote the OP first:

Basically, we run into another case of the classic 3-point balance: Performance, Security, Usability -- Pick two.

#53 argues to offer the following choices and these choices only:

Make your files webserver writeable. (security 0%, performance 100%, usability 100%)
Use CLI. (security 100%, performance 100%, usability 0%)

Now, the issue so far have argued that: a) there is no singular choice that works for everyone, this needs to be pluggable (#42 was about pluggability) and yes these two are sane implementations but b) we are looking to ship a third implementation per default with a 100% usability, more than 0% security solution which obviously means less than 100% performance. Code complexity so far was not mentioned, it was implied that indeed any less-than 100% performant solution will involve some coding, can't help that. I doubt that code complexity alone means we suddenly do not want to find this solution.

Comment #56

catch

he/him

English

CreditAttribution: catch commented 4 July 2012 at 10:59

Yeah I'm fine with this being pluggable, so the question again is how much of a performance improvement do we get with the 3rd option.

Comment #57

chx CreditAttribution: chx commented 4 July 2012 at 10:59

#54 and #56, as I written in #55 - yes we want a CLI option, no doubt. Even if we run numbers now on how bad the non-compile version is, it truly becomes ouch if we to use Twig. Twig compilation is not fast. Once compiled , the compiled files are hopefully fast (and you only need to recompile if your files change).

Comment #58

chx CreditAttribution: chx commented 4 July 2012 at 11:01

#56 and also how much do we win by allowing, finally, finally, painless module updates?

Comment #59

catch

he/him

English

CreditAttribution: catch commented 4 July 2012 at 11:14

That also needs performance numbers, if you want to allow for upgrading core via the UI, you'd need to have an entirely encoded Drupal except enough to bootstrap the stream wrapper, so the cost is going to apply to pretty much every file.

Comment #60

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 12:36

#55 is all wrong. What I propose is:

Disabled per default: 50% performant, 100% usable, 100% security
Enabled but not manually secured: 100% performant, 100% usable, 0% security
Enable but manually chmod'ed: 100% performant, 50% usable, 100% security

And yes, I dare to write 50% usable for the manually chmod'ed scenario, because a production site will never have modifications in those files, except if the archicture is rotten to the grounds. 50% usable because maybe 50% of people don't like to type chmod manually, but as long as it is documented, they will succeed in doing it.

And yes, I also dare to write that the disabled per default still is 50% performant, because if we use a stream wrapper that encodes and decodes stuff in PHP-space, not compiling at all will probably reveal itself relatively performant in the end compared to the encryption mecanism.

And I should have written on 75% performant in chmod'ed scenario in case you still want to ofuscate everything with stream wrappers; I should have written 0% usable in this case, because ofuscated code is what will lower everyone's ability to fully understand and debug their sites.

Phar archives are a good idea, but I don't think phar archives should be manually modified on a production site, they should be prepared upstream to any public opening, else it doesn't have any advantages. Phar brings easy deployement, performance, and security (we can easily check files checksum to ensure they are not modified, etc...). They do not fit with a day basis auto modification scenario, it's one of the real nonsenses among all I read in this thread.

@chx, your ideas are technically nice, pleasing and fun, but they do not fit the real world: they are complex, hard to maintain, ofuscated, and they infringe every encapsulation principle, for which a production ready PHP application should never modify itself, that is against every security and stability matters. A PHP application should never care about compiling itself, that's why having an external compilation mecanism which we can use during a packaging and deployment phase is good (exactly what does Symfony) but is a totally heretic and stupid architectural choice to make when it starts to be done at production application runtime.

Just put a godd'am button in the performance admin settings that says "Compile me!" and then chmod the file, and leave the security matters to the operating system, which is the only layer we can rely on into the whole vertical set of layers we can encounter in the PHP/Web environment.

Comment #61

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 12:59

PS: I think this thread is too convoluted and getting totally off topic, pretty much like the original file storage thread nonsense on g.d.o that predates CMI implementation. Please just don't make foolish overly complex stuff just to put on disk a single PHP file. This is not runtime matter, this is a deployment matter, and it should be treated as is If this leaks onto the runtime code, where it should not even exist (things such as stream wrapper) then it is the wrong solution.

Comment #62

pwolanin CreditAttribution: pwolanin commented 4 July 2012 at 13:11

I'm tending to agree with pounard (and catch) so far. I think the complexity introduced by some of the solutions here is going to be more dangerous for security and performance than the original problem.

I basically agree with the solution in #60, which is that compilation should not be required in a default install; you should be able to configure the site and file permissions to compile via the UI, but that should give you a big red error like when settings;php is writeable. Finally, you should be able to use CLI tools to compile securely.

I'm not sure how big the DIC or other compiled components will be - if we can keep them relatively small/independent and let Drupal have a way of flagging a particular compiled one as outdated, we ideally won't see a serious degradation of functionality if you need to make a UI change for a site that only allows compilation via CLI.

Comment #63

Damien Tournoud CreditAttribution: Damien Tournoud commented 4 July 2012 at 13:12

None of the solutions proposed here have any impact on performance. The streamwrapper is only loaded once, after that PHP loads the file directly from the APC cache.

Comment #64

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 13:19

@#63 If you have APC, if the files are not modified on a regular basis, if you don't proceed to other runtime verifications, if the box has enough APC shared RAM available, and conceptually it remains an extra layer on top of low level file access functions. This is not black nor white. PHP code complexity remains anyway.

Comment #65

catch

he/him

English

CreditAttribution: catch commented 4 July 2012 at 13:21

I'm assuming the stream wrapper would be for hosts that have neither cli access nor APC in most instances.

Comment #66

Damien Tournoud CreditAttribution: Damien Tournoud commented 4 July 2012 at 13:32

I think we should just consider those files a cache (there are very much not "compiled" in any accepted meaning of the word), and just allow loading and including files from cache via a streamwrapper.

On low-end hosts, that will mean caching on the database, on higher end hosts you can implement whatever strategy you want.

So:

include 'cache://cache_boostrap/test.php';

Result in:

cache('cache_bootstrap')->get("test.php")

Comment #67

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 13:45

@#66 I'm not against stream wrappers conceptually, I mean Phar work with it, I'm just against making day basis developers caring about it. We cannot see this as simple cache, this is PHP-compiled PHP code. Cache would mean data, not compiled code, elevating PHP code at the same level as cache is making Drupal matter about something it definitively shouldn't and is a huge constraint for developers.

Putting PHP code into the database makes it even more ofuscated, this means we'd need to be aware of it when doing database dumps in order to exclude it, else it may not be usable on different environments. It makes it even less secure because it's probably easier to do SQL injection than writing the file system when attacking.

Comment #68

Damien Tournoud CreditAttribution: Damien Tournoud commented 4 July 2012 at 13:50

We cannot see this as simple cache, this is PHP-compiled PHP code. Cache would mean data, not compiled code, elevating PHP code at the same level as cache is making Drupal matter about something it definitively shouldn't and is a huge constraint for developers.

The things we are talking about (compiled DIC, Twig) *is* a cache. There is really no other way of seeing it. That the cache is generated as a PHP file is just a implementation detail. It has all the characteristics of a cache ("a collection of data duplicating original values stored elsewhere on a computer").

Implementing this on top of our cache implementation will make it very flexible, so each site can best leverage the features of the infrastructure it is deployed on. And yes, you can store those cache files as files stored on the disk if you want.

Comment #69

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 14:02

@#68 A compiled DIC is a complete component by itself, it doesn't contain any data, and its place is in the code, not with the data. Storing this in cache is just yet another Drupal WTF where we'd end up on badly configured environment with some useless extra SQL/other remote server requests where we should'nt. Code can be cached, indeed, I don't need you to define for me this kind of concepts: we need to define where it stands in our architecture, and I sincerely hope you don't think it's cache, because it is not: it is definitive code linked to a specific (environement,configuration) tuple that should never be either rebuilt or wiped out once put into production, except in case of code update.

PS: But I don't know where Twig stands in this vision.

Re-EDIT: I went reading from Twig documentation, and as I understand it, Twig templates must compiled only once. So in that sense I'd say they also are code stuff, not data at all, if it generates an AST from a lexer then produce plain PHP code, then code it is. This means that we could warm up all Twig templates and compile them once for all before putting a site into production.
In Drupal, cache is meant for mutable data, else we would have hardcoded it instead of caching it. This is code which is not supposed to change: this is not cache for us.

Comment #70

Damien Tournoud CreditAttribution: Damien Tournoud commented 4 July 2012 at 14:04

You can see it the way you want, but if it is generated and duplicates information found elsewhere, it is a cache. The fact that it is generated PHP code doesn't change anything to that.

Comment #71

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 14:06

The concept is relative, in absolute yes it is, but relatively to the usage we'll make of it, this is not, this is code. This is plain good old code that needs to be generated only once for all and happily live forever after.

Comment #72

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 14:17

Anyway I'm just leaving this issue, we are fighting a bit too much... Do as you please, just don't WTF this.

Comment #73

donquixote CreditAttribution: donquixote commented 4 July 2012 at 14:46

cache vs compile:
The DIC will probably depend (among other things) on module enabled status. Am I correct?
If we allow Joe Blogger to enable and disable modules via the UI "post-deployment", then we have to recompile the DIC. And this will need to happen on Joe Blogger's shared hosting environment.

Comment #74

chx CreditAttribution: chx commented 4 July 2012 at 18:41

If you don't like the solution we are seeking for performance reasons then don't use it and stick to native files. That's simple enough? Yes, it will be slower but a) we need to dump PHP for Twig I can benchmark how long it takes to compile a template but it's going to be horrible I can assure you b) the possible usability improvement for module upgrades will be massive. So I would say the performance argument is moot...?

Comment #75

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 19:00

@chx Ok, if we try to make peace between us too, I think I envision things this way:

Stream wrapper aren't that bad, as long as it has 0 leak over any custom code

I'm against doing magic trick in files, but we can indeed generate a Phar. The only security in the end is chmod'ing this file correctly (just like we do for settings.php).

That's for this security matter I was speaking about using optional compilation per default, and leave a checkbox during install or in some performance screen sayging "Yes I will chmod my file myself". Thus we leave it secure for newbies on shared hosting.

So, considering all your proposals, making the compiled file destination pluggable, and providing both a native physical file and a Phar destination in core, enabling the one or the other per default depending on the envirnoment at install, and leaving the choice in some form at the same time, all of this sounds good to me.

But this generation must remain predictable on when it will happen (module install, etc.. or manual button clicked) because paranoid site admins will want to re-chmod just being.

Comment #76

neclimdul

He/Him

CreditAttribution: neclimdul commented 4 July 2012 at 19:17

So I think we're leaning toward some sort of modified version of 5) from the summary. Things I've got from the discussion are we want php and we want it to be able to happen from the UI in one form or another not just from the CLI. We want some sort of interface so we can stream wrap some sort of security features or write to specific directories controlled by the user. I think there's some implementation details to suss out still around the invalidated compiled data problem pwolanin mentioned but we're getting somewhere. Should we update the summary and start outlining some tasks?

Comment #76.0

donquixote CreditAttribution: donquixote commented 4 July 2012 at 19:29

Issue summary:

View changes

Update summary.

Comment #76.1

chx CreditAttribution: chx commented 4 July 2012 at 21:07

Issue summary:

View changes

Rewrote the proposal

Comment #77

donquixote CreditAttribution: donquixote commented 5 July 2012 at 22:16

This discussion may look like chaos, but imo we are on a track that is leading forward :)

I attempt to summarize relevant attacks and countermeasures.
I think something like this should go into the issue summary: Possible attacks, attack conditions, and countermeasures.

In #45-a I named the attack type that I think is most relevant for this issue:
Attacker puts a malicious php file in a place that will be executed by Drupal.

To pull it off (attack a), the attacker needs three things:
1) Know where to put the file / php.
2) A vulnerability (file injection) that allows to put the file / php in the respective location.
3) Engineer the file contents so that they will actually be executed by Drupal.

Countermeasures:

c1) random dirname
Make the write location hard to guess, e.g. by using the random character sequence as dirname.
It also helps to have this location not web-accessible, to avoid Apache accidentally revealing the dirname.
This can prevent attack condition (1), if the attacker cannot figure out the random dirname.
Also, if the vulnerability in attack condition (2) applies only to a specific location, then the random dirname can make it less likely that the vulnerability applies to the codegen location.

c2) code in database
Alternatively, store the code in the DB, and use a stream wrapper to include / execute.
Prevents attack conditions (1) and (2), because it is no longer in the filesystem.
However, now the attacker could figure out the db location, and abuse an SQL injection vulnerability, or a poorly written module that writes unfiltered content to the db. We just moved the problem around.

c3) "signature" on the php code
Only execute generated php, if it has the correct signature. This may involve checking a salted hash of the file / php, which probably has performance implications whenever we execute this php.
Again this can be implemented with stream wrappers.
The countermeasure can be applied to any generated code, no matter if it is in file or database.
Prevents attack condition (3), if implemented correctly.
Does not change attack conditions (1) and (2).

~~c4) chmod deny write access for www-data (#75)~~
Not an option on many sites. Joe Blogger wants to enable and disable modules without having to FTP around in the filesystem. The code needs to be generated by www-data, and thus www-data needs write permission.

~~c5) Do not execute generated code~~
Only execute native php code, which is write-protected from www-data.
I think we don't want this.

EDIT:
c6) Put those files outside of web root.
Prevents those files from being executed by Apache. (which might not be any worse than native drupal php files being executed by Apache)
Also, if we have a secret/signature at the top of each file (c3), this countermeasure helps to avoid leaking of this secret.

EDIT:
c7) Protect those files with .htaccess.
Little brother of "put them outside of web root". Only works with servers that support .htaccess.

For all the suggested countermeasures there can be vulnerabilities (badly written modules etc) that would allow to circumvent them, but that might not circumvent the other countermeasures. Thus, it could be a good idea sense to implement more than one of the countermeasures.

Comment #78

chx CreditAttribution: chx commented 4 July 2012 at 21:08

I posted the proposed resolution into the summary. It seems we are forming an agreement around it.

Comment #79

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 21:11

I'm OK with this summary, it sound like a good compromise/middleground. Glad that we made it.

Comment #79.0

pounard

French

CreditAttribution: pounard commented 4 July 2012 at 21:11

Issue summary:

View changes

Forgot to close my list

Comment #80

Crell CreditAttribution: Crell commented 4 July 2012 at 21:14

If the security team is OK with this plan, I am too.

Comment #81

David_Rothstein CreditAttribution: David_Rothstein commented 5 July 2012 at 14:13

A stream wrapper reading-writing PHP files with a header consisting of <?php exit; ?>randomstring

Is the plan to also use .htaccess to completely block outside access to these files?

If not, I don't see how this approach adds security, because either the PHP file is directly executable (in which case an attacker can overwrite this file and execute it entirely outside of Drupal), or it's not directly executable (in which case an attacker can visit the original file in their web browser and view the file contents, including the random string, and then use the random string when they overwrite the file and get Drupal to execute the new version for them).

So either way, wouldn't we still have the "file upload vulnerability turns into a remote code execution vulnerability" identified earlier?

Using .htaccess to prevent all direct access to these files would solve the problem, but then we'd be relying an awful lot on .htaccess for protection. Maybe we already do...

(Another solution would be to use a hash of the file's contents rather than a random string, of course; that would solve it security-wise, but maybe not performance-wise.)

Comment #82

pounard

French

CreditAttribution: pounard commented 5 July 2012 at 14:34

@#81 If the file extension is .php any HTTPd will try to run it without .htaccess. As long as directory listing isn't on, the only flaw here is that the attacker can know if a file is in there or not, but cannot do much with it.

Comment #83

David_Rothstein CreditAttribution: David_Rothstein commented 5 July 2012 at 15:16

@#81 If the file extension is .php any HTTPd will try to run it without .htaccess.

Yes, but the .htaccess file which Drupal puts in the public files directory will prevent this, and will instead result in downloading the .php file to your local computer (where you can read its contents).

So, for certain types of file upload vulnerabilities we are already relying on .htaccess to prevent them from escalating into arbitrary code execution vulnerabilities... My main point above was that plan being discussed here sounds like it will require adding further .htaccess protections, and relying on those also.

Comment #84

pounard

French

CreditAttribution: pounard commented 5 July 2012 at 15:28

Oh yes I see, indeed, valid point.

Comment #85

meba CreditAttribution: meba commented 5 July 2012 at 15:28

c2) code in database
Alternatively, store the code in the DB, and use a stream wrapper to include / execute.
Prevents attack conditions (1) and (2), because it is no longer in the filesystem.
However, now the attacker could figure out the db location, and abuse an SQL injection vulnerability, or a poorly written module that writes unfiltered content to the db. We just moved the problem around.

No you have not moved the problem around. If you can abuse SQL Injection vulnerability, you can do much more and actually writing into this PHP "cache" (or how we call it) would be more complicated than simply changing uid 1 password.
With writing PHP files into storage you add another attack vector. With writing them to SQL cache you are not adding a new attack vector, the vector already exists.

Comment #86

donquixote CreditAttribution: donquixote commented 5 July 2012 at 16:32

#85 (meba)
you can do a lot with uid 1 login, but can you really do everything? Ok if there is any module which can eval() admin-provided php, then yes. Otherwise, maybe not?
Sure you can totally destroy everything that is in the database. But can you destroy any code? Or can you attack any other sites on the same server? Depends on a number of things..

Also, it doesn't have to be an arbitrary SQL injection vulnerability, it could also be a stupid module that writes stuff to the wrong table if you mess with it. Ok, this is probably less likely than a simple SQL injection vulnerability.

In fact this is a lot easier to abuse with php input formats than with code generation in the db. Because code generation is probably not designed to let any user input come anywhere close.

So.. you might be right :)

Comment #87

donquixote CreditAttribution: donquixote commented 5 July 2012 at 16:54

So, for certain types of file upload vulnerabilities we are already relying on .htaccess to prevent them from escalating into arbitrary code execution vulnerabilities... My main point above was that plan being discussed here sounds like it will require adding further .htaccess protections, and relying on those also.

I would say, yes.
We already use .htaccess to protect regular php files:
http://drupal.org/modules/system/system.module
http://drupal.org/sites/default/settings.php

Unfortunately, this protection is not effective with lighty or any server that does not support .htaccess.

The generated files probably don't contain anything more dangerous than regular core and contrib php. Except for the secret "random string", which we do not want to reveal to a visitor - unless we do the hash thingie, which has performance issues.

But yes, we should definitely protect those files via .htaccess. And to do this, they should better live outside the public:// directory, e.g. in
sites/default/codegen

Of course, the best would be to have them outside of web root - similar to the private://. Can we ask Joe Blogger to create a directory outside of web root?

-----------

Btw, full-stack symfony typically has all php files outside of web root.
The problem with that is, that css, js and images shipped with modules need to be copied or symlinked to a web-accessible location.

---------

All of this said:
Making those files not web-accessible solves only one part of the problem. The directory will still be writable by httpd.

Comment #88

pounard

French

CreditAttribution: pounard commented 5 July 2012 at 17:21

Of course, the best would be to have them outside of web root - similar to the private://. Can we ask Joe Blogger to create a directory outside of web root?

One thing is sure, whatever is the default path, it will be configurable by a variable such as:

$conf['system_compile_directory'] = 'sites/default/compiled';
// Or
$conf['system_compile_directory'] = 'something://somepath';

Comment #89

donquixote CreditAttribution: donquixote commented 5 July 2012 at 17:36

Could we require to set this? Is this feasible on shared hosting?
If Drupal does not find this configured, it will simply show an "I am sorry" page.

And, would it be a good idea to combine this with the private files dir and cache?
E.g.

/*
 * This will create
 * /var/projects/xyz/storage/private for files
 * /var/projects/xyz/storage/cache for file-based cache
 * /var/projects/xyz/storage/codegen for code generation
 *
 * None of these should be web-accessible.
 */
$conf['storage_directory'] = '/var/projects/xyz/storage';

Comment #90

David_Rothstein CreditAttribution: David_Rothstein commented 5 July 2012 at 17:39

In general I think we don't assume that Joe Blogger has the ability to create directories outside of the web root, unfortunately.

However, here's a crazy idea (and not fully thought through)... Can we have the default location of this directory be somewhere inside temporary://? Pros:

This location should exist on all systems and will almost never be in the web root.
In the unlikely event that it is in the web root, Drupal already has code to put a "Deny from all" .htaccess file in there anyway (see file_ensure_htaccess()).
This directory is only used to store cached data, so having it in /tmp (where it can disappear randomly) is not really a problem.
The /tmp directory is not too likely to be backed up (at least compared to the directory the Drupal site is installed in), so that reduces the chance of the random string ever leaking out to other machines via file system backups.

Comment #91

pounard

French

CreditAttribution: pounard commented 5 July 2012 at 17:58

There is one problem with the tmp folder is that some operating systems will clean it up on a regular basis. Tmp is for temporary files and it's semantically wrong to put our code, which isn't temporary, into it.

Comment #92

donquixote CreditAttribution: donquixote commented 5 July 2012 at 18:09

Or rather: We want to control ourselves when to clear this stuff.

Comment #93

donquixote CreditAttribution: donquixote commented 5 July 2012 at 18:11

In general I think we don't assume that Joe Blogger has the ability to create directories outside of the web root, unfortunately.

Are you sure? I thought on most shared hosting things with FTP you have a /www/ or /htdocs/, and then some other directories to play with.

Comment #94

neclimdul

He/Him

CreditAttribution: neclimdul commented 5 July 2012 at 18:19

So there are 2 discussions raging here when we've already decided they're separate implementations. Just for sanity should we split up the implementation discussions and start talking about the interface here. We're climbing in the comment count really quick and the scope of the issue is fairly open still.

Comment #95

donquixote CreditAttribution: donquixote commented 5 July 2012 at 18:26

#94:
Most of the comments in here have been about implementation: Where and how to put generated code, so that it is secure.
So if we split up, then the "interface" talk should go elsewhere.
Otherwise, all the implementation arguments would have to be repeated in the new thread. Not nice.

I think it makes sense to agree on a secure solution first. It is good to know for follow-ups that a secure solution does exist, so we can then decide how exactly to use it.

The comment count is justified, so we all understand each other's security concerns.

Comment #96

neclimdul

He/Him

CreditAttribution: neclimdul commented 5 July 2012 at 19:08

the implementation talks have been toward reaching a consensus on goals. Now that we have some consensus it would make sense to summarise the opinions addressed here and debate them individually. It is never reasonable to discuss 2 tasks in the same issue and it seems clear from the open nature of the OP this was always designed to be a meta issue.

Comment #97

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 5 July 2012 at 20:03

In general I think we don't assume that Joe Blogger has the ability to create directories outside of the web root, unfortunately.

FWIW, most hosting companies that I've worked with have had either /home/username/htdocs or a way to set your own web root from cpanel. I don't think it's an unreasonable expectation to set a directory outside the web root. In fact, I'd really like to see Drupal have a directory called "public" specifically for the webroot like full stack symfony or rails, but I suppose that's for another issue.

Comment #98

chx CreditAttribution: chx commented 5 July 2012 at 20:08

Going outside the webroot, I really dunno, my memories of CPanel are foggy at this point but if you add a new domain and an ftp user to it (typical shared host scenario) what will the FTP homedir will be by default? I think the docroot.

Comment #99

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 5 July 2012 at 20:15

At site5.com, if I add a new domain to my hosting account, it creates /home/cweagans/public_html/domain.com and sets the docroot to that, but there's a text box that lets you change it.

We already recommend that private files be outside the webroot. Why don't we just recommend the same for generated PHP (but not enforce it)?

Comment #100

donquixote CreditAttribution: donquixote commented 5 July 2012 at 20:24

ok with #99.
enforce or not, I don't have a strong opinion.

#97:
See #1672986: Option to have all php files outside of web root.
(no idea if that's a duplicate, I did not find an existing one)

Comment #101

pwolanin CreditAttribution: pwolanin commented 5 July 2012 at 21:48

Having the files outside the webroot is a good idea, but I'm not sure what that gets us other than preventing them from being directly executed by the webserver?

Comment #102

chx CreditAttribution: chx commented 5 July 2012 at 21:51

As for temporary, it is a good idea, do we expect shared hosters to have the brains to set them up in a way that one user can't see the other's temp files? I guess yes. I liked it enough to provide a tentative patch at #1673084: PHP files in the temporary directory

Comment #103

donquixote CreditAttribution: donquixote commented 5 July 2012 at 22:19

In a tmp directory - this only works if we lazy-generate those files when they are needed, instead of only when the rebuild is requested.
The lazy compile could be plugged into the autoloader.

Otherwise, any clearing of the tmp folder will kill the site.

Comment #104

chx CreditAttribution: chx commented 5 July 2012 at 22:43

Yes, we need on-the-fly rebuilding. No doubts there...

Comment #105

effulgentsia CreditAttribution: effulgentsia commented 5 July 2012 at 23:51

Since this issue discusses performance vs. security trade off, I opened #1673162: Analyze performance of uncompiled and compiled DI containers to discuss the performance part in more detail.

Comment #106

Owen Barton CreditAttribution: Owen Barton commented 6 July 2012 at 00:41

One other (slight) advantage of tmp is that it is (or should be) always local, so would avoid some of the potential conflicts of trying to write to the same files at the same time on multiple webhead setups that use a network mount for the site code. There is also a slight performance benefit (tmp lightly to be less write contested and latent than the network share).

Its hard to say if shared hosts are likely to have a mis-configured tmp that leaks files between users - we could check a few perhaps, but my guess is that those that have this problem likely have it for home directories too :)

Comment #107

donquixote CreditAttribution: donquixote commented 6 July 2012 at 01:00

Hmm... isn't tmp/ inviting for file injection?
There are typically no strict conventions on which section of the tmp/ dir belongs to which application.
So if a poorly written web app (on the same server) allows easy file injection into arbitrary locations in /tmp, then we would execute them unquestioned.
And/or, this poorly written web app could also be used as the door to read files from tmp - if we have any secret in there.

So, to say it again:
We need those files to be writable and readable by httpd.
BUT, we want to reduce the chance that these files are affected by a leak or vulnerability in or outside of Drupal, which could allow an attacker to either read or write or execute those files.
(and I think we should worry most about the "write" aspect)

Comment #108

mariomaric CreditAttribution: mariomaric commented 6 July 2012 at 01:10

One thing that crossed my mind was from my experience while working in web hosting support. Joomla! has requirement for temporary directory, I can only find that it is used for uploading files, but - majority of issues with Joomla! was somehow related with location or permissions or server configuration for tmp folder. On the other side, I never had support request for installing Drupal - and you need to c/p settings.php - it's probably culture specific issue - so requirement to create tmp directory in Drupalshpere might not be problem. :)

Another thing, AFAIK 99% of shared hosting providers do separate FTP root from Document root folder - but question is what is the path of the open_basedir PHP directive - if it is defined, mostly in virtual host files? If path is FTP root (as my former company have) then everything will be fine, but if path is equal to Document root - then we have a problem.

My two cents..

Comment #109

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 6 July 2012 at 01:54

@donquixote, can you please stop editing your comments? It's hard to see what changes between edits and it messes up comment history (especially if somebody replies to something that's no longer there). If it's something that belongs in the issue summary, then you could probably put it there. Otherwise, just post a new comment :)

Comment #110

donquixote CreditAttribution: donquixote commented 6 July 2012 at 02:44

@cweagans,
I usually mark my edits as such. And I usually only add to comments, and don't remove.
The comment in #77 has a summary aspect about it, this is why I wanted to update it. I would not mind seeing it in the issue summary, but so far my comments have not received as much approval around here (my perception), so I did not do it.

But yes, I should simply do it. Attacks and countermeasures definitely belong in the issue summary, this is what we have been talking about all the time.

Comment #111

pwolanin CreditAttribution: pwolanin commented 6 July 2012 at 04:09

Putting PHP files in /tmp smells pretty bad to me. I think Owen brings up a good point about file location and sharing among web heads, but I'm not sure that writing to local storage on each web head makes things better or whether it rather encourages a split-brain.

I'm leaning even more that we should never compile by default nor require it for install. Use secure, CLI tools if you needs a production website, otherwise you are in development mode.

Comment #112

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 6 July 2012 at 04:24

Use secure, CLI tools if you needs a production website, otherwise you are in development mode.

I don't think we can require CLI to get out of "development mode". While I don't think it's a big deal to use the CLI, random bloggers would consider that a problem. We want to increase adoption of Drupal, not hinder it.

Comment #113

chx CreditAttribution: chx commented 6 July 2012 at 05:29

tmp files would kill the side idea of using this for upgrades but that's fine. I would say that the likely permissions on /tmp files is 600 so it's unlikely we files leak. And because of the random we intend write into the file it's not exactly easy to create a legitimate fake to be included. #105, again, Twig.

Comment #114

pounard

French

CreditAttribution: pounard commented 6 July 2012 at 06:54

@#111 I agree with pwolanin here, tmp indeed smells bad. the "temporary://" scheme is used at a lot of places in Drupal, and I guess it's easier to end up with "temporary://some_injected_value" at some point than malicious code written in a specific directory for which is no UI based access at all. This mean that this would be an easy attack vector for modifying our compiled files.

Comment #115

sun

German

Karlsruhe

CreditAttribution: sun commented 6 July 2012 at 15:17

I'm with @pwolanin in #101 - the only thing that the temp idea resolves is the location outside the web root, so files cannot be executed through web server. That statement is also limited, because if someone manages to make the web server (over)write a specially crafted file to temp, Drupal will use it. Thus, I don't think using temp is any different from using the public files directory (or a new dir) - except for the potential world-readability (if one knows path and filenames).

@pounard in #91 makes a strong point, too - it's not guaranteed that temp has the semantics that we assume. If the system has a temp garbage collector running every 2 hours or whatever, then the compiled files will vanish and have to be rebuilt, and we won't be able to influence that.

Lastly, @Owen Barton's point in #106 might even be most concerning. If there are multiple Drupal sites on one server, all are writing to the OS' temp filesystem, and apparently, the web server writes the files, then an attacker just needs to gain access to temp with the web server user in order to pown all sites. Apparently, that is the top #1 case of backdoors out there, which even script kiddies can get to work. All you need is a code/form somewhere else on the server, which is known to be unsecure or have an exploitable security vulnerability (e.g., not belonging to Drupal at all). Once your script/code lives in /tmp, you have everything in place. You don't even need to care for impersonating another user or learning file locations, it's all there. You can hi-jack all Drupal sites that live on the server, without even knowing what they are or where they live - just screw them. That would sound very tempting to me, if I'd be in that business.

Comment #116

meba CreditAttribution: meba commented 6 July 2012 at 15:33

Agreed with sun, tmp is not a good location for many reasons.

My opinion is that SQL so far is the only a) secure b) permanently available storage we have. If it's a stream wrapper then you can by default write to SQL, for advanced usage write to files or memcache.

Comment #117

donquixote CreditAttribution: donquixote commented 6 July 2012 at 16:01

#116
find-in-page "SQL injection". (esp #67, #85 and #86)
Is this an issue, or is it not?

Could we have a separate database connection that is used only for generated php code?
And can we trust Joe Blogger to set this up correctly, with a secure password?
It would definitely create an extra burden when writing settings.php, and to create this database in the first place.

If we do this, we should definitely avoid making the credentials available everywhere via DIC or globals.
Ideally, settings.php will write $api->setCodegenDb(...), and this will not tell the key to anything else.

Comment #118

cweagans

He/Him

English

Boise, ID, USA

CreditAttribution: cweagans commented 6 July 2012 at 16:52

I'm pretty sure writing the code to the database is not a good idea, as it turns all SQL injection attacks into remote code execution vulnerabilities. That sounds pretty bad to me.

Comment #119

chx CreditAttribution: chx commented 6 July 2012 at 17:50

If you can SQL inject then we presume the world ended already. See system table changes above. Not good. So. Let's refocus. We have considered and thrown out the temporary idea. We really would like to write outside of the webroot but we are not 100% that's feasible.

While it has been raised "do we even want to compile" that is IMO not a question because support for PHP write gives us not one but these three advantages:

Twig. Edit: sure, we can benchmark how long it takes to compile Twig templates if it's not written out, but I can assure it'll be horrific.
DIC speedup no matter how little.
Easy on-the-fly module/theme and most of core updates, even -- aside from the file loader API we can override everything.

Given this, what do we have against the proposal in the summary?

Comment #120

effulgentsia CreditAttribution: effulgentsia commented 6 July 2012 at 21:44

what do we have against the proposal in the summary?

Here's my current thoughts:

We provide a very small, very tight API, pluggable via settings.php (yes, this will be a different plugging mechanism) that can: 1. Write PHP files, 2. Load PHP files, 3. Tell the rest of the system whether 1. is available.

I'm +1 for this. I don't quite understand why we need to invent a new API here instead of using stream wrappers directly, which I think satisfy all these needs, but that's a detail we can mull over later, it doesn't affect any important decisions needed at this time.

The implementations we ship with are: Bare files [and other variants that involve writing files].

From what I'm reading on this issue, I think this is a problem. I think a better default implementation is something database-based, whether the default stream wrapper implementation queries the database directly, or whether it does it via a cache bin which by default is mapped to a database implementation. Because we've already accepted that an arbitrary SQL injection attack in Drupal leads to arbitrary code execution (whether by setting password for uid 1 and then enabling php.module, or whether by specially crafted values in {menu_router}.load_functions), so this would not open any new attack vector. Whereas, it seems anything based on the PHP process writing code to the file system either opens new attack vectors, or requires the PHP process to also chmod (which is vulnerable to failure/race conditions), or requires the site administrator to manually chmod (which is terrible UX for the majority of Drupal sites). Again, I'm only recommending a database-based implementation to be the default one, and I'm assuming that for any site for which this isn't the best choice, settings.php can be adjusted to do something different.

Con: I have no idea if the performance of that would be any good, or if it's as bad as eval().

Answered by #7.

Introduces an extremely early database dependency

Only for sites using this default implementation, but these are the same sites that can't do anything (not even serve cached pages) without a database connection anyway. If you're able to map a cache bin to something other than a database, then you can also map the codegen stream wrapper to something that doesn't need the database.

and in turn makes putting the database connection in the DIC a circular problem.

For Twig, not a problem. For DIC, this is annoying. I don't know how to address this elegantly yet, but I think there's some way to deal with this, like in the case of a stream wrapper implementing a database backend and asked for the DIC, then connect to the db manually (not via the DIC), but in all other cases, the db connection can be a DIC service. Maybe there's a more elegant approach, but even if we resorted to this, I'd be fine with it.

Comment #121

chx CreditAttribution: chx commented 7 July 2012 at 08:16

If a core implementation has an early dependency on the database it causes huge headaches because then we can't architect things to work before the DB is there.

Stream wrappers: no IDE can debug across a userspace stream wrapper and nothing beats native PHP for speed. Also, stream wrappers by nature are slower than what an API provide -- it will do stat calls, read in a buffer, check for EOF etc. This all is unnecessary. Not to mention confusing and dangerous if someone uses a stream wrapper meant for files for this.

I want to address briefly the split brain problem: if you have multiple webfrontends surely you are using some deployment tools.

However, by now, I am thinking perhaps on changing the proposal to:

Either you are local writeable,
Or you must provide a writeable dir outside of the webroot.

If you are local writeable then we just write .php files somewhere, doesn't matter where. This can be an install-time thing and for the local writeable crowd they would never see this option during install. How does this sound?

Comment #122

chx CreditAttribution: chx commented 7 July 2012 at 08:56

Re-read the whole issue, #101 asked "Having the files outside the webroot is a good idea, but I'm not sure what that gets us other than preventing them from being directly executed by the webserver?" Well, let's re-iterate the possible risks:

There is a file upload vulnerability and the attacker manages to upload a file which then gets loaded and executed. We planned to protect against that by the header trick saying the attacker can't know that. The randomized directory is of limited help because PHP loves to disclose file paths on fatals and such.

It would seem that only stream wrappers can help because then we have clean separation between uploads and PHP: they are using different stream wrappers and the wrapper can make sure things are signed properly. #121 lists arguments against stream wrappers, however.

I have searched the PHP manual for a while trying to find some stream functionality, and look: http://www.php.net/manual/en/function.stream-filter-register.php . We do not need a full blown stream wrapper, we need a simple filter to verify no evil intent. Quite possibly the simplest is to add a comment to the first line of the code with a nonchanging random . Example here http://www.codediesel.com/php/creating-custom-stream-filters/ .

So... #121 with a user filter?

Edit: I created a dummy filter (copying the codediesel example above) https://gist.github.com/3065563 see here how reflection sees the filename just fine and so I hope xdebug will cope just as well.

Edit2: I ran the gist 100 times with microtiming the include time, tossed the 5 slowest and 5 fastest, the resulting 90 averaged 0.136ms for the filtered include and 0.055ms for the raw include.

Edit3: we can make the filter disableable (is that a word?) from settings.php for added speed for those who want that sort of compromise between security and performance. Works well for people who chmod for the time when they change something needing a rebuild, run CLI etc.

Comment #123

David_Rothstein CreditAttribution: David_Rothstein commented 7 July 2012 at 14:52

Whereas, it seems anything based on the PHP process writing code to the file system either opens new attack vectors, or requires the PHP process to also chmod (which is vulnerable to failure/race conditions)

If the PHP file:

(a) has the random string at the top,
(b) is protected by .htaccess, and
(c) has file permissions equal to 600 (which doesn't seem like it should have a race condition, and if we can't chmod it maybe we just bail out and don't write the files?),

then I'm not sure I've seen any example discussed in this thread where there would be a new attack vector that doesn't already exist in Drupal today. If there is one, what specifically are the steps?

(Perhaps if someone manages to get their hands on the random string via a backup stored on another server and then reuses it in a file upload vulnerability attack later... but that is a really complicated combination, and we could also use a new random string each time the file cache is cleared to help mitigate against that.)

We do not need a full blown stream wrapper, we need a simple filter to verify no evil intent. Quite possibly the simplest is to add a comment to the first line of the code with a nonchanging random .

Yeah, I was actually wondering what the <?php exit; ?> at the beginning of the file even helped with. <?php // randomstring ?> by itself seems like it would be equally fine, and a lot less confusing to someone looking at these files.

***

Regarding /tmp, I'm happy to drop the idea if people are uncomfortable with storing code in there. The goal was simply to reduce the attack surface (by reducing reliance on .htaccess) but we can never get rid of the reliance on .htaccess anyway, so it's not a huge deal. That said, I don't really understand the argument that /tmp would be less secure. (If someone controls the webserver user, they can take over the site regardless of where these files are stored, and it's not that much more convenient for a script kiddie who has such access to search through /tmp for the vulnerable subdirectory than it is for them to search through e.g. /var/www/*/sites/default/files.)

(And by the way, I love the idea @chx proposed that if we're really lucky, we can potentially reuse this system for updating modules/themes via the update manager... which of course we couldn't do with /tmp.)

Comment #125

Crell CreditAttribution: Crell commented 7 July 2012 at 17:29

A custom filter sounds even better than a custom stream wrapper. According to chx's benchmarks, the custom filter doubles the read time... all the way to 0.1ms. I can live with an extra half millisecond when it's saving us far more than that per effulgentsia's benchmarks just on the DIC, to say nothing of Twig. And if someone doesn't want to spend the extra half millisecond, they can use CLI-only generation. Problem solved.

For file location, defaulting to in-webroot with an easy option to put it outside-webroot, just like private files, seems like a reasonable approach.

So when can we see an implementation? :-) Can haz patch?

Comment #126

chx CreditAttribution: chx commented 7 July 2012 at 17:43

Yes, you can haz #1675260: Implement PHP reading/writing secured against "leaky" script. Right now I am working on verifying that APC can cache this. /me puts in I can't cache.

Comment #127

chx CreditAttribution: chx commented 7 July 2012 at 18:24

While APC 3.1.11-dev (only APC trunk contains Damien's fix) still can't cache php:// streams because that stream wrapper does not support stat and APC runs a stat to verify the file exists. I think this is a bug in APC and I filed a bug https://bugs.php.net/bug.php?id=62502 with a patch. Let's hope whether it gets accepted :) I have verified that apc_delete_file() also works after the patch which we will need if we go down this path.

Meanwhile, I hope we can continue with the optional filter -- at worst we need to say you need to disable to filter to use APC which would be a serious bummer but let's hope the patch gets accepted.

Comment #128

bjaspan CreditAttribution: bjaspan commented 8 July 2012 at 17:26

chx asked me to look at this issue. I've scanned it, but could not bring myself to read it all carefully. I think a lot of the pushing back I would have done has already been done by others. :-) My summary:

Drupal must work when 100% of its code comes from a singe VCS repo. If we depend on writing any code at runtime, Drupal is no longer deployable.

It's fine for Drupal to take advantage of a PHP-writeable docroot for optimization. If it does, however, it has to assume that it is writing to a non-persistent cache that can disappear at any time (outside of the current page request). Even in dev mode, we do not want Drupal to assume is running on a single web node or with a shared filesystem for code.

I'm not sure I see the advantage of the machinations to try to make PHP-written code "more secure." If an upload vulnerability exists that allows an attacker to write a file to the code deployment area, it's already game-over. If we want "drush pm-install" to work from the UI, are we going to require that all the module files get loaded via the stream wrapper for all time? I suggest we say, "A writeable docroot is more functional, but less secure. You choose." and be done with it.

Comment #129

chx CreditAttribution: chx commented 8 July 2012 at 17:42

Re. #128 , sure thing, if you already have written everything out and deploy it then you are good. The possibility of a read-only configuration was raised multiple times in here.

If an upload vulnerability exists that allows an attacker to write a file to the code deployment area, it's already game-over. Well, see, we want to avoid game over.

If we want "drush pm-install" to work from the UI -- oh yes we want so so much! In D7 we made some first steps... more here :)

are we going to require that all the module files get loaded via the stream wrapper for all time -- yes. No. :) That depends on your configuration settings, if you set $conf['php_prefix'] then we use php:// to avoid that game-over, otherwise we don''t.

Comment #130

donquixote CreditAttribution: donquixote commented 8 July 2012 at 17:50

#128

If an upload vulnerability exists that allows an attacker to write a file to the code deployment area, it's already game-over.

The difference is that nowadays in D7
- files in the upload dir are never executed (thanks to .htaccess)
- any code that we do execute is write-protected from httpd with chmod.

The new thing is that we would create a new place for code that is httpd-writable and that is meant for execution.

Comment #131

chx CreditAttribution: chx commented 8 July 2012 at 19:32

Time for another summary, perhaps? We want to write PHP files and are afraid of "leaky" upload scripts which might not be Drupal. Proposal is to optionally add a header containing a fixed random to the PHP files and verify it via a userspace stream filter. We offer the following choices, percentages are compared to each other and not accurate but gives you an idea.

Do not let the web server handle PHP writing (either use CLI or chmod each time by hand). Performance 100%, security 100%, usability 0%.
Let the web server handle PHP writing into a directory inside the webroot without a PHP header. Performance 100%, security 0%, usability 100%.
Let the web server handle PHP writing into a directory inside the webroot with a PHP header. Performance 50%, security 50%, usability 100%.
Let the web server handle PHP writing into a directory outside of the webroot without a PHP header. Performance 100%, security 50%, usability 100%.
Let the web server handle PHP writing into a directory outside of the webroot with a PHP header. Performance 50%, security 100% (it's still not the same as CLI only, but for the attacks described in this comment, it is), usability 100%.

Comment #132

effulgentsia CreditAttribution: effulgentsia commented 9 July 2012 at 22:27

@chx, @David_Rothstein: I trust you to take security seriously, so if you think it's worth pursuing a file based approach, I won't get in the way of that. Just some replies to a couple of your comments:

If a core implementation has an early dependency on the database it causes huge headaches because then we can't architect things to work before the DB is there.

I don't get that. As long as the thing with the early database dependency can be swapped out via $conf to use an alternate back end, then we can handle install pages, db-offline maintenance pages, and write tests to ensure our architecture works without the DB in environments that have an alternative (well, we will be able to do that once #1576322: page_cache_without_database doesn't return cached pages without the database is fixed, but that's already a critical issue regardless).

Perhaps if someone manages to get their hands on the random string and then reuses it in a file upload vulnerability attack later

Yes. I was assuming that a fixed random string wasn't sufficiently secure (at least for any file in webroot, and that relying on non-webroot isn't an option for any *default* installation of Drupal). If we're willing to trust the security of the random string, then I agree we can make a file-based solution work.

we could also use a new random string each time the file cache is cleared to help mitigate against that

Nope. I'm assuming that we don't want to store the key in the database, if we're trying to avoid the early db dependency, in which case the key would be in settings.php which in production should never be written to by the PHP process. However, I suppose we could do something like have the random string in the file be a result of a settings.php salt and the file's timestamp, if we think that an upload vulnerability isn't also vulnerable to timestamp tampering.

requires the PHP process to also chmod (which is vulnerable to failure/race conditions)

Just to clarify, what I'm referring to here is if we think we can have security in a situation where the PHP process owns the file by keeping the file read-only (400) most of the time, and only briefly chmod'ing it to 600 to perform the write and then resetting it to 400 after the write, then this is what I'm skeptical about, because there can be failures that leave the file writable (and vulnerable to an upload vulnerability) as well as race conditions that someone exploiting an upload vulnerability could exploit. If we think our random key is sufficiently secure, we shouldn't need this extra bit of chmod'ing, and if we don't think our key is sufficiently secure, then we shouldn't delude ourselves into thinking a little extra chmod'ing compensates.

Comment #133

effulgentsia CreditAttribution: effulgentsia commented 13 July 2012 at 00:34

Thinking about this some more:

1) By default, Drupal installations have a PHP-writeable folder (sites/default/files or whatever public:// is mapped to). Although file_save_upload() takes measures to avoid saving PHP files to there, a faulty custom module or non-Drupal script can result in a PHP file getting saved there, at which point, the only thing stopping this file upload vulnerability escalating into a remote code execution vulnerability is the .htaccess file created by file_ensure_htaccess(). Given that we already rely on file_ensure_htaccess() to prevent direct execution of rogue PHP files in sites/default/files, I see no reason why relying on file_ensure_htaccess() to prevent direct execution of rogue PHP files in some other folder (e.g., sites/default/codegen) poses any greater risk. Sure, we can and should recommend for people to place the codegen directory out of webroot entirely since that's good practice, but even if they don't, I don't see the mere existence of a writeable codegen directory in webroot as any more vulnerable to direct execution than is the existence of a writeable sites/default/files.

2) Next, we need to ensure that Drupal not include()/include_once() a rogue file. This is unaffected by whether the file is in webroot or not, or whether the file is .htaccess protected or not. So, what we need is for the file to contain some token in it that can't be forged, and for us to validate that token before calling include() (this validation can be included as part of a custom stream wrapper or stream filter or whatever other implementation we want to consider). The suggestions in this issue so far have focused on a fixed private string that would be added to each file (and be the same for each file). This requires us to implement security measures to prevent the string from getting disclosed. If we're already trusting file_ensure_htaccess() anyway, then we should be as confident in "Deny From All" preventing direct disclosure as we are in "Options None" preventing direct execution. Or, for site owners who follow our recommendation to put the codegen directory out of webroot, direct disclosure is similarly prevented.

3) However, if I'm reading this issue correctly, there are concerns that a common token appearing in all generated files could be too easily disclosed (despite .htaccess protection), and because that token is the only thing standing between an include() statement and a file in a writeable directory, that this represents a new attack vector we might not be willing to accept. We can address this concern, however, by making each file's token a hash of the file's contents combined with a secret key appearing only in settings.php. #81 and #87 also mention this but wonder about the performance implications. I benchmarked drupal_hash_base64() and for both small files and large files found the time to be only a fraction (10% - 30%) of the time it takes to do an include_once() of the same file without APC. And assuming we can resolve the APC issues from #127, then with APC, we'd only incur that incremental cost on cache misses.

Given all that, how about proceeding with #131.3, but where the header is a unique hash for the file contents? I would call that security: 100% in that it introduces no new attack vector that doesn't already exist, and performance: good enough (and much much better than not compiling DIC or Twig at all). Then, perhaps in a separate issue we can explore whether it's worth optimizing further by not doing full file hashing, and instead trusting a common token, but meanwhile, it lets us proceed with #1599108: Allow modules to register services and subscriber services (events) without introducing the major performance regressions that continuing with an uncompiled DIC would cause.

Comment #134

donquixote CreditAttribution: donquixote commented 13 July 2012 at 01:05

#133

I benchmarked drupal_hash_base64() and for both small files and large files found the time to be only a fraction (10% - 30%) of the time it takes to do an include_once() of the same file without APC.

+1
Thanks for stepping over the group paralysis and doing the benchmark :)
I think this solves our performance vs security vs usability problem.

Comment #135

Anonymous (not verified) CreditAttribution: Anonymous commented 13 July 2012 at 01:23

wow, #133 is an excellent summary.

IMO this boils down to adding more complexity (not implementation, but system complexity), with security issues if we get it wrong, for what is purely a *performance* issue. feeling very non-plussed about the whole thing.

Comment #136

Crell CreditAttribution: Crell commented 13 July 2012 at 03:17

#133 has my blessing. I have no preference really whether we start with a simple private key or a content hash; it's easy enough to change our minds on that any time in the next 8 months or so.

Comment #137

pounard

French

CreditAttribution: pounard commented 13 July 2012 at 07:05

I'm afraid that APC fixes regarding stream wrappers (ones quoted by Damien at https://bugs.php.net/bug.php?id=59829 and chx at https://bugs.php.net/bug.php?id=62502 in PHP tracker) are too recent to be relied upon. Majority of shared hostings or stable *NIX OS/Linux distros won't ship the most recent APC version which is only 2 months old. Maybe when D8 stable will be released they will, but I wouldn't bet. At least they strike on a few use cases (stream wrappers without the realpath() method).

That's why I really want to keep raw PHP files write without any kind of stream wrapper as an option provided by core at this point. It sounds really important to me that site admins/sysadmins can configure and tune that behavior, which is tied to the environment/architecture below Drupal.

I really dislike arbitrary headers in PHP files, I agree this probably will improve security, but it sounds really ofuscated and complex to me. But once again, I'm not opposed to have this if it is pluggable and shipped with a basic implementation.

Comment #138

Owen Barton CreditAttribution: Owen Barton commented 13 July 2012 at 23:10

Thank you for the excellent synopsis in #133.

I think it's worth considering that for a signature based on the hash of "file plus secret key" to add security, the secret key can't be stored in settings.php. The reasoning for adding this signature is that if an attacker has access to write a file, it is often the case that they will have access to read the file, so just adding a secret key (with no hash) is insufficient.

However if the attacker has access to read files, then in every file permissions case (that I can think of) they should be able to read a private key from settings.php. Of course this would give them access to the database credentials, but we have to work on the assumption the attacker is not able to run executable code yet (otherwise they would have little need attack the file cache in the first place) and doesn't have internal network access (hard to evaluate this one), so this may be limited issue.

Storing the private key in the database would seem to be more secure in this case - this is at least 3 levels of assumptions down though. If read access to settings.php is considered an escalation, then we might as well just stick with a secret key in the file (no hash) and live with the possibility that attacker file read+write access allows write (whereas write alone doesn't).

Comment #139

donquixote CreditAttribution: donquixote commented 14 July 2012 at 04:43

#138
I think if we are talking about "file injection vulnerability", it does not mean the ability to put any file anywhere, but to put a rogue file into a very specific place that is vulnerable.

This could be caused e.g. by a badly written module, where a file upload or a file write ends up in the wrong location.

Comment #140

pounard

French

CreditAttribution: pounard commented 14 July 2012 at 10:46

settings.php is a very good place for a secret, considering PHP files, read and write should not be put side by side. When you try to read PHP files, they are just executed, so the attacker could still execute settings.php, nothing bad would happen. Even if he can write into the compiled PHP files, having the same over the settings.php is not obvious, since it's stored in another place and never dynamically written, in opposition to compiled files. That would mean the end anyway: if he can write the settings.php file, he has no goal in writing the compiled PHP files he already can do the same damage. Let's consider the actual settings.php as the safest place for this, since it already contains database credentials and other critical site information.

Comment #141

chx CreditAttribution: chx commented 15 July 2012 at 04:11

I think it's time to close this issue and continue talking in #1675260: Implement PHP reading/writing secured against "leaky" script which has #133 implemented, tested and a database stream wrapper written and tested but not used by default.

Comment #142

Owen Barton CreditAttribution: Owen Barton commented 15 July 2012 at 18:28

@pournard/donquixote - I don't understand how this addresses my point at all. If the attacker can read files (via a vulnerable module or external script) to reuse a simple secret placed at the top of files when they write their own, then settings.php is also readable in most cases, which means that the attacker can easily reproduce the hashed file+key signature and have Drupal execute their code - so we would need to put the private key elsewhere. If the attacker can't read files (and it seems we are going to work on that assumption, at least for now), then we have no need for hashing signatures and a private key at the top of each file will suffice.

Either way, this is a non-issue with the current approach, so I would suggest we focus on getting that it. If we decide to add hash based signatures then we can discuss secret key storage later on.

Comment #143

pounard

French

CreditAttribution: pounard commented 16 July 2012 at 13:57

@Owen Barton, I agree.

Comment #144

chx CreditAttribution: chx commented 16 July 2012 at 23:54

#142, whether we hash or just store a secret in the header has a rather minimal effect. However, I would offer that if we are writing into the public files directory as suggested in #133 (and implemented in the issue) then it's a highly likely that the compiled PHP files might be downloaded. A per-file signature looks better to me. Alternatively we can randomize the filename by running the hashing over the filename which is a much shorter string -- woohoo this could allow native including and yet staying secure??

Comment #145

effulgentsia CreditAttribution: effulgentsia commented 17 July 2012 at 02:36

Alternatively we can randomize the filename by running the hashing over the filename which is a much shorter string -- woohoo this could allow native including and yet staying secure?

Considering that #1675260: Implement PHP reading/writing secured against "leaky" script isn't compatible with APC caching (yet), this idea is very appealing on performance grounds, and also the code for it will be much smaller and simpler, so that's appealing too. However, #122 says:

PHP loves to disclose file paths on fatals and such.

How concerned should we be about the hashed filename getting disclosed, and then a file upload vulnerability allowing someone to upload rogue contents to that file?

Comment #146

pwolanin CreditAttribution: pwolanin commented 17 July 2012 at 15:27

Can you randomize the name every time it's created?

e.g. you could have a generation number someplace added into value used to make the name hash?

Comment #147

Crell CreditAttribution: Crell commented 17 July 2012 at 16:15

Randomized name is going to make checking this thing into Git (which we assume anyone with an actual dev/stage/live process will do) more difficult.

In fact I'm really worried about the randomized name int general. It's bad enough for CMI. :-)

Comment #148

pounard

French

CreditAttribution: pounard commented 17 July 2012 at 17:22

I agree with Crell here. The best security we can offer, IHMO, is having the behavior the most predictible as possible so that site admins and sysadmins can secure those themselves. I guess that for shared hostings the answer will be different, but even for those, I don't think that randomisation is the best one. We could store a timestamp of generation date at each file generation and never let newer files to be included by checking the mtime, and allow only the site admin to manually trigger files compilation manually (when enabling modules or so). For those who want, chx's filter mecanism or other comparable protections could also be enabled.

Comment #149

chx CreditAttribution: chx commented 18 July 2012 at 06:41

#146 something akin to that see the implementation. It's vicked tricky I won't outline it here.

Comment #150

donquixote CreditAttribution: donquixote commented 18 July 2012 at 13:23

#142

If the attacker can read files (via a vulnerable module or external script) to reuse a simple secret placed at the top of files when they write their own, then settings.php is also readable in most cases

~~#145~~

~~How concerned should we be about the hashed filename getting disclosed, and then a file upload vulnerability allowing someone to upload rogue contents to that file?~~

I think we are having a lot of unknowns here. There could be vulnerabilities that allow read access to any file, or just to some files. Or the signature could leak through an error message, file backups, etc.

As a vague heuristic, we could say that having the secret copied to fewer places also means a smaller risk of it being disclosed. But by how much, is really hard to tell.

EDIT: Effulgentsia's post in #145 has nothing to do with this.

Comment #151

chx CreditAttribution: chx commented 19 July 2012 at 04:00

Everyone: please review the patch at #1675260-17: Implement PHP reading/writing secured against "leaky" script. It implements #144. Ie. it uses native include but every file is in a separate directory with a filename hashed based on the mtime of the directory AND having a lower mtime than the directory which makes it impossible for the leaky upload script (which, btw, now is totally real, see http://drupal.org/node/1679442 for a security hole which took the upload dir from $_GET) to change the file because a) deleting the file and writing it with the same name will raise the directory mtime, invalidating the hash b) just writing the file will raise the file mtime to the current which will be higher than the directory. Please see the patch for more and a LOT of comments on how this works.

Comment #152

greggles

he/him

English

Denver, Colorado, USA

CreditAttribution: greggles commented 19 July 2012 at 14:29

In case you, like me, will say "but you can change mtime with touch() command" remember that the goal is only about protecting against a "leaky upload script" and not about protecting against arbitrary php execution. "leaky upload script" can be defined as one that makes it easy to upload any file of any file to any directory.

Comment #153

Crell CreditAttribution: Crell commented 19 July 2012 at 15:26

greggles: That is, "if you have the ability to run touch() on the file, you've already got an arbitrary code exploit and there are much more useful ways to destroy the system already", right? (Just making sure I understand what you're saying.)

Comment #154

greggles

he/him

English

Denver, Colorado, USA

CreditAttribution: greggles commented 19 July 2012 at 15:31

Correct.

Comment #155

donquixote CreditAttribution: donquixote commented 19 July 2012 at 16:15

Just saying, there can be other things that mess with the mtime.
E.g. if you play with git - switching branches, etc.

Probably this will result in the files having to be rebuilt? Fine with me, as long as we have a robust behavior.

Comment #156

chx CreditAttribution: chx commented 19 July 2012 at 17:44

It's not like the git question is addressed in the sibling implementation issue which still awaits reviews...

Comment #157

Owen Barton CreditAttribution: Owen Barton commented 20 July 2012 at 19:59

I agree regarding touch() being very unlikely for an attacker to have unless they already have general php execution, in which case they have no need to exploit this part of the system.

In principle it may be possible to harden this further by using the ctime of the files, which is similar to mtime except it cannot be changed by touch or touch() (on some filesystems you can change using filesystem debug tools, or if you have control over the system time, but these are both lower level than even touch). I am not sure this is worth the effort though, since it would need some cross system testing and probably need some extra code, since I am not sure all OSs/filesystems treat directory ctime as consistently as directory mtime.

One issue I think bears consideration - mtime/ctime only has 1 second granularity on many systems (and I am not sure php stat() exposes microsecond data even on filesystems that support this). This in theory opens up the possibility for timing based attacks - if an attacker can predict when a cache is going to be regenerated (e.g. from figuring out deployment schedules, or by triggering one themselves via some bug) then there is a chance that they could write their file in the same second (but after) Drupal writes a file and regenerates the hash. Actually, they could have a longer than a second if Drupal takes some time writing out the files - in fact, if Drupal does not update the entire cache on every update then the attacker could simply write out their file on top of file A, then wait until Drupal updates file B and regenerates the hash.

If they could pull this off then the hash would not change and their file would be included by Drupal and executed. I think the question is if this is a feasible attack or not, but I do know that various timing attacks that sound impossibly infeasible can turn out be quite practical given sufficient effort (e.g. DNS Cache Poisoning).

Comment #158

chx CreditAttribution: chx commented 20 July 2012 at 20:15

Every time a file is written out it gets a file name you can't predict without knowing the Drupal secret key. If a file happens to be broken and the error display settings are wrong and so it the file name gets revealed on screen in an error message and you also have a leaky upload script then yes but even so you have less than a second. A lot of things need to go wrong -- and one of them is having error messages on screen which is something the site owner can and should control.

Edit1: how long a file write takes matters not. Check the implementation why.

Edit2: the resulting filenames are based on the input filename, the Drupal secret key and the directory mtime.

Comment #159

effulgentsia CreditAttribution: effulgentsia commented 20 July 2012 at 20:55

The discussion on this issue has been immensely valuable, and we seem to be mostly in agreement that we need a swappable system with a default implementation something along the lines of #1675260: Implement PHP reading/writing secured against "leaky" script. Therefore, marking this a duplicate, cause I think it's time to channel this towards refining the implementation in the other issue.