There are a couple stat calls involved every time you call include, more if you're using relative paths, which Drupal does. This patch adds absolute paths to all core includes and a new global $drupal_root.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Crell’s picture

Status: Needs review » Postponed (maintainer needs more info)

Is there really a notable difference performance-wise between './foo.inc' and realpath(__FILE__) .'/foo.inc'? I know that not having the leading . causes full PATH traversal, which is bad, but I'm not aware of a difference in the above. Can you provide benchmarks?

I really really really don't want to introduce another global variable unless it really really really makes a difference; and even then, it should be a constant, not a global variable.

dopry’s picture

The gain is minimal but there is on there. Using an absolute path saves a getcwd system call for every require_once. It also save having to open and stat the file for subsequent require_once calls. This applies to include_once as well. For the single require_once the difference is minimal, but there.

You will also see a greater impact for files that may get included multiple times, which I believe happens in the theme layer for D6. There will also be similar benefits for sites running APC with apc.stat=0.

The following tests are system level and really only detail what is happening under the hood as proof of my statements. The real impact test will come with benchmarking, and will probably be more relevant to disk i/o constrained systems that rely heavily on file templates, or servers with large numbers of open files.

The work is based on a presentation by the php 5.2 maintainer at http://ilia.ws/files/phptek2007_performance.pdf regarding performance tuning PHP.

<?php
    require_once('./includes/install.inc');
    require_once('./includes/install.inc');

then strace php test.php... it results in:

munmap(0xb7836000, 4096) = 0 // marks end of reading script and beginning of execution.
getcwd("/home/anarkh/public_html/drupal-head/includes", 4096) = 46
time(NULL) = 1211101452
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
lstat64("/home/anarkh/public_html", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head/includes", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head/includes/install.inc", {st_mode=S_IFREG|0644, st_size=22806, ...}) = 0
open("/home/anarkh/public_html/drupal-head/includes/install.inc", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22806, ...}) = 0
read(3, "<?php\n// $Id: install.inc,v 1.61"..., 8192) = 8192
read(3, "\'_profile_modules\';\n $module_li"..., 8192) = 8192
read(3, ".\n *\n * The general approach her"..., 8192) = 6422
read(3, "", 8192) = 0
read(3, "", 8192) = 0
close(3) = 0
getcwd("/home/anarkh/public_html/drupal-head/includes", 4096) = 46
time(NULL) = 1211101452
open("/home/anarkh/public_html/drupal-head/includes/install.inc", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22806, ...}) = 0
close(3) = 0
close(2) = 0
close(1) = 0
munmap(0xb7809000, 4096) = 0
close(0) = 0
munmap(0xb780a000, 4096) = 0
open("/dev/urandom", O_RDONLY) = 0

Then try it with absolute paths...

<?php
   require_once('/path/to/install.inc');
   require_once('/path/to/install.inc');

the strace output is:

munmap(0xb7836000, 4096) = 0 // marks end of reading script and beginning of execution.
time(NULL) = 1211101314
lstat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
lstat64("/home/anarkh/public_html", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head/includes", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/home/anarkh/public_html/drupal-head/includes/install.inc", {st_mode=S_IFREG|0644, st_size=22806, ...}) = 0
time(NULL) = 1211101314
open("/home/anarkh/public_html/drupal-head/includes/install.inc", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=22806, ...}) = 0
read(3, "<?php\n// $Id: install.inc,v 1.61"..., 8192) = 8192
read(3, "\'_profile_modules\';\n $module_li"..., 8192) = 8192
read(3, ".\n *\n * The general approach her"..., 8192) = 6422
read(3, "", 8192) = 0
read(3, "", 8192) = 0
close(3) = 0
time(NULL) = 1211101314
close(2) = 0
close(1) = 0
munmap(0xb7834000, 4096) = 0
close(0) = 0
munmap(0xb7835000, 4096) = 0
open("/dev/urandom", O_RDONLY) = 0

The absolute path version has two less cwd calls and 1 less open call and 1 less fstat64 call...

Some system level benchmarks suggest require_once performance with absolute paths is faster...

<?php
$name = 'require_once(relative path)';
$iterations = 1000;
$start = microtime(true);
for ($x = 0; $x < $iterations; $x++) {
  require_once('./junk.php');
}
$time =  microtime(true) - $start;
echo "$name -> ($iterations iterations): $time\n";

$name = 'require_once(absolute path)';
$iterations = 1000;
$start = microtime(true);
for ($x = 0; $x < $iterations; $x++) {
  require_once('/home/anarkh/public_html/drupal-head/includes/junk.php');
}
$time =  microtime(true) - $start;
echo "$name -> ($iterations iterations): $time\n";

$name = 'require_once(relative path)';
$iterations = 10000;
$start = microtime(true);
for ($x = 0; $x < $iterations; $x++) {
  require_once('./junk.php');
}
$time =  microtime(true) - $start;
echo "$name -> ($iterations iterations): $time\n";

$name = 'require_once(absolute path)';
$iterations = 10000;
$start = microtime(true);
for ($x = 0; $x < $iterations; $x++) {
  require_once('/home/anarkh/public_html/drupal-head/includes/junk.php');
}
$time =  microtime(true) - $start;
echo "$name -> ($iterations iterations): $time\n";

require_once(relative path) -> (1000 iterations): 0.0175869464874
require_once(absolute path) -> (1000 iterations): 0.00283122062683

require_once(relative path) -> (10000 iterations): 0.21484708786
require_once(absolute path) -> (10000 iterations): 0.073392868042

dopry’s picture

Status: Postponed (maintainer needs more info) » Needs review
FileSize
30.29 KB

here is a revised version with constants, and some minor spacing issues cleaned up. I guess I'll setup a benchmarking environment if no one else has one setup.

.darrel.

dopry’s picture

so far benchmarking a base install with AB isn't very telling one way or the other. I did however get a clean room benchmarking environment set up this morning... I got two servers on their own switch with nothing else to do, yeah! I'll populate my installations with some data tonight and pull out siege and see how things go.

Crell’s picture

The microbenchmarks look like the absolute path is definitely faster, but I'm curious why ab wouldn't show the same. Perhaps it's a case of it not making enough of a difference in actual practice?

If siege sees a difference, then let's go for it.

nedjo’s picture

Seems worth the change.

It's a bit more learning to always prefix.

Should we maybe do this?


define('DRUPAL_INCLUDE', 0);
define('DRUPAL_REQUIRE', 1);

function drupal_include($path, $type = DRUPAL_INCLUDE) {
  switch ($type) {
    case DRUPAL_INCLUDE:
      include_once(DRUPAL_ROOT . '/' . $path);
      break;
    case DRUPAL_REQUIRE:
      require_once(DRUPAL_ROOT . '/' . $path);
      break;
  }
}

Crell’s picture

What actual purpose would switching on include/require serve? Drupal breaks if you try to do either one and it fails; it just breaks differently. :-)

nedjo’s picture

Well, good point. Maybe they should all be requires? My thought was just to introduce a new function to call to avoid having to add the new constant everywhere we do an include.

c960657’s picture

There is also another reason for using absolute paths:
As mentioned on http://php.net/session_set_save_handler, the current working directory may change during script termination (try adding var_dump(getcwd()); to sess_write() in includes/session.inc). When the current working directory is different from the Drupal root, include_onces with relative paths will fail.

If Drupal tries to load new files during shutdown, errors like the following appear in the Apache error log:

PHP Warning: include_once(modules/user/user.admin.inc): failed to open stream: No such file or directory in /var/www/drupal6/includes/theme.inc on line 283

So the suggested change is not only a performance optimization but also a bug fix.

cburschka’s picture

How many places in core actually still use require and include directly, now that the code registry is in place? Isn't all of the code-loading centralized already, not requiring a drupal_include() API function?

Edit: grep shows a lot of places; forget what I said.

nedjo’s picture

The patch consists mainly of lines beginning include_once() or require_once(). There are many of them.

c960657’s picture

As an alternative to defining DRUPAL_ROOT in each top-level .php file, you could use $_SERVER[DOCUMENT_ROOT] instead.

recidive’s picture

Constants should be defined only once, maybe in bootstrap.inc.

c960657’s picture

Status: Needs review » Needs work

Patch cannot be applied to HEAD.

Damien Tournoud’s picture

Priority: Normal » Critical

Raising this to critical, because it breaks PDO object loading in sess_write.

Damien Tournoud’s picture

Assigned: dopry » Damien Tournoud

I'm working on that.

Damien Tournoud’s picture

Assigned: Damien Tournoud » Unassigned
Status: Needs work » Needs review
FileSize
53.89 KB

Here is a proper patch. Changes:

* Standardize all code style to (include|require)_once 'file', not (include_require)_once('file')
* Standardize *_inc variables not to use any trailing './'
* Standardize registry pathes not to use any trailing './'
* Upgraded to HEAD

keith.smith’s picture

Very minor, but there are some small code style violations (lots of comments without full stops, some of which, admittedly, were in the code previously. Some are new.)

Damien Tournoud’s picture

@keith.smith: No comments were harmed during the making of this patch. There are no new comments and no previous comments were touched. That's out of the scope of that patch to change comments.

... but that makes me thing that the DRUPAL_ROOT should get documented.

keith.smith’s picture

Well.

These appear to be added, or changed, or something, for instance, and have no terminating period. But yes, its certainly no big thing.

+  extract($variables, EXTR_SKIP);      // Extract the variables to a local namespace
+  ob_start();                          // Start output buffering
+  include DRUPAL_ROOT . '/' . $file;   // Include the file
+  $contents = ob_get_contents();       // Get the contents of the buffer
+  ob_end_clean();                      // End buffering and discard
+  return $contents;                    // Return the contents
sun’s picture

Status: Needs review » Needs work

Patch looks good (visually).

However, there's a logical change in module_load_include(), which might be unsafe, better leave this for another issue.
Also, the coding-style fixes in theme_render_template() should go into another issue (as already mentioned by Keith).

Tiny, but confusing: Why is this done differently?

+ define('DRUPAL_ROOT', dirname(realpath(__FILE__)));
+ define('DRUPAL_ROOT', getcwd());
(SimpleTest)

Damien Tournoud’s picture

Status: Needs work » Needs review

@sun:

However, there's a logical change in module_load_include(), which might be unsafe, better leave this for another issue.

Please, read the patch. The change is tiny, and is only made not to call is_file() on an empty variable if drupal_get_path is not found.

Also, the coding-style fixes in theme_render_template() should go into another issue (as already mentioned by Keith).

There is no change *at all* here. I only shifted the comments left. I didn't change their style nor their wording. That's for another issue.

Tiny, but confusing: Why is this done differently?
+ define('DRUPAL_ROOT', dirname(realpath(__FILE__)));
+ define('DRUPAL_ROOT', getcwd());

Both affected scripts (scripts/drupal.sh and scripts/run-tests.sh) are in the "scripts" directory. Their dirname(realpath(__FILE__)) is that directory, not the Drupal root directory.

No change needed to the patch, back to CNR.

lilou’s picture

Status: Needs review » Needs work

Damien : your patch is clean and use absolute path offer better performances (#2)

Hower patch#17 fail on :

\www\drupal\7.x\install.php (Cannot apply hunk @@ 7 )
\www\drupal\7.x\update.php (Cannot apply hunk @@ 6 )
\www\drupal\7.x\includes\install.inc (Cannot apply hunk @@ 9 )

lilou’s picture

Example of constant doxygen documentation :

/**
 * Root directory of Drupal installation.
 */
Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
50.84 KB

Quick reroll, including documentation of the constant.

Damien Tournoud’s picture

Note that this patch has *nothing* to do with performance. It is just here to fix HEAD that breaks because the current directory can change when PHP enters shutdown functions (like sess_write), breaking our dynamic code loading.

maartenvg’s picture

Patch applies (with some fuzz) and produced no errors. Visually I can find nothing that ought to be different.

lilou’s picture

Patch failed on :
install.php (Cannot apply hunk @@ 7 )
update.php (Cannot apply hunk @@ 6 )

I think, after reroll, the patch will be RTBC

Damien Tournoud’s picture

Quick reroll.

lilou’s picture

Status: Needs review » Reviewed & tested by the community
Damien Tournoud’s picture

Title: Convert includes/requires to use absolute paths.... » Broken autoloader: convert includes/requires to use absolute paths
lilou’s picture

Category: task » bug

According to #9

catch’s picture

Dries’s picture

Status: Reviewed & tested by the community » Needs work

My session tests pass just fine without this patch (82 passes, 0 fails, 0 exceptions). This looks like an important patch, that is prone to breakage. I think we should write a test case for this patch.

catch’s picture

@Dries, do you get the BlogAPI, XML-RPC, Path and Actions failures when #297860: Regression: Convert session.inc to the new database API (and _sess_write should use a merge) is applied? The fails are pretty obscure in those tests though so I agree it'd be a good to have a test case.

gpk’s picture

@34: I'm probably entirely missing the point, but I don't see how "a" test case can cover all the changes in this patch. Surely code coverage of existing tests will ensure that nothing new is broken by it. The only other thing I can think of would be to always unset the include path by setting it to an empty string or somesuch when running tests. That would make sure that all include/require files are properly specified.

Also there is an "include" (not "include_once") in color.module that has escaped this patch, in http://api.drupal.org/api/function/color_get_info/7. Whether there are any other instances of plain "include" in D7 I'm not sure.

Finally, a very trivial point, the form

include_once DRUPAL_ROOT . "/$file";

seems marginally clearer to me than

include_once DRUPAL_ROOT . '/' . $file;

[@Damien: great patch by the way, must have taken ages to sort out all the different cases]

catch’s picture

Status: Needs work » Needs review

Marking back to review.

desaikillol’s picture

thanks for the share

======================================================
website design | website development | submit site

gpk’s picture

Status: Needs review » Needs work

@37: actually the color.module issue makes this CNW

Damien Tournoud’s picture

@gpk: good catch.

Here is an updated patch, including the missing include in color.module. I double-checked that no require or include were missing. I think we covered virtually all cases.

Please run the full test suite with this patch. I think it will be enough to verify that we don't brake anything.

Damien Tournoud’s picture

Status: Needs work » Needs review

Back to CNR.

maartenvg’s picture

Status: Needs review » Reviewed & tested by the community

5390 passes, 0 fails, 0 exceptions on a fresh install of the current HEAD (patched of course :)).

webchick’s picture

Subscribing. Will check this in a few minutes.

webchick’s picture

Ok, nevermind. The changes in here touch a lot of stuff, including command-line scripts that don't have tests, so I'll need to look at this when I have more time available. Will get to it later this week.

february28’s picture

What file to apply this patch? Is it something like: patch -p0 < drupal directory/259623-absolute-includes-4.patch?

maartenvg’s picture

You first have to be in the drupal directory. Then do patch -p0 < name_of_patch.patch.

kbahey’s picture

FileSize
54.77 KB

There are a couple of rejects in the last patch. Here is a re-roll that applies cleanly.

I installed HEAD from scratch using the patch and everything looks fine.

I will conduct some stress tests.

kbahey’s picture

Here are the benchmarks. 25 concurrent users hitting the home page on a default install for 2 minutes.

Marginal improvement with the patch, statistically insignificant.

             Trans,  Elap Time,  Resp Time,  Trans Rate,  Concurrent,    OKAY,  Failed
Without patch 6060,     120.02,       0.49,       50.49,       24.94,    6060,       0
With patch    6199,     120.16,       0.48,       51.59,       24.96,    6199,       0
Damien Tournoud’s picture

@kbahey: please note this is *NOT* a performance patch.

kbahey’s picture

I know, and it should not decrease it either. Which is the case, so we are good.

Damien Tournoud’s picture

We brainstormed with webchick on this, and she came up with this summary for the patch:

"The code registry makes it so that we shouldn't have to do weird including. It should 'just know' where to find files that have the code we need, and load them itself. Currently, this breaks because when we do a fancy DB Merge query, the Merge Query class hasn't been called in yet, so session stuff pukes all over itself. This patch fixes it by making sure that all includes are calculated from Drupal root."

Crell’s picture

Uh. OK, I am totally confused here.

1) How is this patch suddenly not about performance? That was the point when I last looked at it.

2) How exactly does using an absolute path rather than making PHP figure out the absolute path have any impact whatsoever on the registry loading MergeQuery? If the registry needs to write things it already manually includes the DB files it needs to break the circular dependency there. The only merge in session handling should be in sess_write(), by which point the database and registry are long since loaded.

So, uh, huh?

Damien Tournoud’s picture

@Crell:

There is a note on the register_shutdown_function help page:

Working directory of the script can change inside the shutdown function under some web servers, e.g. Apache.

When we enter sess_write and all classes required for MergeQuery has not been loaded yet, the autoloader tries to load mysql/query.inc but fails with:

PHP Fatal error:  require_once() [<a href='function.require'>function.require</a>]: Failed opening required './includes/database/mysql/query.inc'

... because the current directory changed in the meantime.

Crell’s picture

Ahhh... OK, that makes sense. (Well, not Apache's silliness but the fix does.) I'm assuming everything else in this patch *is* performance related, as per the start of the thread?

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Ok, I ran all the tests (which all passed) and looked this over one more time. From the web side, all looks good. The command-line scripts are missing PHPDoc over their defines, but these are also in the middle of executable code, so I'm not sure it'd do any good anyway. run-tests.sh looks like it needs some general love, since running it without an argument spits out a bunch of errors. But this happens without the patch as well.

Committed to HEAD. Thanks!

gpk’s picture

Status: Fixed » Reviewed & tested by the community

@54: not entirely - there is the question of consistency in the way includes/requires are handled, and it also fixes #296194: Some includes are not anchored.

gpk’s picture

Status: Reviewed & tested by the community » Fixed

Uggh, cross post...

webchick’s picture

Hm. I am probably going to have to revert this. :(

On installation, I now get the following directly after entering my DB info:

PDOException: SQLSTATE[42S02]: Base table or view not found: 1146 Table '7x.variable' doesn't exist in /Applications/MAMP/htdocs/7x/includes/database/database.inc on line 1017

Though even without this patch I still get:

PDOException: SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') ORDER BY fit DESC LIMIT 0, 1' at line 1 in /Applications/MAMP/htdocs/7x/includes/database/database.inc on line 1018

It's generating a broken query:

PDOException: SELECT * FROM {menu_router} WHERE path IN () ORDER BY fit DESC LIMIT 0, 1 - Array ( ) SQLSTATE[42000]: Syntax error or access violation: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ') ORDER BY fit DESC LIMIT 0, 1' at line 1 in /Applications/MAMP/htdocs/7x/includes/database/database.inc on line 365

This comes from /Applications/MAMP/htdocs/7x/includes/menu.inc line 367.

Anyone got a quick fix?

webchick’s picture

Status: Fixed » Active
Damien Tournoud’s picture

Status: Active » Fixed

@webchick: please don't blame the patch. I'm unable to reproduce the issue but it is a known one, even on Drupal 6:

http://drupal.org/node/280015
http://drupal.org/node/223062
http://drupal.org/node/287632
http://drupal.org/node/303378

While the root cause of this error remains to be identified, chx made a quick fix: http://drupal.org/node/261148

While you are at it, because you can reproduce the error, maybe you could grab some information on why and how this happens: at which step of the installer does this happen, what is the state of the database when this happen, etc?

webchick’s picture

Ok, I documented what's happening over at #311251: Fix broken installer in HEAD. Help appreciated.

moshe weitzman’s picture

This is a nice patch.

But it added a line to every script that has to execute drupal. Namely, define('DRUPAL_ROOT', dirname(realpath(__FILE__))) is repeated in update.php, install.php, index.php, xmlrpc.php, cron.php and also in run-tests.sh. Before I add it to drush in #312421: Undefined DRUPAL_ROOT in Drush bootstrap, I propose to move this line to top of bootstrap.inc. Good idea?

webchick’s picture

Status: Fixed » Needs work

Couple things:

1. This never made it to the upgrade docs: http://drupal.org/node/224333
2. I'd like to see if Moshe's suggestion @ #62 would work. That'd make for far less code duplication.

Owen Barton’s picture

I think this looks good - although I like Nedjo's idea of a simple wrapper function drupal_include($file). This would simplify the calls and mean we have a single place to change the behavior in the future if we need to.

Also, I am wondering if the define could possibly live in settings.php. This would allow people to change this if they have an odd configuration, such as all php source (except index.php of course) living outside of the webroot, which is occasionally requested.

kbahey’s picture

I support drupal_include($file) as well. I was actually working on this a few weeks ago for a different reason (moving all of drupal into a separate directory).

It could be done in a way that allows inclusion from an alternate base directory too, in case some modules want to use drupal_include from that:

function drupal_include($file, $base_dir = '') {
  if (!$base_dir) {
    $base_dir = DRUPAL_ROOT;
  }

  require_once($base_dir . '/' . $file);
}

Also, as moshe says, defining DRUPAL_ROOT in bootstrap.inc will avoid repeating it everywhere.

webchick’s picture

To clarify, this patch has already been committed. It needs docs.

gpk’s picture

Status: Needs work » Fixed

@66: See http://drupal.org/node/224333#absolute_includes.

Does Moshe's suggestion @62 need a separate issue?

webchick’s picture

Yes please!

gpk’s picture

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

ezestseo’s picture

The same thing was I am trying to do with my website.