This issue describes a change that may or may not yield a performance improvement. I would expect it to, but somehow it doesn't, or the improvement is tiny.

Using Xdebug (on Windows) I noticed that file_exists() is the most expensive built-in PHP function measured as total time spent calling the function for all invocations in a single request. This is more time than is spent in e.g. PDOStatement->execute().

During the bootstrap process, Drupal does a lot of file_exists() checks to verify that files are still where they are supposed to be according to {system}.filename.

AFAICT the general strategy in Drupal is that changes to the file structure (i.e. when files are added, removed or renamed) may require that the module and theme lists are rebuilt via admin/modules and admin/appearance. If a file is moved/deleted without being properly uninstalled in advance, Drupal should not die completely but make it possible to navigate to admin/modules or admin/appearance, though error messages may be displayed and stylesheets may be missing. Do you agree on this approach?

Though the time spent in file_exists() drop accoring to Xdebug, I cannot measure any performance improvement on the total request time (though my results are fluctuating). From inspecting the patch I'd be surprised if it did not cause at least a tiny performance increase, though it may be that the following include_once just spends a bit extra time filling the stat cache. This would indicate that there would be a performance improvement when apc.stat or xcache.stat is turned off, but I could not measure any performance impact there either

I am posting this patch in order to let other people benchmarks it or comment on the reasoning above.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

c960657’s picture

FileSize
5.34 KB
c960657’s picture

Issue tags: +Performance
catch’s picture

Are there any places we use require_once() on the results of drupal_get_filename()?

c960657’s picture

Not AFAICT.

I grep'ed for require_once and found the following patterns:

  1. require_once DRUPAL_ROOT . '/includes/xxx.inc';
  2. require_once DRUPAL_ROOT . '/' . variable_get('path_inc', 'includes/path.inc');
  3. require_once DRUPAL_ROOT . '/modules/field/tests/field_test.storage.inc';
  4. require_once DRUPAL_ROOT . '/' . drupal_get_path('module', 'user') . '/user.pages.inc';
  5. require_once DRUPAL_ROOT . '/' . $filename;
  6. require_once DRUPAL_ROOT . '/' . $provider['file'];

1-3 are harmless.
4: drupal_get_path() calls drupal_get_filename(), but the user module hardly moves (unless people really hack core).
5-6: These patterns are mostly used when a filename is specified by a module or user. AFAICT none of these use drupal_get_filename() directly. Some occurrences have an explicit file_exists() in front of them.

This command looks for the patterns not covered by 1-3.
grep -r require_once . | egrep -v "DRUPAL_ROOT . '/(modules|includes)/|variable_get"

c960657’s picture

Status: Needs work » Needs review
FileSize
2.98 KB

Reroll. This removes the changes to system.module — I'm not sure why they were necessary.

Status: Needs review » Needs work

The last submitted patch, file_exists-3.patch, failed testing.

c960657’s picture

FileSize
3.07 KB

Fixed test errors due to renaming of "Default" install profile to "Standard".

The change to list_themes() was fixed in #925490: list_themes() should not call file_exists() on each theme's info file.

c960657’s picture

Status: Needs work » Needs review
moshe weitzman’s picture

Version: 7.x-dev » 8.x-dev

IMO we need benchmarks in order to justify putting this in D7. If someone provides that, please change the versio nback top d7.

For D8, we could consider doing it without benchmarks. AFAIK, the OS caches file stat information. Perhaps Windows doesn’t.

sun’s picture

Priority: Minor » Normal

Unfortunately, my development box is unsuitable for doing benchmarks, but for Windows boxes in general, I suspect that this patch could have quite some effect, as its file system access is slow in general.

catch’s picture

Issue tags: +i/o

There are (at least) two levels of caching for file stats, IMO both are too fragile for us to rely on them in core:

1. The OS should cache lstat calls, but at least NFS v.3 doesn't - see http://tag1consulting.com/blog/nfs-drupal-and-realpath-cache

2. As Narayan points out, PHP's own realpath cache can ameliorate this, but we currently have some really poorly designed code in core that calls file_exists() for things like css/js aggregates, images, favicon.ico - as far as I know these will pollute the realpath cache, and the default size of it is very small. Also even if it's tuned, it's going to be strained on servers that have multiple different installs on - whether that's some kind of shared hosting, or multi-site, or other PHP code on the same servers.

Also, if you look through strace output (which I will post later to this issue), we currently loop through all modules to see if they exist with a syscall for each one, then loop through them again to load them (and do the exact same syscall each time). This just looks really, really bad in strace, even if it's not a big performance issue it's extremely messy.

On top of this I think we should be throwing PHP warnings and etc. if a module is just removed from the file system or similar, rather than hide it as we do now.

moshe weitzman’s picture

Version: 8.x-dev » 7.x-dev

catch is pretty convincing

catch’s picture

Here's strace output, with page caching enabled. As you can see this removes 3 system calls to access for the three modules (I still don't understand why system.module is being included as a bootstrap module sometimes - this is a clean standard profile install via drush).

If you have 150 modules enabled, then for a full bootstrap it's going to save 100 of these.

HEAD:

6919  write(10, "\16\1\0\0\3SELECT cache_bootstrap.cid "..., 274) = 274
6919  read(10, "\1\0\0\1\5<\0\0\2\3def\2d7\17cache_bootstrap"..., 16384) = 740
6919  access("modules/dblog/dblog.module", F_OK) = 0
6919  access("modules/overlay/overlay.module", F_OK) = 0
6919  access("modules/system/system.module", F_OK) = 0
6919  lstat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  lstat("/home/catch/www/7/modules/dblog", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  lstat("/home/catch/www/7/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/dblog/dblog.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  mmap(NULL, 6768, PROT_READ, MAP_SHARED, 11, 0) = 0x7fefb15be000
6919  stat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  munmap(0x7fefb15be000, 6768)      = 0
6919  close(11)                         = 0
6919  lstat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  lstat("/home/catch/www/7/modules/overlay", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/overlay/overlay.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  mmap(NULL, 33497, PROT_READ, MAP_SHARED, 11, 0) = 0x7fefb15b7000
6919  stat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  munmap(0x7fefb15b7000, 33497)     = 0
6919  close(11)                         = 0
6919  lstat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
6919  lstat("/home/catch/www/7/modules/system", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/system/system.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0

And with the patch:

7185  write(10, "\16\1\0\0\3SELECT cache_bootstrap.cid "..., 274) = 274

7185  read(10, "\1\0\0\1\5<\0\0\2\3def\2d7\17cache_bootstrap"..., 16384) = 740
7185  lstat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  lstat("/home/catch/www/7/modules/dblog", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  lstat("/home/catch/www/7/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/dblog/dblog.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  mmap(NULL, 6768, PROT_READ, MAP_SHARED, 11, 0) = 0x7f76927c6000
7185  stat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  munmap(0x7f76927c6000, 6768)      = 0
7185  close(11)                         = 0
7185  lstat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  lstat("/home/catch/www/7/modules/overlay", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/overlay/overlay.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  mmap(NULL, 33497, PROT_READ, MAP_SHARED, 11, 0) = 0x7f76927bf000
7185  stat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  munmap(0x7f76927bf000, 33497)     = 0
7185  close(11)                         = 0
7185  lstat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  lstat("/home/catch/www/7/modules/system", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/system/system.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  mmap(NULL, 138592, PROT_READ, MAP_SHARED, 11, 0) = 0x7f7692775000
7185  stat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  munmap(0x7f7692775000, 138592) 
moshe weitzman’s picture

Status: Needs review » Needs work

Code looks good.

Is this going to start throwing errors for sites that removed modules without disabling? Should we do an update function that disables these missing modules/themes? They are effectively disabled anyway. It is too bad we can't run hook_disable but the code is gone so changing system.status seems like our best bet.

catch’s picture

Yes it'll start throwing errors for those. A one-off update for old databases sound sensible unless we already have something that disables missing modules already, I'll have a look.

DanPir’s picture

subscribe

Fabianx’s picture

Subscribe

catch’s picture

Version: 7.x-dev » 8.x-dev
Status: Needs work » Needs review
Issue tags: +Needs backport to D7

We need to check what exactly happens to modules that are just missing from the filesystem before doing any work on the patch, back to CNR and updating metadata.

This is really a correctness issue rather than just performance, but an additional thing here is the file_exists() + require_once pattern more or less undermines apc.stat = 0 since that will not at all prevent the file stat from file_exists().

unleash’s picture

well i have the same issue here:

how can i apply the patch here -

i am pretty new to drupal and also [pretty] new to this forum - so do not bear with me. Sorry for the newbie-questions... or just visit the site - http://campus-24.com/?q=stream and http://campus-24.com/ to get a first glance of what happened. with the installation of drupal-commons.
Here the full story! after strugglin several weeks with drupal-commons 6.2.3 (based o november 2011) with issues on htaccess (my admin added and choose some certain options) i finally was able to install the system on the webserver.

Note - yesterday i installed on the local lamp with ease - all went nice and no error at all appeared.
Well today - the drupal-commons was installable on the webserver as well - Note: i gote some errors -which were reported during the intallation - (at each station of installation i got some lines like shown below)
But - nonetheless the installation-process was continued and fulfilled - i was able to choose all options and configuration-steps.
Interesting detail: There were onle 149 tables been created - not 154 like yesterday on the local installation. !?! Interesting - isnt it?!

besides this - the result looks abit funny - what do you think - what can i do with that - can i fix it!?

have a closer look: http://campus-24.com/install.php?locale=&profile=drupal_commons&id=2&op=...
http://campus-24.com/?q=stream
http://campus-24.com/

but it should look like that of course like so:

http://commons.acquia.com/stream
or so http://commons.acquia.com

Question: is this a minor bug - or what can we do here -

btw - see the error reports that shown up during the installation - process. (well nonetheless all went without any break or interruption )

love to hear from you

greetings
unleash

error reports:

    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/acquia_search/tests) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/apachesolr/Zend) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/apachesolr/tests) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/apachesolr/contrib) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/apachesolr/SolrPhpClient) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/acquia/acquia_search/apachesolr/drush) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_timezone/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_popup/themes) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_popup/lib) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_popup/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_locale/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_tools/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_php4/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.
    warning: opendir(profiles/drupal_commons/modules/contrib/date/date_repeat/translations) [function.opendir]: failed to open dir: Too many open files in /home/vhost/WWW/campus-24.com/includes/file.inc on line 955.

plz help me - just advice me what to do!

sun’s picture

Title: Avoid too many file_exists() calls during bootstrap » Remove file_exists() during bootstrap
Issue tags: -Needs backport to D7
FileSize
1.39 KB

Re-rolled against latest HEAD. Removing backport tag, as this cannot be backported.

Speaking of, I'd simply go ahead with this patch and don't care for bogus info for non-existing modules at all. Make it a requirement for D8 that extensions need to be properly uninstalled before removing them. We're baby-sitting bogus user behavior with stuff like this on the cost of overall system performance.

Status: Needs review » Needs work

The last submitted patch, drupal8.file-exists.20.patch, failed testing.

sun’s picture

Status: Needs work » Needs review

#20 makes pretty clear that we're attempting to load the utterly wrong files in the D8 upgrade path.

swentel’s picture

FileSize
3.31 KB

Well, the reason the tests fail is simple. The file_exists() check is gone, so it will always return something into $files[$type][$name] (FALSE or a filename, which is now still in the wrong place on the file system). With the file_exists, it falls through so drupal_system_listing() starts and finds the new filename of the modules that failed loading in the test. Attached patch makes those failures go away - but I'm not sure if it's the right way to fix this.

- added a system_rebuild_module_data() so the filenames for the modules are rebuild
- the db_merge() change is because the language update otherwhise triggered an PDO error (maybe we could refactor this)
- change the include for the user module with a note in update.php

Note, in case you use devel module for D8 testing (not simpletest, but just plain debuggin), disable it because it's using drupal_get_filename('modules/user/user.module') and during hook_boot().

pounard’s picture

I wrongly opened #1715222: Remove almost all file_exists() and is_file() calls from core which is a partial duplicate of this issue. Attached a full patch which fails with SimpleTest, but not in real life http://drupal.org/files/1715222-1-d7-remove_useless_file_stat.patch

sun’s picture

Re-rolled against HEAD.

YesCT’s picture

Issue tags: -i/o +i-o

(the slash in the i/o tag breaks the autocomplete from adding new tags)

jbrown’s picture

Issue tags: -Performance, -i-o

#25: drupal8.file-exists.25.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Performance, +i-o

The last submitted patch, drupal8.file-exists.25.patch, failed testing.

catch’s picture

catch’s picture

Status: Needs work » Needs review
FileSize
457 bytes

Re-rolled. Only including the hunk that actually affects bootstrap.

RobLoach’s picture

Status: Needs review » Reviewed & tested by the community
alexpott’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Committed a0611bb and pushed to 8.x. Thanks!

Tested this patch with moving modules around - this patch does nothing to affect how brittle Drupal 8 is when this occurs.

mkhamash’s picture

Status: Patch (to be ported) » Needs review
FileSize
437 bytes

I have tried it in Drupal 7 and didn't encounter any problems, it is not that mush of an improvement, but needs testing in a lot of situation and the gain is liner with how many modules you have enabled.

the gain is around 1-2 ms, using APC and having 133 active modules but that is on a very good hard drive.

The last submitted patch, 33: remove_file_exists-D7-752730-33.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 33: remove_file_exists-D7-752730-33.patch, failed testing.

  • alexpott committed a0611bb on 8.3.x
    Issue #752730 by c960657, sun, swentel, catch: Remove file_exists()...

  • alexpott committed a0611bb on 8.3.x
    Issue #752730 by c960657, sun, swentel, catch: Remove file_exists()...
joseph.olstad’s picture

There is a related 7.x issue open, goes a step further with removing almost all is_file as well. The patch in the related issue needs a reroll

Fabianx’s picture

Drive by comment:

Wasn't the whole reason for the larger impact of the file system scan that we added this file_exists() check somewhere in 7.23 or do I mix something up now?

  • alexpott committed a0611bb on 8.4.x
    Issue #752730 by c960657, sun, swentel, catch: Remove file_exists()...

  • alexpott committed a0611bb on 8.4.x
    Issue #752730 by c960657, sun, swentel, catch: Remove file_exists()...
joseph.olstad’s picture

Status: Needs work » Needs review
FileSize
472 bytes

reroll of patch 33 , similar to 8.x commits just made and same as patch 30

joseph.olstad’s picture

Status: Needs review » Reviewed & tested by the community

Removing useless call improves performance. All green. 7.54 please and thanks!

stefan.r’s picture

Status: Reviewed & tested by the community » Needs review

So what happens if the file doesn't exist?

If this optimization is actually OK to do, we'd need to update the comment below -- we can't be fully sure any file listed here actually exists anymore when we remove that call, can we?

  // The $files static variable will hold the locations of all requested files.
  // We can be sure that any file listed in this static variable actually
  // exists as all additions have gone through a file_exists() check.
  // The location of files will not change during the request, so do not use
  // drupal_static().
joseph.olstad’s picture

#1715222: Remove almost all file_exists() and is_file() calls from core how many times is it necessary to call file_exists and is_file?

I would say no more than once per cache clear per file or else maybe in module_enable / module_uninstall

How are all tests passing if this call is really necessary ? I/O to external devices isn't cheap even in 2017. Software must do more and waste less. Ecologically and to increase throughput.

I already have enough grey hairs growing while waiting for disk lookups on a fast system with ssd drives. If I wanted my machines to spend a lot of time on I/O I'd install Microsoft Windows cut my system memory in half and install some bloat ware.

How did this change make it to 8.4.x without a philosophical discussion?

***EDIT***
impatient just a tad .

stefan.r is correct about the comments needing to get pulled out as they're no longer relevant.

as for why this call was ever there in the first place is a question we probably don't need to know the answer to. I like the idea of removing it. If a file is not found, the operating system will throw an error and that'll be handled by user space, in our case, php , sending a message to the user. If this call isn't absolutely necessary I think it should be removed as this sort of thing is provided at the OS level.