This issue describes a change that may or may not yield a performance improvement. I would expect it to, but somehow it doesn't, or the improvement is tiny.

Using Xdebug (on Windows) I noticed that file_exists() is the most expensive built-in PHP function measured as total time spent calling the function for all invocations in a single request. This is more time than is spent in e.g. PDOStatement->execute().

During the bootstrap process, Drupal does a lot of file_exists() checks to verify that files are still where they are supposed to be according to {system}.filename.

AFAICT the general strategy in Drupal is that changes to the file structure (i.e. when files are added, removed or renamed) may require that the module and theme lists are rebuilt via admin/modules and admin/appearance. If a file is moved/deleted without being properly uninstalled in advance, Drupal should not die completely but make it possible to navigate to admin/modules or admin/appearance, though error messages may be displayed and stylesheets may be missing. Do you agree on this approach?

Though the time spent in file_exists() drop accoring to Xdebug, I cannot measure any performance improvement on the total request time (though my results are fluctuating). From inspecting the patch I'd be surprised if it did not cause at least a tiny performance increase, though it may be that the following include_once just spends a bit extra time filling the stat cache. This would indicate that there would be a performance improvement when apc.stat or xcache.stat is turned off, but I could not measure any performance impact there either

I am posting this patch in order to let other people benchmarks it or comment on the reasoning above.

Files: 
CommentFileSizeAuthor
#30 752730-30.patch457 bytescatch
PASSED: [[SimpleTest]]: [MySQL] 58,401 pass(es).
[ View ]
#25 drupal8.file-exists.25.patch1.62 KBsun
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal8.file-exists.25.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#23 752730-23.patch3.31 KBswentel
PASSED: [[SimpleTest]]: [MySQL] 34,161 pass(es).
[ View ]
#20 drupal8.file-exists.20.patch1.39 KBsun
FAILED: [[SimpleTest]]: [MySQL] 34,062 pass(es), 4 fail(s), and 14 exception(es).
[ View ]
#7 file_exists-4.patch3.07 KBc960657
PASSED: [[SimpleTest]]: [MySQL] 26,650 pass(es).
[ View ]
#5 file_exists-3.patch2.98 KBc960657
FAILED: [[SimpleTest]]: [MySQL] 25,198 pass(es), 0 fail(s), and 78 exception(es).
[ View ]
#1 file_exists-2.patch5.34 KBc960657
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch file_exists-2.patch.
[ View ]

Comments

StatusFileSize
new5.34 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch file_exists-2.patch.
[ View ]

Issue tags:+Performance

Are there any places we use require_once() on the results of drupal_get_filename()?

Not AFAICT.

I grep'ed for require_once and found the following patterns:

  1. require_once DRUPAL_ROOT . '/includes/xxx.inc';
  2. require_once DRUPAL_ROOT . '/' . variable_get('path_inc', 'includes/path.inc');
  3. require_once DRUPAL_ROOT . '/modules/field/tests/field_test.storage.inc';
  4. require_once DRUPAL_ROOT . '/' . drupal_get_path('module', 'user') . '/user.pages.inc';
  5. require_once DRUPAL_ROOT . '/' . $filename;
  6. require_once DRUPAL_ROOT . '/' . $provider['file'];

1-3 are harmless.
4: drupal_get_path() calls drupal_get_filename(), but the user module hardly moves (unless people really hack core).
5-6: These patterns are mostly used when a filename is specified by a module or user. AFAICT none of these use drupal_get_filename() directly. Some occurrences have an explicit file_exists() in front of them.

This command looks for the patterns not covered by 1-3.
grep -r require_once . | egrep -v "DRUPAL_ROOT . '/(modules|includes)/|variable_get"

Status:Needs work» Needs review
StatusFileSize
new2.98 KB
FAILED: [[SimpleTest]]: [MySQL] 25,198 pass(es), 0 fail(s), and 78 exception(es).
[ View ]

Reroll. This removes the changes to system.module — I'm not sure why they were necessary.

Status:Needs review» Needs work

The last submitted patch, file_exists-3.patch, failed testing.

StatusFileSize
new3.07 KB
PASSED: [[SimpleTest]]: [MySQL] 26,650 pass(es).
[ View ]

Fixed test errors due to renaming of "Default" install profile to "Standard".

The change to list_themes() was fixed in #925490: list_themes() should not call file_exists() on each theme's info file.

Status:Needs work» Needs review

Version:7.x-dev» 8.x-dev

IMO we need benchmarks in order to justify putting this in D7. If someone provides that, please change the versio nback top d7.

For D8, we could consider doing it without benchmarks. AFAIK, the OS caches file stat information. Perhaps Windows doesn’t.

Priority:Minor» Normal

Unfortunately, my development box is unsuitable for doing benchmarks, but for Windows boxes in general, I suspect that this patch could have quite some effect, as its file system access is slow in general.

Issue tags:+i/o

There are (at least) two levels of caching for file stats, IMO both are too fragile for us to rely on them in core:

1. The OS should cache lstat calls, but at least NFS v.3 doesn't - see http://tag1consulting.com/blog/nfs-drupal-and-realpath-cache

2. As Narayan points out, PHP's own realpath cache can ameliorate this, but we currently have some really poorly designed code in core that calls file_exists() for things like css/js aggregates, images, favicon.ico - as far as I know these will pollute the realpath cache, and the default size of it is very small. Also even if it's tuned, it's going to be strained on servers that have multiple different installs on - whether that's some kind of shared hosting, or multi-site, or other PHP code on the same servers.

Also, if you look through strace output (which I will post later to this issue), we currently loop through all modules to see if they exist with a syscall for each one, then loop through them again to load them (and do the exact same syscall each time). This just looks really, really bad in strace, even if it's not a big performance issue it's extremely messy.

On top of this I think we should be throwing PHP warnings and etc. if a module is just removed from the file system or similar, rather than hide it as we do now.

Version:8.x-dev» 7.x-dev

catch is pretty convincing

Here's strace output, with page caching enabled. As you can see this removes 3 system calls to access for the three modules (I still don't understand why system.module is being included as a bootstrap module sometimes - this is a clean standard profile install via drush).

If you have 150 modules enabled, then for a full bootstrap it's going to save 100 of these.

HEAD:

6919  write(10, "\16\1\0\0\3SELECT cache_bootstrap.cid "..., 274) = 274
6919  read(10, "\1\0\0\1\5<\0\0\2\3def\2d7\17cache_bootstrap"..., 16384) = 740
6919  access("modules/dblog/dblog.module", F_OK) = 0
6919  access("modules/overlay/overlay.module", F_OK) = 0
6919  access("modules/system/system.module", F_OK) = 0
6919  lstat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  lstat("/home/catch/www/7/modules/dblog", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  lstat("/home/catch/www/7/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/dblog/dblog.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  mmap(NULL, 6768, PROT_READ, MAP_SHARED, 11, 0) = 0x7fefb15be000
6919  stat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
6919  munmap(0x7fefb15be000, 6768)      = 0
6919  close(11)                         = 0
6919  lstat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  lstat("/home/catch/www/7/modules/overlay", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/overlay/overlay.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  mmap(NULL, 33497, PROT_READ, MAP_SHARED, 11, 0) = 0x7fefb15b7000
6919  stat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
6919  munmap(0x7fefb15b7000, 33497)     = 0
6919  close(11)                         = 0
6919  lstat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
6919  lstat("/home/catch/www/7/modules/system", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
6919  open("/home/catch/www/7/modules/system/system.module", O_RDONLY) = 11
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
6919  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0

And with the patch:

7185  write(10, "\16\1\0\0\3SELECT cache_bootstrap.cid "..., 274) = 274
7185  read(10, "\1\0\0\1\5<\0\0\2\3def\2d7\17cache_bootstrap"..., 16384) = 740
7185  lstat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  lstat("/home/catch/www/7/modules/dblog", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  lstat("/home/catch/www/7/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/dblog/dblog.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  mmap(NULL, 6768, PROT_READ, MAP_SHARED, 11, 0) = 0x7f76927c6000
7185  stat("/home/catch/www/7/modules/dblog/dblog.module", {st_mode=S_IFREG|0644, st_size=6768, ...}) = 0
7185  munmap(0x7f76927c6000, 6768)      = 0
7185  close(11)                         = 0
7185  lstat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  lstat("/home/catch/www/7/modules/overlay", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/overlay/overlay.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  mmap(NULL, 33497, PROT_READ, MAP_SHARED, 11, 0) = 0x7f76927bf000
7185  stat("/home/catch/www/7/modules/overlay/overlay.module", {st_mode=S_IFREG|0644, st_size=33497, ...}) = 0
7185  munmap(0x7f76927bf000, 33497)     = 0
7185  close(11)                         = 0
7185  lstat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  lstat("/home/catch/www/7/modules/system", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
7185  open("/home/catch/www/7/modules/system/system.module", O_RDONLY) = 11
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  fstat(11, {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  mmap(NULL, 138592, PROT_READ, MAP_SHARED, 11, 0) = 0x7f7692775000
7185  stat("/home/catch/www/7/modules/system/system.module", {st_mode=S_IFREG|0644, st_size=138592, ...}) = 0
7185  munmap(0x7f7692775000, 138592)

Status:Needs review» Needs work

Code looks good.

Is this going to start throwing errors for sites that removed modules without disabling? Should we do an update function that disables these missing modules/themes? They are effectively disabled anyway. It is too bad we can't run hook_disable but the code is gone so changing system.status seems like our best bet.

Yes it'll start throwing errors for those. A one-off update for old databases sound sensible unless we already have something that disables missing modules already, I'll have a look.

subscribe

Subscribe

Version:7.x-dev» 8.x-dev
Status:Needs work» Needs review
Issue tags:+needs backport to D7

We need to check what exactly happens to modules that are just missing from the filesystem before doing any work on the patch, back to CNR and updating metadata.

This is really a correctness issue rather than just performance, but an additional thing here is the file_exists() + require_once pattern more or less undermines apc.stat = 0 since that will not at all prevent the file stat from file_exists().

Title:Avoid too many file_exists() calls during bootstrapRemove file_exists() during bootstrap
Issue tags:-needs backport to D7
StatusFileSize
new1.39 KB
FAILED: [[SimpleTest]]: [MySQL] 34,062 pass(es), 4 fail(s), and 14 exception(es).
[ View ]

Re-rolled against latest HEAD. Removing backport tag, as this cannot be backported.

Speaking of, I'd simply go ahead with this patch and don't care for bogus info for non-existing modules at all. Make it a requirement for D8 that extensions need to be properly uninstalled before removing them. We're baby-sitting bogus user behavior with stuff like this on the cost of overall system performance.

Status:Needs review» Needs work

The last submitted patch, drupal8.file-exists.20.patch, failed testing.

Status:Needs work» Needs review

#20 makes pretty clear that we're attempting to load the utterly wrong files in the D8 upgrade path.

StatusFileSize
new3.31 KB
PASSED: [[SimpleTest]]: [MySQL] 34,161 pass(es).
[ View ]

Well, the reason the tests fail is simple. The file_exists() check is gone, so it will always return something into $files[$type][$name] (FALSE or a filename, which is now still in the wrong place on the file system). With the file_exists, it falls through so drupal_system_listing() starts and finds the new filename of the modules that failed loading in the test. Attached patch makes those failures go away - but I'm not sure if it's the right way to fix this.

- added a system_rebuild_module_data() so the filenames for the modules are rebuild
- the db_merge() change is because the language update otherwhise triggered an PDO error (maybe we could refactor this)
- change the include for the user module with a note in update.php

Note, in case you use devel module for D8 testing (not simpletest, but just plain debuggin), disable it because it's using drupal_get_filename('modules/user/user.module') and during hook_boot().

I wrongly opened #1715222: Remove almost all file_exists() and is_file() calls from core which is a partial duplicate of this issue. Attached a full patch which fails with SimpleTest, but not in real life http://drupal.org/files/1715222-1-d7-remove_useless_file_stat.patch

StatusFileSize
new1.62 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal8.file-exists.25.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Re-rolled against HEAD.

Issue tags:-i/o+i-o

(the slash in the i/o tag breaks the autocomplete from adding new tags)

Issue tags:-Performance, -i-o

#25: drupal8.file-exists.25.patch queued for re-testing.

Status:Needs review» Needs work
Issue tags:+Performance, +i-o

The last submitted patch, drupal8.file-exists.25.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new457 bytes
PASSED: [[SimpleTest]]: [MySQL] 58,401 pass(es).
[ View ]

Re-rolled. Only including the hunk that actually affects bootstrap.

Status:Needs review» Reviewed & tested by the community

Version:8.x-dev» 7.x-dev
Status:Reviewed & tested by the community» Patch (to be ported)

Committed a0611bb and pushed to 8.x. Thanks!

Tested this patch with moving modules around - this patch does nothing to affect how brittle Drupal 8 is when this occurs.