Checking to see if a file exists for every module can't be cached by APC and is thus slow. Caching a list of .inc files to require would be ideal.
http://drupalcode.org/project/filefield_paths.git/blob/82ab88d5cacbd1d9f...

foreach (module_list() as $module) {
  if (file_exists($file = dirname(__FILE__) . "/modules/{$module}.inc")) {
    require_once $file;
  }
}

Code used to be in filefield_paths_init()

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeytown2’s picture

Status: Active » Needs review
FileSize
1.13 KB

Stock:
Total Self: 5.5ms
Total Cumulative: 13ms
Memcache->get calls: 0

This Patch:
Total Self: 0.4ms
Total Cumulative: 1.0ms
Memcache->get calls: 1

mikeytown2’s picture

This is the correct patch.

Deciphered’s picture

Status: Needs review » Fixed

Thanks mikeytown2,

I appreciate the rewrite.

I reworked it a little bit to be more generic as I use the code in most of my modules:

/**
 * Include additional files.
 */
$info = pathinfo(__FILE__);
// Get files to include from cache.
$files = cache_get("{$info['filename']}_inc_modules");
if (!empty($files->data)) {
  $files = $files->data;
}
else {
  $files = array();
  // If not in the cache, get files from module list.
  foreach (module_list() as $module) {
    if (file_exists($file = "{$info['dirname']}/modules/{$module}.inc")) {
      $files[] = $file;
    }
  }
  cache_set("{$info['filename']}_inc_modules", $files, 'cache', CACHE_TEMPORARY);
}
// Add in the files.
foreach ($files as $file) {
  if (!include_once $file) {
    // Clear the cache if a file is missing.
    cache_clear_all("{$info['filename']}_inc_modules");
  }
}
// Clean up because we are outside of a function call.
unset($info, $files, $file, $module);

Committed to 6.x-1.x and 7.x-1.x.

Cheers,
Deciphered.

Deciphered’s picture

Priority: Normal » Major
Status: Fixed » Active

Unfortunately I have identified an issue with this code, which is that when new modules are enabled the cache is not reset and therefore the relevant include files are not included and as far as a user would be able to tell the functionality is not available/is broken.

This is a fairly major issue, which if cannot be resolved will require the reversal of this fix.

Deciphered’s picture

Status: Active » Postponed

Reverted code for both 6.x-1.x and 7.x-1.x. Ideally there would be a way to clear the cache whenever a module is enabled or disabled, but hook_enable() and hook_disabled() only trigger on the module being enabled or disabled.

Will revisit as I do like the optimisation, but can't have the module becoming unusable.

mikeytown2’s picture

for D7 this shouldn't be an issue do to module_implements() caching what hooks are being used. For D6 this is a concern but it's a minor one when D7 is taken into account. Also see #557542-179: Cache module_implements() .

As for D6 this is the callback on the modules page system_modules_submit(). Near the bottom you can see a bunch of clear cache calls. Having this cache cleared inside of hook_menu should do the trick.

Deciphered’s picture

I had considered that, but my concern is then would that be respected by 'drush en' or alternate module enabling methods. I will definitely need to do a bit more digging before I can finalize a solution.

mikeytown2’s picture

hook_menu is fairly safe. If you enable a module and you can't get to its configuration page, that's a broken workflow.

j0rd’s picture

I just noticed this code and it caused me some alarm as well.

First of all, it doesn't appear to be in a hook, which was the first alarm. Is there a reason this isn't in something like a hook_init() function?

Personally, I think you should hardcode an array of modules you provide, then check these against module_list and enable them as needed. Cache is a better solution yet, but this has already been discussed. Ctools plugins is another way to go.

With that said, I optimized the code at hand and have tested it to be faster on a production server (fast disk) and development server (virtual machine, with shared disk aka. slow) and in both cases my code proved to be faster. I ran the tests at least 3 times on each server. These were consistent results between refreshes.

You'll get more performance improvements depending on how many modules you have enabled.

Development Server (pre-patch)
~70ms

Development Server (post-patch)
~12ms

Production Server (pre-patch)
~0.74ms

Production Server (post-patch)
~0.48ms

Here's the code I used to test. My code is commented out, but you can easily comment it back in.

/**
 * Include additional files.
 */ 
$time_start = microtime(true);
/*
$dirname = dirname(__FILE__); 
$includes = file_scan_directory($dirname . "/modules", '/.inc$/');
*/

foreach (module_list() as $module) {
  /* 
  if(isset($includes[$file = $dirname . "/modules/{$module}.inc"])) {
    require_once $file;
  }
  */ 
  if (file_exists($file = dirname(__FILE__) . "/modules/{$module}.inc")) {
    require_once $file;
  }
}
$time_end = microtime(true);
$time = $time_end - $time_start;
dpm('Script took '.$time.' seconds to execute');

Attached is the patch.

j0rd’s picture

Version: 6.x-1.x-dev » 7.x-1.0-beta3
Status: Postponed » Needs review

Changing status.

Deciphered’s picture

Thanks j0rd,

Both code and performance improvements look good, I'll give this a test to ensure that there are no other issues being introduced but I think it should be fairly safe.

j0rd’s picture

Ya, code just removes un-needed look ups of file_exists, as it does it in one swoop. Also caches dirname from multiple calls, which speeds up a little more.

Side note:

I've noticed things which check for files on the filesystem take up a lot of time. Anyone know of an idea to check in APC op-code cache before checking the filesystem? Seems like that could be a reasonable way to use an existing cache and avoiding the file system all together. Or is that something PHP already does. Not 100% sure myself....but I'm also looking to optimize the amount of calls like this that ctools plugins make. It adds up.

mikeytown2’s picture

I would look at apc_cache_info. The cache_list key should have what your looking for.

If you're using file_exists() the stat cache (realpath_cache_get()) will be used; but file_scan_directory() uses opendir() and readdir() which are not cached in php, thus it will always hit the file system.

I would avoid file_scan_directory as much as possible and cache it if you can. This is an issue where file_scan_directory usage got out of hand #1081266-58: Avoid re-scanning module directory when a filename or a module is missing.

j0rd’s picture

@mickeytown2: Thanks for the useful information.

I looked up file_exists with regards to this cache you mentioned. Turns out (i believe) that you incur the full hit sans cache when the file does not exist.

From: http://php.net/manual/en/function.clearstatcache.php

You should also note that PHP doesn't cache information about non-existent files. So, if you call file_exists() on a file that doesn't exist, it will return FALSE until you create the file. 

Additionally with regards to this cache, do we know how persistent it is? If it's only per request, which I assume is the case, then it's not going to help us out anyways. We'd continue to incur the cost of having our cached un-primed per request anyways.

I'm still curious what's possible with APC.

mikeytown2’s picture

Code that finds out if any of the filefield_paths/modules/*.inc files are loaded in APC's opcode cache. Not sure how helpful this would be but I though it was interesting.

$data = apc_cache_info();
$filenames = array();
$name = drupal_get_path('module', 'filefield_paths') . '/modules/';
foreach ($data['cache_list'] as $values) {
  if (strpos($values['filename'], $name) !== FALSE && strpos(strrev($values['filename']), strrev('.inc')) === 0) {
    $filenames[] = $values['filename'];
  }
}
print_r($filenames);
Deciphered’s picture

Version: 7.x-1.0-beta3 » 7.x-1.0-beta4
Status: Needs review » Fixed

Committed a slightly modified version of the patch at #9.

Thanks for the work.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

add a note about filefield_paths_init