Regarding to #222109: module_list() should be rewritten as 2 functions: module_list() and module_list_bootstrap(), this problem will be fixed soon in 7.x
To make it clear, this thread is related to solve that problem in 6.x

MAIN PROBLEM
Drupal core function module_list() (module.inc) have second argument $bootstrap to choose if you want to receive reduced set of modules loaded in "bootstrap mode" or ALL modules.
The problem is that this function caching that list without checking the $bootstrap argument.

EXAMPLE
function module_list($refresh = FALSE, $bootstrap = TRUE, $sort = FALSE, $fixed_list = NULL) {

1. Calling module_list(FALSE, TRUE) returns modules loaded in "bootstrap mode"
2. Calling again module_list(FALSE, FALSE) returns modules loaded in "bootstrap mode" again, instead of all modules (second argument is ignored).

SUMMARY
Why this issue is so important if Drupal works fine?
In same cases it's breaking rendering pages causing WSOD without any error, in some other cases it returns invalid values causing unexpected behaviour (See module_exists issue below).

CODE WORKFLOW
First place when bootstrap module list is cached badly is in bootstrap_invoke_all('boot'); (called from bootstrap.inc)

Affected functions before DRUPAL_BOOTSTRAP_FULL (i.e. during hook_boot and in cache modules):
- module_exists() will return TRUE only for bootstrap module, will return FALSE for module which doesn't have boot or exit hook, even if is activated
- each module which will return module_list(FALSE, FALSE), will get bootstrap list, instead of list of activated modules
- theme() function during this time is rendering badly, because during _theme_build_registry() list of modules which implements theme will be invalid (only those which are registered in bootstrap mode)
Read more: #496170: module_implements() cache can be polluted by module_invoke_all() being called (in)directly prior to full bootstrap completion

Secondly list is refreshed again without bootstrap mode in module_load_all() (called from module.inc) then cached list is back to normal unless some module will not request to refresh the list of modules in bootstrap module.

Comments

damien tournoud’s picture

Why is that a bug?

Before full bootstrap, the module system is not fully loaded, and module_list() should not return the name of any module that is not fully loaded.

heine’s picture

- module_exists() will return TRUE only for bootstrap module, will return FALSE for module which doesn't have boot or exit hook, even if is activated

I don't get this. Seems like correct behaviour to me, as non bootstrap modules are not loaded prior to the full bootstrap. Idem for the theme.

kenorb’s picture

Other case when theme is broken causing WSOD
Problem: theme_registry which contain all theme hooks could be saved badly causing WSODs

Example:
1. We getting theme_registry from cache (step 4), if doesn't exist, build_registry (go to step 2)

function _theme_load_registry($theme, $base_theme = NULL, $theme_engine = NULL) {
  // Check the theme registry cache; if it exists, use it.
  $cache = cache_get("theme_registry:$theme->name", 'cache');
...
    $registry = _theme_build_registry($theme, $base_theme, $theme_engine);
    _theme_save_registry($theme, $registry);

2. Now we are building invalid theme_registry:

function _theme_build_registry($theme, $base_theme, $theme_engine) {
...
  foreach (module_implements('theme') as $module) {
    _theme_process_registry($cache, $module, 'module', $module, drupal_get_path('module', $module));
  }

Why?
module_implements('theme') returns only few modules (in my test environment returns only one 'devel' module)
Read more: #496170: module_implements() cache can be polluted by module_invoke_all() being called (in)directly prior to full bootstrap completion

3. This invalid theme_registry is saved into cache via _theme_save_registry() in step 1.

4. Next pages loading invalid theme_registry from cache (step 1) causing WSODs by ignoring all core themes (like system_theme etc.)
So each call to theme() in example theme('page') (which should be always rendered) returns nothing causing WSOD

function theme() {
...
  if (!isset($hooks[$hook])) {
    return;
  }
heine’s picture

In what case is _theme_load_registry called before the FULL bootstrap?

kenorb’s picture

#1 & #2:
module_implements() could be called after Drupal is fully loaded, so why it's trying to load list of bootstrap modules? #496170: module_implements() cache can be polluted by module_invoke_all() being called (in)directly prior to full bootstrap completion

Secondly:
If module_list() returning not loaded modules yet during boot time (and it's nothing wrong with it) and some code request theme_registry to be rebuild during this time, list of theme hooks will be broken and saved into database and used in next pages.
It's even could be requested to be rebuild by anything by simple theme() call. Even watchdog calling theme(), so if there is simple error notice handler by error_handler, it goes to watchdog and it's themed.
Is there any documentation that saying that theme() shouldn't be called during hook_boot?

kenorb’s picture

#4:
drupal_error_handler -> watchdog -> module_invoke -> call_user_func_array -> syslog_watchdog -> theme -> init_theme -> _init_theme -> _theme_load_registry -> _theme_build_registry -> _theme_process_registry ...
I'm not sure if this is called during boot, but for sure during shutdown process and then static variables are cleared and theme is rebuilded again.

heine’s picture

No, only implicitly: "Only use this hook if your code must run even for cached page views.This hook is called before modules or most include files are loaded into memory. It happens while Drupal is still in bootstrap mode."

damien tournoud’s picture

Title: module_list() is broken in 6.x » syslog_watchdog() should not call theme()
Version: 6.x-dev » 7.x-dev

Now, that syslog_watchdog() does call theme(), that's a real bug.

kenorb’s picture

Title: syslog_watchdog() should not call theme() » module_list() is broken in 6.x
Version: 7.x-dev » 6.x-dev

If I understand correctly, if this function is in API:
http://api.drupal.org/api/function/module_list

I can use it in my modules in following way:
1. Calling module_list(FALSE, TRUE) returns modules loaded in "bootstrap mode"
2. Calling again module_list(FALSE, FALSE) returns modules loaded in "bootstrap mode" again

even I've selected to be not in "bootstrap mode"
How this can be correct?

kenorb’s picture

#8:

function syslog_watchdog($entry) {
...
  syslog($entry['severity'], theme('syslog_format', $entry));

syslog_watchdog could be called from error handler anytime when theme was not initiated yet.

kenorb’s picture

#1:
Another case with drupal_bootstrap() called twice
hook_boot could be called many times in one page load and module_list() could return restricted bootstrap module listing even after DRUPAL_BOOTSTRAP_FULL.

Example with Drupal core module: Statistics
Backtrace:
1. statistics_exit
2. drupal_bootstrap(DRUPAL_BOOTSTRAP_PATH)
Note: it's not DRUPAL_BOOTSTRAP_FULL, so module_load_all() is not called and list from module_list() will not back to normal again as normal does.
3. bootstrap_invoke_all('boot');

function bootstrap_invoke_all($hook) {
  foreach (module_list(TRUE, TRUE) as $module) {

4. Now module_list() after rebuild contain modules loaded in "bootstrap mode"
Any call to module_exists() cause FALSE, even module was already loaded.
Any try to rebuild theme() after that could cause WSOD on the next pages (#3) unless you will clear the cache to remove invalid theme_registry. And probably during shutdown theme() will be rebuild again with the wrong theme hooks.

heine’s picture

Did you test that scenario? drupal_bootstrap won't rerun any earlier stages and just ensures you're at the requested stage OR further.

kenorb’s picture

#12
Yes, I've tested it.
drupal_page_footer -> module_invoke_all -> statistics_exit -> drupal_bootstrap -> _drupal_bootstrap -> bootstrap_invoke_all -> module_list
Finally module_list is called as follow:

function bootstrap_invoke_all($hook) {
  foreach (module_list(TRUE, TRUE) as $module) {

with $refresh and $bootstrap TRUE, so it execute db_query (bootstrap = 1) and keep cached bootstrap module list in static $list.
Any later access to module_list() after statistics_exit() will return bootstrap list instead of full list of modules.

damien tournoud’s picture

@kernob#13: that can only happen when serving a page from the cache. In that case, Drupal is never fully bootstrapped, so I don't really see where the problem is. Neither hook_boot() nor hook_exit() can assume they are being run in a fully bootstrapped Drupal.

kenorb’s picture

#14
About which case you are talking?
This backtrace happen everytime.
index.php:

drupal_page_footer();

drupal_page_footer called module_invoke_all('exit'); everytime
that means statistics_exit will called everytime
and the code in statistics is simple:

function statistics_exit() {
  global $user, $recent_activity;
  drupal_bootstrap(DRUPAL_BOOTSTRAP_PATH);
...

Which call _drupal_bootstrap():

function _drupal_bootstrap($phase) {
...
    case DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE:
...
      if (!$cache || $cache_mode != CACHE_AGGRESSIVE) {
        // Load module handling.
        require_once './includes/module.inc';
        bootstrap_invoke_all('boot');
      }
...
    case DRUPAL_BOOTSTRAP_PATH:
      require_once './includes/path.inc';
      // Initialize $_GET['q'] prior to loading modules and invoking hook_init().
      drupal_init_path();
      break;

    case DRUPAL_BOOTSTRAP_FULL:
      require_once './includes/common.inc';
      _drupal_bootstrap_full();
      break;

So in which case drupal_bootstrap(DRUPAL_BOOTSTRAP_PATH); will not execute drupal_bootstrap twice?
bootstrap_invoke_all('boot'); will be executed as well second time, if page is not cached.

damien tournoud’s picture

@kenorb#15: when Drupal is fully bootstrapped, the call to drupal_bootstrap(DRUPAL_BOOTSTRAP_PATH) in statistics_exit() exits immediately.

kenorb’s picture

Ah, ok, I know what you mean.
drupal_bootstrap() storing $phases which were executed already which not.
In this case probably it will not be called twice.

So what about modules which calling drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL); or _drupal_bootstrap_full(); which cause second call to module_load_all(); for some different purposes?
Like backup_migrate, adserve, mrbs_inc, og_user_roles (#273068), securesite, persistent_login, provision, singlesignon, views_mail during boot, wsod and many more?
They can execute this case described in #13, but I understand that they shouldn't do that.

kenorb’s picture

I had case #13 because of my test environment and DRUPAL_BOOTSTRAP_FULL called twice, let say that was my fault.

Ok, so in summary
If I understand correctly, officially none of the modules should called module_list with following arguments:
module_list($refresh = TRUE, $bootstrap = TRUE)
Because it will broke other module functionality by getting later invalid data via module_exists(), module_implements() or module_list()
Is that correct?

So the fix should be in documentation instead?
http://api.drupal.org/api/function/module_list/6
with additional note: please don't call this function with arguments (TRUE, TRUE) after DRUPAL_BOOTSTRAP_FULL, because you may experience WSOD or unexpected behaviour?

kenorb’s picture

Problem with truncated list of modules could cause as well this issue: #238760: menu_router table truncated and site goes down
When menu_rebuild() calling menu_router_build() which using module_implements() which could return invalid data.
Then whole menu_router table is broken with only those, which are bootstrap modules.
It could happen in different cases, I've written only few of them in this thread.
For my logically module_list() is broken and it not make sense to analyse hundred of backtraces, when does it happen.
For me $list should be cached two times depends of $bootstrap argument.

damien tournoud’s picture

Status: Active » Closed (won't fix)

module_list() returns the current module list (the bootstrap one during bootstrap, and the full one once Drupal is fully bootstrapped).

No contributed module should ever call module_list(TRUE). Ever. They do not need to. We should document that.

Obviously the current code flow is not very elegant, but it doesn't seems buggy.

#238760: menu_router table truncated and site goes down seems unrelated.

kenorb’s picture

In case of 'Maximum execution time' #238760 is not related, but in case that we discussed when module_list() contain wrong list, during menu_rebuild() and module_implements() (similar to #3) and 'menu_router table truncated and site goes down' then it's related.
Currently on one environment when I've got menu_router with only those items that are from bootstrap modules (of course I haven't truncated this table by my-self). I know it shouldn't be like that, but it happened.

kenorb’s picture

Let me make a list, when developer can execute module_list and when cannot.

NOT allowed calls of module_list():
module_list(TRUE)
module_list(TRUE, TRUE)
...

NOT allowed calls of module_list() only during bootstrap mode:
module_list(TRUE, FALSE)
...

NOT allowed calls of module_list() only after bootstrap mode:
module_list(TRUE, TRUE)
...

Allowed general calls of module_list():
module_list()

Allowed calls of module_list() only during bootstrap mode
module_list(TRUE, TRUE)

Allowed calls of module_list() only after bootstrap mode
module_list(TRUE, FALSE)

Tell me if I'm wrong.

QUICK REVIEW
So analysing contributed modules:
Module: adminrole
function adminrole_update_perms()
Code:

foreach (module_list(TRUE, FALSE, TRUE) as $module) {

called during form_alter.
Probably allowed.

Module: luceneapi
function luceneapi_path_list
Code:

    foreach (luceneapi_module_list(TRUE) as $path) {

Author probably wrote his own function, because of that restriction - allowed.

Module: multisite
function multisite_manager_install_site
Code:

  module_list(TRUE, FALSE);

during node insert. Probably allowed, nodes are not created during bootstrap process, but who knows.

Module: provision
Function: provision_load_from_args()
Code:

    $list = module_list(TRUE, FALSE);

during hook_init
Allowed.

Module: provision_stats
Function: _provision_stats()
Code:

  module_list(TRUE);

during drush command.
Probably not allowed, if run after bootstrap

Module: merci
Function: merci_check_dependencies() in merci_import.php
Code:

  // Have to reset the module list here, as the maintenance theme wipes
  // out the full list.
  module_list(TRUE, FALSE);

After custom-executed full bootstrap - probably allowed. True, theme wipes (#3).

Logically developers think that module_list() will simply returns module list and it will not break any other functionality, because it looks like simple function which returns simple module list. But it's not simple as it is and after couple of hours or days debugging, crashing, table truncating, WSODing, reading documentation with no result, they will get to that point as described in this thread to see simply status: won't fix. Probably that's the last thing that they want to see;)

Module: platform
Function: install_main() in merci_import.php
Code:

  module_list(TRUE, FALSE, FALSE, $module_list);

before bootstrap - not allowed.

Module: simpletest
drupal_unit_tests.php

      module_list(TRUE, FALSE);

modules_system.test

    // Get a list of the currently enabled modules
    $enabled_modules = module_list(true, false);
...
    // Now, we check the tables for each module
    // We also refresh the module list and make sure the modules are enabled
    module_list(true, false);
...
    $current_modules = module_list(true, false);
...
    $enabled_modules = module_list(true, false);
...
    while (count(array_diff(module_list(true, false), $enabled_modules)) > 0) {
...
    module_list(true, false);

For test plans - with second argument FALSE probably allowed.

In summary - not so bad.

damien tournoud’s picture

Most of those should *not* call module_list() with $refresh = TRUE.

kenorb’s picture

At least there should be really some simple note for developers at:
http://api.drupal.org/api/function/module_list/6
related to $refresh or $bootstrap
Something like:
That $refresh could be set to TRUE only for advanced developers
or:
$bootstrap
Whether to return ... See bootstrap.inc.
(don't set it to TRUE after bootstrap mode)
or I don't know.
Any note which will point developers to search for some additional information or to this issue that using this function with wrong arguments could cause WSODs and broke your website by caching invalid data in your database and they shouldn't call this function with $refresh = TRUE.
If none action will be taken, you will see that within few weeks many developers will have the same issue and they land here with no proper solution.

kenorb’s picture

Title: module_list() is broken in 6.x » module_list() called with wrong arguments causing WSOD and breaking theme_registry