module_list() is broken

earnie - February 15, 2008 - 13:31
Project:Drupal
Version:7.x-dev
Component:base system
Category:bug report
Priority:critical
Assigned:chx
Status:patch (code needs work)
Description

This patch supports http://drupal.org/node/196862 at #36.

AttachmentSize
module_list-HEAD.patch.txt631 bytes

#1

David_Rothstein - April 15, 2008 - 02:13
Status:patch (code needs review)» closed

Please reopen this if I'm mistaken... but as I said at the other issue, this patch doesn't seem to help.

#2

earnie - April 15, 2008 - 19:17
Status:closed» patch (code needs review)

Reopened. This patch does indeed help and is more proper than #237336: Url Aliases no longer work comment #4 which does the same thing but every call and thus eliminating the cache.

#3

earnie - April 15, 2008 - 19:29

I've included a modification to the $bootstrap default value.

AttachmentSize
module_list-HEAD.patch.txt827 bytes

#4

catch - April 15, 2008 - 20:42

I'm not convinced that drupal_lookup_path needs to be in path.inc at all. Why not just move it to path.module? (along with drupal_get_path_alias and drupal_clear_path_cache). Would need to be renamed of course.

drupal_get_normal_path needs to stay, and would need a function_exists for drupal_lookup_path (there's already a function_exists for custom_url_rewrite_inbound in there so I guess it'd be balanced if nothing else).

#5

catch - April 15, 2008 - 20:56
Title:Using module_exists in path.inc causes warnings about list not being an array» URL aliases broken

retitling to something more eye-catching.

#6

recidive - April 15, 2008 - 22:29

This change make sense, looking at the code if ($refresh || $fixed_list) seems to assume the first call to module_list() should be module_list(TRUE) which is not consistent. But $list is not an array only when it is not set, so a check for !isset($list) should suffice, and should be less expensive than !is_array($list).

I'm not sure about defaulting $bootstrap to FALSE though. If it is correct we should search the calls to module_list(FALSE, FALSE) and module_list(TRUE, FALSE) and change them to module_list(FALSE) and module_list(TRUE).

#7

earnie - April 15, 2008 - 23:11

The change for the default value of $bootstrap is needed for the !isset($list) case. However we need to check all the uses of module_list to find those that might have assumed it to be TRUE. Those would something like module_list(TRUE) with no other parameter given and would need to change to module_list(TRUE, TRUE).

#8

David_Rothstein - April 16, 2008 - 00:39
Status:patch (code needs review)» patch (code needs work)

OK, I just tried applying the patch from #3, but unfortunately it doesn't fix the problem either. URL aliases are still broken.

It's definitely a caching-related problem, but I think a more serious one than described above... As I said at the other issue, I think the fundamental problem with module_list() is this:

It does its internal caching in a rather odd way, by storing everything in the same static $list variable (regardless of what type of module list it is generating; e.g. $bootstrap = TRUE or FALSE). This means that when you call it normally (without asking it to do a refresh), it returns whatever module list it happened to generate last time, which is not necessarily the list you wanted!

This shouldn't be too hard to fix... we need an isset() check as you suggest here, but we also need to statically cache the bootstrap and non-bootstrap lists separately. I think that should be about it, right? I'll see if I have time to whip up a patch later.

#9

David_Rothstein - April 16, 2008 - 00:44

@catch, your idea in #4 sounds worthwhile in its own right (maybe a separate feature request?). But for fixing this bug, I would vote for a direct fix rather than a workaround... and there's no fundamental reason why calling module_exists() inside path.inc should be forbidden, right?

#10

David_Rothstein - April 16, 2008 - 07:00

OK, here is a patch. It fixes module_list()'s internal cache to actually work correctly... and on an existing Drupal 7 installation, everything seems OK. URL aliases work again, and I didn't notice any other problems (although I haven't tested it extensively).

The only teensie-weensie problem is that this patch totally breaks install.php (and probably update.php too, I haven't checked). It causes a "call to undefined function db_query()" error inside module_list()..... I have to track this down further, but I think the code in install.php was basically relying on the old, weird caching behavior of module_list(), in particular with regard to the $fixed_list parameter. I think $fixed_list doesn't really belong in this function anyway (it's kind of a weird parameter), and what the function really needs is a more general way to get a list of modules to use when it's called before there is an active database connection.

But I think this is on the right track. Frankly, I'm amazed that the caching problem never caused any bugs before this... module_list() really seems like it's completely broken internally! (Either that or I'm totally going crazy here ;)

AttachmentSize
fix_module_list.patch9.6 KB

#11

catch - April 16, 2008 - 09:12
Status:patch (code needs work)» patch (code needs review)

OK I took a look at splitting the path.inc stuff into a different issue but that's kinda tricky.

This from bootstrap.inc - it runs before common.inc is included or any modules loaded:

<?php
case DRUPAL_BOOTSTRAP_PATH:
      require_once
'./includes/path.inc';
     
// Initialize $_GET['q'] prior to loading modules and invoking hook_init().
     
drupal_init_path();
      break;
?>

If we follow the trail from drupal_init_path
it does this:

drupal_init_path
drupal_get_normal_path
drupal_lookup_path
elseif (module_exists('path')) << before any modules are loaded

Note I did this via api.drupal.org rather than a debugger but it appears to explain what's going on.

My inclination would be to roll back #196862 (where this issue never came up, and the final patch was committed without testing) and look at this from scratch. Judging by this I'm not at all convinced that module_list is broken, or at least not in such a way that it's causing this specific issue.

Attaching a rollback patch for #196862 and marking to needs review.

AttachmentSize
path-rollback.patch1.23 KB

#12

earnie - April 16, 2008 - 11:53
Status:patch (code needs review)» patch (code needs work)

@David: Your patch to me appears to return a module named boolean TRUE since it is possible to call module_list(TRUE, TRUE) or even module_list(FALSE, TRUE) and then somewhere else call module_list() but you've set $list[$bootstrap] when $bootstrap == TRUE and you don't have a correct list of modules. Why not use another static array named $bootstrap_list to take care of the bootstrap modules? We could even go wild and fill both $list and $bootstrap_list if $refresh is TRUE since we have knowledge of the bootstrap modules with the full list read anyway.

@catch: There is no reason to not use module_list as the indicator that the path module is enabled. You're just ignoring the bigger issue with module_list. David's patch @#10 is the correct method to correct the issue.

#13

chx - April 16, 2008 - 12:23
Status:patch (code needs work)» patch (code needs review)

The patch to review is #11. earnie, you again?

#14

catch - April 16, 2008 - 13:15

path.module is not in module_list() during that stage of the bootstrap.

Easy way to check is to add this to the top of path.module

<?php
function path_boot() {
}
?>

and url aliases work again (after a module_rebuild_cache).

#15

catch - April 16, 2008 - 14:19
Status:patch (code needs review)» duplicate

Marking as duplicate of http://drupal.org/node/196862

#16

earnie - April 16, 2008 - 16:20
Status:duplicate» patch (code needs work)

module_list is broken. The sucker needs fixed.

#17

David_Rothstein - April 16, 2008 - 17:51
Title:URL aliases broken» module_list() is broken

Indeed, I'm retitling this, since module_list() is definitely broken (regardless of whether or not fixing it is the best way to deal with the URL alias problem).

@catch, your analysis makes a whole lot of sense, but I think there is one problem. module_exists() does not actually check whether a module has been LOADED... it checks whether the module is enabled (i.e., whether it WILL BE LOADED at some point during the bootstrap process). Indeed, this is very confusing, since module_exists() calls module_list(), and the code comments for module_list() claim that it returns a list of "loaded modules"... however, if you look at the code, that's not what it's doing. It simply checks whether status=1 in the database, which is not the same thing.

Therefore, @earnie's original use of module_exists('path') should theoretically work, as it's only designed to check whether or not the path module is ENABLED -- and indeed, it does work when my patch in #10 is applied.

Basically, it seems like we have two bugs here which are sort of canceling each other out... or something like that ;) Arguably, module_exists() should be rewritten to behave the way you thought it was behaving (i.e., check whether the module has actually been loaded)... as to a large extent, that would make more sense.

The end result is that there is definitely broken code here which should be fixed. @earnie, thanks for your review in #12... I will look over those comments more carefully later when I have time.

#18

catch - April 16, 2008 - 19:26

module_exists() calls module_list(), and the code comments for module_list() claim that it returns a list of "loaded modules"... however, if you look at the code, that's not what it's doing. It simply checks whether status=1 in the database, which is not the same thing.

<?php
else {
      if (
$bootstrap) {
       
$result = db_query("SELECT name, filename, throttle FROM {system} WHERE type = 'module' AND status = 1 AND bootstrap = 1 ORDER BY weight ASC, filename ASC");
      }
      else {
       
$result = db_query("SELECT name, filename, throttle FROM {system} WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC");
      }
?>

note bootstrap = 1, which is set in module_rebuild_cache, which sets it on the basis of whether a module uses hook_exit or hook_boot, as evidenced by my empty hook_boot example for path.module

So I think it's actually a documentation/naming issue. module_exists() should be module_is_loaded()

#19

Freso - April 16, 2008 - 21:36

Adding to my issue queue (catch thought I should have a look at it, so... ;)).

#20

earnie - April 16, 2008 - 23:00

@catch: module_exists returns TRUE if the module is both installed and enabled. That is what it is being used for throughout Drupal core. I don't understand the reasoning to rename it.

Module_list returns the list of enabled modules. Some of that list has special meaning in the bootstrap phases. The documentation for it should state a list of enabled modules rather than loaded to give it proper definition. Even bootstrap uses it to get a list of modules to load. It doesn't use it to get a list of modules loaded.

#21

David_Rothstein - April 16, 2008 - 23:35

Ah, okay, I think I finally have an explanation that hopefully we can all agree on:

@catch, you are right, module_exists() actually does behave more like "module_is_loaded" in this instance... but it doesn't always. It's a very fragile house of cards that allows this to occur.

For example, I can easily make module_exists() go back to its bootstrap behavior even after all modules are fully loaded. Try this:

<?php
// Full bootstrap.
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// This works as expected.
if (module_exists('system')) {
  print
'The system module is loaded... ';
}

// Retrieve a list of modules.
$junk = module_list(TRUE, TRUE);

// Uh oh.
if (!module_exists('system')) {
  print
'But Drupal suddenly forgot that it is!';
}
?>

Or the opposite:

<?php
// Minimal bootstrap.
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE);

// This works as expected.
if (!module_exists('system')) {
  print
'The system module is not loaded yet... ';
}

// Retrieve a list of modules.
$junk = module_list(TRUE, FALSE);

// Uh oh.
if (module_exists('system')) {
  print
'But Drupal suddenly thinks it is!';
}
?>

The behavior of module_exists() depends on the past state of function calls to module_list() -- and as stated above, the behavior of module_list() itself does so as well. This is double-plus-ungood.

This whole thread is a poster child for why unit tests are a good idea, I think ;) I can guarantee you that module_list() would fail any unit tests written for it right now.

#22

earnie - April 17, 2008 - 03:11

@David: That is why I advocate for module_list to have two caches, one for the full list of modules and one for the bootstrap list. If $refresh is true it sets both lists from the full query. But if $bootstrap is TRUE only that one will be returned. The $fixed_list parameter is a special case for being able to call drupal_load('module', 'system') and drupal_load('module', 'filter') from places like install.php, update.php and includes/theme_maintenance.inc. However, I can't find from the drupal_load function that it calls module_list and I think we can do away with the special case of the $fixed_list parameter and simply call the drupal_load for the two modules.

#23

David_Rothstein - April 18, 2008 - 03:23

@earnie, with regard to your comments in #12 and #22:

That is why I advocate for module_list to have two caches, one for the full list of modules and one for the bootstrap list.

I'm pretty sure that's what my patch in #10 does (?). $list[TRUE] is an array of bootstrap modules, and $list[FALSE] is an array of all modules. When you call the function you will always have $list[$bootstrap] returned, which is what you want. Your idea of using a separate $bootstrap_list variable would certainly work too (and would probably be more readable code, although at the cost of some code duplication, I think).

We could even go wild and fill both $list and $bootstrap_list if $refresh is TRUE since we have knowledge of the bootstrap modules with the full list read anyway.

That certainly seems reasonable. (Although I'm not sure if it would have a practical effect, since I suspect there is never a situation where the first call to this function during the course of a page load is with $bootstrap = FALSE?)

The $fixed_list parameter is a special case for being able to call drupal_load('module', 'system') and drupal_load('module', 'filter') from places like install.php, update.php and includes/theme_maintenance.inc. However, I can't find from the drupal_load function that it calls module_list and I think we can do away with the special case of the $fixed_list parameter and simply call the drupal_load for the two modules.

You know, you may be right. I think module_list() does get indirectly called somewhere during the early stages of install.php -- there might be a module_hook call somewhere within one of the functions called by drupal_maintenance_theme(), I think? However, I strongly suspect that those hooks are not intended to be fired during the install process... especially since system and filter are the only modules loaded then, so hooks are probably pretty pointless. So it may indeed be possible to get rid of $fixed_list, which would be very nice.

I'll have more thoughts related to the above in a moment.

#24

David_Rothstein - April 18, 2008 - 03:50

OK, I have been doing some thinking and I am now pretty convinced that there is no way to fix the problems with module_list() without also dealing with the module_exists() vs. module_is_loaded() distinction. There are just too many places in Drupal that assume that module_exists() behaves one way or the other... and all these places can (and will) break when they are reached during install.php and friends.

Thus, I've been thinking of the following totally new plan:

  • Remove all database calls out of the module_list() function, and put them in module_load_all() and bootstrap_invoke_all(), which is where they properly belong. The only time you should ever need to query the database is right when you're deciding which modules to load.
  • drupal_load() already has a pretty good internal cache; with a little reworking and a helper function this could probably be used as a general-purpose place to store the current list of loaded modules at any given time.
  • With the above, module_list() and a new module_is_loaded() become extremely simple functions... all they have to do is check the helper function's statically cached list. No need for a $bootstrap or $fixed_list parameter or anything like that... module_list() is guaranteed to always return you a list of the currently loaded modules at the instant in time at which you call it. (How to handle the sorting correctly might require a little thought, but I don't think it would be hard.)
  • module_exists(), perhaps renamed to module_is_enabled(), would still be there. Probably there would be a helper function that stores the list of modules from the database and is used both by module_load_all() and by module_exists(), so again this function would be very simple. It would also be easy to add an $enabled parameter to module_list() which defaults to FALSE but which can be used to get a list of all enabled modules rather than the currently-loaded ones.

The above might be a fair amount of work, but I think the result would be a much cleaner, clearer and simpler system that "just works" correctly and is not prone to bugs. I probably do not have time to work on this for a little while, so I wanted to just throw it out there now for comment. I have not thought through every detail of this, so if there are holes to be poked in it, please poke them ;)

#25

David_Rothstein - April 18, 2008 - 03:57

I also just came across #116820: module_list review, which seems relevant? I haven't look through it in depth, but I wonder if there are ideas there that can be merged with those here? @earnie, being the author, I guess you would probably know... ;)

#26

chx - April 18, 2008 - 19:33
Status:patch (code needs work)» patch (code needs review)

David, very nice writeup. I rely on drupal_get_filename in this patch. I believe it's one nice cleanup. Look at how special cases collapse into 1-2 function calls.

AttachmentSize
module_revamp.patch10.09 KB

#27

mmilano - April 18, 2008 - 21:07

Installed fresh drupal and applied CHX's patch in post #26.

Installed successfully.

Listed modules successfully.

Enabled profile and statistics module successfully

Added article and page nodes successfully

Statistics module appears to be working except for top visitors. May be un-related to this patch, but this module was specifically mentioned when asking how to test. Missing table 'access'.

user warning: Table 'drupal7.access' doesn't exist query: SELECT COUNT(a.uid) AS hits, a.uid, u.name, a.hostname, SUM(a.timer) AS total, ac.aid FROM accesslog a LEFT JOIN access ac ON ac.type = 'host' AND LOWER(a.hostname) LIKE (ac.mask) LEFT JOIN users u ON a.uid = u.uid GROUP BY a.hostname, a.uid, u.name, ac.aid ORDER BY hits DESC LIMIT 0, 30 in /home/mike/projects/drupal/drupal7/drupal/modules/statistics/statistics.admin.inc on line 87.

#28

chx - April 18, 2008 - 21:18

I think you just discovered a bug in statistics -- access table was used to store the access rules which is now removed. As the accesslog table is there, it's not that I broke module install...

#29

chx - April 18, 2008 - 21:27

I removed one unnecessary hunk (remnant from an earlier attempt using drupal_load) and remove one line of code: the drupal_get_filename from module_load_minimal . Heine pointed out that drupal_load will call drupal_get_filename and if the DB is down then it will fall back to file search which is fine. But, I needed to fix that with a check whether the db system is even loaded already :)

AttachmentSize
module_revamp.patch9.93 KB

#30

catch - April 18, 2008 - 21:31

Issue for statistics breakage here: http://drupal.org/node/248436 (sorry, my fault).

#31

chx - April 20, 2008 - 08:44
Status:patch (code needs review)» patch (code needs work)

drupal_get_schema will include every schema there is, even disabled modules. Stay tuned.

#32

earnie - April 21, 2008 - 13:53

@David: I've marked the #116820: module_list review as duplicate referring to this node. Some of the ideas have already been redone.

@Chx: Thank-you for your revamp patch. I'm waiting for the stay tuned and will be giving it my reviews. I'm glad people with intimate knowledge of Drupal core has given this a closer look.

#33

David_Rothstein - April 22, 2008 - 06:27

Wow, thanks @chx! This looks really great.

I too am waiting for the "stay tuned" in #31, but in the meantime I found a couple things that need work with the patch in #29. I won't try revising the patch now, since I'm waiting to hear what the changes in #31 mean for all this. But:

  • It seems that this patch causes modules with bootstrap = 1 to always appear first in the list, regardless of their weight (since they're the first ones that get inserted in the drupal_get_filename() cache). So a standard call to module_list() will not always return the modules in the correct order.
  • We probably need to switch update.php to use module_load_minimal(), like the others.
  • In module_list(), the static $stored_list variable never gets set...
  • Finally, this patch is a huge improvement over the current system, but there's still some dependencies that are a little fragile. A lot of things rely on the fact that module_fetch_all() is called within DRUPAL_BOOTSTRAP_PATH, and many functions will not work correctly if used before then. I think with a small amount of rewriting, we can find a way around this; for example, ideally there would be a call to module_fetch_all() within module_load_all(), right? I also think using drupal_get_filename() for the cache definitely makes a lot of sense... but I'm wondering, is there a way we can store information in there about the state of the module (loaded or not)? For example, something like this as the internal storage mechanism inside drupal_get_filename():
    array(
      'module_1' => FALSE, // Stored this way by module_fetch_all() to indicate an enabled module.
      'module_2' => TRUE,  // Stored this way by drupal_load() to indicate a loaded module.
      etc...
    );

    I think that keeping track of this difference would allow things to be even simpler and more flexible, with even fewer special cases to worry about... any time you wanted to know if a module was (a) enabled or (b) both enabled and loaded, you could always get exactly what you want.

#34

Dries - April 22, 2008 - 09:22

Is this something we can write tests for now? :)

#35

chx - April 23, 2008 - 18:01
Status:patch (code needs work)» patch (code needs review)

I used drupal_load instead and defer loading when that's needed.

AttachmentSize
module_revamp.patch9.53 KB

#36

chx - April 24, 2008 - 15:19

Now, path tests pass with one failure regarding alias deletion which has hardly anything to do with this patch -- before the patch, they failed on every 'alias works' test. Alas, I still a cache reset possibility -- when you remove modules, that is not picked up by this approach. I have fixed the "bootstrap modules first" problem. Also, I forgot about update.php that is fixed, too. I have not implemented the distinction between the 'fetched' and the 'loaded' states as I see no use case for it. It'd be easy if it's needed, though, one just needs to keep a separate list in drupal_load.

AttachmentSize
module_revamp.patch12.44 KB

#37

JohnAlbin - April 24, 2008 - 23:09
Status:patch (code needs review)» patch (code needs work)

First, this patch needs more php docs. This would help me figuring out what some of the params are doing. But here’s what I can determine so far…

This patch modifies/simplifies the API for drupal_load() and module_list() and adds module_load_minimal() and module_fetch_all().

1. I think module_fetch_all() needs:

<?php
/**
* Fetches a list of modules from the database.
*
* @param $bootstrap
*   Whether to fetch the reduced set of modules loaded in "bootstrap mode"
*   for cached pages. See bootstrap.inc.
* @param $force
*   Force module list to be reloaded from database.
*/
function module_fetch_all($bootstrap = FALSE, $force = FALSE) {
?>

2. drupal_load() needs its $mode documented. Here’s my stab at it:

<?php
/**
* Includes a file with the provided type and name.
*
* @param $type
*   The type of item to load (i.e. theme, theme_engine, module).
* @param $name
*   The name of the item to load.
* @param $mode
*   Whether to load the item(s) immediately, defer loading the item until
*   later, or to forget that all the items of that type were loaded.
* @return
*   TRUE if the item is loaded or has already been loaded.
*/
function drupal_load($type, $name = NULL, $mode = DRUPAL_LOAD_IMMEDIATELY) {
?>

3. drupal_web_test_case.php: there should be one extra space before // We might have removed some modules.

4. If drupal_load('module', NULL, DRUPAL_LOAD_DEFERRED) is called then all the modules are included. This seems contrary to passing in DRUPAL_LOAD_DEFERRED.

<?php
 
if (!isset($name)) {
   if (
$mode == DRUPAL_LOAD_DEFERRED) {
      foreach (
$files[$type] as $filename) {
        include_once
"./$filename";
      }
    }
?>

5. module_load_minimal() is much better than copy and pasting code 3 times into theme.maintenance.inc, install.php, and update.php. +1

6. What’s with the update to simpletest_entrypoint()? This seems unrelated to this issue.

7. In module_fetch_all(), 'AND bootstrap = 0' should be 'AND bootstrap = 1', no?

8. When running Path and System simple tests before the patch:

  Path: 76 passes, 4 fails and 0 exceptions.
    Alias works. at [modules/path/path.test line 38]
    Changed alias works. at [modules/path/path.test line 49]
    Alias works. at [modules/path/path.test line 89]
    Changed alias works. at [modules/path/path.test line 98]
  System: 85 passes, 0 fails and 0 exceptions.

After the patch:

  Path: 79 passes, 1 fails and 0 exceptions.
    Alias was successfully deleted. at [modules/path/path.test line 120]
  System: 82 passes, 3 fails and 0 exceptions.
    Module "aggregator" is enabled. at [modules/system/system.test line 43]
    Module "translation" is enabled. at [modules/system/system.test line 88]
    Module "locale" is enabled. at [modules/system/system.test line 88]

#38

chx - April 25, 2008 - 22:33
Status:patch (code needs work)» patch (code needs review)

Thanks for the through review, the documentation and drawing my attention to the fact that system.test has a lurking module_list reset. And that's necessary here as well because some other Drupal have changed the module list and we need to resync. In normal operations, such resets will be much less necessary than they were before. I simplified how drupal_load works with deferred stuff and I have documented everything stressing that all this new stuff to drupal_load is not something people should do. Finally, I have enrolled http://drupal.org/node/211439 into this patch. system test pass and path has a single, unrelated failure.

AttachmentSize
module_revamp.patch15.81 KB

#39

justinrandell - April 27, 2008 - 10:39

subscribing. will review tomorrow.

#40

chx - April 27, 2008 - 18:03

note that the aforementioned single unrelated path test failure is fixed in http://drupal.org/node/251631

#41

chx - April 28, 2008 - 05:00

I found two unrelated hunks in the patch. Removed. No other change.

AttachmentSize
module_revamp.patch14.33 KB

#42

David_Rothstein - April 28, 2008 - 07:15

I reviewed and tested this patch. (Sorry, I had intended to do this earlier!). Overall looks very good, and everything seems to basically work.

I'm attaching a new version which includes a few changes to the documentation (no code changes, though, so it doesn't matter if anyone else started testing the earlier version):

  • I fixed some errors and added some comments in the module_list(), module_fetch_all(), and drupal_load() documentation... the most important of these probably being that the documentation for the $sort parameter in module_list() was no longer correct.
  • This patch seems to have removed a random code comment about the aggregator module from modules/system/system.test, so I put that back in.

Now some comments/suggestions about the code:

  • The comment in DRUPAL_BOOTSTRAP_PATH implies that the call to module_fetch_all() is needed only for path aliases to work, but actually, it's needed for, like, all of Drupal to work... try commenting out that line and you'll see what I mean ;) The reason is that other functions, in particular module_load_all(), rely on module_fetch_all() having been called earlier. I think we need to fix this so these functions become more robust (especially since that call to module_fetch_all in the bootstrap can maybe be removed at some point). Since module_fetch_all() is cached, there is no harm in doing this. So I recommend that we rewrite module_load_all() as:
    <?php
    function module_load_all($bootstrap = FALSE) {
     
    module_fetch_all($bootstrap);
     
    drupal_load();
    }
    ?>

    and bootstrap_invoke_all() as:
    <?php
    function bootstrap_invoke_all($hook) {
     
    module_load_all(TRUE);
     
    module_invoke_all($hook);
    }
    ?>

    Which looks a lot cleaner anyway.
  • I think the cache for the sorted list in module_list() is not working correctly. Maybe that part of the function should be rewritten like this?:
    <?php
    if ($sort) {
     
    // Only do the sorting if the list has changed since the last time we sorted.
     
    if ($stored_list != $list) {
       
    $stored_list = $list;
       
    ksort($list);
       
    $sorted_list = $list;
      }
      else {
       
    $list = $sorted_list;
      }
    }
    ?>

    and then we need static $stored_list = array(), $sorted_list = array(); at the top of the function too. On the other hand, I'm not sure this cache is even necessary?! I can only find one place in core where module_list() is actually called with $sort = TRUE (in user.admin.inc).
  • When module_enable() is called, I believe the newly enabled module always gets put at the end of the list in drupal_load(), so that for the rest of that page load module_list() doesn't return things correctly ordered by weight. I'm not sure how much this matters (?), but we might want to consider using a module_fetch_all() with force refresh in there, just to be safe.
  • I'm pretty sure we can get rid of DRUPAL_LOAD_NONE entirely. In module_list(), can't $list = drupal_load('module', NULL, DRUPAL_LOAD_NONE); just become $list = drupal_load('module');?
  • Should a call to drupal_load() with DRUPAL_LOAD_RESET reset the $deferred array too? (I think that would probably be a good idea, but I'm not sure.)
  • If I'm understanding things correctly, then the $files cache in drupal_load() -- and by extension the return values of things like module_list() and module_exists() -- actually involves a list of all fetched modules, rather than loaded ones? This seems to be because the line $files[$type][$name] = $filename; near the bottom of drupal_load() is called regardless of whether the file is being loaded immediately or deferred. If this is correct, we should probably document that better, since I found it pretty confusing when reading through the code! It also probably affects the documentation of module_load() and module_exists()... we probably should explain more carefully in those functions exactly what is being returned. I think this is somewhat important, since this seems to be the whole reason that this patch actually allows path aliases to work again.
  • While it would be nice to get rid of the call to module_fetch_all() in DRUPAL_BOOTSTRAP_PATH, since it's basically a hack to let module_exists() work within path.inc before all modules have been loaded, I can't think of a great way to do it. As @chx and I discussed in IRC a couple days ago, although we could do this either by creating separate "module_is_loaded" and "module_is_enabled" functions as I suggested above (or alternatively, by adding an extra parameter to module_exists), it's probably not worth changing the API of these public functions just for this special case. Doing so would have the slight benefit of allowing "module_is_enabled" to work within bootstrap hooks; however, @chx pointed out that if you really need that information during the bootstrap phase, the best way to get it is probably just to query the system table anyway, and I tend to agree with this point.
AttachmentSize
module_revamp_5.patch16.01 KB

#43

David_Rothstein - April 28, 2008 - 07:32

It also occurs to me that since drupal_load() is kind of a crazy function with all sorts of parameter combinations intended for internal use only, it might be good to take that code and put it in a private function, with a few public wrapper functions intended for calling it in its various ways. That would make the code more readable and make it harder for modules that might be using this function to make any mistakes.

I don't think that necessarily needs to be part of the current patch, though.

#44

Dries - April 28, 2008 - 09:00

Excellent review, David. I had a look at the patch and I have some additional suggestions.

1. In the code comment of DRUPAL_LOAD_DEFERRED, I would explain what it means to load a module later, and when that should (or shouldn't) be used. The current code comment doesn't answer these questions and I'm left wondering why I'd want to defer loading a module ... *confused*.

2. The explanation should probably be repeated in drupal_load()'s comments.

3. +  $list = drupal_load('module', NULL, DRUPAL_LOAD_NONE); looks somewhat "ugly" -- it wasn't self-explanatory and could use a code comment (if not a more intuitive API).

4. I wonder if we could write a couple of Simpletests that use (aggressive) page caching, and that use the new API to check the number of loaded modules. Let's set the example and start writing tests.

It looks like this patch is getting close to be RTBC.

#45

David_Rothstein - May 1, 2008 - 04:55

Here is some possible text for Dries' first two suggestions (it's not so easy to explain, I guess... I struggled figuring out how to write this):

<?php
/**
* Mark the file to be loaded later on in the bootstrap process.
*
* This is used when information about enabled modules needs to be
* retrieved from the database before the module code itself is actually
* required.  This method prevents unnecessary loading of files if the
* bootstrap process is cut short.
*/
define('DRUPAL_LOAD_DEFERRED', 2);
?>

And then in the drupal_load() comments:

* @param $mode
*   Whether to load the file immediately (DRUPAL_LOAD_IMMEDIATELY, the
*   default), mark the file to be loaded later on in the bootstrap process
*   (DRUPAL_LOAD_DEFERRED) or just reset the internal cache of files
*   (DRUPAL_LOAD_RESET). These are for internal use only; do not specify
*   this argument.  DRUPAL_LOAD_DEFERRED is used when information about
*   enabled modules needs to be retrieved from the database before the
*   module code itself is actually required; this prevents unnecessary
*   loading of files if the bootstrap process is cut short.

#46

chx - May 1, 2008 - 15:49

I think we should really postpone this past the registry. As we were wroking on that , all this deferring problem is solved: we will not load modules, no need.

#47

chx - May 6, 2008 - 14:36
Assigned to:Anonymous» chx
Status:patch (code needs review)» patch (code needs work)

So, here we are, the registry is in! Stay tuned, I will post more today.

#48

chx - May 16, 2008 - 20:23
Status:patch (code needs work)» duplicate

I have written http://drupal.org/node/259412 which as a side effect fixes this too :)

#49

catch - August 20, 2008 - 09:33
Status:duplicate» patch (code needs work)

#259412: Total module system revamp is stalled and this issue is blocking dbtng now.

#50

Crell - August 20, 2008 - 09:36

I was just about to post a new issue for this, but here's what I was going to say:

Currently, module_list() caches its results based on its parameters. However, those parameters can change, and just calling module_list() on its own will return NULL if it hasn't already been "primed", since the default is to not refresh. However, who knows how it's been primed before you wanted it, or even if? That means you need to either pray or always tell it to refresh, which is nasty for performance. It also has no "I'm empty so rebuild" validation, so if you call it early in the page request you stand a very good chance of getting back NULL.

module_list() needs to be totally rewritten, and probably broken into 2 separate functions: module_list() and module_list_bootstrap(). Both should self-initialize and cache the first time they're called, as well as accept a $reset parameter. That would make them safe to call. Trying to throw it all into one function is a horrid idea that causes all sorts of problems for, say, DBTNG.

I really think the fix is to just trash module_list() and make 2 functions. That will simplify the code here and in various other places, too.

 
 

Drupal is a registered trademark of Dries Buytaert.