As Drupal grows bigger, there is too much code not really needed parsed for each request.
Also, installing more and more modules presents serious scalability issues, as all the enabled modules are included always for each non cached page.
This is a first attempt to keep track of *all* module hooks, and only load modules when they're really needed. I guess it may need some polishing but the idea is simple enough.
Currently, as the hook_menu is implemented by most of the modules, and it is called most of the times, the potential performance improvement introduced by this mechanism may be small.
But the thing is, once he have some system for on demand module loading, hooks can be reworked in the future, to have some real performance boost. I.e. hook_menu could be easily split in two 'hook_menu' and 'hook_menu_dynamic', thus really reducing acually loaded modules.
As I said, this is a first step. If there's some interest in this kind of features, I have some other things in the works aimed at performance and scalability, like:
- (Simple) Rework of menu system to take real advantage of on demand module loading
- Expand menu items to be able to include some file where the callback is in
- Extend module loading for 'loading on path', and maybe 'loading split modules'
- Some API loader, kind of api_invoke....
| Comment | File | Size | Author |
|---|---|---|---|
| on_demand_module_loading.patch | 7.8 KB | jose reyero |
Comments
Comment #1
chx commentedThis will be never be really efficient. If you have a block from aggregator, then aggregator needs to be loaded and parsed and store despite most of the functionality is never used. Quite a lot of modules play a small part in most pages.
Alas, my split mode development is halted a bit, but I'll revive. My problem is that drupal_eval needs on the fly tokenizing and wrapping...
Comment #2
eldarin commentedInteresting issue and approach; I'm also looking into reducing processing load and speeding things up.
I have completed a first test version of a template system where I can keep the generated content - without the theme template "XHTML framework". I am now looking into how to cache partially parsed pages - with parts like e.g user specific information being always getting parsed.
It's kind of complicated when to dirty cached contents - perhaps some simple rules could do - but I've yet to discover which.
In my opinion the biggest gain can be found by reducing whatever processing needed to deliver XHTML to users - but not all is serving up the pages - a lot is also logging, AAA etc.
I think a combination of successful caching and reducing module loading can be the best overall solution.
For security I also like to try and differentiate db users - don't like the idea of a central "db-root" being used for all accessing. An encrypted password like /etc/passwd with perhaps client-side browser encryption generation could serve - a IDEA, 3DES or Rijndael. That would effectively get rid of most of the hacking/defacing of web-sites - i.e keep the db intact from hackers.
Comment #3
chx commentedOK, to make this clear: I am working since March (with pauses) on something called split mode. This splits drupal into a gazillion of files -- one function, one include file. And loads on demand. The speedup and the memory saving are enormous (40-50%). What code I have is in my sandbox since May.
I think it'd be great to have this in 4.7 as a possibility. I hoped that install system would come along, but as I do not have, I am a bit reluctant -- the problem is that if you update a module then it won't work until you resplit which can take several seconds.
However, I think most sites do not update their modules too often so this is still an avenue that worth pursuing.
Also, as most functions (most == there are problems with references) are wrapped into a c() call, it'd be possible (later?) to introduce a mechansim which could override any function. Sometimes this comes up...
Comment #4
jose reyero commentedeldarin,
yes, as I've said this only one of the many things we can do to speed things up, so the "final solution" could be a combination of on demand loading, splitting modules, improving the cache... However, all these can be approached as different features and patches
chx,
I've also tried your 'split' thing, but once I had all that small files, I couldn't apply the patch, so I dont know really what to do with it... Anyway, why dont we start another thread for that, as it is quite a different approach?
Comment #5
moshe weitzman commentedthis approach is worth exploring ... you have anarray which defines 'all hooks used by modules' . perhaps that possible in core, but non core modules define hooks too (e.g. syndication, mailhandler, ...). they need a way to register with the module system.
Comment #6
Crell commentedCorrect me if I'm wrong, but as I understand it the disk-hit involved in loading a file is a bigger performance drain than parsing said file, unless said file is very complicated. Unless you have a RAM disk or PHP accelerator (which does RAM caching), wouldn't the trade-off of then hitting the disk 50 times instead of 10 be net negative?
Comment #7
jose reyero commentedmoshe,
yes, you're right, I really hadnt thought of that... But I had foreseen some other similar issues with modules implementing hooks in included files...
But I think this could be handled with some module_info function in which each module returns information about which hooks it implements. This 'module_info' hook has been also mentioned in some other thread about returning version information, and could be used too for dependencies between modules, etc...
A different approach would be also modules providing information about which new hooks they introduce, so the module system could search for that new hooks in all the modules.
I think I'll go for the first one -which may be only for non core hooks-, maybe also using that function for information about on which paths the module has to be loaded, thus solving the problem with dynamic menu items too.
lgarfiel,
I'm not sure about that data about performance but if you're right, that's one more reason to rework the module loading. I think this approach means actually less disk hits than current system.
Besides that, more PHP parsed has some important impact in memory use, and that eventually means performance too.
Comment #8
moshe weitzman commentedI'm not too fond of module developers having to declare all the hoploks they are using. The system should be smart enough to handle that. How about we implement a system_cron() function which loads all modules and tracks the hooks that they employ. That way, only it has to load everything, and they regular user requests can use lazy loading as you've suggested.
Comment #9
jose reyero commentedWell, we have that two options:
- modules declaring implemented hooks
- modules declaring introduced new hooks, this would affect only a few modules, like mailhandler, which create their own hooks
Maybe the second could be easier, having i.e. the system.module declaring all core hooks, then the list can be dinamically built.
May sound funny, but how about a new 'hook_define_hooks' hook?
This list only needs to be built after enabling/disabling modules, just similar to how the bootstrap hooks are managed, saving the need for that cron call. This bootstrap hook thing was a big step ahead, so why not do it for all hooks?
This patch is only some proof of concept, showing this is not that complex.
Comment #10
Uwe Hermann commentedPatch doesn't apply anymore.
Comment #11
Jaza commentedMoving to 6.x-dev queue. This is definitely an issue that still needs to be looked into. Feel free to close if there are other issues looking at the same problem that are more active (I suspect that there are other issues, but I'm not familiar with them).
Comment #12
Crell commentedI believe this has been superseded by this issue. Please comment there if you still support reducing the bootstrap overhead in Drupal. :-) (The benchmarks in that issue show it is definitely worth pursuing.)