Closed (fixed)
Project:
Drupal core
Version:
7.x-dev
Component:
base system
Priority:
Critical
Category:
Task
Assigned:
Issue tags:
Reporter:
Created:
26 Sep 2009 at 15:48 UTC
Updated:
3 Jan 2014 at 00:29 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
Crell commentedAnd here's a version without the stupid tabs that Eclipse keeps trying to add, even though I keep telling it not to. ARG!
Comment #2
sunA reference to #557542: Cache module_implements() would be nice to understand the purpose of this patch. ;)
Comment #3
robloachVery much in approval of this. Before setting it to RTBC, is there any place where we could use module_include_all? I believe the Token API uses modulename.token.inc.....
Comment #4
donquixote commented+1
If we introduce a cache for module_implements(), we need to make sure that module_include_all() is called for all possible groups, before any call to function_exists() or get_defined_functions().
One solution is to let module_implements() take the $group as a parameter, call module_include_all, and then do the function_exists() stuff to build its own cache.
If we want to scan for hook implementations all at once, this would not be possible, because we don't have a list of hook and group names available.
What about a bit of caching to remember which groups have already been included?
Comment #5
Crell commenteddonquixote: Have a look at
#557542-26: Cache module_implements()
#557542-32: Cache module_implements()
We've already discussed the group stuff and adding it to module_implements(). What you describe is pretty much what I wrote back in Paris.
catch, merlin, and I were just discussing this in IRC, and our new plan here, Dries permitting, is:
1) Introduce hook_hook_info(): this hook lets you define that you expose a hook, and what it's group is. Right now it does nothing else but it could very easily be extended to do other stuff later. It is entirely optional.
2) hook_hook_info() gets cached. Easy operation.
3) When module_implements() is called, we automatically check for a group. If a group is defined, we include that set of $module.$group.inc files and then do the function_exists(). That information gets cached so that next time we know what to lazy load.
4) Because everything gets cached, the above becomes quite fast. And because we still function_exists() things and it's just a lookup array, it won't break your site and can be reset by any cache clear operation so stale data should be a non-issue.
5) Because the entire system is optional, there are 0 API changes, only one API addition (the new hook). That makes it a very easy addition.
We are still trying to convince Dries to let us write this. :-)
Comment #6
donquixote commentedI'm afraid we can get some inconsistencies here, if module developers don't play as they are supposed to.
hook_hook_info()
The hook_hook_info() concept suggests that each hook is owned by exactly one module - similar to a permission, or a menu entry. I'm not sure if this is a reasonable way of thinking, at least it does not reflect how hooks used to work in D6:
hook_hook_info will change the hook concept a lot, because it lets a hooks be 'owned' by one module.
Technical side effects of hook_hook_info():
This effect can be confusing.
In fact it can sometimes be useful, because it allows modules to conditionally include hook implementation files based on which other modules are defined.
But this mechanic is not very transparent, and can cause some confusing side effects if you use an older version of module_a with module_a_hook_hook_info() with a newer version of module_b with a "module_b.groupname.inc" file.
$group as an argument to module_implements()
The other solution is to pass the $group name as an argument in module_invoke_all() and module_implements(). Problems:
Example for the possible inconsistencies:
vs
Combination of hook_hook_info and $group argument in module_implements()
I don't know if you see the two things as alternatives or if you want to combine them.
Combining them does not really solve the problem, instead you have both problems at once.
The way out?
Possible solutions I see:
Comment #7
donquixote commentedThis was actually a separate point..
(this implementation does also allow to include the group for only one module.)
Comment #8
catchWhile I see your point, hook names are namespaced by module - so while any module can invoke any other module's hook (or core's), they should know who the hook is originally 'owned' by in the first place. The only situation where that namespacing breaks down is with hook_/foo/_bar_load() vs. hook_/foo_bar/_load() or if a module doesn't namespace it's own hooks but that's already a problem.
Marking CNW since the proposed approach is different to the current patch now.
Comment #9
mattyoung commented.
Comment #10
donquixote commented@catch:
I don't know of any perm.module for hook_perm. I know of a menu.module, but it has little to do with hook_menu - the hook_menu stuff would work quite well with menu.module disabled. I don't know of any view.module for hook_view (not to confuse with views). I don't know of any nodeapi.module for hook_nodeapi. Well, this is all core stuff.
There are some module-namespaced hooks such as hook_uc_message or hook_views_plugins, but I don't know if this can be made a rule. And moreover, what do you want to do if the "hook owner" module is switched off? Do you want to prevent other modules from using this hook? Or do you only want to prevent the lazy inclusion?
Comment #11
donquixote commentedFor the group names, another possibility would be to prefix hooks by group name.
It would make a lot of sense to have
- hook_build_menu() instead of hook_menu()
- hook_build_perm() instead of hook_perm()
- hook_build_menu_alter() instead of hook_menu_alter()
- hook_build_block($op) instead of hook_block($op) (you would still have hook_block for actually rendering the blocks).
All of these are only needed occasionally when some registry or cache needs to be rebuilt, so it would be fine to put them in a modulename.build.inc file.
To make the transition easier, you would still invoke hook_menu and hook_perm, and then you would add the information from hook_build_menu and hook_build_perm.
Well, even if you disagree with the prefix thing, I think modulename.build.inc is not such a bad idea!
Comment #12
catchIt's now hook_permissions() - and it's owned by /includes rather than a module so doesn't necessarily follow the same rules.
menu.inc, but same as above.
Old cruft, IMO.
What about node module for hook_node_$op - http://api.drupal.org/api/function/hook_node_load/7 hook_nodeapi doesn't exist in D7.
Things may not be perfect in HEAD, but it's a lot more standardised than D6. Either way, two modules trying to /define/ the same hook, rather than just invoking it, would be a bug, and a very rare edge-casey bug at that.
Comment #13
donquixote commentedOk, but again, what happens if you disable the definition of a hook, while there are still other modules calling and implementing it? Is this simply forbidden? Or do you want to introduce some rules to deal with it?
Comment #14
chx commentedyesterday it was raised on IRC not sure for this patch but I presume this (or was it the m_i cache?) whether APC caches all functions or those that are executed.
that settles it.
Comment #15
Crell commenteddonquixote: Excluding core hooks like hook_menu, hook_permissions, and so forth that would be owned by system.module which cannot be disabled, do you actually have a use case where you'd actually want to include a hook defined by a module that is disabled? I cannot think of one, and therefore see no reason to block this patch for it.
Even if we can find one, if you know what file pattern it's going to be in so the function in the first comment in this thread would quite easily allow for that edge case should it ever actually happen.
Comment #16
Crell commentedAnd here is an implementation of what is described in #5 above. The patch itself is large, but only because I am moving system_theme() and system_menu() to an include file for demonstration purposes, and to save about 450 lines of code.
As a nice side effect, we get rid of the need to do a cast to force a copy on the implementations array.
Comment #17
catchDiscussed this a lot with Crell in irc. This approach should have a minimal impact on sites with opcode caches, and doesn't add any new requirements for module authors at all. The only issue is whether we can agree on a $module.registry.inc pattern in core to hold menu and theme hooks at all.
Comment #19
Crell commentedI have a theory that the issue is, once again, system_menu being a special case due to the installer. Let's try this again, this time only moving node_theme(). It works for me on CLI and GUI.
Comment #20
sunI really love this patch and entire approach. Especially, it will be tremendously helpful for contrib modules, which very very very often need to implement support for optional third-party contrib modules.
But it is still entirely optional - you can simply leverage it if you want to. And for a very large amount of code it makes sense, due to the clarification on opcode caches provided in #14.
We have some minor issues though:
- Wrong indentation in the API docs for hook_hook_info() as well as system_hook_info().
- Missing CVS Id and @file directive in the new files.
- Since $module.tokens.inc already uses this scheme, we should add this group already, which makes up another perfect example (because there are already tests).
Comment #21
moshe weitzman commentedI am quite happy with this too. Nice work.
Comment #22
moshe weitzman commentedThis patch addresses sun's points.
It also exposes a flaw in module_hook_info(). It is caching its results even when called by simpletest during system_install(). The system_install() is called during drupal_install_system(). system_install() leads to a drupal_alter() which kicks in the module_hook_info(). Since no modules are installed, we cache am empty array and we get test failures. I'm hoping someone can sort this out.
To reproduce the problem, just try to get token replacement test (in System group) to pass with this patch.
CNR just to see what bot says.
Comment #23
moshe weitzman commented#markup processing is back to #pre_render. It was a valiant attempt. It has to in place early since #theme_wrappers needs to wrap the markup.
Comment #24
moshe weitzman commentedwoops. last patch was not meant to be posted here.
Comment #25
Crell commentedNice! Converting token, too, lets us remove needless cruft code. Contrib modules can very easily do the same if they have some one-off solution now. Check plus!
Regarding Moshe's bug in #22, apparently the bot can't find it. :-) I didn't run into it either in my testing. Moshe, are you sure the cache isn't getting cleared already somewhere? It seems good to go...
Comment #26
robloachI understand it would make a rather large patch, but if we move
node_menu()alongsidenode_theme()in our new node.registry.inc, I'd be so so so so so happy to set this to RTBC! :-) ...... Bonus points for any others that we manage to move in along with this patch. Either that or stick an ugly @todo in there ;-) .Comment #27
moshe weitzman commentedMoved node_menu() to node.registry.inc.
Lets chalk up that bug I saw to local halloween gremlins. Bots happy.
Comment #28
moshe weitzman commentedAdded capital R per tha_sun.
Comment #30
robloachHaha, mixing up the patches :-) . In order to get things back on track, here's the patch at #22 with the node_menu() move to node.registry.inc, and without the whitespace changes. We can move others post-issue. Let's see how the bot likes it.
Comment #32
bcn commentedSame procedure as Rob tried in #30... This is the patch from #22, with node_menu() moved also...
Comment #34
sunTests are failing, because the cached hook_info is never reset.
This patch should fix that.
Comment #35
sunThis looks RTBC to me.
Summary:
Comment #36
robloachBam!
Comment #37
Crell commentedYay!
Also, to clarify, in the degenerate case of nothing ever implementing this functionality the added cost is one single-value if statement. That's it.
The benefit to non-APC sites is directly proportional to the amount of rarely used code we can relocate out of .module files, plus a simplified API for contrib modules to make it easier to provide optional support for other modules.
For APC-using sites, performance based on previous benchmarks should be a wash either way. They still get the benefit of the improved API.
I don't think we could get a better cost/benefit ratio than this, and it breaks not one single existing API. :-)
Comment #39
alexanderpas commentedjust a quick question, does this enable us to seperate our (small) (hook_)boot code from our large modules?
Comment #40
moshe weitzman commentedalexanderpas - no, it doesn't do that. when hook boot fires, it loads the .module so there is no point to moving hook_boot code elsewhere. this helps by making the .module smaller.
Comment #41
Crell commentedArguably that would be the next logical step, since we shouldn't HAVE to load the .module file in those cases, but that's for a follow-up. Would be nice if we could pull it off, though, but I don't know if that's freeze-safe.
Comment #42
alexanderpas commented@Crell
I bet we could put that under the Performance exception.
Comment #43
dries commentedI'm not a big fan of the name 'node.registry.inc'. The word 'registry' has zero meaning to me when trying to look for a function. I can't think of a better name though.
As I said before, I'm not a fan of splitting the core files in smaller pieces as I think it makes for a DX regression. I guess people can use it in contrib if they want.
Comment #44
donquixote commentedSorry, but.. "DX" ?? (Ok, found it, "Developer Experience")
I am personally a fan of separate files for stuff that is not needed in a typical request, and that logically belongs together. Typically all "rebuild" stuff (menu_rebuild, theme registry etc) could very well live in separate files, for my taste.
Also makes it easier to apply concurrent patches that otherwise would mess up line numbers.
What about "modulename.build.inc" or "modulename.rebuild.inc" or "modulename.definitions.inc" ?
Comment #45
Crell commentedThe name "registry" is what we've been informally using for the past year for "registry-style hooks", like menu, theme, etc. Info hooks is another common name for them, but $module.info.inc is even more confusing. When discussing this with webchick in IRC a few months ago (back when we still had the function registry), we couldn't come up with a better name either so suck with "registry". I'm not tied to it, but bikeshedding on the file name is the last thing we want to do right now. :-) donquixote suggested some possibilities; I can live with .build.inc if you can, Dries. Make a call and we can reroll the patch.
As #44 points out, there are DX benefits to a logical splitting of files. 5000 line files can be quite hard to manage, performance questions aside. We can bikeshed exactly how to group hooks for best performance later, once the implementation is in place, as those are not API changes. (We did the same for the page split in D6.)
Comment #46
sunI don't have a better suggestion than $module.registry.inc or $module.info.inc either. It's always a pain to have to scroll over these large hunks of arrays when you actually search something else in a module, so I'd really love if I could consult a single $module.registry.inc instead when needed.
Can we re-roll with the API change only and discuss the eventual code re-organization/movement in core later on?
Comment #47
moshe weitzman commentedI'm even willing to get of registry group to get this in. We shoudn't, but Contrib will use it even if core can't decide on a name for info hooks ... We are proposing only 1 new file for core.
Comment #48
mattyoung commented>I'm not a big fan of the name 'node.registry.inc'. The word 'registry' has zero meaning to me
Agree. The word registry is overloaded with meanings and there are way too many different kinds registry in Drupal for this to make any sense just the word 'registry' by itself.
I did not like that 'registry' name before it was killed so I'm glad it's done way.
Please use a more specific word.
Comment #49
alexanderpas commentedI think registry is appropriate, as we're talking about something similar to a "Metadata registry" here (not to mention "The Other" registry, see also: http://en.wikipedia.org/wiki/Registry)
Comment #50
sunLet's get this in first and continue this discussion afterwards.
Comment #51
Crell commentedFor those who don't feel like reading the patch, #50 is the core change necessary to make this work plus porting token to this system but no other changes. We can then bikeshed registry vs. build vs. whatever afterward since it's not an API change. I am +1 on that. (I was about to roll such a patch if sun didn't beat me to it. :-) )
Comment #52
catchThe diffstat on #50 is 4 files changed, 96 insertions(+), 28 deletions(-) - including the new documentation. IMO that's a no brainer just to get token in.
The biggest use-case here is precisely things like token support, bot.module support etc. - code which is used very rarely or not at all on some sites. Even if we don't care about the memory savings on non-opcode sites (I care a little bit but not too much), then having a nice way to divide up inter-module includes like this makes a lot of sense, and already replaces some cruft in core with a nice API. So this one is very RTBC, and we can argue about core splits later.
Comment #53
dries commentedAlrighty, committed.
Comment #55
sunI tinkered really long about how to proceed from here.
I think I found the answer that should be compelling for everyone:
#624324: Do not load support code for optional core modules
This lazy-loading was mainly targeted for contributed modules. Optional core modules have to be treated identically to contributed modules.
Comment #56
chx commentedI am so glad in less a month we have properly discussed and got this in! Really glad. Especially that with commit messages messed up it takes a PI to find this issue.
This is one of these issues that will get a DrupalWTF sign two days after release. We spent years discussing whether we want to explicitly register hooks or not (hint: no!) and all of a sudden we decided yeah this one-off is a good idea? Unbelievable... I will try to come up with something useable and grokkable quick before alpha.
Comment #57
chx commentedOh I have the first real problem here. Modules implementing hooks on behalf of other modules won't work because
module_load_include('inc', $module, "$module.$group");loads the include strictly from the$moduledir. That's a proper bug report... to begin with.Comment #58
chx commentedSay someone debugs her site and you need to explain what's a hook and where she can find 'em. Before patch? "A hook is just a function named the module name, the underscore and the hook name and it lives in the .module file." After patch? "A hook is just a function named the module name, the underscore and the hook name and it lives in the .module file unless ... unless of course the hook is defined in another hook to belong to a group where the group just means the part of the filename before .inc."
Second, there are tons and tons of hooks not implemented by core but fired by core. Let's say I do a huge, huge form_id specific form_alter so I find it a good idea to move it into an include file. This works! Yay. If I am unlucky enough that another module wants to implement the same hook , now there is a problem... Edit: if it wants to use another include file, that is.
Edit: we say we do not babysit broken code and the latter is such broken code. But tt works perfectly and saves memory until (much) later said another module comes. That's a like a hidden bomb not broken code per se.
Comment #59
chx commentedAaaaaaaand a solution!
Scrap hook_hook_info. Use $module.$hook.inc. That's easy to explain. It's impossible to ambush another modules either because the include file is fixed. We still need to implement this carefully to avoid the problem of implementing on behalf of other modules. Likely first iterate over the modules and see if there are any $hook.inc files and after including them all, now run over the modules and check for function_exists.
Comment #60
jhodgdonAlso, see #675136: Document how/why hook_hook_info() is totally different in D6 and D7
Comment #61
chx commentedAnd that sucks before sun said "A rather simple module like admin_menu exposes 3-4 hooks from which an integrating module will have to implement 2-3." so grouping is essential. We could group by module name, ie hook_admin_menu_foo_bar lives in foo_bar.
Do we have the problem with views_attach and views modules when trying to figure out where does hook_views_attach_foo belong? Yes, but if you have views module which has a hook_views_attach_foo then you already should not have a views_attach module so I do not think this is a terrible issue.
How's the explanation then? "The hook can live in .module or if foo module provided the foo_bar hook then in bar.inc". Not sure whether we win too much then :(
Comment #62
jhodgdonSo...
Am I correct in reading the above comments that your idea is:
a) Get rid of hook_hook_info() entirely.
b) For any hook_xyz(), allow module mymodule to either put function mymodule_xyz() in file mymodule.module or mymodule.xyz.inc.
c) Have module_implements() and module_invoke() and node_invoke() and all the other such functions, including the ad-hoc invoke routines that are scattered around and through Drupal (the ones that made tracking down #675046: Make sure all hooks in D7 have documentation so difficult) follow this convention.
And what about hook_xyz_alter() implementations, called from drupal_alter()? I'm assuming they should be in mymodule_xyz() too?
And are you saying that there should be "groups" of hooks somehow that can go together into particular files? How would those be defined -- didn't udnerstand what you were saying on that? That was the purpose of the hook_hook_info() thing... either have it or don't, but don't do something crazy based on the first component of the hook name etc. Please. !!
Comment #63
jhodgdonOh, forgot to say: I think this idea (if I have understood the idea correctly) is not yet fully defined and will be difficult to implement...
Comment #64
donquixote commented[off-topic]
This is yet one more of those issues where people don't have time to read all associated discussions, and even if they do, won't remember everything. And then we get sub-optimal decisions.
The same if I want to make sense of chx points above and find related posts, I need to read up a lot of stuff.
Wouldn't it be possible to have some kind of Wiki, where we can summarize the different solutions and their pros and cons, with links to the arguments, cleaned up of the usual noise?
I have the same problem in other threads, where I am sick of repeating myself. And my own unfinished thoughts in previous posts make it even worse.
EDIT: I posted an issue for that.
#682178: Drupal Strategy Wiki to sort+summarize information from the issue queue.
or alternatively
#682254: Allow multi-dimensional rating of issue comments
[/off-topic]
Comment #65
chx commentedNot enough time I now see and with grouping we won't get anything better for now. /me weeps and leaves this issue alone.
Comment #66
sunThe very first thing we should do in D8 is to change ALL hooks to use double-underscores as delimiter between module name and hook name, i.e.
views_attach_links()
becomes
views__attach_links()
or
views_attach__links()
...
Now I realize that this won't even cut it, so we likely need to get one step further and perform the very same change to hook names either:
views_comment_load()
becomes
views__comment__load()
That way, we can auto-determine the module name the hook belongs to. Therefore, all
hook__comment__*()
hooks can live in
$module.comment.inc
(auto-loaded)
Comment #67
donquixote commentedPrevious arguments have pointed out that module name is often not the best criterion for the grouping.
Can we replace "module name" with "group name" in your idea?
And for the double underscore idea, maybe you noticed the following discussion?
#548470: Use something other than a single underscore for hook namespacing
If we really want to change this, I would much prefer _hook_ to just a double underscore: It's more obvious, and invites to do a google search for _hook_ and also allows a cross-file sourcecode search.
Comment #68
donquixote commented@sun (#66):
We could still decide to load all candidates. That is, every prefix that is a module name + '_'. It might kill a bit of performance, but it's safe. If we don't want the second '__'.
So, for mymodule_hook_a_b_c() we would include
mymodule.a.inc
mymodule.a_b.inc
if both "a" and "a_b" are module names.
And if you want, we also load
mymodule.a_b_c.inc
Now we are left with a rare nameclash problem, and a very small performance penalty.
All of that IF we want hook filenames based on hook prefix / module name.
------
@chx (#57):
Modules implementing hooks on behalf of other modules should do that in their .module file, or any other file that is always loaded. Problem solved?
Comment #69
Crell commented@donquixote: No, modules should absolutely be able to implement hooks on behalf of another using a separate include. If they can't, that's a bug because we wrote it to be possible.
chx, can you confirm that bug you report with an implementation? If so, then we have a bug and it should be fixed. (Not with a total rewrite, just a tweak fix.)