Registry
chx - February 15, 2008 - 03:48
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | base system |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | fixed |
Description
This patch introduces a registry of functions, classes and interfaces. What this is good for?
- No longer need to specify files in hook_menu. It Just Works (TM).
- Forms can be reused from inc files without manual includes which is necessary in D6.
- Preprocess functions can live in inc files.
- module_implements becomes faster and allows us to use a better weighting system later.
- Most hooks can live in inc files (minus hook_boot, hook_exit and the new hook_hooks)
- Every page load can have (almost) exactly only the inc files it needs preloaded. This can speed up bootstrap tremendously by not loading unnecessary code.
There is some nice magic in hook_hooks: this registers the dynamic hooks. We cache the implementation of these, for example {$form_id}_form_alter during registry build, so when module_implements is called with system_themes_form_form_alter and there is no implementation cached for it then it knows there is no need to look for implementations of this hook. This speeds up things tremendously for such dynamic hooks -- of which I hope we can introduce a few more during D7.
The registry itself and the parser is Crell's idea and code originally please credit him when committing.
| Attachment | Size |
|---|---|
| registry.patch | 27.96 KB |

#1
To give some background, the problem with my original registry design was that it hit the database too often. In fact, it ended up hitting the database a few hundred times per request, which is double-plus-ungood. chx solved that by the additional magic in module_implements() which lets Drupal learn about its own hook implementations and cache that knowledge, which not only gives us the lazy-include logic without making the database cry but speeds up module_invoke_all() to boot. The rest is just a logical extension along the same lines.
chx is the one who made the whole thing actually work. :-)
#2
On a cursory reading of the patch, I notice that you are calling module_list with FALSE, TRUE, FALSE, which are the default arguments anyway. From a developer's point of view, to find out what FALSE TRUE FALSE does (no refresh, bootstrap mode, default sort), I'd need to look up the function anyway - so including these arguments explicitly does not add any readability. And since this is core accessing core, we don't need to protect ourselves against sudden changes in the other function.
#3
(Also, I presume that this patch will require the other core modules to implement hook_hooks too, correct?)
#4
subscribing so I can test a bit later.
#5
You need hook_hooks only to register dynamic hooks. I am unaware of any other than {$form_id}_form_alter.
#6
Fixed those module_list calls.
#7
Supressed hooks for the includes directory so tablesort_init does not fire on hook_init. content_type.inc is now parsed as belonging to node because it is inside the directory 'node'. install files are parsed so drupal_write_record works.
#8
The module dir setting was wrong for content_type.inc and other similar named files. There is something good here now: if you want to implement a hook in includes, you can, though it'd be awkward a bit: hook_block can be implemented as includes_block.
#9
module_implements did not load the include files for the cached stuff. Bummer.
#10
Subscribing.
#11
Fixed the definition of hook_form_{$form_id}_alter. Fixed update.php WSOD'ing.
#12
I've been testing this a bit, and haven't run into any problems other than when I enabled all modules, I got an error:
"call_user_func_array() [function.call-user-func-array]: First argument is expected to be a valid callback, 'node_node_type' was given in /Users/webchick/Sites/head/includes/module.inc on line 505."
#13
Looks nice from the description and reading the code. I'll give it a try soon.
#14
Great idea thanks, i will give it a try
#15
Subscribing.
#16
Subscribing, may be parallel to Drupal pipes...
#17
Subscribing.
#18
subscribing.
#19
From what I can see so far this looks really cool. Is there any better documentation on everything that is included here?
#20
I tested the patch with Apache 2, PHP 5.2.5, MySQL 5.0 through a clean install and didn't experience any errors. I enabled upload.module, created some vocabularies and created some stories with terms and attachments without an error.
@chx: it would be nice if you could point out some specific things to test.
The concept is sound and saves from from the dynamic include insanity I deal with in D6, and provides automatic class loading. The code style is sound, and documentation is copious.
+1...
The attached patch is re-rolled to remove a debugging line and fix a system_update conflict. I'd say it's really close to RTBC with a few more positive tests.
#21
Make sure you have devel installed and are watching the query log. The various caching systems are designed to avoid ballooning the query count, so make sure it's not showing too many queries against the registry table.
To test the auto-load, try moving the class in aggregator into its own file and rebuilding the registry. (Happens on module enable or when the "clear all caches" button is clicked on the Performance page.) Then setup a feed or two and update them. It should load the file with the class and function normally rather then breaking. :-)
That should cover the two main tests at this point. If those work, the rest should be fine.
#22
Here is an updated patch which moves class update_xml_parser to a new file to make testing easier.
#23
Testing is rather simple, no need to move files around -- the file handling code is removed from menu.inc so if Drupal works, admin pages, node submission etc then all is well. I have also removed the file declarations from menu hook. Also included is a
$_SERVER['REQUEST_METHOD'] == 'GET'for the caching, and that's good: rarely used included files necessary for form submissions holding lots of rarely running code are no longer cached. This was the only big remaining concern of Crell (please do not forget crediting him for this subsystem) and mine.I fixed the node_node_type error on the modules screen -- the error was in module_implements and meant that hooks in include files were not included.
Note that this patch is a continuation of my patch, it uses nothing from dopry's -- there was more debug code to be removed and the system update conflict was solved in an unclean way and also the aforementioned big bug was not fixed.
#24
This seems like a great idea. Just to test that I understand this correctly (and to maybe help others new to the issue), are the following correct statements?
1) The files that hold the functions should be named $module.$something.inc, so content.admin.inc would work correctly as an additional registry location for the content module, but there might be a problem using a file named content_admin.inc. In other words, some files may need to be renamed.
2) To implement this in most modules, you would just need to :
a) find and replace if(function_exists('my_function')) {..} with if(registry_check_function('my_function')) {...}.
b) remove any 'file' => 'custom.inc' lines from hook_menu(), since the file name will be discovered automatically.
d) remove 'include_once(custom.inc)' from your code since it won't be needed any more :)
#25
#26
New patch attached, with very minor code comment changes (a couple of typos, periods at end of sentences, capitalization, etc.) Absolutely nothing major, and I diff'd it with the patch in #23 to make sure that I didn't inadvertently change any of the code.
(Please do not credit me on commit.)
#27
One addition on conversion, some standard conventions are a good idea. Eg, page handlers in module.pages.inc and module.admin.inc as now. Hooks that are called once in a blue moon like hook_menu and hook_theme are "registry hooks", and should probably all end up in module.registry.inc. That sort of refactoring will be done to core after the main patch lands and we can establish some de facto best practices that way.
Ideally, code should be grouped in such a way that loading one file will get all of the functions a particular action is going to need. This is optimizing by cut and paste. :-)
#28
@Crell: Hm.
At a certain point though, I fear that we're going to be causing more of a performance overhead including all of these wacky files everywhere. A commenter on the D6 announcement post pointed out that for *really* high-traffic sites that need to use things like opcode caches, more code in *fewer* files is what you want. Additionally, it makes the module really hard to grok when code is here, there, and everywhere. The current hints in the menu system are nice, because they tell me when I can't find function x_y what file it lives in. This is going away now, so my only resource is to grep and have 11 files open in front of me to figure out how a module works.
node.pages.inc and node.admin.inc is a nice, logical separation. But if we start having node.registry.inc and node.node-pages.inc and node.node-teaser-pages.inc and node.admin-all-nodes.inc and node.admin-edit-nodes.inc and so on that's a little bit... insane. Let's shoot for a happy medium that gives us the fewest number of files per module with the biggest performance benefit.
#29
Attached patch removes the 'file' lines from forum_menu(). Also fixes a comment, 'enabled', not 'installed' modules.
#30
@webchick: Absolutely. While this setup could support one-function-per-file, that would be an very stupid way to do it. :-) The advantage here is that we can experiment and find what a good balance would be (registry hooks are simply low-hanging fruit) without having to change the API or the engine itself. Optimize by cut and paste. :-)
That said, if you simply set your opcode cache to not recheck the file system then it shouldn't matter how many files you have. I also find 3-4 well-organized files much easier to deal with than one 1000 line behemoth. I frequently break modules up already into multiple files with a manual include just to make them more manageable. For a module that only has 4 small functions in it, it's probably best to just leave it all in one .module file. Again, it's fully dynamic so modules can organize themselves in whatever way makes the most sense for them.
#31
I agree with Crell's statement. Obviously there's a sane limit to how you should break a module's functions. May need to add a "best practice" guide once this lands? Either way, the flexibility is there!
#32
Subscribing.
#33
So what else is needed to get this in?
#34
Subscribing.
#35
I tested a clean install and really clicked around. No errors.
When is update_create_registry_tables() supposed to be called? I don't see it in the upgrade path. Once it is there, this is RTBC.
#36
Subscribing
#37
This patch seems to work fine and apart from the performance hit on modules page (registry creation) it certainly hasn't had a negative impact on page generation.
+1
#38
We haven't discussed much a big benefit of this patch - we can rework the bootstrap so that we hand over control to the menu handlers when only the required modules are loaded. No longer will we indiscrimitately do a full bootstrap for most requests. This system will carefully load modules when they are needed. This has big implications for performance sensitive callbacks like ajax handlers, ahah handlers, private files handlers, etc. This bootstrap refactoring will be my next patch after this gets in.
Also, chx mentioned at the top that this paves the way for allowing modules to order hook invocation on a hook by hook basis instead of the crude system.weight we have now. thats nice.
#39
moshe: darn you! You got me drooling! :D
#40
@moshe: Yep, that's exactly the end goal. A well-factored module will have nothing in .module aside from hook_boot, hook_exit, hook_hooks, and methods that have to be called directly. (And even some of those can be moved out.) Pretty much everything else can be loaded on-demand. That makes even a BOOTSTRAP_FULL tiny.
There's still the question of the includes directory, but I suspect we can clean out much of that, too. Phase 2. :-)
I'm not sure what you mean in #35. Can you clarify?
#41
#35 says, there is a nice update function which creates the registry tables early but the function is not called. Tomorrow I roll a new one.
#42
So there. When the Database: TNG patch gets in, we can do better with the update code as then the schema will not be in common.inc.
#43
* need to rename system_update_7000 to system_update_7001
* even after that, i am getting "Call to undefined function cache_get() in /Users/mw/htd/pr/includes/module.inc on line 427" on update.php
#44
The previous try was rolled from an older version -- sorry. This one seems working.
#45
ok, upgrade path works now. rtbc
#46
Moshe in private raised a concern that we are now INSERTing on every page with the cache_set. The basic assumption is that every GET to a given router path uses the same include files. When this assumption fails, then the safety net is that the cache is incremental, so if there are two sets of includes for a given router path then both will be included always.
#47
Looking at the caching code I found a space error. I have edited the patch to fix this so there is absolutely no change just a code style fix.
#48
Another reroll. This now really preloads the original_files as I wanted.
#49
a) I removed the possibility to implement hooks in includes/ it was silly anyways b) We no longer include information from disabled modules sharing a directory with an enabled one.
#50
Moshe said it's not silly and he likes it. Easy to move around the check.
#51
subscribing
#52
More goodness: module_hook ties into module_implements, reusing its cache. Thus node_invoke now can use this cache to save a few queries from the registry but then we have the problem that what node_hook calls $module is not necessarily a module. I circumvented this by adding recognition code for these pseudo modules into the registry build phase.
#53
And the patch.
#54
I tested and this is RTBC.
I found one minor glitch in that fapi often produces registry cache misses that result in a quick query. The calls are:
registry_check_function($form_id)registry_check_function($form_id .'_validate')
registry_check_function($form_id .'_submit')
Until we have a directory of all $form_id on the site, I can't think of a clean solution. It is such a minor issue that the patch should proceed as is.
#55
$ diffstat
38 files changed, 457 insertions(+), 264 deletions(-)I'm still trying to figure out what the killer advantage is of this patch ... I understand it could potentially speed-up Drupal but do we have any proof of that yet? The advantages listed in the body of the issue are nice, but don't blow me away. I'm a bit sceptic because it doesn't necessarily make the module code easier -- as you can see from diffstat's output, it quite a bit more complex. I'd like to see us unleash more of this patch potential to help us evaluate its usefulness.
#56
We have wanted to make Drupal boot faster. Once this is in we can explore that so much easier -- most of the current boot process can be removed. Also, currently if you want to reuse a form, you need to figure it out first which file it is in. Also, this patch will make it possible to weight the hooks one by one, even relative to each other, which is a much wanted feature.
As you said it's complex so I would rather not make it even more complex with changing the bootstrap right in this one.
#57
I just benchmarked the Drupal 6 page split effort: http://www.garfieldtech.com/blog/benchmark-page-split
Short version: 20% speed improvement for uncached pages. The registry here should let us split off even more code. That's the killer advantage of this patch.
#58
Another good thing is that hooks become really really cheap. Even dynamic hooks. We can add a lot more dynamic hooks here and there once this is in -- we can even add a hook_generic_function_name_alter so that you can do something with every function there is (altering internal data structures to your hearts content) and this costs us extremely little performance-wise... lots of other craziness is opened up with this.
#59
rerolled to fix 2 broken hunks in system.install
#60
oops. i had a local change in there. i *think* his one is same as chx' last patch.
#61
OK, I experimented tonight in an attempt to demonstrate the bootstrap advantages of the patch. I had some great success. To see for yourself, apply the patch and do this quick test:
1. Enable all the core modules you feel like testing.Then browse to /admin page.
2. Go to module_load_all() in module.inc] and replace module_list() with array('filter', 'block').
3. Refresh /admin page. Yell out "holy shit - my site still works!". The above step cut a gaping wound into the heart of the bootstrap. Only block and filter modules were automatically loaded. The rest are being loaded as needed. Consider the advantage this has once you have 50+ Contrib modules as most sites do now. Thats a lot of code that is not being parsed and loaded into memory.
4. Click around and your site mostly works. You can add/edit/view nodes, manage posts on admin/content/node, add feeds and retrieve feed items, and so on. You can't do funky stuff like install modules and flush all caches. That will take some refactoring. The error messages you see here are trivial compared to the bootstrap change we just made.
As I said earlier, this patch lets us hand over control to the menu callback much sooner. This is great for ajax handlers, autocomplete, web services, private file handling, etc.
I'd love for this patch to be committed before we fully rework the bootstrap. I hope the above demo satisfies the need to show speed.
#62
But, golly, Moshe, that means fewer WSOD's too! We can't have that - how would all those contribs get debugged? ;-)
#63
subscribe
#64
subscribing
#65
subscribe
#66
While I'd still like to see more of this patch potential evaluated, I decided to move forward and to have a first good look at the code.
My main concern is that there is a lot of stuff going on, and the code is poorly documented. This requires the reviewer to reverse engineer a lot of what is going on. Also, certain function's PHPdoc (i.e. _registry_save_resource()) is out of sync with the declaration of the function.
Please better document the code.
#67
And that's how Dries reviews patches while skiing. Subscribing.
#68
A question also regarding files like cache.inc - we allow the file to be included to be determined via a variable setting. Is that going to break here? In other words, this registry code will always include the file includes/cache.inc based on the function names?
CNW based on #66
#69
@pwolanin: The registry only pre-loads files that were lazy-loaded in the first place, I believe. As long as the cache system is initialized before the registry is, and the appropriate file is included, the lazy-loader will never kick in and there should be no problem.
#70
I can patch current CVS, but system.install patches with hunk #4 with fuzz 2 (and a few hunks with offsets), and this results in a
Fatal error: Cannot redeclare system_update_7001() (previously declared in drupal/modules/system/system.install:2702) in drupal/modules/system/system.install on line 2714Can someone re-roll the patch, please?
#71
There seems to be a problem with both moshe's and chx's latest versions in the installer. I tried to track it down, but couldn't figure out what the issue is. It seems to be trying to run the schema setup twice, the second time with NULL $schema, and that's breaking horribly.
That said, the attached patch refactors _registry_save_resource() out into a series of utility functions that make more sense, and increases the patch size by a third by adding a lot more documentation. Hopefully it should be much easier to follow now. It even adds a new registry @defgroup. :-)
There should be no functional changes in this patch, but I can't confirm that for sure because of the problem with the installer. I'm therefore leaving this as CNW and turning it back over to chx (or whoever else wants to figure out what happened to the installer).
#72
Another nice thing we can do after the registry is in, there is no longer a problem that you can't invoke hooks relatively early -- so you can actually have a hook for variable defaults. w00t w00t.
#73
Crell: thanks for improving the documentation. I'll schedule another review. Glad to see this moving forward.
chx and moshe: thanks for posting 'motivations'. I'm sufficiently convinced now. At this point, I'd like to make sure that (i) these motivations are captures in the patch, (ii) that the code is self-explanatory and (iii) that this patch does not introduce bugs (i.e. caching).
all: please review the code and help make sure that the code is easy to understand. This is a key building block of Drupal, and it is important that it is both elegant and transparent. Your critic eye is required to make this a beautiful piece of design and code. Thanks.
#74
I did a quick code review. Here are some observations:
Code documentation formatting is inconsistent, some
@param's declare data-types. Parameter descriptions are wrongly indented: some are single spaced, some have two spaces (the correct one) and others have tabs.Why the database update function is in update.php instead of system.install? I'm sure there are good reasons for this, I think they should be documented. Also its comment says "If batch table exists, update is not necessary".
I'll test the patch when I get a chance.
#75
The last patch seems to be running drupal_rebuild_code_registry() in index.php. I'm pretty sure this is debug code left in there by mistake.
#76
That was not too hard. As the fallback
module_implementsusesmodule_listand during install, the module to be installed is not yet in the list but its install is loaded bymodule_load_includeand thenmodule_invokeis used -- so I added a few lines of code tomodule_load_includewhich ensures this works. To balance, I ripped the fallback code frommodule_invokeit was a) useless b) broken.#77
Somewhere during the rerolls TAB characters got added to my beautiful little patch. Sacrilege! (definitely not me, my editor is set up properly.)
#78
Getting a WSOD on a clean install.
#79
WSOD is caused by a rebuild of the registry calling back into code that relies on the registry already being built:
_registry_parse_directory --> _registry_get_resource_module --> node_get_types --> _node_types_build --> module_invoke_all --> WSOD
specifically, according to this comment drupal_rebuild_code_registry:
<?php// Flush the old registry.
db_query("DELETE FROM {registry}");
// We can't use module_invoke_all here because it depends on the registry
// which is currently being rebuilt.
?>
too late for me to fix (3am in sydney), hope this helps someone else get a working patch going.
#80
Well that comment and the surrounding code is outdated, module_invoke_all calls module_implements which has a safeguard for the registry being rebuilt. I'll investigate.
#81
Calling
_registry_get_resource_hookmakes the conversion to utility functions and the patch actually work.#82
Ok no WSOD this time, but I'm getting table doesn't exist errors after enabling all modules.
#83
Instead of trying to figure out which version of module_implements to run we now make it possible to set a "dumb" mode which is basically what module_implements is now. This is set during registry rebuild (as in the previous patches) and during install of modules.
#84
w00t! works for me after installing new modules.
updated patch to get rid of fuzz against HEAD.
#85
Attached is a version of the patch in #83 with substantially reworked comments in most places, a few minor typos corrected, and double spaces changed to single spaces between sentences. There should be no functional changes, if I did this correctly.
(Also, as in #26, please do not credit me on commit.)
#86
Registry allows http://drupal.org/node/237959#comment-790661
#87
I could find only 2 problems
- right after i installed, i followed a link to node/add and got access denied. that cleared up though after i submitted the modules page
- i am getting a NOTICE and blank node form the notice is "notice: Undefined index: blog_node_form in /Users/mw/htd/registry/includes/form.inc on line 344.". perhaps the registry is missing those dynamic form ids again.
#88
As we check for disabled modules, adding node_content to the module list in _registry_parse_directory helped the problem. Also, I added access to the node hooks list. Pseudo modules are a pain. I have grepped for node_invoke in node directory and these are all the hooks. I have updated some of the code comments in node.module so that now you can actually search for every node_invoke and get a list of hooks.
#89
Re-rolled this patch (there was a conflict in common.inc), and fixed a misnamed variable in system_update_7002() that was keeping the update from being successfully executed.
#90
tried patches at #88 and #89, both need work.
- create content menus are missing after fresh install
- going to node/add/page --> access denied for uid 1
- installing a module --> WSOD for all pages on the site
#91
this is a nasty hack in module_hook makes the create content menus appear on a fresh install:
<?phpfunction module_hook($module, $hook) {
$list = module_implements($hook);
if ($hook == 'form' || $hook == 'access') {
$list[] = 'node_content';
}
return in_array($module, $list);
}
?>
the real problem is that there seem to be some namespace hacks around the module names 'node' and 'node_content'. not sure the best way to go here, but currently module_implements doesn't return 'node_content' as a module that implements the 'form' hook.
also, the current implementation seems to rely on the hook cache too much. to illustrate:
1. fresh install plus patch
2. clear the hook row in cache table
3. visit admin/build/modules --> WSOD
#92
I had a patch to clean up some of the "node_content" stuff at one point, though I'm not sure if it would help here.
#93
I went back to #85 enrolled all the fixes, moved the hooks to cache_registry -- note that if the hooks cache entry is not there, the registry gets rebuilt -- and changed
node_hooktoregistry_check_function($module .'_'. $hook);because that's what it is, node_content is not a module.#94
And the patch. The above comment / solution was wrong, the problem with content types was that we did not rebuild the registry after install, there is a comment in install.php which says: Rebuild menu to get content type links registered by the profile, and possibly any other menu items created through the tasks. I added registry there.
Edit: I realized it was wrong because module_hook deliberately does not check whether $module is really a module. That's enforced only in dumb mode because then we have no other way.
#95
the patch at #94 applies with some fuzz, but all the issues from #90 are fixed.
creating content works, enabling and disabling modules and themes works.
i also couldn't get things to fail by deleting the registry or hook cache, or emptying the registry table between page requests.
nice work!
#96
Rerolled against HEAD. Absolutely no changes.
#97
i've spent some more time trying out most of the admin section, enabling all modules, turning caching on and off.
i think this patch is ready to go.
#98
I'd like to test still - I'm concerned about what happens if (for example) several modules have a .inc file that define the same functions.
#99
@pwolanin - there is no new problem with function duplication. this new system just loads less code on every request than the current drupal. if you get a duplicate function error with the new code, you would have certainly gotten it with the existing code.
#100
Duplicate function names indicate a bug in one or more modules for failure to namespace their code properly. That's true now, and it's true after this patch. No change. :-)
The only place that's not true is the cache and database systems. Both of those initialize before the registry so the registry wouldn't break on those since it never loads them itself. And the database will not be that way for long if the new Database API patch ever gets in. :-)
#101
I was just working on something that makes me wonder if this new facility will address: With the movement to "compartmentalize" code (which I like), would this allow hook_block code to be split from a module? This could, conceivably, greatly speed up block usage, which I can see slowing down my page rendering.
#102
@Crell - well, in core already is the new password.inc file which may also be substituted based on a variable.
#103
When running update.php, the system update 7002 generates the following error on the "select updates" page:
user warning: Table 'drupal7.registry' doesn't exist query: SELECT file FROM registry WHERE name = 'update_script_selection_form_validate' AND type = 'function' in /home/florian/workspace/drupal7/includes/common.inc on line 3911.The errors don't seem to affect the upgrade, but we can't commit something that will require users to "just ignore" an error while updating their site.
#104
@nancyw: Yes, if we can compartmentalize block code more than it is now. One of my later goals for Drupal 7, time permitting, is to refactor the way blocks work to be more hook_menu()-ish, with a registry function that specifies callbacks to call for each block. That would allow the registry function, like hook_menu and hook_themes, to live in call-once files ($module.registry.inc) and the callback function to live "wherever", which could be conditionally loaded. However, any block that is displayed would still have to have its code loaded unless it's cached. The D6 block caching system is separate from either the Registry or any block refactoring. (Both the registry and the push for testing benefit from a move to smaller, more self-contained chunks of code, which is an overall API-betterment anyway.)
@pwolanin: Well, if password.inc is loaded before the registry then there's no problem. If not, that *may* be an issue. My preference, though, is like the database to move these "swappable subsystems" to a factory/object structure like the new database API uses. That provides a single front-end, lazy-code-loading, and no duplicate resource names.
#105
Thanks, Larry. That's what I wanted to hear. I'd rather load 500 bytes than 50,000. (Now, if I can convince people that they don't need comments...)
#106
*Crell stabs nancyw...
It's not the file size that's the issue, it's the code compile weight that we're trying to eliminate. Comments have virtually no compile weight. :-)
#107
@Crell - that file is now loaded on demand during user login or user account edit submit. i.e. it would probably be well after the registry step. If it needs to be refactored, you'll have to explain to me how such an object would work and such a switch should be part of this patch.
#108
The file include part of registry_check_file is now safegaurded against maintenance_mode. I have also simplified the function.
#109
You probably meant defined() (http://us2.php.net/manual/en/function.defined.php) rather than is_defined() which is not a php function. No other changes. update.php runs successfully, if I wasn't making a fix I'd set this as RTBC.
#110
subscribing so I can test a bit later...
Mongi Daghbouj
Malermester
http://www.holtemalerfirma.dk
#111
Ahem.
#112
the patch at #109 applies and cleanly, and i couldn't find any problems from a fresh install.
however, looking at the code for 'what if' type problems, i tried out a 'what if the process rebuilding the registry fails before completion'. eg, put in a die() at some point after the registry is flushed and before all the new registry rows (over 1385 separate inserts queries on a standard install) are written. or just delete all/half of the registry in between two requests.
hilarity ensues - pages are randomly inaccessable, and its not obvious how to get the registry cache rebuilt without hackery. so, setting this back to patch needs work. i think we need to rework the registry rebuild so that we don't rely on processes not die()ing during a 1000+ insert run.
two suggestions:
- read the new registry data into memory *before* the flush and inserts. still leaves us open, but for a shorter time and we could do batch inserts. might mean larger RAM use for that request though
- keep track of the files we are parsing, and don't delete registry items that haven't changed (which would be the vast bulk). i think this is the cleanest and fastest way, but would add some complexity.
thoughts?
#113
I should not set this to RTBC but two people already said it works. justinrandell's problem IMO is a follow up patch, making the registry rebuild more robust.
#114
Batch inserts will be trivial with the new Database API, as it supports multi-insert statements (using whatever database-specific syntax is appropriate). That is still in development, though.
File-by-file rebuilding is interesting, and would offer a smaller total-death window, but would still leave the system in an unstable state should the process die halfway through system.module for some reason. Considering how rare a registry rebuild is, I don't know how huge a problem that would be.
The new Database API is also designed to support transactions where possible. Wrapping the entire delete/rebuild cycle in a transaction would offer far better robustness on databases that support transactions, as the system could (in theory) never be left in an unstable state. That's almost a text-book use case for transactions, actually. That's also pending the Database work.
So in short, yes it's a problem, but not a huge one,