Regression: Unify and rewrite module_rebuild_cache() and system_theme_data()

dww - May 26, 2007 - 08:39
Project:Drupal
Version:7.x-dev
Component:base system
Category:task
Priority:normal
Assigned:Dave Reid
Status:closed
Issue tags:Needs Documentation, Patch Spotlight, Performance
Description

in working on http://drupal.org/node/146910, i noticed that module_rebuild_cache() and system_theme_data() do almost exactly the same thing, but have wildly different names. we should clean this up. a couple of options:

  1. $modules = system_data('modules');
    $themes = system_data('themes');
  2. $modules = system_module_data();
    $themes = system_theme_data();

#1

dww - June 19, 2007 - 09:44

This cleanup/unification would also be a good place to add the hook_alter_info() I'm proposing over at http://drupal.org/node/152926.

#2

drewish - June 25, 2007 - 05:16

subscribing. personally i like #1 better.

#3

dww - June 25, 2007 - 07:09

maybe we should call it drupal_system_data() to defend ourselves from system.module needing to implement hook_data() at some point in the future?

#4

dww - June 25, 2007 - 07:25

or maybe "system_get_data($type)"? too vague? maybe my original plan for "system_theme_data()" and "system_module_data()" (which would obviously share lots of the same helper functions) is the way to go? i'm too exhausted now to think clearly.... someone please tell me the right name(s) to use. ;)

#5

drewish - June 25, 2007 - 16:17

digging to the two functions i'm having a bit of trouble figuring out how to unify them into a single system_data() function. i think shared helper functions might be a better way to go.

#6

dww - June 27, 2007 - 08:17

yeah, i agree. sad to say it, but i think s/module_rebuild_cache/system_module_data/ is about the best we can do here. there are a few things that could be shared, but not many. :( it's possible we could reorganize where all the code lives, etc, but that doesn't seem particularly urgent. i think i'm just going to put this on hold and worry about http://drupal.org/node/152926 first.

#7

Dave Reid - September 6, 2008 - 23:03
Version:6.x-dev» 7.x-dev
Assigned to:Anonymous» Dave Reid

I'm going to take this up as I'm also working on rewriting system_theme_data in #289369: Deleted modules are not removed from system table.

#8

webchick - September 6, 2008 - 23:45

Small request: Let's get a verb in that function name ({scope}_{verb}_{thing_that's_being_verbed}()). I don't like either 1. or 2. for that reason, although I agree that unifying their names is a good move.

#9

Dave Reid - September 7, 2008 - 19:12
Status:active» needs review

Here's my first effort on this. Summary of changes:
- module_rebuild_cache() has been moved from ./includes/module.inc to ./system/system.module for consistency
- module_rebuild_cache() has been renamed to system_get_module_data()
- system_theme_data has been renamed to system_get_theme_data()
- Created helper function _system_get_module_data
- Renamed helper function _system_get_theme_data
- Refactored system_get_files_database(&$files, $type) to system_get_files_database(&$files, $type, $update = FALSE)
- When $update = TRUE, the function will update the system table based on the information provided in the $files database (insert new records for new discovered modules/themes in filesystem, updates changed records, deletes records for modules/themes no longer found in the filesystem)

I tested install, update, module enabling/disabling/installing/deleting, theme switching, basically anything involving these functions. Tests all pass for me. The only thing I'm unsure of is if I should split the update functionality out of system_get_files_database() and make a new function called system_update_files_database().

AttachmentSize
147000.patch 25.12 KB
Testbed results
147000.patchfailedFailed: Failed to apply patch. Detailed results

#10

Dave Reid - September 8, 2008 - 03:20
Title:unify module_rebuild_cache() and system_theme_data()» Unify and rewrite module_rebuild_cache() and system_theme_data()

Here's my second revision. I split my original system_get_files_database into two functions. The original system_get_files_database and system_update_files_database. I updated the logic so only changes found in module or theme information will be actually updated into the database instead of blindly updating every single record every time. After testing compared to the original methods, the amount of insert/update/delete queries to the database were reduced to barely nothing with no side effects or issues:

Page: admin/build/modules
Task: viewing page, no changes being made
Original D7 code: 136 write queries
Patched D7 code: 0 write queries

Page: admin/build/modules
Task: after uninstalling and removing devel module from file system
Original: 121 write queries
Patched: 5 write queries

Page: admin/build/modules
Task: after installing devel module to file system
Original: 136 write queries
Patched: 5 write queries

Page: admin/build/themes
Task: viewing page, no changes being made
Original: 57 write queries
Patched: 0 write queries

I'll try to gather some data on read queries.

AttachmentSize
147000-2.patch 25.18 KB
Testbed results
147000-2.patchfailedFailed: Failed to apply patch. Detailed results

#11

dww - September 8, 2008 - 07:20

Very cool, thanks. #10 sounds especially promising, though I haven't had a chance to look at the patch. Keep up the great work!

#12

arhak - September 9, 2008 - 21:52

subscribing

#13

Dave Reid - September 10, 2008 - 10:33

Making a note to myself to keep an eye out for when #259412: Total module system revamp lands in case I need to re-roll patch.

#14

pillarsdotnet - September 12, 2008 - 22:47

Backported to version 6.4 here: http://drupal.org/node/307756

#15

Dave Reid - September 15, 2008 - 03:03

Rerolled and bumped for review.

AttachmentSize
147000.module_system.patch 25.19 KB
Testbed results
147000.module_system.patchfailedFailed: Failed to apply patch. Detailed results

#16

Dave Reid - September 26, 2008 - 03:52
Status:needs review» needs work

Re-rolled for HEAD, but the patch is causing a errors on install, so I need to take a look at what's going wrong.

AttachmentSize
system_get_data-147000-D7.patch 22.19 KB
Testbed results
system_get_data-147000-D7.patchfailedFailed: Failed to apply patch. Detailed results

#17

pillarsdotnet - September 27, 2008 - 00:55

Can't test for D7, but my backport to D6 also caused errors, which I fixed. Maybe you can compare with my (working) D6 patch at http://drupal.org/node/307756

#18

Dave Reid - October 9, 2008 - 05:51

Fixed some comment coding standards, will need to test if this still applies. Too tired right now...

AttachmentSize
system-get-data-147000-D7.patch 22.05 KB
Testbed results
system-get-data-147000-D7.patchfailedFailed: Failed to apply patch. Detailed results

#19

Dave Reid - October 10, 2008 - 16:06
Status:needs work» needs review

Re-rolled for head. I also did some actual ab performance testing. I like the results, and I bet you will too!

admin/build/modules before patch:
Requests per second:    0.96 [#/sec] (mean)
Time per request:       1042.654 [ms] (mean)
Time per request:       1042.654 [ms] (mean, across all concurrent requests)
Transfer rate:          34.40 [Kbytes/sec] received

admin/build/modules after patch:
Requests per second:    1.01 [#/sec] (mean)
Time per request:       988.747 [ms] (mean)
Time per request:       988.747 [ms] (mean, across all concurrent requests)
Transfer rate:          36.27 [Kbytes/sec] received

admin/build/themes before patch:
Requests per second:    3.48 [#/sec] (mean)
Time per request:       287.024 [ms] (mean)
Time per request:       287.024 [ms] (mean, across all concurrent requests)
Transfer rate:          71.66 [Kbytes/sec] received

admin/build/themes after patch:
Requests per second:    3.82 [#/sec] (mean)
Time per request:       261.681 [ms] (mean)
Time per request:       261.681 [ms] (mean, across all concurrent requests)
Transfer rate:          78.60 [Kbytes/sec] received

With only a bit more processing, we're getting a huge decrease in the hits to the database (see # 10) and a very useful performance improvement!

AttachmentSize
system-get-data-147000-19-D7.patch 22.39 KB
Testbed results
system-get-data-147000-19-D7.patchfailedFailed: Failed to apply patch. Detailed results

#20

Anonymous (not verified) - November 12, 2008 - 11:30
Status:needs review» needs work

The last submitted patch failed testing.

#21

justinrandell - November 13, 2008 - 22:04

subscribe. i got to looking at module_rebuild_cache when wondering how to make tests run faster.

its a beast, and we call it too often.

#22

Damien Tournoud - December 24, 2008 - 19:49

The patch looks good by itself.

A couple of issues:

<?php
// Need to make a safe, modifiable copy of the $files array since PHP
// automatically makes references to objects instead of copies.
$files = array();
+  foreach(
$files_updated as $key => $file) {
+   
$files[$key] = clone $file;
+   
$files[$key]->info = serialize($files[$key]->info);
+  }
?>

I don't buy it. If the only difference is serialize() the info key or not... we can easily do better then doubling memory usage.

<?php
+      // Scan fields to find only the updated values.
+      $file->updated_fields = array();
+      foreach (
$file as $key => $value) {
+        if (isset(
$files[$file->name]->$key) && $files[$file->name]->$key != $value) {
+         
$file->updated_fields[$key] = $files[$file->name]->$key;
+        }
+      }
+
+     
// Update the record.
+      if (count($file->updated_fields)) {
+       
db_update('system')
+          ->
fields($file->updated_fields)
+          ->
condition('filename', $file->old_filename)
+          ->
execute();
+      }
?>

It would probably be faster to just look if *at least one* field has changed, pass the whole $file to db_update, and let the database deal with it.

<?php
+      // File is not found in filesystem, so delete record from the system table.
+      db_delete('system')
+        ->
condition('filename', $file->filename)
+        ->
execute();
+    }
?>

Two words: file system.

<?php
// All remaining files are not in the system table, so we need to add them.
+  foreach($files as $file) {
?>

A space is missing after foreach.

<?php
+    db_insert('system')
+      ->
fields(array(
+       
'filename' => $file->filename,
+       
'name' => $file->name,
+       
'type' => $type,
+       
'owner' => isset($file->owner) ? $file->owner : '',
+       
'bootstrap' => isset($file->bootstrap) ? $file->bootstrap : 0,
+       
'info' => $file->info)
+      )
+      ->
execute();
?>

This is badly formatted.

<?php
// Set defaults for module info
?>

Dot missing at the end of the sentence.

<?php
// Read info files for each theme
+  foreach ($modules as $key => $module) {
?>

I guess that's "Read info files for each module". Plus the final dot is missing.

<?php
+    // Log the critical hooks implemented by this module.
+    $modules[$key]->bootstrap = 0;
+    foreach (
bootstrap_hooks() as $hook) {
+      if (
module_hook($modules[$key]->name, $hook)) {
+       
$modules[$key]->bootstrap = 1;
+        break;
+      }
     }
?>

Do we still need this? I thought we removed the notion of bootstrap module.

<?php
+    // Set defaults for theme info
?>

Same remark as above.

#23

Dave Reid - December 24, 2008 - 21:59

Awesome review DamZ! I think this was one of my first core patches (seeing as the bootstrap stuff is still there), and it obviously showed. :) I'll have to revisit this to get it back in tip-top shape.

#24

Dave Reid - January 6, 2009 - 21:36

#25

Dave Reid - January 17, 2009 - 06:39

#26

mathieu - May 19, 2009 - 21:21

Subscribe.

#27

EvanDonovan - May 30, 2009 - 05:27

Anyone working on this? Looks like it would be a critical performance improvement...

#28

Berdir - June 4, 2009 - 20:20
Status:needs work» needs review

Re-roll... that was.. erm... fun... NOT ;)

About DamZ's review..

I don't buy it. If the only difference is serialize() the info key or not... we can easily do better then doubling memory usage.

Changed that function quite a bit, removed whole additional $files array, beneath others things. There is some special casing now for the info attribute, but it's quite simple and the whole function is shorter.

It would probably be faster to just look if *at least one* field has changed, pass the whole $file to db_update, and let the database deal with it.

Not sure about that. how would we do that? a break after the first mis-match?

Changed the other things...

One thing that I don't like is filepath vs. filename. We use filename in {system} but filepath everywhere else. Imho, that's quite confusing. and filepath is the proper name, because the path is included too. We should rename the column in {system} imho..

AttachmentSize
system-get-data-147000-D7_1.patch 21.9 KB
Testbed results
system-get-data-147000-D7_1.patchfailedFailed: Failed to apply patch. Detailed results

#29

Dries - June 5, 2009 - 10:02

From an API point of view, this looks like a nice clean-up.

- @param $files_updated should be $files.

- The following code looks a little funny:

+  system_get_files_database($themes, 'theme');
+  system_update_files_database($themes, 'theme');
The same pattern exist for modules. It's funny because you seem to be reading and writing the same data -- that feels like a null operation.

#30

Berdir - June 5, 2009 - 11:41

- Changed $files_updated to $files

The same pattern exist for modules. It's funny because you seem to be reading and writing the same data -- that feels like a null operation.

Yes it does, but it does make sense.. that is the workflow (same for themes):
1. Read the file system and looks what's there (_system_get_module_data())
2. Extend that information with things that are only in the database, for example, status and schema_version (drupal_get_files_database)
3. Update/Insert/Delete {system} based on the new information we have. (drupal_update_files_database())

Maybe we could combine step #2 and #3, but I tried and failed (resulted in the ~10th drupal re-install for that patch ;)) or rename step #2 to extend instead of get or something like that.

- Introduced a static cache for system_get_module_data(). It is currently called twice, and we did remove a call from the registry (#429132: Remove unecessary module_rebuild_cache() call from _registry_rebuild()). If we have a static cache, we can revert that change, because using the statically cached information is obviously faster. It does need a bit more memory, but we save that atleast on admin/build/modules because we don't have to load and parse all those .info files three times. The static cache can currently not be reset except of using drupal_static_reset() directly..

Some devel "benchmarks"

Without patch:

Executed 919 queries in 152.26 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 990.35 ms.
Executed 919 queries in 170.06 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 1116.68 ms.
Executed 920 queries in 164.28 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 1034.99 ms.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.79 MB, PHP peak usage=27 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.79 MB, PHP peak usage=27 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.79 MB, PHP peak usage=27 MB.

Patch without static cache:

Executed 796 queries in 134.05 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 946.37 ms.
Executed 798 queries in 135.79 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 931.35 ms.
Executed 798 queries in 137.66 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 919.3 ms.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.17 MB, PHP peak usage=25.5 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.14 MB, PHP peak usage=25.5 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.17 MB, PHP peak usage=25.75 MB.

Path with static cache and #429132: Remove unecessary module_rebuild_cache() call from _registry_rebuild() reverted:

Executed 796 queries in 137.8 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 880.64 ms.
Executed 795 queries in 136.18 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 855.15 ms.
Executed 795 queries in 138.14 milliseconds. Queries taking longer than 5 ms and queries executed more than once, are highlighted. Page execution time was 875.1 ms.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.56 MB, PHP peak usage=24.75 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.56 MB, PHP peak usage=24.5 MB.
Memory used at: devel_boot()=0.78 MB, devel_shutdown()=15.56 MB, PHP peak usage=24.5 MB.

AttachmentSize
system-get-data-147000-D7_2.patch 23.15 KB
Testbed results
system-get-data-147000-D7_2.patchfailedFailed: Failed to apply patch. Detailed results

#31

System Message - June 5, 2009 - 11:50
Status:needs review» needs work

The last submitted patch failed testing.

#32

Berdir - June 5, 2009 - 11:59
Status:needs work» needs review

Forgot to remove a debug call...

AttachmentSize
system-get-data-147000-D7_3.patch 23.11 KB
Testbed results
system-get-data-147000-D7_3.patchfailedFailed: Failed to apply patch. Detailed results

#33

Berdir - June 5, 2009 - 12:04

Bad update.module fooled my :)

AttachmentSize
system-get-data-147000-D7_3.patch 23.14 KB
Testbed results
system-get-data-147000-D7_3.patchfailedFailed: Failed to run tests. Detailed results

#34

Berdir - June 5, 2009 - 15:18
Status:needs review» needs work

Passed: 149 passes..

That's not exactly enough ;) I assume the tests don't like the static cache..

#35

Berdir - June 6, 2009 - 08:04
Status:needs work» needs review

I don't think I can get the static cache working without harming kittens, removed those things again. Maybe we can thing about this again after #481498: Update simpletest module to use drupal_static()

AttachmentSize
system-get-data-147000-D7_5.patch 21.92 KB
Testbed results
system-get-data-147000-D7_5.patchfailedFailed: Failed to apply patch. Detailed results

#36

Dries - June 6, 2009 - 16:05
Status:needs review» fixed

Looks good. Committed to CVS HEAD. I'm marking this 'code needs work' because we'll want to update the upgrade documentation. Please mark as fixed once the handbook is updated.

#37

Dave Reid - June 6, 2009 - 16:07

Thanks Berdir for picking this up where I left off. It was much needed.

#38

webchick - June 6, 2009 - 16:28
Status:fixed» needs work
Issue tags:+Needs Documentation

This was not actually marked "needs work" :)

#39

Arancaytar - June 6, 2009 - 22:22

-  // Set defaults for module info
-  $defaults = array(
-    'dependencies' => array(),
-    'dependents' => array(),
-    'description' => '',
-    'package' => 'Other',
-    'version' => NULL,
-    'php' => DRUPAL_MINIMUM_PHP,
-    'files' => array(),
-  );

+  // Set defaults for module info.
+  $defaults = array(
+    'dependencies' => array(),
+    'dependents' => array(),
+    'version' => NULL,
     'php' => DRUPAL_MINIMUM_PHP,
   );

Notice: Undefined index: package in system_modules() (line 648 of modules/system/system.admin.inc).

I'd guessed that this is a design change, but when I mentioned it on IRC Heine said it might have been an accidental regression.

Is it intended that modules must now set a package in their .info files, and the 'Other' package no longer exists?

#40

Berdir - June 6, 2009 - 23:02

I'd say that's a bug. For themes, there were 2 defaults missing too, but these caused errors and I corrected them.

I will provide a fix tomorrow.

#41

yched - June 7, 2009 - 00:28

Also, the committed patch contains

-    $themes = _system_theme_data();
-    $modules = module_rebuild_cache();
+    $themes = _system_get_theme_data();
+    $modules = system_get_module_data();

Shouldn't _system_get_theme_data() be system_get_theme_data() ?
[edit: actually _system_get_theme_data() is called directly in a few other places - not sure that is intended ?]

#42

Dave Reid - June 7, 2009 - 02:17

@yched/41: No not at all, since we don't want to load the cached data. That's why the original call is _system_theme_data() instead of system_theme_data().

#43

Berdir - June 7, 2009 - 08:33

@yched
_system_get_theme_data() is called at a few places when the database shouldn't be updated. That function does just return what's on the file system while system_get_theme_data() does extend that with the additional information from the database and stores there what has changed.

IIRC, for themes, this lead to a bug when system_get_theme_data() disabled all themes when it is called in update.php

Edit: Maybe a different name would help to indicate this? Something like system_scan_theme_data() ?

#44

Berdir - June 7, 2009 - 15:58
Status:needs work» needs review
AttachmentSize
system-get-data-147000-D7_fix.patch 652 bytes
Testbed results
system-get-data-147000-D7_fix.patchfailedFailed: Failed to apply patch. Detailed results

#45

brianV - June 10, 2009 - 14:20
Status:needs review» reviewed & tested by the community

Tested Berdir's latest patch, and it seems to do the trick.

#46

Dries - June 11, 2009 - 04:36
Status:reviewed & tested by the community» fixed

Committed. Thanks Berdir, and thanks for testing brianV.

#47

Berdir - June 11, 2009 - 09:40

Added a short note about the renamed functions: http://drupal.org/node/224333#system_get_module_data

#48

JohnAlbin - June 12, 2009 - 08:13
Title:Unify and rewrite module_rebuild_cache() and system_theme_data()» Regression: Unify and rewrite module_rebuild_cache() and system_theme_data()
Status:fixed» active

The re-roll done in comment #28 broke #351487: Remove default values for stylesheet and scripts includes from system module. It looks like an 8-month-old, out-of-date patch was used to move code around instead of actually moving the up-to-date code to the new spots.

From a quick glance at the patch, I also see "comment_user_verification" is gone from the theme settings defaults.

What else was broken?

#49

Berdir - June 12, 2009 - 10:01
Status:active» needs review

Sorry for breaking all these things, you start to rely too much on tests if you have them ;) And it probably went in a bit too fast....

The attached patch re-adds comment_user_verifcation and removes mission and css/js stuff. This was all I could find.

AttachmentSize
system-get-data-147000-D7_another_fix.patch 959 bytes
Testbed results
system-get-data-147000-D7_another_fix.patchfailedFailed: Failed to apply patch. Detailed results

#50

System Message - June 20, 2009 - 08:30
Status:needs review» needs work

The last submitted patch failed testing.

#51

Dave Reid - June 20, 2009 - 20:52

The script regressions part of the regression has already landed in #351487: Remove default values for stylesheet and scripts includes from system module.

Also marking #497442: Module uninstall leaves system table row as a duplicate of this issue.

#52

Berdir - June 21, 2009 - 09:01
Status:needs work» needs review

Re-roll with the remaining changes.

AttachmentSize
system-get-data-147000-D7_another_fix2.patch 638 bytes
Testbed results
system-get-data-147000-D7_another_fix2.patchpassedPassed: 11544 passes, 0 fails, 0 exceptions Detailed results

#53

JohnAlbin - June 21, 2009 - 10:35
Status:needs review» reviewed & tested by the community

Berdir, sorry about the conflicting patches! :-)

This patch is RTBC!

#54

Dries - June 21, 2009 - 14:16
Status:reviewed & tested by the community» fixed

Good catch, John. Committed to CVS HEAD. Thanks.

#55

System Message - July 5, 2009 - 14:20
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.