Upgrade drupal_alter()

sun - October 1, 2009 - 20:53
Project:Drupal
Version:7.x-dev
Component:base system
Category:task
Priority:critical
Assigned:Unassigned
Status:closed
Issue tags:API clean-up, Performance, PHPWTF
Description

We can update drupal_alter() for PHP5.

Test code (to put into index.php/debug.php):

<?php
$foo
= array('foo' => 'bar');
$bar = new stdClass;
$bar->foo = 'bar';

echo
"<pre>"; var_dump($foo, $bar); echo "</pre>\n";

function
system_foo_alter($foo, $bar) {
 
$foo['foo'] = 'baz';
 
$foo['baz'] = TRUE;
 
$bar->foo = 'baz';
 
$bar->baz = TRUE;
}

drupal_alter('foo', $foo, $bar);

echo
"<pre>"; var_dump($foo, $bar); echo "</pre>\n";
?>

Result:

array(1) {
  ["foo"]=>
  string(3) "bar"
}
object(stdClass)#16 (1) {
  ["foo"]=>
  string(3) "bar"
}

array(1) {
  ["foo"]=>
  string(3) "bar"
}
object(stdClass)#16 (2) {
  ["foo"]=>
  string(3) "baz"
  ["baz"]=>
  bool(true)
}

Hence:

- $bar is altered by reference.

- $foo is not altered by reference, even when using &$foo in the function signature.

chx or anyone else to the rescue!

AttachmentSizeStatusTest resultOperations
drupal.alter_.patch3.3 KBIdleFailed: 9236 passes, 117 fails, 51 exceptionsView details | Re-test

#1

Damien Tournoud - October 1, 2009 - 21:04
Status:needs review» needs work

- $bar is altered by reference.

$bar is an object. You never pass the object itself, but a reference to it.

- $foo is not altered by reference, even when using &$foo in the function signature.

There is no &$foo in the function signature.

#2

sun - October 1, 2009 - 21:40
Status:needs work» needs review

I already mentioned that changing the function signature doesn't change the behavior with regard to arrays.

Anyway.

Here's a much cleaner approach that every single noob on this planet will understand.

- Removes some ugly code.

- Applies a consistent pattern.

- Saves us some cycles. (probably not much though)

- Also fixes a wrong usage of drupal_alter() in NodeAccessRecordsUnitTest

AttachmentSizeStatusTest resultOperations
drupal.alter_.2.patch29.72 KBIdleFailed: 13620 passes, 5 fails, 0 exceptionsView details | Re-test

#3

System Message - October 1, 2009 - 21:50
Status:needs review» needs work

The last submitted patch failed testing.

#4

sun - October 2, 2009 - 12:26
Status:needs work» needs review

HEAD was broken

#5

System Message - October 2, 2009 - 12:40
Status:needs review» needs work

The last submitted patch failed testing.

#6

sun - October 2, 2009 - 18:41
Status:needs work» needs review

This one fixes drupal_goto() tests.

I have no idea why session tests are failing (they also do for me locally), because all instances of drupal_alter() throughout core are converted now. :-/

AttachmentSizeStatusTest resultOperations
drupal.alter_.6.patch30.04 KBIdleFailed: Failed to apply patch.View details | Re-test

#7

chx - October 2, 2009 - 22:19

No, no, no. There is no need to do this. What you want and what PHP5 allows while PHP4 does not is &$foo = NULL and then call out based on func_num_args.

#8

sun - October 3, 2009 - 03:40

@chx: Can you please explain that a bit more, post an example snippet, or just point to some docs on the net? I searched, but couldn't find any mentioning of the approach or trick you propose. (also no mention on http://www.phpwtf.org ;)

#9

chx - October 3, 2009 - 08:07

<?php
function drupal_alter($type, &$data, &$arg1 = NULL, &$arg2 = NULL)
  switch (
func_num_args()) {
    case
2: $function($data); break;
    case
3: $function($data, $arg1); break;
?>

#10

sun - October 4, 2009 - 02:03

Thank you!

Given that core has one implementation that uses 5 arguments, a max of 7 looks reasonable.

drupal_alter('field_attach_view', $output, $obj_type, $object, $build_mode, $langcode);

AttachmentSizeStatusTest resultOperations
drupal.alter_.10.patch4.99 KBIdleFailed: Failed to install HEAD.View details | Re-test

#11

System Message - October 4, 2009 - 02:20
Status:needs review» needs work

The last submitted patch failed testing.

#12

sun - October 4, 2009 - 05:04
Status:needs work» needs review

- Using syntax chx proposed.

- Added tests.

AttachmentSizeStatusTest resultOperations
drupal.alter_.12.patch9.14 KBIdleFailed: Failed to install HEAD.View details | Re-test

#13

System Message - October 4, 2009 - 05:20
Status:needs review» needs work

The last submitted patch failed testing.

#14

Damien Tournoud - October 4, 2009 - 09:12

I do prefer the syntax sun initially proposed, because it allows you to choose which parameter is alterable or not.

#15

sun - October 4, 2009 - 15:44

FWIW, I'd also prefer #6 for the same reason. However, it's possible that performance gains of not using cufa() may outweigh this. (My machine is unsuitable to run benchmarks, sorry)

In the end, it only applies to non-objects anyway... and usually, drupal_alter() is not used for more than 7 arguments.... so, not sure whether the arguments really stand.

#16

sun - October 6, 2009 - 22:28

Guys, we quickly need a decision here.

#17

chx - October 6, 2009 - 22:49

We need more people in here.

#18

bjaspan - October 6, 2009 - 22:52

subscribe

#19

Crell - October 6, 2009 - 23:18

In previous benchmarking in other issues, skipping cufa() doesn't buy us all that much. I was trying to remove it from module_invoke_all(), I think, and we went up to 10 case statements. Given how slow cufa() is I was surprised at that, but that issue died because the benefit wasn't large enough to be worth the hideous code that it required. An alternate approach that passed in a lot of NULL values didn't fare any better (and also did crazy things to default value handling, because it borked all calls to func_num_args()). drupal_alter() is far less frequently called than module_invoke_all(), so I think avoiding cufa() here is an unnecessary micro-optimization.

Moving to drupal_alter($key, $array) instead of drupal_alter($key, ...) is inline with other such shifts we've made in that direction lately, including the DB layer. It also means that we get the functionality we want with vastly less code, which I also like. So I'm +1 to the approach in #6.

That said, "$data" should never be used as the name of a variable. :-)

#20

chx - October 6, 2009 - 23:28

Hm, so it's not a DrupalWTF that even if you want to pass in a single argument which is the most common use case of drupal_alter you need to pass in an array(&$foo) instead of $foo? Seriously?

#21

sun - October 7, 2009 - 00:33

Both approaches compared: http://drupalbin.com/11713

#22

Dave Reid - October 7, 2009 - 00:36

#6 is my preferred approach. There are times I would not want things altered in the hook_alter implementations and it would be harmful to do so. Using #6 it's up to the caller to pass the arguments correctly once. The latest solution it's up to each and every single implementation to make sure they do it right. One is much easier than the other.

#23

sun - October 7, 2009 - 00:38

That idea doesn't apply to objects though.

#24

Damien Tournoud - October 7, 2009 - 00:40

That idea doesn't apply to objects though.

It does exactly the same: it's up to the caller to correctly clone the object it doesn't want to see altered.

#25

catch - October 7, 2009 - 01:16

I did microbenchmarks of HEAD vs. sun's version vs. chx's version on the most commonly called use-case (in core, but seems likely it would apply to a lot of sites too) - a single argument to drupal_alter() with no actual implementations. Currently that's done by file_create_url() about eighteen times per request (mainly drupal_add_css()) - attached webgrind screenshot which just shows the current distribution of calls in vanilla HEAD.

What's important to point out here, is that we're not testing call_user_func_array() at all here - module_implements() is returning an empty array, but we get a 50% saving from the function calls to other things that are skipped in both sun's and chx's approaches (array_merge(), is_array() etc.)

100,000 iterations.

nothing: 0.0400650501251 seconds

function no_op(): 0.455811977386 seconds

drupal_alter() (HEAD): 4.44066381454 seconds

drupal_alter_sun(): 2.04327583313 seconds

drupal_alter_chx(): 2.15555810928 seconds

Attached script in case someone wants to play with it, would make sense to benchmark something with an actual implementation as well, but I there's no chance either approach would be slower than HEAD in any situation as far as I can see, and probably not much between them either.

I slightly prefer drupal_alter('hook_name', $thing_to_be_altered); over drupal_alter('hook_name', array(&$thing_to_be_altered)); because that's the most common case, but would be fine with either.

#26

catch - October 7, 2009 - 01:18

Forgot to attach script..

AttachmentSizeStatusTest resultOperations
micro.php_.txt2.19 KBIgnoredNoneNone

#27

mattyoung - October 7, 2009 - 02:23

.

#28

bjaspan - October 7, 2009 - 02:43

This was a fast review so I may have missed the point.

I do not have the mastery of subtle PHP details that chx does, but I thought that & is for function parameter declarations, not function arguments.

I'm fine with the 7-arg approach in #12. I'm also happy with the cufa approach, because I don't really know the performance implications. If #12 is faster, then great.

Minor copy-and-paste error:

+/**
+ * Tests for URL generation functions.
+ */
+class DrupalAlterTestCase extends DrupalWebTestCase {

#29

Crell - October 7, 2009 - 05:07

So it looks like performance- wise cufa() vs. switch is a wash, and in practice the difference from HEAD is minor either way as well.

Whether or not #6 is a DrupalWTF depends on which you consider worse: variable parameter count but no extra array(), or an extra array() but a fixed number of documented parameters on functions.

Personally I favor the latter, and Drupal has been trending that way. The new DB API, for instance, goes as far as requiring an associative array even for a single value. Being able to pass in an array also allows for, potentially, some funky dynamic logic in user space. since we don't have funky dynamic logic inside the API. Frankly I just dislike variable parameter count functions in general.

#30

catch - October 7, 2009 - 05:27

So it looks like performance- wise cufa() vs. switch is a wash, and in practice the difference from HEAD is minor either way as well.

No, it doesn't look like that. I specifically said I ran microbenchmarks on a drupal_alter() with zero implementations - equivalent to hook_file_url_alter() in a clean HEAD currently. So those benchmarks cover everything except cufa() vs. $function.

If #572618: All theme functions should take a single argument to make preprocess sane and meaningful gets in, that'd swing things in favour of the single array argument here too.

#31

catch - October 7, 2009 - 05:35

Here's the webgrind output I failed to post before - front page of vanilla HEAD with no content.

According to webgrind, drupal_alter() takes 2.27% of a page request internally (obviously this will vary widely depending on which page and which site). If we reduce that time by half, then we're looking at ~1% improvement with either of the proposed patches. It's not exciting, but cutting 50% out of one of the top 10 most expensive functions in HEAD isn't to be sniffed at either - there's a few patches around like that at the moment, and they all add up.

AttachmentSizeStatusTest resultOperations
drupal_alter.png525.56 KBIgnoredNoneNone

#32

sun - October 7, 2009 - 09:28

One potential (major) issue I see is that (contrib) developers who are not so familiar with PHP references could forget the ampersands in their drupal_alter() invocations. Admittedly, those can be fixed easily, but still, could easily lead to many issues.

Brainstorming. How about a third approach?

Thus far, the most common use-case is drupal_alter($thing). The second most common use-case is drupal_alter($form, $form_state). We only have this weird switch statement, because we think that we need to support more arguments. In reality, we use a different pattern for "a variable amount of arguments" elsewhere in core already:

function drupal_alter($type, &$one, &$two, &$context) {
}

Whereby $two could already serve the purpose of $context when there is only one alterable argument. In fact, the function signature of drupal_alter() doesn't matter at all.

+++ modules/field/field.attach.inc 2 Oct 2009 18:22:54 -0000
@@ -1193,7 +1193,7 @@ function field_attach_view($obj_type, $o
   // Let other modules make changes after rendering the view.
-  drupal_alter('field_attach_view', $output, $obj_type, $object, $build_mode, $langcode);
+  drupal_alter('field_attach_view', array(&$output, $obj_type, $object, $build_mode, $langcode));

@@ -1228,7 +1228,7 @@ function field_attach_preprocess($obj_ty
   // Let other modules make changes to the $variables array.
-  drupal_alter('field_attach_preprocess', $variables, $obj_type, $object, $element);
+  drupal_alter('field_attach_preprocess', array(&$variables, $obj_type, $object, $element));

+++ modules/node/node.module 2 Oct 2009 18:22:54 -0000
@@ -2421,7 +2421,7 @@ function node_access_grants($op, $accoun
   // Allow modules to alter the assigned grants.
-  drupal_alter('node_grants', $grants, $account, $op);
+  drupal_alter('node_grants', array(&$grants, $account, $op));

+++ modules/system/system.module 2 Oct 2009 18:22:54 -0000
@@ -1893,7 +1893,7 @@ function _system_get_module_data() {
     // Invoke hook_system_info_alter() to give installed modules a chance to
     // modify the data in the .info files if necessary.
-    drupal_alter('system_info', $modules[$key]->info, $modules[$key], 'module');
+    drupal_alter('system_info', array(&$modules[$key]->info, $modules[$key], 'module'));

@@ -1986,7 +1986,7 @@ function _system_get_theme_data() {
       // Invoke hook_system_info_alter() to give installed modules a chance to
       // modify the data in the .info files if necessary.
-      drupal_alter('system_info', $themes[$key]->info, $themes[$key], 'theme');
+      drupal_alter('system_info', array(&$themes[$key]->info, $themes[$key], 'theme'));

These are the only places in core where we have more than 2 arguments.

So what if we "simply" rewrite those to pass a $context instead? I mean:

  $context = array(
    'obj_type' => $obj_type,
    'object' => $object,
    'build_mode' => $build_mode,
    'langcode' => $langcode,
  );
  drupal_alter('field_attach_view', $output, $context);

Note: $output not invented here. ;)

If you want to pass anything in $context by reference, you put it into $context by reference (as documented in the patch in #6). The maximum that drupal_alter() would support then would be:

  drupal_alter('form', $form, $form_state, $context);

#33

catch - October 7, 2009 - 11:36

That seems like a good option. In the cufa() benchmarks done on another issue, the main slowdown from the switch pattern was the number of arguments passed and/or passing NULL arguments - so it might even be a touch faster too.

#34

sun - October 7, 2009 - 11:51

It is even *very* debatable whether $form_state isn't a "context" already... If we say it is, then we'd have drupal_alter($type, &$data, &$context);

#35

catch - October 7, 2009 - 13:02

To summarize, we have the following options:

<?php
// status quo.

// Status quo with a max of 7 arguments.

// One big array.
drupal_alter('foo', Array array(&$stuff);

// Two alterable arguments, then an array for context.
drupal_alter('foo', &$stuff, &$more_stuff_or_context, Array $context);

// One alterable argument, then an array for context
drupal_alter('foo', &$stuff, Array &$context);
?>

Looking at those, I really like the last one.

#36

sun - October 7, 2009 - 13:18

+++ includes/form.inc 1 Oct 2009 20:38:44 -0000
@@ -669,20 +669,11 @@ function drupal_prepare_form($form_id, &
+  // Invoke hook_form_alter() implementations.
+  drupal_alter('form', $form, $form_state, $form_id);

Whoopsie. ;)

#37

sun - October 7, 2009 - 13:43
Status:needs work» needs review

ok, this is what it would look like. We need to agree on the approach before changing all code to use $context.

AttachmentSizeStatusTest resultOperations
drupal.alter_.37.patch9.25 KBIdleFailed: Failed to install HEAD.View details | Re-test

#38

System Message - October 7, 2009 - 14:00
Status:needs review» needs work

The last submitted patch failed testing.

#39

Crell - October 7, 2009 - 15:42

Do we have any cases where we want to alter more than one thing in one shot? I can't think of any. Maybe $form_state...

drupal_alter($type, $thing_to_alter, $metadata) seems like a good pattern. Whether we want to include a 3rd param that is an "alter this or use it as context", I'm undecided. I guess I can go either way. The overall concept of a fixed parameter count and a keyed array to handle "Extended" information, however, I am +1 on.

#40

Dave Reid - October 7, 2009 - 16:35

#320331: Turn custom_url_rewrite_inbound and custom_url_rewrite_outbound into hooks would want two things altered in hook_url_alter_outbound(): $path and $options. The idea of drupal_alter($type, &$data, &$context1 = NULL, &$context2 = NULL) is I think the best solution proposed so far. No cufa() or ugly switch's.

#41

sun - October 7, 2009 - 16:44

And... yes, we need at least 3 arguments, because of hook_form_alter($form, $form_state, $form_id) - changing that would be a nightmare.

All 3 arguments to drupal_alter() are passed by reference, but you can pass a keyed array as $context and put non-alterable variables in there (as described in the Doxygen in the patch).

#42

chx - October 7, 2009 - 16:46

I am in support of function drupal_alter($type, &$data, &$context1 = NULL, &$context2 = NULL) if you find a better name for context1 and context2. More like context1 needs a better name , context2 should be just context or args. Actually then function drupal_alter($type, &$data, &$context = NULL, &$args = NULL) ?

#43

Crell - October 7, 2009 - 18:40

I'm cool with chx's naming in #42. If we want to allow 3 primitives for simplicity, then we should not put a type hint on $args. Otherwise we should make it drupal_alter($type, &$data, &$context = NULL, Array &$args = NULL). I am comfortable either way.

#44

sun - October 7, 2009 - 18:47

Type-hinting won't fly, again, because of hook_form_alter($form, $form_state, $form_id).

Seriously, I have troubles to think of better variable names. $context can be the second OR third parameter, whatever makes sense for your code.

And I already mentioned in IRC that those variable names are invisible. Neither the caller nor the hook implementations use them. So, to me this sounds like bikeshedding.

#45

catch - October 7, 2009 - 18:59

<?php
function drupal_alter($type, &$data, &$context = NULL, &$args = NULL)
?>

Looks great.

#46

bjaspan - October 7, 2009 - 21:10

Just out of curiosity: What exception to code freeze does this fall under?

#47

sun - October 7, 2009 - 21:11

@bjaspan: "API clean-up". @see http://drupal.org/project/issues/search/drupal?status[]=Open&version[]=7.x&issue_tags=API+clean-up

#48

sun - October 9, 2009 - 12:55
Issue tags:+D7 API clean-up

Tagging absolutely critical clean-ups for D7. Do not touch this tag. Please either help with this or one of the other patches having this tag.

#49

sun - October 12, 2009 - 21:55
Status:needs work» needs review

Not sure why it failed to install HEAD.

This patch converts the only to drupal_alter() invocations in field.attach.inc to use a $context.

AttachmentSizeStatusTest resultOperations
drupal.alter_.49.patch12.87 KBIdleFailed: Failed to install HEAD.View details | Re-test

#50

System Message - October 12, 2009 - 22:05
Status:needs review» needs work

The last submitted patch failed testing.

#51

sun - October 13, 2009 - 01:53

#216098: drupal_goto() does not follow same parameters as url() is blocked by this issue. Please help - I don't know why testbot fails to install HEAD.

#52

sun - October 13, 2009 - 03:31
Status:needs work» needs review

catch was able to replicate the failure.

This one should pass.

AttachmentSizeStatusTest resultOperations
drupal.alter_.52.patch14.84 KBIdlePassed: 13818 passes, 0 fails, 0 exceptionsView details | Re-test

#53

chx - October 13, 2009 - 04:23
Status:needs review» reviewed & tested by the community

Nice job.

#54

catch - October 13, 2009 - 05:54
Status:reviewed & tested by the community» needs work

Lovely patch, gave it a final pass through and found nothing to complain about. Benchmarks from #25 still apply.

#55

catch - October 13, 2009 - 05:54
Status:needs work» reviewed & tested by the community

Sorry.

#56

catch - October 13, 2009 - 06:12

Benefits of this patch:

drupal_alter() goes from a bit of a monster to 5 lines of easy to understand code.

We get a 50% internal performance improvement in one of Drupal's most expensive functions (top ten most expensive on a vanilla HEAD install).

This combined with module_implements() caching means adding an _alter() hook somewhere is more or less free.

While this patch only gives us at most a 1% saving in page execution time, it's in the critical path, so that's on every, single, Drupal request ever served. Also we don't know what drupal_alter() calls contrib modules will do, which could increase it to 3-5% in some situations.

There are no ab benchmarks on this issue because 1% of page execution time equates to about 0.75% in ab, and ab isn't designed to measure with that level of precision - since it's issuing http requests, not just the application.

#57

kwinters - October 13, 2009 - 13:33

Subscribing, this is sort of holding up #216098: drupal_goto() does not follow same parameters as url().

#58

Dries - October 13, 2009 - 16:39
Status:reviewed & tested by the community» fixed

Looks great, removes some ugly hacks, and is faster. Committed. Good job, sun, chx et al.

#59

webchick - October 15, 2009 - 00:41
Status:fixed» needs work
Issue tags:+Needs Documentation

Let's document this in the upgrade guide.

#60

sun - October 15, 2009 - 08:25
Issue tags:-D7 API clean-up

.

#61

Crell - October 18, 2009 - 22:55
Status:needs work» fixed
Issue tags:-Needs Documentation

Update docs added.

#62

System Message - November 1, 2009 - 23:00
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.