So this has come up in a number of issues now and webchick asked me to make an issue for it, so here goes. :-)
The basic problem is that extend() doesn't modify the $query object itself; it can't. It returns the new $query object, because that's all it can do. It's up to the caller to assign that back to a variable or to use it properly.
That works fine if you fully chain, or if you assign the variable back explicitly. That is, either of these work:
$query = db_select('node');
$query
->fields()
->extend('PagerDefault')
->limit(5)
->execute();
$query = db_select('node');
$query
->fields();
$query = $query->extend('PagerDefault');
$query->limit(5);
$query->execute();
However, this will not work but is what people keep trying to do:
$query = db_select('node');
$query
->fields()
->extend('PagerDefault')
->limit(5);
$query->execute();
That won't work because extend() returns the new query object, which then has limit() called on it, and then that new query object is lost because nothing is referring to it. extend(), despite what it seems you'd want it to do, doesn't *really* magically add functionality to an object. It creates a new object that has new magical functionality that you can keep using, but you have to use that object.
I discussed with webchick the coding convention for extends() a while back, and we decided for performance reasons to encourage late-extending; putting the extend() all the way at the end of the query chain. I think now that was a mistake. Instead, we should for DX encourage early extending:
$query = db_select('node')->extend('PagerDefault');
$query->join(...);
$query
->fields()
->fields()
->orderBy()
->limit(5)
->execute();
That way, it doesn't matter if you break the chain later on. If you ever assign the query object to a variable, it will always be the extended decorator object. That means it doesn't matter how you chain, it will always be the right object. Similarly if you are extending multiple times (pager and tablesort), then you do that right at the beginning:
$query = db_select('node')->extend('PagerDefault')->extend('TableSort');
$query->join(...);
$query
->fields()
->fields()
->orderBy()
->limit(5)
->execute();
That way you also state up front what extra methods will be available on the query. It's like top-declaring a variable in a function. That makes it easier to read later. The flipside is that all method calls then have to go through multiple layers, since they need to get delegated through the decorator object to the query. For the vast majority of them it is just one extra stack call, no expensive magic happening, but it's still there.
I think our DX would be far better that way, and I'm willing to eat the fairly minor performance cost.
Thoughts?
Comments
Comment #1
ksenzeeWow... that last bit of code is a lot clearer. I think this would not only prevent errors, it would help people better understand what's going on. I'm +1 early extending unless it's a significant performance hit, and I can't imagine it would be that significant with the number of pager and tablesort queries we use.
Comment #2
berdirYes, the early-extending looks a lot more clean, it's more obvious to what's going on.
It might be "hacky", but would it be possible to somehow "flag" the old object, that it shouldn't be used anymore? A *very* ugly example would be to remove something, for example the fields, so that the query doesn't work anymore. Or better, create a warning when you call a method on a object. I don't think that would work without extending the base query object, though. We could add a boolean property to SelectQuery, and every Extender or everything else that's going to return a new object would set that property. execute() would then check if that property is set and create a warning.
Atleast, that might help to reduce the "Omgosh, the pager is broken"- effect :)
Comment #3
Crell commentedWe can't cripple the base query object, because in the end it is the base query object that still runs. It's still doing most of the hard work. We're just enabling additional methods to be wrapped around it. Its execute() method is still the one that actually calls the database.
Comment #4
webchickI don't even fully remember why I advocated for late extenders in the first place, so I am fine with this change as long as it has broad developer support (which looks like so far, so good!).
The nice thing about that last bit of code is it maps a lot closer to what the original code would've been, which would be something like:
It would be good to have some benchmarks though to understand exactly what the performance implications are. D7 is already damn slow as it is.
Comment #5
berdirSome PHP benchmarks:
Used code:
This gives me the following results:
----
Late extending: 1628.89ms
Early extending: 1630.9ms
----
Late extending: 1566.5ms
Early extending: 1585.54ms
----
Late extending: 1541.91ms
Early extending: 1542.47ms
----
When I remove the ->execute() calls (meaning: only testing the method calls, no database involved), I'm getting the following results:
----
Late extending: 70.47ms
Early extending: 72.96ms
----
Late extending: 72.11ms
Early extending: 76.92ms
----
Late extending: 69.5ms
Early extending: 71.63ms
----
Not yet done: Use a more complex example, the early binding is probably slower if more method calls are used.
Comment #6
Crell commentedSo for these samples, the overall cost is about 2 ms per 100 queries. I am comfortable with that for the considerably improved DX, especially as there are performance improvements in Drupal that we can make elsewhere if we just take the time to do so. :-)
Comment #7
dries commentedI agree that early decoration makes it better, however ...
The basic problem is that extend() doesn't modify the $query object itself; it can't. (Written by Crell)
Could you elaborate on this? I think we might be able to modify the existing class so I'd like to understand why you think we can't. There is very little you can't do in PHP, especially combined with some conventions.
The fact that extend() doesn't modify the original object is unexpected from an APIs point of view, and I don't think "early extending" solves the DX problem entirely. Early extending is better, but still not perfect. I'd like to understand the reason we can't modify the object -- it doesn't sounds like we'd be crippling the object. Unless I'm mistaken, we should be able to modify the object with 100% precision.
Comment #8
berdirextend() simply uses the Decorator pattern to "wrap" the old query object with a new one. This means that you can't easily change the old query object because you still use it, you pass everything through that except of a few functions you overwrite/add.
Theoretically, it would also be possible to use static methods, that would make it clear that you have to use the new object. But that would expose a new syntax to "normal" developers and it is a pain to handle without late static binding :)
$query = PagerDefault::extend($query);What is currently missing is a documentation page which explains these extenders and how to use them. We need to make it obvious there that it's important that the returned object is used.
Comment #9
Crell commentedDocs page is on my todo list, now that I seem to finally be past whatever illness I've had for the past month. :-) I want to get this question settled first.
@Dries: I don't believe PHP lets you recast $this to a new class type, especially not one that is not in the inheritance tree. To be fair I've never tried it, but I would be extremely surprised if it's possible. That's essentially what we want to do, and the current method is the only way to emulate it that I can think of.
The more traditional composition syntax would simply be:
But I am actively trying to keep "new" away from module code as it is very brittle. It also cannot be chained, period, because PHP doesn't let you chain instantiations. (I have tried that.)
Comment #10
berdir$this cannot be re-assigned since PHP 5.0, it was possible in older versions.
Comment #11
Crell commentedBumping so that I don't forget about it. It is just a documentation task at this point.
Comment #12
Crell commentedThis apparently got documented a while ago. :-)