Blocker for
#1808080: Add an _internal route
#1597696: Consider whether HttpCache offers any significant benefit over the existing page cache
Why this is a blocker
One of the goals for WSCCI is to enable ESI-based subrequests. These could be used for all blocks if SCOTCH comes to fruition, or just in selected cases if not. Symfony already offers support for doing so in a clean fashion, iff we support it properly.
The HttpKernel we're using (borrowed from Symfony's Framework Bundle), contains a method, render(). That method is called with the name of a controller. render() then determines what the best strategy is based on the request information: Render an ESI tag, render an hInclude tag, or fire a subrequest immediately and just inline the result. That distinction is, by design, completely transparent to the caller. It must be.
In order to generate a URI for the ESI tag or hInclude tag, render() calls HttpKernel::generateInternalUrl(). It looks like so:
public function generateInternalUri($controller, array $attributes = array(), array $query = array(), $secure = true)
{
if (0 === strpos($controller, '/')) {
return $controller;
}
$path = http_build_query($attributes, '', '&');
$uri = $this->container->get('router')->generate($secure ? '_internal' : '_internal_public', array(
'controller' => $controller,
'path' => $path ?: 'none',
'_format' => $this->container->get('request')->getRequestFormat(),
));
if ($queryString = http_build_query($query, '', '&')) {
$uri .= '?'.$queryString;
}
return $uri;
}
That is, it builds a URI to a magic route, named _internal, that contains all the information it needs to then route the subsequent ESI or hInclude request to the right controller to generate the appropriate portion of the page. It does so using the generate() method of the "router" service.
Drupal has no router service yet. What is a router service? It is a router, ie, an object that conforms to the RouterInterface:
http://drupalcode.org/project/drupal.git/blob/refs/heads/8.x:/core/vendo...
That is, it is a combination of a matcher and a generator. We already have a matcher. We do not have a generator. This issue is adding a generator.
What is a generator? A generator is the opposite of a matcher. Whereas a Matcher takes a Request and locates a Route, a Generator takes a Route and parameters and returns a URL. A Router, in Symfony speak, is simply both of those coupled together. Adding a Matcher and Generator has been on the WSCC roadmap since DrupalCon Denver in March: http://groups.drupal.org/node/22026
So then what's the chain of events here?
1) Add a Generator. Without a Generator, we cannot use HttpKernel::render() to generate internal requests.
2) Add an _internal route, and its corresponding _internal_public. Those routes are what all ESI/hInclude requests go through: #1808080: Add an _internal route
3) Add a Router service, by simply combining our existing matcher and Generator.
That enables the use of render(). That, in turn, allows for ESI anywhere.
That "ESI" is the same for a real Varnish server or for using Symfony's HttpCache: #1597696: Consider whether HttpCache offers any significant benefit over the existing page cache. That is by design. Even if you're not running Varnish, you can benefit from the resulting "Block Caching TNG". And that could be implemented even on our current Drupal 7-era crappy blocks system, although it would certainly be better with the new-hotness SCOTCH blocks system.
In addition, I have been talking to the Symfony CMF team for some time and we are planning to merge our routing libraries. That would offer them our more robust and flexible matcher, and us the more robust chain handling and other tools in the CMF Routing component. (This was discussed back in the original router/matcher issue, too: #1606794: Implement new routing system). However, doing so and then switching to the CMF Routing component also requires using a Router.
Symfony uses the generator for all URL generation. While I think we should, there is a performance question there. Because we have enough routes that we cannot load them all into memory at once, we have a single simple DB hit per route to load the route object. (On the level of a url alias lookup hit.) The immediate intention of this patch is enabling the render() method, which means only a very few calls per page (one per subrequest/ESI). That is, negligible, given that this is on the critical path to enabling partial page caching. If we extend it and start using it for all URL generation, it will need to have some performance tuning done to it, likely with caching. However, if we are to do that then it would be logical to also combine in url(), path alias lookup, and hook_url_outbound_alter(), since those all mess with outgoing paths in disparate ways. Doing all of those at once, in the generator, and then caching the result would offer considerably more flexibility and performance.
However, that is out of scope for this issue. It is also blocked on #1269742: Make path lookup code into a pluggable class, which turns path aliasing into an injectable class that could be passed into the generator.
Even if none of that happens, this is a blocking step to enabling ESI requests, and the limited usage of it there is not an appreciable performance hit.
So why critical?
I am working on the assumption that the feature set described above qualifies as a "new feature", and thus has to be in before 1 December. If I am wrong, and the committers agree that the entire pipeline and issue set described above can happen post-feature-freeze, then we can demote this to major and postpone it on #1269742: Make path lookup code into a pluggable class, after which we can do the entire generator/url/alias/hook_url_outbound_alter conversion at once, with whatever caching we end up with for performance. That will take a while, though, which means we don't even get to having the possibility of doing ESI until sometime in February, probably.
Absent such assurances that a February timeline for adding the new feature of "ESI/hInclude/partial page caching out of the box" is acceptable, I consider this issue critical, RTBC, and request that it be committed before BADCamp so that we can work on the next steps at BADCamp.
Original
Goal: a class that implements UrlGeneratorInterface and handles Drupal routes as well as actual paths or external URLs
We should make the current url a wrapper on this so that we can get it into core without changing all the url calls.
The Drupal generator should ideally have the same constructor as the symfony one - hence we need to have a route collection that taks to the Drupal DB?
On the performance front, the current url() function doesn't actually touch the DB other than to find the path alias. this makes me worry.
A possible approach to moderate the impact is that the menu links could reference both the instantiated path as well as the route machine name? In other words, a menu link entry is effectively a cache for the output of the generator?
Or should the generator or routecollection cache routes that have already been seen in this request?
Consider drupal.org with e.g. 10k menu links used by the book module. They are all node paths, so ideally we don't need to hit the DB more than once when generating the output links.
| Comment | File | Size | Author |
|---|---|---|---|
| #53 | 1705488-generator.patch | 13.38 KB | Crell |
| #51 | 1705488-generator.patch | 229.99 KB | Crell |
| #50 | 1705488-generator.patch | 13.3 KB | g.oechsler |
| #44 | 1705488-generator.patch | 11.94 KB | g.oechsler |
| #42 | 1705488-generator.patch | 11.94 KB | g.oechsler |
Comments
Comment #1
pwolanin commentedStarted working on this yesterday - my initial conclusions:
we should implement a read-only route collection that pulls the desired route from the DB.
We should extend the existing symfony generator class rather than re-implementing or copying most of the code there which is needed for generating a URL from a route. The child class should just add a method to do what url() does now.
I pushed some code, but it's not working yet.
Comment #2
Crell commentedMoving to Core, since the router is about to go in which means this can/should be done directly on core.
Comment #3
sunComment #4
Crell commentedTagging
Comment #5
pwolanin commentedOk, I'll try to migrate partial code to here.
Comment #6
Crell commentedThis is a blocker for a number of other issues now, so critical:
#1808080: Add an _internal route
#1597696: Consider whether HttpCache offers any significant benefit over the existing page cache
And a few others.
Comment #7
g.oechsler commentedGood news: I was working on this issue and at least got anywhere.
Bad news: It's a horrible hack. I's not meant to be commited, it's meant to be discussed.
What I did:
I implemented the class DrupalUrlGenerator extending Symfony's UrlGenerator.
DrupalUrlGenerator has a method url() which is returning the url for the path requested.
This is basically achieved by calling the original url() function from common.inc, which I simply moved to original_url().
The function url() in common.inc is now glue code to call DrupalUrlGenerator::url();
Every Route which is discovered within a request is cached in a RouteCollection.
There are some problems with this:
I resolved this by extending it to DrupalRouteCollection which allows adding anything as name.
To make this work I had to hack Symfony's RouteCollection, making the attribute $routes protected to be accessible from the subclass.
So adding complex routes with parameters, variables, requirements a.s.o. is impossible. This especially hurts for routes like "node/42" or "user/foo", which could be handled by route definitions with variables defined in node.module or user.module respectively.
Next steps:
PS: The patch contains the whole hacked Symfony\Component\Routing, that's why it's so big.
Comment #8
jherencia commentedComment #10
g.oechsler commentedOutch. My patch is against Crell's ancient wscci sandbox. Back to the laboratory...
Comment #11
g.oechsler commentedOk. This one is against 8.x. Let's see...
Comment #13
Crell commentedg.oechsler and I talked on Skype. He's going to take a second crack at this with more background context. Please stand by. :-)
Comment #14
webchickCan someone clarify why this is a critical task, worth holding up other features? The OP doesn't make a whole lot of sense to me. Demoting to major until that is identified.
Comment #15
Crell commentedThis is a blocker for the _internal route support from Symfony, which in turn blocks all of the sub-requests-that-cache-independently-so-we-can-get-all-the-macro-level-performance-benefits-we've-been-talking-about work. SCOTCH will need to do work after that happens too, to leverage it.
Comment #16
webchickOk, let's do this then. Cos I seriously do not understand what problem this issue is trying to solve and how all of these things can possibly be blocked on it.
Comment #17
Crell commentedThis is g.oechsler's work, I'm just posting the patch here for review purposes.
Comment #18
Crell commentedCorrection, this one.
Comment #19
Crell commentedClass names should not have "Drupal" in them. The preferred approach in this case is to alias UrlGenerator in the use statement to BaseUrlGenerator or SymfonyUrlGenerator, and then name ours just "UrlGenerator".
I'm not sure we need the parent constructor at all, do we? If not, we can skip the RouteCollection and just accept a RequestContext as a constructor parameter to our generator.
I think an empty RequestContext is going to fail tests, because it's needed for things like the base path, scheme, etc.
This should take a DB connection as a constructor argument, which gets assigned to a protected $connection property. Then here, we can call $this->connection->query().
We also don't need to select multiple routes here. All route names are unique, so we can do a ->fetchObject() call rather than fetchAll() to just get back the first (only) result.
We may want to combine the compilers at some point. Not sure yet.
I don't understand this comment. The compiled route is not stored in the database, because it breaks if we do that. :-)
Why the extra Fixtures class? If you need to change something, let's just push the changes back on the Fixtures class.
This is where we can pass in a DB connection and RequestContext for testing.
By convention, the message string should, I believe, be something like "Correct path generated for route $name". That way it reads correctly if the test comes back green.
Thanks, George! Let's see what the bot things.
I also rebased your branch against upstream/8.x (that's why the original patch above was so huge). You'll need to delete your local branch and pull the new one with the same name.
Comment #20
g.oechsler commentedThank you for coaching, Larry! Here is another shot on this.
Comment #21
g.oechsler commentedI tried to implement all of what was suggested in #19.
Right now I still use an empty RequestContext object in the UrlGenerator as it apparently does not break the tests. However, this might be because of the tests: I'm expecting "URLs" like "/node/add/article" right now, which works perfectly fine, if the $baseUrl is an empty string coming from an empty context.
Is this alright? If the UrlGenerator should be able to produce something like https://www.example.com:8081/node/add/article or other fancy stuff some context could be really helpful, though.
On the other hand UrlGenerator is a lonely object doing no harm. I wonder how this should be integrated to actually do something useful - not even daring to ask how it relates to this _internal URL thingy.
More coaching is very welcome. Cheers!
PS: I dropped the "Route has a cache of it's own" comment. It was simply copy and waste from Symfony's UrlGenerator code.
Comment #22
Crell commentedWe need to fix this before committing, although I'm not sure if we fix it by adjusting the text fixtures or setting a compiler explicitly. Or merging compilers, although that may have unpleasant performance implications. Hm. Opinions welcome.
Why the change to the dumper tests? We're not using the default table deliberately to avoid any potential collision with the host site, and to demonstrate that we can. Let's remove this change.
Instead, the generator should take a table name as a parameter, too, and default to the same table name as the dumper. That way, all key components in the system can use any table name we want.
We definitely need to test this with multiple RequestContexts. As you note, the base path is empty so a lot of things are, I think, working just kinda by accident. Let's make a fake request, use that to make a request context, and then test with that.
Comment #23
Crell commentedComment #24
g.oechsler commentedI saw that MatcherDumper::dump() is writing the route objects and data from the compiled routes to the database.
I think I could join the route compilers and write the data from Symfony's CompiledRoute (that is variable and routes) in extra columns to the database as well.
This way I could skip the compilation step in UrlGenerator::generate(), which should rather speed up things, than slowing it down.
Could this be a way to go?
Comment #25
Crell commentedIf we can reasonably combine our compilers, fine. However, since we're running compile for routes in the matching process, we don't want to make that too expensive. Refactoring that is out of scope for now, so I'm inclined to punt on it. Part of the difficulty there is that Symfony's compiled route object is not at all designed to be extended. It's rather poorly architected, frankly. :-)
I think for now let's just get it working. We have another 4 months to optimize the compilation time code after we get the functionality in. Maybe for now we just explicitly set the compiler class before compiling, and ignore the fact that it is sometimes redundant? (That's what we're doing in the dumper, basically.) Alternatively, we can remove setting the compiler in the fixtures and move that into each of the tests. I'm fine either way.
Comment #26
g.oechsler commentedHere we go with another try.
Comment #27
g.oechsler commentedComment #28
Crell commented$this->table needs to be wrapped in $this->connection->escapeTable().
Since there's only a single field coming back, we can skip fetchObject() and call fetchField(). That will return either the value of the route column, or NULL. Then the unserialize() call changes to $route = unserialize($route);
Just slightly cleaner.
And holy crap those tests are thorough! :-) Let's see how bot likes 'em. If it passes, then I think we can wire this into the DIC and move forward.
Comment #29
g.oechsler commentedI'm racing a turtle here. Half way there! :)
Comment #30
aspilicious commentedI think the class name is not prefixed with Drupal anymore
Don't forget to add docs
trailing whitespace
Inline comments in the tests would be helpfull
Comment #31
Crell commentedAh yes, and never use private properties. Those should be protected.
Comment #32
g.oechsler commentedNow UrlGenerator is registered to the DIC, a test for this exists and all of it got more docs and less stray whitespace.
Comment #33
g.oechsler commentedComment #35
Crell commentedAll I have left is minor nitpicks. Fix these and I think we're done!
This needs to be a declared property on the object.
false should be FALSE.
You don't need to use Exception. Just reference it as \Exception inline. (Newly revised coding standards.)
I know this is just a test, but let's set a good example. :-) Make the TestControllers class extend from Symfony's ContainerAware class, and then you can use $this->container->get() instead of drupal_container(). That's cleaner.
Comment #36
g.oechsler commentedFixed last quirks and rebased against upstream/8.x.
Comment #37
Crell commentedRocking! Let's get this committed and move on to the next steps.
Comment #38
aspilicious commentedI'm so srry.... Let's fix this quickly...
This should start with a \ too
Full path please
Comment #39
mbrett5062 commentedQuickly addressing #38.
Comment #40
mbrett5062 commentedHere it is.
Comment #41
aspilicious commentedThe name standard is to prefix these paths with a \to please editors. Please do that in this patch, now I have your attention ;)
Same
Same
Comment #42
g.oechsler commentedWow. Two people working on this in parallel :) I was working on this just the second I saw you volunteering, mbrett5062. Thanks anyway! On the way I found a few other namespace issues in the docs... fixing them with this
Comment #43
g.oechsler commentedLots of traffic here! aspilicous: #42 is fixing these as well!
Comment #44
g.oechsler commentedI'm afraid the last patch had one backslash too much... I wonder how that one sneaked in.
Comment #45
mbrett5062 commentedoopps sorry. Just saw a quick fix needed, and assigned anonymous.
Hope this gets in soon. Good luck.
Comment #46
Crell commentedJust skimmed it over again, and all the \ look right, I think.
Comment #47
webchickSo I asked for an issue summary here because I have no idea what this patch is doing, and reading the docs and tests in this patch, I still don't. So most of my feedback is docs-related.
Let's expand the PHPDoc here to explain this concept. How do I know when I need to use UrlGenerator in my module? What's the use case for why it exists?
Also, (nitpick) "definitions"
Should be SELECT ... FROM ... WHERE for consistency with what we do elsewhere.
Rather than putting a link to a specific comment here (which we tend not to do), let's re-formulate what's being said there into an actual @todo in the source code.
Sorry, what? How does this test show that generation is working properly..?
(nitpick) URL
Can we get a one or two-line comment above this array that explains the rationale for these particular requests?
(nitpick) Please re-word so fits into 80 characters (or 80 characters and then additional sentences below).
This assertion method should use format_string() rather than direct interpolation of variables.
This function needs PHPdoc, hopefully which helps explain what the heck is going on here.
Comment #48
Crell commentedWhat is a Generator: http://symfony.com/doc/current/components/routing.html
See the section "Generate a URL".
This is for making URLs that point to a Drupal-defined route. (We've only been talking about it in public in WSCCI postings since DrupalCon Denver...) It's critical because Symfony's HttpKernel relies on a generator in order to make links for ESI and hInclude. No generator, no partial page caching, no performance wins.
Comment #49
webchick"Symfony's HttpKernel relies on a generator in order to make links for ESI and hInclude. No generator, no partial page caching, no performance wins."
Great, so include some wording to that effect in the UrlGenerator PHPDoc to help orient people a bit.
"(We've only been talking about it in public in WSCCI postings since DrupalCon Denver...)"
That sort of snark isn't appreciated, thanks. People reading D8 source code 3 years from now are not going to even know there was a DrupalCon Denver. It needs to be clear from the docs in the code what is happening here.
Comment #50
g.oechsler commentedHere's my shot on most in #47.
I got a bit chatty in the comments, I'm afraid. Especially with RouterTest:testUrlGenerator(), where the whole setup of the test is scattered over three files. While trying to point out how this all fits together, I ran in a quite unusual problem: The path where to find a certain file, like core/modules/system/tests/modules/router_test/lib/Drupal/router_test/TestControllers.php won't fit in 80 characters. I went with a backslash break those extra long lines, but I'm not sure if that's ok and I could not find a hint to handle this in the coding standards.
One thing remains:
"Let's expand the PHPDoc here to explain this concept. How do I know when I need to use UrlGenerator in my module? What's the use case for why it exists?"
I don't know when and why to use an UrlGenerator in a module. In the issue summary it says something about replacing url(). If this happend the use case would be clear immediately, I guess. Until then, I'll have to take Crell's word, that it's somehow important for some weird stuff I had to google before I knew what he was talking about. :) I'm pretty sure I do not want to elaborate on ESI et al. in the docs of UrlGenerator - this innocent little class simply does not care about webservers or caching subrequests. My problem is that I honestly have no clue what to write there apart of the obvious: It generates URLs.
Comment #51
Crell commentedRevised version of #50 that includes an improved class docblock to explain what it does.
If this is still bot-happy, let's get it committed so we can move on to the next steps. (And close this critical.)
Comment #52
berdirLooks like the branch you created the patch in #51 from needs a rebase/merge against 8.x.... That's a bit too big ;)
Comment #53
Crell commentedAhem... One of these days I'll learn how to use Git...
Comment #54
g.oechsler commentedAh, that's some smart wording for the class docs. The code was already agreed upon and documentation is quite insightful now. Looks like we're done!
Comment #55
catchThis is supposed to be used instead of url() right? If that's the case (or even if it isn't, supposing sufficient usage), then adding a database query for every call is completely unacceptable. I said this very clearly in the router issue (when it had a stub generator implementation) and that bit hasn't changed here.
Haven't reviewed the rest of the patch yet.
Comment #56
Crell commentedIt is not (yet) a complete replacement for url(). url() if nothing else will still be available for arbitrary URLs. But this is a requirement to link to routes by route name, which is mandatory if we want to use the render() method of HttpKernel.
We can add caching around this if we need to, but that can be done post-feature-freeze. There's other enhancements we can make post-freeze, too. Right now this is a feature blocker for the ESI-all-the-things pipeline that we've been trying to get to since forever.
Comment #57
webchickRe-assigning to catch.
Comment #58
catchNo I'm not committing patches with obvious performance regressions and no discussion on how to fix them because "we can fix that after feature freeze", there's already a long list of issues at #1744302: [meta] Resolve known performance regressions in Drupal 8 that also need to be fixed after feature freeze and it's not getting any shorter.
Blocks at their own URI is already a critical task, as is this issue, so it can go in after feature freeze, without introducing yet another performance regression just for the promise of ESI (which is not going to improve out of the box core performance at all without solving a tonne of far more complicated issues than this one).
Also I don't see how this possibly blocks #1597696: Consider whether HttpCache offers any significant benefit over the existing page cache since that's for the page cache, doesn't need sub-requests or anything like that. There's no discussion in that issue at all of how this one is a blocker, it's been blocked on quite unrelated stuff for a while now.
Comment #59
sunI think the closest solution here would be to implement "static" caching along the lines of menu_get_item(), which retrieves, translates, and checks each requested path/route for access, depending on the passed arguments, in HEAD already.
Of course, every l() and/or url() is a completely different dimension, but at least I think that comes closest.
Comment #59.0
sunUpdated issue summary. Add blockers
Comment #60
Crell commentedUpdated the summary with a detailed review of the work this issue is blocking and why it is not a performance issue in its current form. There is no performance regression in this patch. There is "work to be finished later that is blocked on other issues still".
Comment #61
Crell commentedTalked this over with catch at BADCamp. New plan is here: #1830854: [meta] The ESI pipeline battle plan
Comment #62
Crell commentedThis is now folded into #1874500: CMF-based Routing system.
Comment #62.0
Crell commentedElaborate on the full dependency chain that this patch is blocking.