Problem/Motivation
#2368987: Move internal page caching to a module to avoid relying on config get on runtime moved the page caching to its own module and removed the caching setting instead (so caching is dependent on the module being enabled or not). Before that, the browser detection only worked if page caching was turned off. That setting was removed, so browser language negotiation now runs even with page caching enabled, but is not cached. It uses the kill switch instead:
\Drupal::service('page_cache_kill_switch')->trigger();
To make it cacheable, we need to overcome a few problems.
A. If browser language negotiation is before URL language negotiation, you may get different language negotiated for the same path. We should ensure browser negotiation is after URL negotiation.
B. Once we ensure that browser negotiation is after URL language detection, we need to ensure that browser negotiation redirects to the correct language specific URL.
C. You may use browser negotiation without URL language detection. We need to somehow ensure that you only use browser language negotiation if we can redirect to a concrete URL that is explicit for the negotiated language (contrib negotiation methods may be involved, session negotiation may be fed with an URL parameter to set this, etc).
D. Even if you use browser negotiation and URL negotiation, if one of the languages has an empty path prefix, it will fall through URL detection, so may still have different languages or is not redirectable to. Needs to be ensured that if browser language negotiation and URL negotiation is used, all languages have a prefix.
Proposed resolution
Problem B and C (implemented in #65)
Implement language redirect:
- Implement a new method onKernelRequestLanguageRedirect()
on the LanguageRequestSubscriber
.
- This method is called after current language is detected and routes are initialized.
- The method first ensures that redirect is allowed: request method is GET, there is no "destination" parameter in the current request, etc.
- The method constructs URL to the current route. The URL is processed with URL outbound processors, so it may contain language path prefix or other language identifier.
- The method compares the result URL with the requested URL. If the differs, it assumes that the language identifier was added and redirects to the result URL.
Here is a behavior example for the case when a) URL negotiation is enabled and language prefixes are set up properly, b) browser negotiation is enabled and goes after the URL negotiation.
Before patch:
- "domain.com/" URL accessed
- "domain.com/" page cache is killed
- all links rendered on the page leads to language prefixed URLs
After patch:
- "domain.com/" URL accessed
- "domain.com/" page cache is killed
- user is redirected to language prefixed URL ("domain.com/en") which can be cached
So, the patch does not avoid cache kill, instead it introduces language redirect. (Which in D7 was usually performed by globalredirect module.)
Problem A and D
Implement hook_requirements() that warns user about the possible cache issues
- if there is more that 1 language and the path prefix is not set for some language
- if browser negotiation is the only enabled method
- if URL negotiation is set after the browser negotiation
Alternatively, the language_negotiation_url_prefixes_update() function can be updated to always add language path prefix to all languages. But this requires some tests update, see #66 for details.
Alternative resolution
(implemented in #116)
- Introduce new response policy `Vary`, that will be less disruptive than `KillSwitch`.
- Modify PageCache to respect Vary headers coming from response.
Latter is a good thing in general, going beyond the scope of language detection.
Currently PageCache distinguishes cached pages only by predefined "tags": host, URL, and requested format. The only supported Vary header is cookie, and it's handled in ad-hoc manner.
Since page cache occurs on the early stage of request processing, the fastest/easiest way to get the list of vary headers is to fetch them from "generally" cached response object, and then check if there are any headers in request object that match vary criteria.
Regarding redirect and related problems A-D, I believe they should be handled in #2641118: Route normalizer: Global Redirect in core, and this issue should take care of cache awareness of browser language detection.
Remaining tasks
+ Fix tests
+ Write specific tests
- Review
- Maybe address Problem A and D.
User interface changes
Possibly errors for unsupported configurations.
API changes
Not likely. Additions needed.
Comment | File | Size | Author |
---|
Issue fork drupal-2430335
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #1
Wim LeersComment #2
Gábor HojtsyAlso based on \Drupal\Core\StackMiddleware\PageCache::lookup() we should set a 'no-cache' directive when setting a redirect, so its not cached.
Comment #3
catchArchaeology: #339958: Cached pages returned in wrong language when browser language used.
Comment #4
Fabianx CreditAttribution: Fabianx commentedThe proposal from an IRC meeting with Wim, catch and berdir was:
- Whenever browser or url language or user language negotiation is in play, send a 302 and a Cache-Control: no-cache header. (possibly using the PageCache request policy or a cache_context.uncacheable)
- On language aware paths (/en,/fr,/xyz) the path always takes precedence and this can be cached then.
- If a user language is negotiated cache_context.user needs to be set.
Comment #5
Gábor HojtsyIt does not seem like anyone is working on this one :/
Comment #6
Fabianx CreditAttribution: Fabianx commentedComment #7
victor-shelepen CreditAttribution: victor-shelepen commentedThere is a proposition to modify cid: we need get the real page path, add a language key a separated key.
It returns the bare path.
$uri = $this->pathProcessor->processInbound($uri, $request);
It returns the current language according to Negotiators.
$language = $this->languageManager->getCurrentLanguage();
So the page cache becomes more flexible.
Comment #9
victor-shelepen CreditAttribution: victor-shelepen commentedIn my case this logic does not work. I do not know exactly why?
/var/www/d8/drupal/core/tests/Drupal/Tests/Core/PathProcessor/PathProcessorTest.php
Comment #10
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedhere my first approach, it does:
- Introduce a new method reactSelectedLanguage() for language negotiation plugins to react to the selected language by another plugin
- LanguageNegotiationUrl plugin implements this method, generates the current request with language pathprefix or domain (via processOutbound()) and returns a RedirectResponse Object with a redirect tho this new url. Also forces Drupal to not cache this Redirect.
- Adapted LanguageNegotiator to be able to handle Response Objects <-- super super ugly, probably need another system to handle that, but currently it works.
missing:
- specific tests for this new method
- "If a user language is negotiated cache_context.user needs to be set" (by @fabianx)
not clear:
by @fabianx
Do you mean we should change the order of the language negotiation plugins?
Let's see what the testbot says.
Comment #12
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedokay, new approach, the idea to pass the response through getCurrentLanguage() was a really bad idea, it is used so widely, it broke a lot of things.
The new approach has a new kernel.request subscriber that calls the negotiation plugins and if one of them returns a RedirectResponse object, this one is executed.
also found a bug that was checking the generated new URI against getBaseUrl() instead of getRequestUri, which then redirected way too much :)
still missing:
- specific tests for the whole thing.
Comment #14
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentednew try, I think the issue actually is a Core Bug, because if we create a new language with ConfigurableLanguage::createFromLangcode() and after that directly call Url::fromRoute(), this URL does not have a path prefix... which it should.
The patch fixes that, but I think this should actually be it's own issue with a test, but let's see what the testbot says.
Comment #16
znerol CreditAttribution: znerol commented@Schnitzel: Can you reproduce those test failures on your local machine? Could it be that they are due to differences between the UI and CLI test-runner?
Comment #17
victor-shelepen CreditAttribution: victor-shelepen commented@Schnitzel: Hello. Could you describe your idea? Sorry. But it is not clear for me. Thank you.
Comment #18
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedlooks like most of the errors are based because the tests are enabling languages and then make requests to URLs, but as we have now redirects we should follow them.
Comment #20
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedsome of the failing tests where failing because by default Drupal can have an empty path prefix for the default language (which I don't think is a good practice), but anyway, now not doing any redirects if one of the languages has an empty path prefix
Comment #22
Gábor HojtsyPlease update the issue summary with the proposed solution, it is not evident from skimming the patch or I had too little time, sorry. Would love to provide feedback on the architecture first before looking at how things are named for example :)
Comment #23
Gábor HojtsyComment #24
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedComment #25
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedokay, that is strange, my fix for the problems in #14 actually created more issues. removed that again and now let's see.
Comment #27
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedlooks like there are some issues during POST and redirects.
so only redirecting on GET.
Comment #29
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedwhoops, actually need to call the method correctly ;)
Comment #31
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedComment #33
Schnitzel CreditAttribution: Schnitzel at Amazee Labs commentedI've sent the redirect header with an relative URL, which generated the issue that the webtest internal redirect follow handler couldn't load the redirect, even though this is RFC compatible (http://stackoverflow.com/a/25643550)
Now sending an absolute URL and should work much better.
Comment #35
BerdirFYI: There's an existing issue about this not working with page caching enabled in 7.x here: #2257843: Browser language detection is skipped for anonymous users when page caching is turned on, with no warning to the site administrator. I think we can backport the fix from #2368987: Move internal page caching to a module to avoid relying on config get on runtime to 7.x as well, please review :)
Comment #36
Gábor HojtsyComment #37
Leksat CreditAttribution: Leksat at Amazee Labs commentedI've fixed the test fails. At least the "language" group tests pass on my local.
Changes:
1. I avoided redirect if the "destination" parameter exists in the request (diff). Some tests failed because drupalLogout() sent GET request to /user/logout?destination=user/login, the LanguageNegotiationUrl tried to redirect this to /en/user/logout?destination=user/login, then RedirectResponseSubscriber::checkRedirectUrl() retargeted this to /user/login (picked up from the destination parameter). The final URL was the user page, and of course user was not logged out.
2. I fixed the LanguageListTest fail by adding a resetAll() call (diff). The URL check failed because the Drupal container (which was used inside the test) was not updated with new language configuration. It contained information about the only registered language, so the LanguageServiceProvider::isMultilingual() returned FALSE, and did not register itself for the path outbound processing. And so the URL constructed inside test was different from the actual URL.
Comment #38
Fabianx CreditAttribution: Fabianx as a volunteer commentedOverall looks great to me.
Comment #39
Wim LeersHurray, thanks so much for pushing this forward :) This is a hairy issue, and solving it will help many thousands of sites!
How does this patch ensure A, C and D?
It is clear that B is addressed.
However, I think the patch contains a lot of small problems: it seems many things only happen to work in a specific set-up. More test coverage would be welcome to prove that this is robust. Method names are also quite confusing.
Overall, looks like this needs some polish before it can be committed. It's getting there, but it's not close to RTBC yet.
Apparently the return value can also be
FALSE
, the docs don't say that.These can be chained.
Copy/pasted comment that is nonsensical here.
Needs newline in between.
Comment doesn't match what the method name says.
This will break if this is called *before*
LanguageNegotiator::initializeType()
is called. And::initializeType()
is called when you call::getCurrentLanguage()
. So, it would be safer to move the last quoted line to run first, to avoid that edge case.Does it really make sense to iterate over all the higher-priority language negotiators that weren't able to negotiate a language?
Doesn't this mean that we do the redirect in many more cases than only the browser-based negotiation? E.g. the user, session, etc. language negotiators, if they run after the URL-based negotiator, they will also redirects.
One extraneous newline.
Docs are completely wrong.
This feels like a very weird/wrong method name. Why not just
getRedirect()
?We should be getting rid of the call to the page cache kill switch!
Also: this new place for calling the page cache kill switch seems wrong: it is the *browser* language negotiation that causes this, not the URL language negotiation being asked to generate the URL for the browser language negotiation!
s/url/URL/
s/then/than/
GET or not: this belongs in a higher level.
Will this not have to be reimplemented by every single implementation of this interface method? i.e. shouldn't this live at a higher level?
\Drupal::urlGenerator
->\Drupal\Core\Routing\UrlGeneratorInterface::generateFromRoute()
s/url/URL/
Why is this necessary?
s/Drupal::url()/\Drupal\Core\Routing\UrlGeneratorInterface::generateFromRoute/
Also, which static cache is causing the need for this reset?
Comment #40
Wim LeersComment #41
Leksat CreditAttribution: Leksat at Amazee Labs commentedTo continue work on this I need to know something...
Here is the current call stack:
\Drupal\language\EventSubscriber\LanguageRequestSubscriber::onKernelRequestLanguageRedirect()
\Drupal\Core\Language\LanguageManagerInterface::getLanguageRedirect()
\Drupal\language\LanguageNegotiatorInterface::getNegotiationMethodsRedirect()
\Drupal\language\LanguageNegotiationMethodInterface::reactSelectedLanguage()
A redirect only can be performed only on some conditions like
If we put this check on the top level (as suggested in 14 and 15 of #39), there will be no need to execute further methods if $can_redirect is FALSE. However, this also means that in this case the language negotiation methods will not be available to react on the selected language. Is that okay?
If yes, can we also rename reactSelectedLanguage() to getRedirect()? I think, that well be more logical.
Comment #42
Gábor HojtsyI would be happy to somehow help reviewing this issue (or help with where redirect should happen on the call stack) but I totally fail to understand what is the proposed architecture. And some of the call stack you introduced...
Eg. this one suggests you have a language redirect subscriber that would run the negotiation before the negotiation itself happens?
I keep asking for an issue summary update to help me understand, if you know how the patch is architected, that would be a great task to start with so people can help :)
Comment #43
Leksat CreditAttribution: Leksat at Amazee Labs commentedI'm not sure if I can update the issue summary properly, but I can describe here in the comment what I have learned while working on this.
Initial problem:
Browser language negotiation always kills page cache.
Here is a behavior example for the case when a) URL negotiation is enabled and language prefixes are set up properly, b) browser negotiation is enabled and goes after the URL negotiation.
Before patch:
- "domain.com/" URL accessed
- "domain.com/" page cache is killed
- all links rendered on the page leads to language prefixed URLs
After patch:
- "domain.com/" URL accessed
- "domain.com/" page cache is killed
- user is redirected to language prefixed URL ("domain.com/en")
So, the patch does not avoid cache kill, instead it introduces language redirect. (Which in D7 was usually performed by globalredirect module.)
Current implementation:
(assuming ConfigurableLanguageManager is used)
- on request event
- onKernelRequestSetLanguage() is called
- language negotiator iterates over language negation methods
- if language was detected, iteration stops
- if not, the invoked method is stored in the unselectedNegotiationMethodes array
- onKernelRequestLanguageRedirect() is called
- language negotiator iterates over methods stored in the unselectedNegotiationMethodes array
- if reactSelectedLanguage() returned redirect response, iteration stops, and the response object is passed up by the call stack to be set as a response for the current request
The call stack from #41 seems not logical to me, because the three first method names in the call stack mean "getting redirect", while the last one means "reacting on event".
I think, to make the code clear, we need to answer a question: what we try to achieve here? Allow language negotiation methods only to redirect, or allow them to react in any way they want to?
1. Allow language negotiation methods only to redirect
This is what we currently implemented. I think, to finish this, we just need to rename the methods and update them docs.
2. Allow language negotiation methods to react in any way they want to.
I think this is not really possible because in this case we'll have to use some equivalent of D7 drupal_goto() inside the URL negotiation method. I searched through the D8 core and did not find a place where the request-response flow breaks somewhere in the middle, except for \Drupal\Core\Form\EnforcedResponseException.
Comment #44
victor-shelepen CreditAttribution: victor-shelepen commentedCould anybody look my approach that is described at the 7-th comment?
Thank you.
Comment #45
znerol CreditAttribution: znerol commented@likin: Regrettably this is not a good idea because initializing the language manager at this point in time will kill page cache performance. In addition, the redirect based approach is much better from a SEO perspective and fixes existing issues with proxy caches.
Also the session is not yet loaded at this point in time and thus session based content negotiation is not yet possible. There are ongoing efforts to push page caching even further up the stack at #2501989: [meta] Page Cache Performance.
Comment #46
Gábor Hojtsy@Leksat: re #43, the steps explained made WAY EASIER to understand what is going on. I agree there is not much of a use case for language negotiators to react with anything else but a redirect.
Comment #47
Gábor HojtsyComment #48
Gábor Hojtsy@Leksat: can you work on an updated patch?
Comment #49
victor-shelepen CreditAttribution: victor-shelepen commentedI've written the test. But the page is not redirected according to the logic above. For simple redirection it the big patch :)
Comment #53
Leksat CreditAttribution: Leksat at Amazee Labs commentedWorking on it...
Comment #54
Leksat CreditAttribution: Leksat at Amazee Labs commentedHere is the patch including fixes for #39. I did not change the newly added test.
Comment #55
Leksat CreditAttribution: Leksat at Amazee Labs commentedIf the approach from the current patch will be accepted, here are some thoughts to further improvements:
- Automatically add language prefix if second language was added. This will avoid page cache issues if URL negotiation is enabled.
- Report misconfigured language negation methods on the Status report page.
- The Session language negation method currently uses $_SESSION for storing the language. The GET query parameter is used only for switching language. Does the !empty($_SESSION) breaks page cache as it was in D7? If yes, we can also implement getLanguageRedirect() there and avoid $_SESSION usage.
Comment #56
Leksat CreditAttribution: Leksat at Amazee Labs commentedI've also tried another approach, but got no luck with it. Could someone review attached patch?
The idea:
- LanguageRequestSubscriber::onKernelRequestLanguageRedirect() constructs a URL for the current request
- it also adds the language option in it
- then it compares with the current request
- if it's different, a redirect is performed
Comment #58
victor-shelepen CreditAttribution: victor-shelepen commentedThe new patch has been done according to the comments 49, 56. New approach from #56 with the test from #49.
Comment #60
victor-shelepen CreditAttribution: victor-shelepen commentedThere is an idea to modify the UrlGenerator to reduce a number of redirections...
Comment #61
Leksat CreditAttribution: Leksat at Amazee Labs commentedA bit improved version of #58. Let's see what tests say.
Comment #63
Leksat CreditAttribution: Leksat at Amazee Labs commentedComment #65
Leksat CreditAttribution: Leksat at Amazee Labs commentedComment #66
Leksat CreditAttribution: Leksat at Amazee Labs commentedComments for #65:
I updated the new test to set "en" prefix for the English language. After that, the caching works as expected. But. If user do not know of this peculiarity, and the prefix for the default language is not set, the caching will not work.
I tried to fix this. I modified language_negotiation_url_prefixes_update() to always add prefixes, no matter if language is default or not. That worked, but too many tests rely on the fact that the default language has no path prefix. So, I think, we first need to get confirmation for this change (or resolve this issue in other way, for example, implement hook_requirements() that warns user about the cache issue if the path prefix is not set for the default language).
Also, I noticed that many tests use \Drupal\simpletest\WebTestBase::drupalGet() in a strange manner. They include the language path prefix to the $path like this:
This is not really correct because the $path is passed to \Drupal\Core\Routing\UrlGenerator::generateFromPath() and is processed by URL outbound processors. Instead, it should be done like this:
The first example code works because the default language has no path prefix (by default), so the prefix is not added to the $path, and the $path remains the same. If we configure the path prefix for the default language on any of such tests, it will fail.
Comment #67
Leksat CreditAttribution: Leksat at Amazee Labs commentedUpdated summary.
Comment #68
Leksat CreditAttribution: Leksat at Amazee Labs commentedUpdated tags.
Comment #69
Leksat CreditAttribution: Leksat at Amazee Labs commentedComment #70
victor-shelepen CreditAttribution: victor-shelepen commented@Leksat Please attach interdiffs. Yes many some tests do not aware of the language prefix. But there are not so many tests. If we choose the redirection policy. We need changes related tests.
Comment #71
victor-shelepen CreditAttribution: victor-shelepen commentedIt is not a fact that it adds a prefix.
I agree with @Leksat to add language prefixes always.
Comment #72
Gábor HojtsyIts a feature to not add prefixes at all times. People may set up a German site and not intend to make it multilingual, then having a /de prefix just in case they sometime become multilingual is a no-go. Adding the /de prefix once they become multilingual would be a SEO disaster, all links would change, there would be no redirects between them, etc. BTW you are experiencing problem C in the issue summary:
If an empty prefix both means browser negotiation and some site default language then the browser negotiation will obviously not be able to redirect to the site default language with URL prefixes.
Comment #73
Leksat CreditAttribution: Leksat at Amazee Labs commentedIf there is only one language, the \Drupal\language\LanguageServiceProvider does not register URL inbound/outbound processors. So, even if language prefix is set, it isn't used.
Comment #74
Leksat CreditAttribution: Leksat at Amazee Labs commentedBTW, #73 means that the container should be rebuilt after the amount of languages is changed to/from 1. Or, better, when \Drupal\language\LanguageServiceProvider::isMultilingual() return value changes.
Comment #75
Gábor Hojtsy@Leksat: right, so when you go multilingual all your URLs suddenly change if we enforce path prefixes then.
Comment #76
znerol CreditAttribution: znerol commented@Gábor Hojtsy: Not quite. The original URL will redirect to the page in the proper language, since language negotiation is performed on all non-prefixed URLs if I'm not completely mistaken.
Comment #77
Gábor Hojtsy@znerol: what, how? If you have only one language, so no path prefix, you always get that language. If you then have multiple languages and all have prefixes and apply browser language negotiation like this issue plans, then you may or may not get the language you got before when visiting pages without a language prefix. What am I missing then?
Comment #78
Gábor HojtsyApparently it does not seem like people care enough about this issue at this time to make it a priority to work on. Removing from sprint.
Comment #79
Gábor HojtsyOk, sat down with Leksat to review this in Barcelona:
Better doc would be IMHO: "Performs a redirect if the path changed in routing.
For example when language negotiation selected a different language for the page."
(We believe the only example is this example BTW, that is why the method is in this class. But I think for example sounds appropriate for this).
Sounds pretty generic indeed. I am not sure if we want to move this to somewhere else, because we indeed do it due to the language changes.
I've been thinking hard about what kind of other things may change the URL in the routing process, but I don't think there is anything else.
Are the query string values the only we need to carry over to options to ensure we generate the same URL?
.
Let's add comments on why we are doing this then, so people don't need to git blame for this issue :)
Test browser...
Codes of the languages added.
Would be nice to make this an array of languages instead, so we don't need to fake-create them again later. We already created them in the setup method, we can store them on the array keyed by langcode.
setUp...
Feedback messages not correct English, not in line with what is being asserted.
Lacks phpdoc.
setUpCacheMaxAge() IMHO otherwise setCache() sounds very generic. setUp... aligns with the setup method pattern.
setUpLan....
phpdoc
phpdoc
:/
Comment #82
karmazzin CreditAttribution: karmazzin as a volunteer commentedI think these changes should help to close this issue.
Comment #86
karmazzin CreditAttribution: karmazzin as a volunteer commentedPrevious patch revealed a problem in the test that came from https://www.drupal.org/node/1810394. I think it's good that we managed to catch it at an early stage.
Comment #87
Leksat CreditAttribution: Leksat at Amazee Labs commentedGreat work @karmazzin!
I did some minor changes. The patch looks good from my side.
I also created an interdiff to latest reviewed patch (#65). Can someone review the patch again?
Comment #88
Leksat CreditAttribution: Leksat at Amazee Labs commentedForm 2. of #79:
To answer these questions I'd like to prepare another patch that will do the same redirect, but in the system module, so all tests are affected. We'll see then how much of tests will fail and why.
Comment #89
Leksat CreditAttribution: Leksat at Amazee Labs commentedOkay, let's see what will happen if we perform redirect from the system module.
Comment #91
Leksat CreditAttribution: Leksat at Amazee Labs commentedSo, if we always perform this redirect, it fails some tests. My thoughts on this:
1. If tests fail in redirect-always mode, I would not recommend to commit the patch, because it may cause bugs on Drupal installations having language module enabled.
2. I would have this redirect in the system module, so it works always and all tests affected.
If I have time, I'll check test fails. Maybe it's just hidden bugs, like the one @karmazzin fixed in #86.
Comment #92
Gábor HojtsyRemoving from D8MI sprint due to lack of attention.
Comment #94
Leksat CreditAttribution: Leksat at Amazee Labs commentedHello all!
This issue never had lack of my attention. And recently I had some time during the nights to work on it, thanks to my newborn son Jan :)
So, I continued with #89 approach. I worked in a separate issue to not mess things up and to not bother people.
Feedback and reviews are welcome! #2641118: Route normalizer: Global Redirect in core
Comment #95
victor-shelepen CreditAttribution: victor-shelepen commented@Leksat, please, merge a test into your patch from #59. Thank you.
Comment #96
Leksat CreditAttribution: Leksat at Amazee Labs commented@likin, #2641118: Route normalizer: Global Redirect in core has its own tests including one for language redirect.
Comment #98
gg4 CreditAttribution: gg4 commentedComment #100
hostedpower CreditAttribution: hostedpower commentedHi,
Was this fixed or not? I'm using Drupal 8.1.8 and seem to have issues on our website: https://www.hosted-power.com/
When visiting with English preference it still shows dutch :(
Comment #101
hostedpower CreditAttribution: hostedpower commentedAny update on this? Upgraded to 8.2 but still the browser language isn't taken into account it seems :|
Comment #103
drikc CreditAttribution: drikc commentedI'm having an issue where in 8.2.x paths like /example/photos were resolving correctly without redirecting (with a language prefix).
Now, since 8.3.x, that /example/photos path render a 404. Sometime, when I clear the cache the path resolve as /en/example/photos and render correctly but after a while it doesn't and return a 404!!!
Does someone have a quick solution to redirect any non language prefixed paths to a prefixed one? Load of urls are actually rendering 404 because of this behavior change since 8.2.x to 8.3.x!!!
In this urgent situation I'm trying to redirect in checkForRedirection event as explain here https://www.drupal.org/node/2013014 but without satisfying solution...
Comment #105
smk-ka CreditAttribution: smk-ka commentedJust a heads-up, #89 breaks image style generation, since the generated route looks different from the input route, and therefore triggers a redirect. Example:
Input route:
/sites/default/files/styles/image_style/public/image.png?itok=abcdef
Generated route:
/sites/default/files/styles/image_style/public?itok=abcdef&file=image.png
.The first image derivate after a cache reset will still be generated, after that the directory exists and Apache immediately responds with a 404. I think, this will break all routes that accept arbitrary data, this one is just the most obvious.
Comment #106
BerdirThat patch there is very old and moved to #2641118: Route normalizer: Global Redirect in core as commented a bit later, and that in turn has been implemented and added to the redirect.module, which fixed that problem and many others by now. So, if you want this, just use redirect.module for now.
Comment #107
smk-ka CreditAttribution: smk-ka commentedGood to know, thanks!
Comment #110
sleitner CreditAttribution: sleitner commentedThe tests of the language module do not test what happens if "Accept-Language" is missing in the browser headers (missconfigured browsers or bots). The browser language detection skips to the default language and this is cached.
Subsequent requests with a correct "Accept-Language" header are answered with the cached default language and the browser language detection is not taken into account. This behavior is not mentioned anywhere and makes browser language detection unusable.
See #2986325: Browser language detection broken as soon as a request with no Accept-Language header happens for a test that demonstrates this.
Comment #111
valthebald(Scratching my own itch here, so addressing only C and no test for now):
What if instead of `KillSwitch` caching policy there would be a new `Vary` policy? In this case browser negotiator can call it to make response vary on accept language header.
PoC attached
Comment #112
valthebaldAdd a test, and modify PageCache behavior
Comment #113
valthebaldComment #115
sleitner CreditAttribution: sleitner commented@valthebald I tested your #112 patch and this one works a little bit better than master for my case with an empty "Accept-Language" header:
the previous language is delivered, but it should be the selected default language.
See test attached: LanguageBrowserDetectionEmptyTest.txt
Comment #116
valthebald@sleitner I'd argue that the current behavior is correct - language negotiators are chain-processed by `LanguageNegotiator` class, so if browser detection doesn't return any language, the next negotiator should be asked.
In the attached patch I've fixed failing tests - they fail because PageCache treats Vary: Cookie in a distinct way.
Comment #117
valthebaldComment #118
valthebaldComment #120
valthebaldFixed CS and added test-only patch
Comment #123
valthebald@sleitner please ignore my previous comment - I got your point about reacting to empty Accept-Language header, and added your test (with slightly different name, because it tests different combinations of Accept-Language, not only empty one).
I've addressed this case by unconditionally adding 'Vary: Accept-Language' in `LanguageNegotiationBrowser::getLangcode()`
Also, I've added cache header assertion to the second call with French browser
Comment #124
valthebaldComment #126
valthebaldBack to needs review because the last patch was expected to fail
Comment #127
sleitner CreditAttribution: sleitner commented@valthebald your patch in #123 works with phpunit and in normal browser as expected
Comment #128
sleitner CreditAttribution: sleitner commentedComment #129
Lukas von BlarerI guess we'll need the test to turn green before this can be RTBC.
Does anyone use this patch in production?
Comment #131
Lukas von BlarerThis patch need a re-roll.
Comment #132
andypostComment #133
jofitz CreditAttribution: jofitz at jofitz commentedRe-rolled and fixed a couple of coding standards errors.
Comment #135
valthebald@Lukas von Blarer: we use the patch from #123 on production (with core 8.6.2) - works nice with page cache module and varnish
Comment #136
catchI don't think we can use Vary: here, and have pushed back against it being used in other places in core.
The Accept Language: header is extremely flexible as to what it contains:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language
This means that what ends up getting resolved to 'French' in the response, could be dozens of distinct headers sent by the browser.
Here's a fastly blog on having to normalize the Accept-Encoding header because they found 44(!) different variations for something which can only result in two different responses: https://www.fastly.com/blog/best-practices-using-vary-header
Similarly Varnish recommends normalizing Accept-Encoding:
https://varnish-cache.org/docs/3.0/tutorial/vary.html
And Accept-Encoding only has two real consequential values as such, language is actually designed to be flexible.
For example, on my own machine:
Safari:
Accept-Language: en-gb
Chrome:
Accept-Language: en-US,en;q=0.9,pt;q=0.8
These are semantically completely different even though on 99.99% of sites they'll result in the same response. Drupal does support differentiating between en-US and en-gb, but without actually running language negotiation there's no way to know whether it does or not, by which point the page cache response is doing too much work.
On top of this a lot of example varnish vcls disable the vary header because different frameworks set things like User-Agent, so if we start to rely on it, people may get wrongly cached items.
Conversely, with the patch as it stands, it could result in hundreds or thousands of cache entries per page - worse than not being cached.
The older approaches on here which do a redirect (to a page which might be page cached) give us something predictable, although not sure that's better than the status quo.
Comment #137
valthebald@catch: I don't see how normalizing of Accept-Language contradicts with having Vary: in backend response.
1. If varnish (or any other reverse proxy) reduces number of possible header values, good - less variations in page cache. If not - page cache behaves at least not worse than with the current kill switch behavior.
2. Cache guide (https://www.drupal.org/docs/8/api/cache-api/cache-contexts) says very clearly that
3. Core already contains HeadersCacheContext context, and the only (known to me) way to tell reverse proxies that response is varied by the value of (some) header is to use Vary header.
In other words, we have fully HTTP compliant mechanism of page cache context. Why exactly not use it?
Comment #138
sleitner CreditAttribution: sleitner commented@catch The status quo browser language detection completely does not work without patches ( #2430335-110: Browser language detection is not cache aware )
As @valthebald wrote in the issue description, the redirect to the language URL will be handled by #2641118: Route normalizer: Global Redirect in core and this works with #2430335-133: Browser language detection is not cache aware . The cache size is smaller, because only the redirect is saved, not the complete page HTML.
On my test server the Vary approach is faster (35-40 ms; compared to kill_switch: 60-80ms).
In most cases only one page will be affected by the huge browser language detection cache: the front page / .
We could add an option to switch between
on /admin/config/regional/language/detection/browser
Comment #139
BerdirI still think that the correct approach is to enforce a single canonical URL which is what happens with browser negotation in combination with the redirect module, see also #106.
Comment #140
valthebald@Berdir: totally agree. But this is a different issue - RedirectResponse is cacheable, and without fixing this issue global redirect in core will return wrong result. I.e. in our live set up we have:
Url language negotiation comes before browser negotiation.
When user goes to `/node/`:
- LanguageNegotiationBrowser returns preferred user's language, say 'fr', and adds Vary:Accept-Language.
- Redirect module returns RedirectResponse to `/fr/alias-francaise` which is canonical URL for the french translation of the node. RedirectResponse is cacheable, and its metadata has 'header' context)
When user goes to `/fr/alias-francaise`:
- LanguageNegotiationUrl returns fr before browser detection.
- Response cached by page cache does not have 'header' context, and doesn't add 'Vary' header
Comment #141
catchIf they normalize it's fine. If they don't it's potentially a very low cache hit rate which can be worse than an uncacheable page, or conversely some online examples ignore the Vary header altogether. The page cache itself doesn't do any such normalisation.
Being analogous doesn't mean that actually using the Vary header is recommended, as above even Accept-Encoding is not sent consistently by browsers.
If you use HeadersCacheContext without specifying a particular header, it'll vary by the cookie header, meaning caching per session.
Assuming that the redirect is implemented (which I agree is what we should do overall), it comes down to the difference between a poorly cacheable RedirectResponse and an uncacheable one (or if we implemented some header normalisation in Drupal, a slightly slower but more cacheable RedirectResponse). Has anyone done a timing comparison for that?
Comment #142
valthebaldI had similar results as @sleitner in #138 - cached redirect response is 2 to 3 times faster than kill_switch. Results may vary (pardon the pun) of course, but the boost is caused in skipped DrupalKernel and language negotiation chain loading.
Regarding normalization of Accept-Language - it would be possible to perform inside Drupal (new middleware class in language module, that runs before PageCache in the stack?). Tricky part would be to make it as light as possible, and probably optional (via settings.php)
Comment #143
valthebaldBrowser language negotiation is first, but not the only one piece of functionality that could win from the new Vary cache policy. Another example from recent real-life project:
We have a block plugin that should show different content depending on visitor's country. On production, we have Cloudflare CDN that sends `CF-IPCountry` header, and in other environments custom geo-ip based service that emulates Cloudflare behavior.
Without vary cache policy, only implementation that we could come with, was using placeholders and AJAX loading of those blocks. With vary policy, solution could be more elegant and not require AJAX calls (or BigPipe)
I wonder would it be better to spin off creation of Vary cache policy to separate issue?
Comment #144
BerdirRelated problem that some of you might have been struggling with when it didn't seem to work at all: #3024535: LanguageNegotiationBrowser needs to trigger the page cache kill switch even when there is no Accept-Language header
Comment #145
sleitner CreditAttribution: sleitner commented@Berdir : I already mentioned this problem six months ago in comment #110 #2986325: Browser language detection broken as soon as a request with no Accept-Language header happens with the same solution. I also built a custom module for one of my projects, because I do not want to patch against core forever.
Comment #147
Lukas von BlarerDoes anyone have a working patch for 8.6.x? The last few patches don't apply anymore...
Comment #148
markdc@Lukas von Blarer: I'm still using #133 on the latest 8.6.13.
Comment #149
stBorchertAttached you'll find a re-roll for 8.7.x in case you need this after updating to the new release.
Comment #151
Lukas von BlarerThe patch in #149 doesn't fix the issue anymore. The previous one was from #123 and that worked.
Comment #152
aschiwi CreditAttribution: aschiwi at undpaul commentedBtw, using the patch in #149 broke the browser language functionality for us. Removing the patch seems to work fine with just default Drupal settings.
Comment #153
Lukas von Blarer@aschiwi no, without the patch this issue persists.
Comment #154
clairedesbois@gmail.comThe patch #149 works correctly for us.
Comment #155
clairedesbois@gmail.comIn fact, we had failed our tests and the patch #149 doesn't pass. And in fact, remove patch for this issue work perfectly. We are on Drupal 8.7.4.
Comment #156
clairedesbois@gmail.comComment #157
Lukas von Blarer@aschiwi @Calystod Do you have page cache enabled?
Comment #158
aschiwi CreditAttribution: aschiwi at undpaul commentedWeird! Not sure why but with the patch we had a huge problem: the browser language functionality did not work at all. When we removed the patch, it started working again.
We do have Page cache enabled and it seems to continue to work. I can't publicly share the URL but pm me and I will share.
It took us a while to find out why browser language detection did not work anymore so once we found out that removing the patch solves the problem, we stopped looking.
Comment #159
Lukas von Blarer@aschiwi Ok, i'll text on my project next week. Thank you!
Comment #160
Lukas von BlarerI can confirm that this issue is fixed on 8.7.5 without this patch. I am about to test in production with varnish, but I guess it should be fine.
Can we close this issue or are there some improvements that still should go in? Currently there is no
Vary
header for example.Edit: I didn't mean to upload that screenshot. You can ignore that.
Comment #161
Driskell CreditAttribution: Driskell commentedUsing 8.7.1 and varnish cache the browser language detection is working - but it kills the cache. All content pages trigger UNCACHEABLE.
Do you mean to say 8.7.5 potentially fixes it?
Comment #162
bakulahluwaliaI have one site which has both varnish and cloudflare enabled. Settings for detection is:
1) URL
2) Browser lang
3) Default
Website is showing unexpected behavior. varnish and cloudflare both are not respecting the Browser lang at all. Drupal Version: 8.7.7
Details about the unexpected behavior:
Case1:
1) First visit site in russian language (Varnish will cache the page). Make sure it's homepage followed by the prefix /ru
2) Now, In another browser open the homepage (default should be english), make sure it is via varnish. it should not be a miss.
Result: the homepage will appear in russian instead of english(en is default in this case)
Case2:
1) Change browser setting to russian default lang. Visit the site. (Varnish will cache the data)
2) Open the site in different browser, english default lang. It will be served by varnish with russian content.
I have not tried the patch yet but I believe this issue can be fixed on varnish and cloudflare level itself using configuration.
I am just curious to see it working in action as it's core feature on our D8 and it doesn't work with varnish and cloudflare properly.
Comment #164
andypostreroll for 9.1
Comment #165
andypostFix one test and a bit of clean-up
Comment #167
Falco010@andypost I tested your patch, however it does not seem to fix it for us:
- Visit homepage in English browser
- Page is cached in English
- Visit website in Dutch browser
- Now all users get this English home page, no matter what language their browser is
We are considering to remove browser detection language and switching to Varnish Geoip.
Comment #168
nikitagupta CreditAttribution: nikitagupta at Srijan | A Material+ Company for Drupal India Association commentedComment #169
nikitagupta CreditAttribution: nikitagupta at Srijan | A Material+ Company for Drupal India Association commentedComment #170
shobhit_juyal CreditAttribution: shobhit_juyal as a volunteer and at Srijan | A Material+ Company commentedHi, andypost
Due to some issue the patch #165 was not applying into 9.1.
I recreated it, but as per @Falco010 comment, it still the cache still persists.
The below piece of patch was creating issue to apply in #165:
Comment #171
shobhit_juyal CreditAttribution: shobhit_juyal as a volunteer and at Srijan | A Material+ Company commentedComment #172
Gábor HojtsyUse event-appropriate tag. (Sorry for the noise).
Comment #173
Lukas von BlarerHere is a reroll against 8.9.1
Comment #174
bserem CreditAttribution: bserem at zehnplus commentedPatch #173 applies nicely with composer, even though it fails to apply on the drupal.org test system.
It looks ok on our test environment, will do more test and provide even more feedback next week.
Comment #175
andypostIt needs work because comment 167 and tests should pass
Comment #176
jasonawantWhen evaluating the patch from 165, I can confirm results described in 167.
And, just want to share an alternate setup suggested by Berdir.
Instead of using this patch, I can use the Redirect's modules "Enforce clean and canonical URLs." option, which is described as
With the following configuration
products the following results
I did not evaluate the patches from #2641118: Route normalizer: Global Redirect in core.
Comment #177
jozzy_a CreditAttribution: jozzy_a commented#173 failed to apply against 8.9.2
Comment #179
dorficus CreditAttribution: dorficus at Genuine commentedRerolled the 8.9 patch in #173 to apply.DO NOT TRY THIS PATCH
Comment #180
dorficus CreditAttribution: dorficus at Genuine commentedI rerolled the patch and it applied to 8.9.6, however I'm now getting an error:
I've checked, and the class is there and it's implementing the interface, however I can't figure out what is generating this error. Can anyone else test this patch with 8.9.7 and tell me if they get the same result? Or, does anyone know a workaround?
Comment #181
anmolgoyal74 CreditAttribution: anmolgoyal74 at OpenSense Labs for DrupalFit commentedRe-rolled #165 for 9.2.x.
Comment #182
dorficus CreditAttribution: dorficus at Genuine commentedSorry, my previous patch was missing things that I should have noticed ahead of time. Apologies for the previous 2 comments.
Comment #185
dorficus CreditAttribution: dorficus at Genuine commentedI'm not sure about the failing tests, but I am getting an error where it appears that the service is not available outside of the classes which seems to stem from the
part of the patch. I'm going to remove the
public: false
line in the patch and resubmit it.Comment #186
dorficus CreditAttribution: dorficus at Genuine commentedI should ask is there any reason for the service to be private?
Comment #190
smustgrave CreditAttribution: smustgrave at Mobomo commentedRerolled for 9.5
Comment #191
smustgrave CreditAttribution: smustgrave at Mobomo commentedHaven't figured out the issue but during the test
foreach ($vary_headers as $vary_header) {
Never seems to get called as$vary_headers
appears empty.Comment #193
aludescher CreditAttribution: aludescher at drunomics for Porsche Informatik Gesellschaft m.b.H. commentedUp to patch #165 the page_cache_kill_switch trigger was removed.
Starting with #173 the kill switch trigger is not removed i.e. it is still called, even though page_cache_vary is used (with the exception of #181).
Here's a rewrite of #190 without calling the page_cache_kill_switch trigger.
Comment #194
smustgrave CreditAttribution: smustgrave at Mobomo commented@aludescher thanks for the patch but please also upload an interdiff so people can see the changes you made.
Fixed the CI issue.
Comment #197
handkerchiefAny news on that?
Comment #198
Vadym.Tseiko CreditAttribution: Vadym.Tseiko commentedHi everyone, I have a similar problem, my project is configured via detection and selection URL by domains method and I need to switch language to the correct domain accordingly to the user's browser preferences. But in my case, when using detection and selection via browser I'm constantly switched to my preferred language without the ability to switch language only once(what I'm achieved with custom event subscriber and redirect). I added a custom code that redirects user but now I'm having a problem with SEO as Google search bots is redirected too, and all machine services return 302 status code besides the basic English one as those bots I guess have English as a preference in their requests. Is there a way for redirect to not mess with bots or to detect them and not redirect if that not human-users, none of those patches fixed behavior when the user is constantly redirected and none of them set the right domain in the URL as a result I getting always URL from with browser switch to the preferred language? Has anyone solved this problem somehow?
Comment #201
MatthijsSince the cache ID is actually cached, the
$vary_headers
passed to\Drupal\page_cache\StackMiddleware\PageCache::getCacheId()
actually did nothing.I created a merge request from #2430335-194: Browser language detection is not cache aware and did some changes to cache the cache ID by
$vary_headers
.Comment #202
sleitner CreditAttribution: sleitner commentedI fixed the visibility of the modules variable of the test.
Drupal\Tests\language\Functional\LanguageBrowserDetectionAcceptLanguageTest
still fails.Comment #203
sleitner CreditAttribution: sleitner commentedX-Drupal-Cache
is now always present. It isMISS
for the first access andHIT
for further accesses.Comment #204
smustgrave CreditAttribution: smustgrave commentedLeft some comments on MR
I don't know if adding a CR from D9 is correct anymore but will leave as is.
Comment #205
sleitner CreditAttribution: sleitner commentedComment #206
andypostadded 2 questions to MR
Comment #207
sleitner CreditAttribution: sleitner commentedI don't know why this is not a local variable. But since it only reads the request, a clone is not necessary.
Comment #208
smustgrave CreditAttribution: smustgrave commentedBelieve all feedback has been addressed around MR but don't really follow the issue summary, specifically the proposed solution section. Will leave in review if anyone else wants to pick up.
Comment #209
znerol CreditAttribution: znerol commentedThis includes #3023104. As one of the authors of the cache policy mechanism, I'd be disappointed if this approach would make it into core. Citing from #3023104-4: Introduce "Vary" page cache response policy:
Comment #210
znerol CreditAttribution: znerol commentedI suggest to move the logic into a response event subscriber instead.