Problem/Motivation
I've got a live Drupal 8 site that has certain pages that 404 when accessed. The page is still accessible to edit, provided you can track down the Node ID to write the path. Clearing the caches on the site solves the problem temporarily. I've had several instances where after clearing the cache and making sure the page is accessible, it becomes inaccessible again the next day. This is unsettling as those path caches are not set to expire ever, so I do not know what is acting on them.
Since clearing caches works, I tracked down the cached path information in the cache_data table. Broken paths are missing a significant amount of data. For example, here is the cached data on a path that is not working (cid: route:/patients/prevention/plant-products/cancer-prevention-sulforaphane:) :
a:3:{s:4:"path";s:66:"/patients/prevention/plant-products/cancer-prevention-sulforaphane";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":2:{s:49:" Symfony\Component\Routing\RouteCollection routes";a:0:{}s:52:" Symfony\Component\Routing\RouteCollection resources";a:0:{}}}
This is considerably emptier than the same path cache after a cache clear and re-access:
a:3:{s:4:"path";s:10:"/node/4054";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":2:{s:49:" Symfony\Component\Routing\RouteCollection routes";a:1:{s:21:"entity.node.canonical";C:31:"Symfony\Component\Routing\Route":1327:{a:9:{s:4:"path";s:12:"/node/{node}";s:4:"host";s:0:"";s:8:"defaults";a:2:{s:11:"_controller";s:48:"\Drupal\node\Controller\NodeViewController::view";s:15:"_title_callback";s:49:"\Drupal\node\Controller\NodeViewController::title";}s:12:"requirements";a:3:{s:4:"node";s:3:"\d+";s:14:"_entity_access";s:9:"node.view";s:7:"_method";s:8:"GET|POST";}s:7:"options";a:5:{s:14:"compiler_class";s:34:"\Drupal\Core\Routing\RouteCompiler";s:10:"parameters";a:1:{s:4:"node";a:2:{s:4:"type";s:11:"entity:node";s:9:"converter";s:21:"paramconverter.entity";}}s:14:"_route_filters";a:1:{i:0;s:27:"content_type_header_matcher";}s:16:"_route_enhancers";a:1:{i:0;s:31:"route_enhancer.param_conversion";}s:14:"_access_checks";a:1:{i:0;s:19:"access_check.entity";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";C:33:"Drupal\Core\Routing\CompiledRoute":429:{a:11:{s:4:"vars";a:1:{i:0;s:4:"node";}s:11:"path_prefix";s:5:"/node";s:10:"path_regex";s:24:"#^/node/(?P<node>\d+)$#s";s:11:"path_tokens";a:2:{i:0;a:4:{i:0;s:8:"variable";i:1;s:1:"/";i:2;s:3:"\d+";i:3;s:4:"node";}i:1;a:2:{i:0;s:4:"text";i:1;s:5:"/node";}}s:9:"path_vars";a:1:{i:0;s:4:"node";}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:2;s:14:"patternOutline";s:7:"/node/%";s:8:"numParts";i:2;}}}}}s:52:" Symfony\Component\Routing\RouteCollection resources";a:0:{}}}
Running Drupal 8.1.10 with several languages enabled
Note that I initially started to investigate this issue on #2728725: Special characters like tab or spaces in pattern can break alias generation but I do not think it is related as that issue is not partially resolved with cache clears and also does not involve Pathauto.
Steps to Reproduce
- Spin up an instance of multilingual_demo distribution on simplytest.me (http://ply.st/multilingual_demo). Proceed through the install like normal; you should not need to change any defaults.
- We'll use the 'About Us' page as the target. Edit that page (/node/9/edit) and give it a new alias under URL Path Settings. In this example I have set the new alias to '/about-us'. It just needs to be different from the aliases given to the other translations.
- Add a link field to the basic page content type (/admin/structure/types/manage/page/fields/add-field). Do NOT allow this field to be translated, and make sure it allows internal links. Also move the link field out of 'Disabled' on the full content display (/admin/structure/types/manage/page/display/full); the link field needs to be rendered for this to happen.
- Add a basic page (/node/add/page). Content does not matter, but on the newly created Link field add in the path in the URL box ('/about-us'). Do not use the autocomplete.
- Make a translation of that new page. I'll go ahead and make a Spanish one (/es/node/10/translations/add/en/es). No changes need to be made; you can just save it
- Clear the site caches on the Performance page (/admin/config/development/performance). The new route issue happens when the cache is created, so we can't have the existing cache standing in the way.
- Visit the translation of the new page (es/node/10). This renders the link field which breaks the route cache of the page it links.
- Attempt to access the English About Us page (/about-us). It will now 404 until the next cache clear.
Comments
Comment #2
BerdirI do think it is the same issue, people now started to claim that cache clear helps them there. And if not, the other issue is actually multiple issues.
But thanks, this is a lot of useful info.
Pathauto uses core API's though to save aliases, and we're not storing anything in cache ourself, afaik that only happens when aliased pages are actually accessed. Could be some kind of race condition triggered by our code, though.
Comment #3
dhansen CreditAttribution: dhansen as a volunteer commentedI tracked down that cache generation to RouteProvider::getRouteCollectionForRequest. It seems the processInbound on the PathProcessManager is returning no updates to the path when the cache is being set, which means AliasManager's getPathByAlias is returning nothing when all this happens (it's being called by PathProcessAlias).
I think that means this points to some kind of race condition as @Berdir described, where the cache is being created but the path is not yet set.
It's still very strange. It continues to happen specifically and consistently on specific nodes of the site; a race condition IMO is generally less consistent than this.
Anyone have any ideas on how to make sure the alias is in place before the cache for the route is built?
Comment #4
dhansen CreditAttribution: dhansen as a volunteer commentedDid some updates and I'm continuing to see this issue with:
Comment #5
dhansen CreditAttribution: dhansen as a volunteer commentedI hacked some core to output information on what is happening in the RouteProvider. I've learned the following:
This is looking less and less like a Pathauto issue and more like a core issue.
Comment #6
willwh CreditAttribution: willwh commentedHi dhansen!
We are also running in to this, thanks very much for reporting this!
Could you possibly provide a patch for your core changes? I'd love to help with this issue, as it is definitely Major, possibly arguably a critical bug.
Thanks!
Comment #7
dhansen CreditAttribution: dhansen as a volunteer commentedBig breakthrough: I can reproduce the issue! Lots of ticket updates including updating the issue name and adding reproduction steps.
The issue occurs without the help of Pathauto, so I'm moving this to the Core queue. This is definitely related to the Link field. Pathauto just seemed to be a component because I was having it generated different aliases based on different languages, which seems to be a component of this breaking down.
I should note also that this cache is NOT being updated spontaneously. We're just flushing cache so often in dealing with it that it seemed to happen at odd hours.
@willwh I'm hesitant to create a patch because I don't want anyone to think it is a resolution. Also, most of my code was just an array defining the corrected paths for pages I knew were breaking. If you can identify any pages that break on the regular on your site, you can do something similar to this after line 157 in core/lib/Drupal/Core/Routing/RouteProvider.php:
Comment #8
willwh CreditAttribution: willwh at North Studio commentedAbsolutely, thanks for posting more info. I will be spending some time on this today.
Comment #9
azinck CreditAttribution: azinck commentedWe're also seeing this; it's very disruptive. Definitely qualifies as "major".
Thank you for posting, dhansen, as this has been almost impossible for us to reproduce outside of our production environment.
Comment #10
dawehnerThis is really a "interesting" issue.
Comment #11
dawehner@chx and @dawehner looked at this issue and we have some assumption that the PathValidator calls out to the inbound path processor while the route provider does as well.
We have both no idea whether this could fix the issue, but it has the potential to do so.
@willwh @azinck @dhansen
It would be great if you could try out the patch with your setup.
Comment #12
dawehnerThis really sounds borderline critical
Comment #13
willwh CreditAttribution: willwh at North Studio commented@dawehner & @chx, thanks guys!
Honestly, I'm still try to reproduce this locally :)
Based on the awesome digging by @dhansen, I thought I'd be able to do so.
i.e. on some pages we have untranslatable link fields used in views... although, despite my many attempts yesterday, I still can't seem to reproduce this in my local environment.
My thinking is roughly; with all types of caching enabled;
drush cr
, view various pages in each language, then view the pages I would expect to display the above behaviour (they do in our prod environment) in multiple languages (We're using English as a default with 'jp' and 'zh-hans' also enabled).@dhansen, if you have any further tips as to how you're reproducing this, I'm all ears... I've had no luck so far.
Comment #14
azinck CreditAttribution: azinck commentedWe've deployed this patch to production and will report back if we see the problem recur. Thanks for your efforts @dawehner and @chx.
Comment #15
BerdirRe-uploading the patch, lets see if testbot likes me more than @dawehner
Comment #16
dhansen CreditAttribution: dhansen as a volunteer commented@dawehner and @chx: Just tried the patch on my local dev (skipping the test changes, so just tweaking those two lines in the PathValidator). It has no effect on the issue; pages still 404.
@dawehner The only reason I left it as Major instead of Critical is because it represents no data loss. I was on the fence as well, so if you see a good reason to bump it up then we certainly can.
@willwh I have NOT tried it with the link field rendered in a view, which may be a key difference but I am not sure. There are some key things that seem to contribute to the issue on my end:
I'll help out more if I can. I will say that the best work-around currently is just to not use internal paths on the link field and only use the entity reference. That automatically translates the path with no issues.
Comment #17
azinck CreditAttribution: azinck commentedI hadn't noticed the steps that dhansen posted to the summary about how to reproduce the problem. I'll test it in our environment later today.
Comment #18
BerdirComment #19
willwh CreditAttribution: willwh at North Studio commentedMuch like azinck, I did not see those reproduction steps. We're taking a hop through this, this morning, thanks very much for the additional info folks!
Comment #20
rjay CreditAttribution: rjay at North Studio commentedI was able to reproduce this issue consistenly in a local development environment by following the instructions that bhansen provided, so I started stepping through the code that renders the link field with a debugger.
During the rendering of the link field, getPathAttributes() calls the inbound path processor, which in turn calls PathProcessorAlias::processInbound(). The processInbound method attempts to load the path from the alias that is provided:
The getPathByAlias() method takes a language code as an optional second parameter, which is not being passed in this case. When a language code is not passed in to getPathByAlias(), it falls back to using the current language to try to load a path.
Here is where I believe the problem starts to occur. If you follow bhansen's steps to reproduce, the alias that is entered in the link field is in the default language (eg. English). When you do a cache rebuild followed by viewing the translation of the page that has the link field on it, getPathByAlias() attempts to fetch the path that the alias points to using the current language (eg. Spanish).
In my case, the page that is being linked to only has a translation in the default language, and does not have a translation in any other languages. Ultimately getPathByAlias() is unable to fetch the path that the alias points to in the current language, so $path remains set to the alias instead.
Then getPathAttributes() tries to fetch a route that matches $path, but that fails because $path is still an alias. However a record still gets added to cache_data for the alias with an empty RouteCollection.
So first of all, it seems like it would be sensible to not cache a path if we are unable to find a matching route for it.
Second, anyone have any thoughts on how should we deal with making sure getPathByAlias() attempts to fetch a path in the correct language? We could make PathProcessorAlias::processInbound() pass a language code to getPathByAlias(), which would deal with this specific situation. Or we could make getPathByAlias() attempt to get the language code from the alias, before it falls back to using the current language.
Comment #21
rjay CreditAttribution: rjay at North Studio commentedThis patch makes getPathByAlias attempt to determine the $langcode from the $alias first, rather than automatically defaulting to the current language.
Comment #23
rjay CreditAttribution: rjay at North Studio commentedSo those test failures indicate that modifying getPathByAlias() was not the right approach here.
This patch modifies the inbound path alias processor and passes a language code to getPathByAlias(). Let's see if Testbot likes this one better.
Comment #25
rjay CreditAttribution: rjay at North Studio commentedI updated the returnValueMap for getPathByAlias() in the path processor tests. This was necessary because this patch makes the path processor pass a language code to getPathByAlias() now, instead of NULL.
Comment #26
rjay CreditAttribution: rjay at North Studio commentedSo all of the existing unit tests are passing now, at least.
The URI for the untranslatable link field is now being rendered as
/<langcode>/node/<nid>
, instead of the alias that was entered in the link field, when viewing the page that the link field appears on in any language other than the site's default language.Comment #27
catchI think this comes under:
Also nearly:
If you get 404s on lots of pages the site is unusable. You can't flush caches on a site 24 hours a day as a workaround (well you could but it's not like an admin page being inaccessible).
So bumping to critical. Nice find!
Comment #28
rjay CreditAttribution: rjay at North Studio commentedThe issue that I mentioned in #26 that is causing the URI to be rendered as
/<langcode>/node/<nid>
seems to be due to the fact that$url->options
does not contain a language.When no language is explicitly set on the $url, the outbound path processors (specifically, the outbound path alias processor) default to the current language.
I've updated the patch in #26 so it now explicitly sets the language option on the $url when rendering the link field.
Comment #30
rjay CreditAttribution: rjay at North Studio commentedReroll of #28, but this time only explicitly setting the language on the $url for untranslatable links.
Comment #31
dawehnerCould this code also just call out to the language manager and ask for the current language?
Comment #32
rjay CreditAttribution: rjay at North Studio commented@dawehner: The getPathByAlias() method in the alias manager is already calling out to the language manager to get the current language when a language is not passed in to it, and when I was debugging this issue that was what triggered the 404 errors.
When an untranslatable link field is being rendered on a page where current language != default language, then $request->getLocale() will be different than the current language. That said if there is another, better way of getting the language of the link field than $request->getLocale(), let me know and I'll update my patch :)
Comment #33
rjay CreditAttribution: rjay at North Studio commentedUpon further testing I was still seeing links in the format
/<langcode>/node/<nid>
with the patch in #30.So this patch gets the language from the $item for untranslatable links when rendering a link field, instead of using the language that is passed to viewElements(). This is for the same reason that I mentioned in #32.
Comment #34
dawehnerGiven we have 404s I believe some actual higher level regression test would be really cool.
Comment #35
dawehnerLet's be honest here :)
Comment #36
Wim LeersThis screams
.Comment #37
_Archy_ CreditAttribution: _Archy_ as a volunteer and commentedHi everyone! I investigated the issue without continuing the findings of @rjay and tracing the problem I ended up in the same place.
The main problem here is that routes do actually get cached on language context, while route collections for a request do not necessarily (might be wrong about this). For example with the provided reproduction the /about-us alias is not getting converted to the path because we are searching for it in the wrong language context. So we get it back as is, then when we try to match a route for it we get 0 results and cache it. Then when we navigate to the path we get this cache with 0 routes.
I've also taught about caching the route collection matched for a request in context of language, but extracting the language from the request is messy before the PathProcessorLanguage has ran (the cache id needs to be generated before this, see: RouteProvider::getRouteCollectionForRequest). This solution might break other stuff.
I propose the following solution, which I think it's cleaner than @rjay s: in AliasManager::getPathByAlias in case we don't find a match in the current language we should also check for one in the default language, and if found put it in the current language cache as well.
Comment #38
_Archy_ CreditAttribution: _Archy_ as a volunteer and commentedComment #40
_Archy_ CreditAttribution: _Archy_ as a volunteer and at PitechPlus commentedForgot to check if the website is multilingual.
Comment #41
_Archy_ CreditAttribution: _Archy_ as a volunteer and commentedForgot to mention that the patch in #40 renders the link as
/<langcode>/<alias>
. I guess this would be the right/expected behaviour, correct me if I am wrong, but if not it also without effort eliminates the issue with the other patch:Comment #42
arunkumarkThis patch seems working fine. It handles 404 issue on Multilingual site.
Comment #43
heddnStill needs tests.
BTW, #33 definitely does not fix the issue. It breaks the site because there is no
->get()
method available from BaseFieldDefinition. I discovered this on a multi-lingual with that version of the patch on site when trying to view redirects at admin/config/search/redirect.Comment #44
xjmThe committers discussed this issue and agreed that it is a critical bug, given the unpredictability, disruption for regular site visitors, and the lack of a feasible workaround.
Comment #45
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedCreated a test for this issue and added it to my patch. This is my first ever test so it might not be the best. It seems to me that it is bound to my solution, so someone should take a closer look at the test code.
Uploading test patch to fail and the fix with test patch to pass.
NOTE: I have added the following to the testGetPathByAliasNoMatch() so it doesn't throw exception (getId() on NULL..) when ran with my solution:
Comment #47
drupalninja99 CreditAttribution: drupalninja99 at Mediacurrent commentedWe have 2 multilingual D8 sites with similar symptoms, that is random 404s for pages that have aliases.
Comment #48
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@drupalninja99 could you try and see if the #45 patch with the fix and the test eliminates the problem.
Bringing to top the test patch and fix+test patch.
Comment #49
rcodina CreditAttribution: rcodina commented@_Archy_ I've tested your patch and it applies cleanly. However, it doesn't solve the problem I experience with URL alias in Spanish. See related issue #2728725: Special characters like tab or spaces in pattern can break alias generation.
Comment #50
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedHello @rcodina. Did you clear the cache after applying the patch? If so and it still doesn't fix the problem: could you give me a step to reproduce the problem, or maybe just point me to one?
I have a hunch that it is caused because of the default language other than english and/or probably that it is the only one. Could you also check if you add back english it happens? If yes: then when you set it default does it?
Comment #51
rjay CreditAttribution: rjay at North Studio commented@_Archy_: I tested your patch in #45 and it does prevent 404 errors from occuring for me when I follow the Steps to Reproduce in the description of this issue.
However, as you pointed out in #41, your patch causes the URI to be rendered as
/<langcode>/<alias>
. That doesn't seem like the expected behaviour for an untranslatable Link field, in my opinion.If a user enters an
/<alias>
in an untranslatable Link field, I would expect that when the Link field is rendered the alias is output exactly as it was entered in the field, regardless of what the current language is. And that is the behaviour of an untranslatable Link field without the patch in #45 applied.So the patch in #45 alters how the URL of an untranslatable Link field is constructed when it shouldn't. Inside the getPathByAlias() method there doesn't seem to be anything we can use as context to determine if we are fetching an alias for an untranslatable Link field, which is what we would need in order to handle that specific case appropriately.
That's the conclusion I came to in #23, and it's why I abandoned the approach of modifying getPathByAlias() in favour of attempting another, admittedly less elegant, possible solution.
I've also confirmed the issue with the URL Redirect listing page that @heddn pointed out in #43, so my patch in #33 does indeed still need work if we were to return to that approach.
Comment #52
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedI just wanted to take a look at alternative ways to handle this, but can't reproduce the issue. Neither from the branch I was working from for this issue (pull--rebased), neither by checking out a new branch that tracks 8.2.x. Maybe something changed that alters the way links are rendered or I am doing something wrong, tough I've re-read the steps and re-tried it 3 times.
Can someone check if it still happens with the latest dev?
EDIT: It renders the link as /about-us on the translated content (to ro) and when I access it works (takes to en).
Comment #53
rcodina CreditAttribution: rcodina commentedYes, I cleared the cache. I can't give you steps to reproduce the problem. It's weird but It happens at some point. You can check out my comments on issue #2728725: Special characters like tab or spaces in pattern can break alias generation to find more details.
I added back English and it doesn't work too. Then I set English default again and then both nodes failed (manual URL alias and auto alias). Then I saved again both nodes and then the problem was the same as usual (the auto alias doesn't work but the manual it does).
How Can I update the core to latest core dev? Thanks!
Comment #54
oriol_e9g@rcodina You can grab the last core code dev from GIT:
Comment #55
rcodina CreditAttribution: rcodina commented@oriol_e9g But if I just want to just update the core I already have? Are you sure the clone will not break the site?
Comment #56
willwh CreditAttribution: willwh at North Studio commentedLet's see what the test bot thinks?
Comment #57
willwh CreditAttribution: willwh at North Studio commentedComment #58
willwh CreditAttribution: willwh at North Studio commentedWe need better tests for this.
Comment #59
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedI just confirmed that my assumption that it was magically fixed was wrong. The issue persists.
@rjay thanks for feedback. I've been thinking about what you said regarding my patch #45. I think that the outcome is actually better.
Let's say that the website is multilingual with my patch applied: en + ro for example. Scenarios:
I can only see the benefits of having it as
/{lang-code}/{alias}
. Remember it renders as/{alias}
if on default language. In addition I believe the alias manager actually should handle this and not other. So maybe someone can point me to a real scenario where this behavior it's a drawback.Comment #60
azinck CreditAttribution: azinck commentedJust want to weigh in that I strongly agree with Archy's interpretation here, as opposed to rjay's.
Comment #61
benelori CreditAttribution: benelori commentedHello everyone!
We've ran into this problem on our site as well, and here's a more detailed description of what happens.
1. We have two languages: one default and another enabled language.
2. Whenever there is a link that is rendered on the page and it runs through validation to be a valid route, Drupal checks if that information is cached and if not, then it saves that information.
3. Suppose you are on Romanian language page and you have a content on the page that is not translated yet:
e.g. something that renders a link that is an English alias.
4. Drupal will do what I described in step 2.
4.1. RouteProvider::getRouteCollectionForRequest() will check if there's a cache entry already, and if there's not then it will try to find out, if the English alias exists in a Romanian language context. This is not the case obviously, so line 160: $this->getRoutesByPath() will return 0.
4.2. In conclusion the data that will be saved in the cache will be 0.
5. When we try to actually access that particular link, after some navigation, the data will be loaded from cache and it will find no valid route saved for it, so NestedMatcher::matchRequest() will throw ResourceNotFoundException.
So #45 fixes the issue, because AFAIK, any untranslated content will fall back to the default language, and the patch does exactly this: when the data about the route is saved in the cache for the first time, the alias that is searched for handles the fallback (the default language) as well.
At first glance, my gut tells me that #45 treats the symptom and not the problem, though, but it's still a working patch in our case.
Comment #62
willwh CreditAttribution: willwh at North Studio commentedHmm, 45 does not work in our case, although my patch in #56 definitely does. I would love to see some input from maintainers if they have the time! Thank you!
Comment #63
_Archy_ CreditAttribution: _Archy_ as a volunteer and at PitechPlus commentedAs I see the issue patches that fix it by altering the way links are rendered won't solve it from it's root. Given the right context no matter from where if the route match is requested (in this case from the link formatter) the path aliased wont be found and will be cached with no match.
Another possible cleaner way to solve this would be to figure out a context to cache route matches on, ofc somehow language, but as I described in my first comment getting the language before the language pathprocessor ran is tricky to do. Also this would result in more cache and multiple duplicates.
Quite possibly my patch doesn't solve the problem in all the cases, so @willwh I would like to recreate your environment and try to adapt. What is the difference between your setup and the one described to reproduce? Or maybe you could check why the patch doesn't work.
Comment #64
mudassar774 CreditAttribution: mudassar774 commentedHi,
I have exactly same issue for the page which i created using views. Though I have NO link fields in source content types.
Comment #65
cilefen CreditAttribution: cilefen commentedComment #66
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedA special case here is when we create the node not in the default language, so the nodes original language is other than the default and we set the alias. In this case no matter how we try to access
/{alias}
it won't work. It will only work if the language code in which it was created is prepended/{original-language-code}/{alias}
.This case is not handled nor affected in any way by the #45 patch.
I think this should be handled as-well, because what happens: the user creates this node in a language other than the original, adds the alias and puts it in a link field somewhere else. Now if the page where this link shows is not in the linked nodes original language it will lead to 404. Even if we don't take in consideration links, the path
/{alias}
will lead to 404, even tough I as the user wouldn't expect that. Now I am not sure what to do about this. For the AliasManager to take care of this as-well it would have to check through all other languages (not the with the one which first requested and in case of #45 applied neither original) as the last resort to find a match.EDIT: After thinking a bit more about it, let's say that this node is originated from a language not the original and ALSO it is translated in another language (both having the same /{alias} set). So now even if implemented to search in other languages for alias match it could find multiple matches: the original one should be selected tough. But I think this is impossible to know at that point..
Should it be left as is or made something, what do you think?
Comment #67
rcodina CreditAttribution: rcodina commented@_Archy_ Something has to be done. In my case ALL my content is only in Spanish and it is frustrating to have this "Page not found" errors. I don't have any content in English.
Comment #68
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedAbout the special case I was talking about, that requires some serious changes and it's just related to this issue. It might be tough a feature request if it's needed.
@rcodina I can't really tell what is happening on your project. I've read through the related issue you posted. I've tried having drupal with only one language other than english, created nodes, tried it with links rendered in nodes, tried pathauto but they all seem to work with patch #45. Does your problem even involve links? After clearing the cache and directly accessing the /{alias} does it work? Does it work by accessing /es/{alias}?
Could you maybe help us a bit and (with patch #45 applied)
in PathProcessorAlias::processInbound method and after clearing the cache access the /{alias}. If you DON'T get {alias}={alias} and you are still getting 404 => than your problem is somewhere else.
Comment #69
rcodina CreditAttribution: rcodina commented@_Archy_ I have updated core to 8.2.4 and pathauto and ctools modules to latest version. Then I applied your patch at #45. But the problem persists.
The affected content type has no links but it has two Entity Reference fields. After clearing the cache neither /{alias} nor /es/{alias} work. To sum up:
Content with alias done automatically by pathauto:
es/talleres/taller-nuevo-siguiente => Not found
talleres/taller-nuevo-siguiente => Not found
Content with alias done manually to avoid problem:
es/taller2 => Found
taller2 => Not found
I know it's weird. I have done myself a new setup from scratch and I haven't been able to reproduce the problem.
I have been looking for PathProcessorAlias::processInbound but I haven't found it. However, I've found that module "Easy Breadcrumb" calls processInbound function. But that module has been uninstalled for long time (it is a dev version from 2016-05-31 and it didn't work as expected so we uninstalled it after some testing). Could it be the source of problems? If you tell me where processInbound is I can try to check out the var_dump you suggested. Many thanks!
Comment #70
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@rcodina: under the core folder search for the file PathProcessorAlias.php and in that you will find the method called processInbound.
Comment #71
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedCould be that maintainers aren't looking at this because it's in 'needs work'. Putting in review so they can re-analyze the situation.
Comment #72
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI have some language codes that I use that are not in the drupal defaults. I get 404 errors on most pages when using my own language codes with this patch. Otherwise it seems to work well.
Comment #73
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI made a slight change to willwh's patch to make it so that, if someone has custom lang codes in the language.negotiation.yml file, the website will still work.
Example in an example language.negotiation.yml file
prefixes:
en-gb:uk
This patch change will allow this example to go through where it wouldn't before.
Comment #74
Mac_Weber CreditAttribution: Mac_Weber as a volunteer commented@Tachion thank you for adding the patch for custom lang codes.
Patch looks good. I think we just need to add a test for this use case.
Comment #75
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedSounds Good. I found an issue with this patch with core 8.2.4 and up. I will track it down, modify the patch, and make a test.
Comment #76
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedchanging this to needs work.
Comment #77
tameeshb CreditAttribution: tameeshb at Google Summer of Code commented@Tachion What issue is it for >8.2.4 ?
Comment #78
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@Tachion Did you try the patch in #45 for the custom language codes and for the new problem that you found with the other patch?
Comment #79
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedIt appears that the side-effect I found was just something on my end. I will write up some tests for the patch.
Comment #80
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commented@tameeshb The issue was that some of the links, in particular it seemed to be views, would go back to the default language instead of the current language in some menu links. After looking at my site, I think it is highly unlikely that this issue has to do with core or with views. @_Archy_ Yes, I tried patch #45. It had the same effect.
Comment #81
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@rcodina and others: Could this issue be your problem?
Comment #82
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedSo this is my first time making tests for core. Can someone point me to some Drupal standards for making tests?
Comment #83
rcodina CreditAttribution: rcodina commented@_Archy_ I have updated the Pathauto and core and issue hasn't dissapeared.
Comment #84
cilefen CreditAttribution: cilefen commented@Tachion There is a good presentation on Drupal testing in general and the new methods.
The above link explains the test types: unit, integration, and functional. You can decide what is appropriate for the coverage needed. Generally, look for existing tests to expand if possible. You may know this already: We like to see a tests-only patch uploaded alongside the full patch, as a proof the tests work.
And, learn how to execute the tests on your own installation.
Comment #85
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedThis took some time and was really hard to make. But here is a patch containing a test that if PASSES the problem is still present. I will re-upload all the patches + this test to see if they fix the root problem .. tomorrow. FAILING ones will be solving the problem.
This TEST patch should now PASS as it proves the presence of the problem.
Comment #86
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedComment #87
catch@_Archy_ that's a 0 byte patch, depending on how you've created the patch, you might need to run git diff --staged instead of git diff or do something like git show $commithash
Comment #88
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedOh damn, sorry. Guess I was too tired. Can only upload on monday.
Comment #90
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedThis patch is intended to prove the problem. Since I didn't upload it right last time (0 bytes), I've also decided to change it to fail when the problem is present.
I didn't quite find a suitable place for it, so I've created a new BrowserTest. The main part of it is to prove that anything that will try to match a request for an alias will not find it and cache 0 results if the master request is set and the language is different from the one in which the alias exists. In this case the LinkFormater through Url it will use the PathValidator to request a match for it. But this could happen from anywhere, not just the link renderer.
This patch with the test should fail.
Comment #92
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedUpdated the test: removed unused method, added some commenting, and most importantly added a pop to the manually pushed request and reset to language manager before trying to access the aliased page. The interdiff is for the test only. Also renamed the file.
Added the test to the patch in #73.
Comment #95
willwh CreditAttribution: willwh at North Studio commentedHi _Archy_,
It looks like your patch is based on my previous one, if somewhat loosely? This is working really well in the site we were having issues with previously. We'd been using my patch in #56; but after testing yours, this is working nicely now! (I had some issues with your previous patch)
I think we can slap this with an RTBC, nice work!
Comment #96
willwh CreditAttribution: willwh at North Studio commentedComment #97
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@willwh That is totally your patch. I've only added the tests to it. Decided to drop my patch that adds fallback on language for aliases, as it didn't work for everyone.
Comment #99
oriol_e9gThere is a fail with PHP 7 with aggregator, I re-uploaded the patches in #92 to force run testing with Drupal 8.3.x branch. Do not credit me.
Comment #101
dawehnerAs an afterthought it feels like making the langcode optional was a bad idea, because that stopped thinking about the langcode.
This comment seems to not match what its actually doing. It is not fetching the langcode from the parent.
... We need to sure, but why? Expanding the documentation about that would be nice.
Comment #102
catch#101 is good points, also noticed the following:
Shouldn't this be 'Get the language code for the field items.'? And if that's all it is, does it need an inline comment at all?
As well as explaining why, this should also wrap at 80chars.
Comment #103
tameeshb CreditAttribution: tameeshb at Google Summer of Code commentedChanges from #102, #101
Comment #104
dawehnerWell, at least for me #103 still doesn't answer why this is done like that. Why is $items->getLangcode() prefered over $entity->getLangcode() etc.
Comment #105
BerdirWhat about different negotation methods?
What about the fact that we have multiple language types that can be configured differently?
getLocale() doesn't really seem to have any documentation what "a locale" exactly is, but it does forward it to \Locale::setDefault(), which mentions "UAX #35 extensions are accepted." More docs on http://php.net/manual/en/class.locale.php.
Language objects can have custom ID's, not sure what will happen then.
Will setting the locale interfere somehow with Drupal?
Is this really the correct and best option to pass this around? Maybe it would be better to use an arbitrary attribute key?
same here, what about content vs interface language, shouldn't this use the content language? Or actually, the language of the entity being displayed, which is nicely provided to use as $langcode. Which I think means you should use that?
Comment #106
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedAfter seeing all these feedback to the latter patch, and knowing that the problem is deeper than links.. decided to force alternatives. Digging even deeper I have come to the conclusion that there is a great problem with the path processors: they aren't contained enough.
The problem is that the AliasManager getPathByAlias() and getAliasByPath() both believe that the language manager will always provide them with accurate information. But here is what happens:
Here is a totally new approach to resolve the problem that pushes a new request on the request stack and resets the language manager on every process inbound and outbound. This patch will 100% fail as I haven't updated the tests for it, uploading to demonstrate.
This issue is much deeper than links and aliases. Other path processors might rely on data that doesn't actually represent what they are processing.
Comment #108
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedDid not expect so many failures tough. What if the language manager returns the current language based on the current request from the request stack? This seems to solve the problem as-well, let's see what it brakes.
Comment #109
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedComment #111
rcodina CreditAttribution: rcodina commented@_Archy_ In my case, updating to latest 8.3.0-beta1 and updating all contrib modules fixed the problem for me. So finally my problem may not be related to the one discussed in this issue.
EDIT: Amazing discovery: the real problem was that in my pathauto rule I had a tab character at the end of the pattern. Removing it makes the problem disappear also in prior versions of Drupal core. However, if you have at least 8.3.0-beta1 the alias aren't broken if you have that tab character (this is the way I discovered which was the problem).
Comment #112
pawel_r CreditAttribution: pawel_r commentedNone of proposed solutions helped in my case. #108 Worked perfectly for default language (English) but 10 other languages got permanent 404.
Waiting for solutions, meanwhile I had to fix it somehow. My tmp solution has not-so-terrible performance outcome:
Comment #113
Mac_Weber CreditAttribution: Mac_Weber as a volunteer commentedThis issue does not belong to Link module, this is probably a Path subsystem issue, or a Routing System issue. @catch will better know how to address this.
BTW @catch, is
PathProcessor
is considered part ofpath.module
?Comment #114
david.gil CreditAttribution: david.gil at Biko2 commentedHi pawel_r,
i use your workaround #112, and it solve the worst problem (aleatory 404 pages), but now sometimes i see in my menu links not pretty-path routes (/node/xxx),
does anybody has a better temp solution?
Best
David
Comment #115
david.gil CreditAttribution: david.gil at Biko2 commentedHi,
more info, and for me it is more really a critical BUG, so any help is appreciated. We move a big site (https://www.museothyssen.org/) to production today and it really happens in multiple pages. I try to explain my problem better:
As i said in #114 with line commented
we are not obtaining 404 error pages, BUT as we have varnish in front of our sites, we are caching lots of pages without pretty-urls.
If i are logged (so varnish is not caching), the pretty paths in rendered pages are Ok and in some pages we have the same menu cached with pretty and without pretty urls.
Any idea how can i temporally patch this.
Attached a couple of screenshots of the problem in https://www.museothyssen.org/ just relesed today in production.
Thx in advance
Comment #116
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedI've been investigating this issue for quite some time by now. And the problem is two distinct things:
So these need to be resolved.
#1. For the language manager we need to be able to override the request and store per request the the negotiated language. So the path processor manager would inject the request into the language manager for which path is processing, then reset to the current request when all processors ran. (this is sort of what #108 does)
But this alone doesn't resolve the problem.
#2. The alias manager has to provide fallbacks for an alias that has multiple translations.
First it should find whether in the current language the alias exists. If not find whether in the original language it exists (this is what my first patch does #45). But this still isn't enough: Then finally it should select the first one from any other available languages. And finally fail and return the alias as is.
So this way the user could put in a link field /about-us and get the best matched translation (because of #2) and if he prefers any translation of it he could specify {langcode}/about-us (because of #1).
#1. Is something that should be done in a separate issues I think, added the issue.
Comment #117
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@david.gil Sorry I deleted you files unwillingly as I was writing my comment before there were uploaded. Can you put them back? Also did you try the patch in #45 , that worked for our case.
Comment #118
david.gil CreditAttribution: david.gil at Biko2 commentedhere they are, i will give a try to #45
Comment #119
jiisuominen CreditAttribution: jiisuominen commentedWe had this exact issue and were able to reproduce it with 8.2.7. We have link field attached to a Paragraph field, and those paragraphs are attached to nodes. And the target node gave 404 if translated node where the paragraph was added was accessed before the link target node after cache clearance. This applied only when the link was added manually without auto complete. Also if the target node had pathauto option enabled this issue went away.
Patch in #45 seemed to do the trick for us too.
Comment #120
catchPathProcessor isn't part of path module, due to a very long-standing situation where alias storage and handling was in core and the UI was in path module.
There's a case to move this to 'request processing' since it's a default implementation of that, but let's leave in path module for now.
Comment #121
david.gil CreditAttribution: david.gil at Biko2 commentedHi,
more info in my install. I have been testing workaround in #112 and it is a bad solution, at least in my case it makes pretty-urls randomly not work.
Today i use patch in #45 and it seems to work!. My sites have 2 languages, main is spanish and secondary english. I can reproduce exactly the problem in "Steps to reproduce" and with #45 it is solved.
Best
Comment #122
catch#45 looks like it's probably a good solution to this and I'm not sure why it was dropped.
The case in #66 could be split to a new issue.
Comment #123
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@catch #45 is just a partial solution and only works in simpler instances. Check #116. For instance for @willwh case it didn't work as he mentioned in #62.
Maybe we could extend #45 with the fallback to the first available alias from any language, so that should take care of wider range of combinations. And for the language manager problem mentioned in #116 I've already created a separate issue.
Comment #124
david.gil CreditAttribution: david.gil at Biko2 commentedHi all again,
i was wrong with my comment #121 it seens that in my install there are 2 problems, the one of this issue that it is solved with #45 and another one that makes that in some cases the url alias are rendered without pretty paths, there must be another problem related with path subsystem that i cannot find and makes that sometimes urls are rendered using internal path (node/XXX), anyone can refer me a related isuue?
Comment #125
catchI read through nearly the whole issue again, on re-reviewing #45, don't think we should get the original language alias when the current and default language don't return anything, that's a big change of behaviour for the path alias system.
I got back to #25 which looks a lot closer to what I'd expect, also a very minimal change to avoid the 404 which is the critical bug here.
Here's an untested patch combining the fix from #30 with the browser test from #92, might not even be a perfect re-roll, got a lot of conflicts since those patches are from a while back but see what comes back.
Comment #126
catchShort review:
We don't use $request->setLocale() anywhere else in core. Should we be setting that in all of the language negotiation plugins if we use it here?
Fixing one stray array syntax code style issue from the new test and uploading a test-only patch.
Comment #127
catchThat wasn't a test only patch, sorry for the noise.
Comment #129
bmcclure CreditAttribution: bmcclure as a volunteer and at Top Floor commentedWhen I'm on my Redirect admin page (/admin/config/search/redirect), I'm getting the following error with the latest patch here:
Error: Call to undefined method Drupal\Core\Field\BaseFieldDefinition::get() in Drupal\link\Plugin\Field\FieldFormatter\LinkFormatter->viewElements() (line 181 of core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php).
I'm not entirely sure why yet, but apparently `$item->getFieldDefinition()` is returning an object without a get() method in some cases or something. Still trying to investigate.
Comment #130
BerdirThat's not the problem, the problem is that get() is a config entity method, and only configurable fields are config entities, base fields are not. Redirect has a link base field. I'd expect that menu links and shortcuts could cause similar errors when viewed somehow, e.g. in a view.
Use isTranslatable() instead.
Comment #131
bmcclure CreditAttribution: bmcclure as a volunteer and at Top Floor commentedThe fix was to change the conditional to use isTranslatable() instead of get(). There was no get() method. Here's a slightly modified patch with this change.
Comment #132
bmcclure CreditAttribution: bmcclure as a volunteer and at Top Floor commented@Berdir Thanks! Found the issue at the same time I think :) Last patch changes only that method call from the previous patch.
Comment #133
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedSo the 8.3 patch does not work if you have a custom language prefix. I can look into it, but if anyone gets that taken care of first that would be great! For example: I have made the basic page /test-page and then switch to /uk/test-page and it immediately gives a 404. I don't even need to click the link in the outlined steps.
Comment #134
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI made a small change to bmcclure's patch to allow for custom lang codes. The reason I didn't make a test for this is because 8.3 came out. I will make a test for this case unless someone gets to it first. At any rate this change should allow for custom lang codes.
Comment #135
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedComment #136
nkraftJust checking on the state of this, as I've started encountering this bug on a massive, multilingual site we will be putting into Production in just a few weeks. This issue is causing us some major stress.
I'm not adept at this whole patching review process. Which of these patches should I apply? Is #131 the patch I should be use? Has it been tested adequately? Is there any hope of getting this bug patched in a Core update soon?
Thanks for any insight you can provide!
Comment #137
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedyou should probably use patch 135. You don't have to use 135 if you don't use custom langcodes, but I imagine that once I write a test for 135 that the patch based on 135 will be used.
Comment #138
nkraftThanks!
One quick question -- is there a way to AVOID this issue without the patch? I saw some people alluding to the link field linking to a page in a multilingual site... seems like there's a lot of moving pieces. But could I recreate a problem node (make a new copy of it), and avoid doing something to it so as not to incur this issue?
Comment #139
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI don't think so. I would apply the patch. We have it on our site and it seems to work pretty good.
Comment #140
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI tried doing a fresh install and running the PHPUnit tests, but I didn't see any issue even when I put $this->assertSession()->statusCodeEquals(202); at the bottom of AliasManagerTest->testAliasRouteCache() and I didn't see any additional assertions. (I figure that the page can't have 202 and 200 status codes). I put in a test function that I think will test for the custom langcodes, but again, I didn't see anything pop up when I made another function inside of AliasManagerTest.
Why isn't this patch going through the test queue?
Comment #141
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedTrying to upload the patch again.
Comment #142
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedComment #144
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedHmm looks like I might need to redo the 140/141 patch. I will get to this tomorrow.
Comment #145
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedok so 141 seemed to have passed. I put in a test that I think would test the issue I fixed, but it seems that the tests in the previous patch are not even being executed. I am new to the Drupal PHPUnit testing, so I could be missing something.
Comment #146
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commentedI will not be able to work on this next week, but I will be able to look at this after that.
Comment #147
dhansen CreditAttribution: dhansen as a volunteer commented141 has a LOT of garbage in it like permissions changes and the addition of drush as a composer requirement, and that's before I even get to any reviews. Reissuing that patch.
Comment #148
david.gil CreditAttribution: david.gil at Biko2 commentedHi,
i think that the approach that start @catch in #127, and so all the patchs sended after it had some problems, at least in my installation. I give more info. I have a multilingual site with 2 languages (spanish & english), the default one is spanish. The language detection is based in url, and i have no prefix in default language, so my urls in spanish dont have a prefix.
With the approach in latest patches i found that:
- The locale of Request objects is initialized in processInbound of class LanguageNegotiationUrl, but when there is not a path prefix it is not initialized.
- When PathProcessorAlias ask aliasManager for a path using: getPathByAlias if the Request has the locale not initialized as is in my case, the defaultLocale of Request is used, and in Symfony:Request class it defaults to "en".
With latest patch #148 based in #127 if you have an installation where the default language is not English and default language has no path prefix ALL pages in default language resolve as a 404!.
I tried several ways to initialize locale in LanguageNegotiationUrl but all that i tries has side effects. Any ideas?
Best
Comment #149
david.gil CreditAttribution: david.gil at Biko2 commentedComment #150
holist CreditAttribution: holist at Siili Solutions commentedI can confirm that the patch results in just 404's with non-English default language.
Also I found out that the issue arises if the link field is translatable but target node is not translated to current language (I'm on a site where there are four languages out of which two are only partially supported). If the link is entered as the alias "/test-page" it will be first "/{current-lang}/node/{id}" after save but convert into "/test-page" after cache rebuild. So effectively linking to default language version of a node from another language breaks the target page. Using autocomplete for the link the issue does not appear in this case either.
Comment #151
Guybrush Threepwood CreditAttribution: Guybrush Threepwood commented@dhansen: Thanks!
Comment #152
catchSo I think we should add an else branch there to setLocale() when there's no path prefix to the default language, but not able to roll that patch atm. Will also need test coverage for that case.
Comment #153
david.gil CreditAttribution: david.gil at Biko2 commentedHi catch,
i think it is not so simply (at least in the manual tests that i did. i already try what you suggest), the problem is not with the main request but with the sub-request, in some comment _Archy_ talk about that (#116 and some others).
Comment #154
holist CreditAttribution: holist at Siili Solutions commentedShould someone be trying to solve a problem like this on their site but don't have Link fields (and to keep count on the severity of the issue): It looks like this issue may also affect a site that uses Pathologic module, which causes links in A tags in formatted text fields to be processed as internal: URIs.
Haven't tested yet, but just read through the code and it seems a plausible explanation for the problem on a site that was suffering from this same issue. There actually weren't any Link fields causing the problem, but there is Pathologic to support separate URLs for editors and anonymous users.
This would be related to the feature in the issue that arises when linking by path to default language version of a node that is not translated to current language that I explained in #150.
Might actually warrant even changing the title of the issue, as it is not about Link field but linking by path across languages.
Comment #155
Arlina CreditAttribution: Arlina at Chapter Three commentedPatch #147 solved this issue on a site using custom language codes.
Comment #156
lcontreras CreditAttribution: lcontreras commented@david.gil Did you solve this?. I have a site like yours. Spanish like default language and english.
Comment #157
esolitosStill happens on 8.4 :(
Comment #158
david.gil CreditAttribution: david.gil at Biko2 commentedNot exactly the same problem, but i report here because some of my messages in the thread talks about another problem with alias and internal paths, probably someone can benefit of this: https://www.drupal.org/node/2879512
Comment #160
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedI will try to do some progress on this.
Comment #161
DeFr CreditAttribution: DeFr at Axess Open Web Services commentedHad the same problem, but using translation by domain instead of path prefix, so added that to #147. Attaching the resulting patch.
Comment #162
catch@DeFr thanks for updating.
Holist or david.gil are you able to post step-by-step instructions to introduce the bug you found with 404s so we can convert that to automated test coverage?
Comment #163
david.gil CreditAttribution: david.gil commentedHi catch,
they are in the description of the issue
Comment #164
RumyanaRuseva CreditAttribution: RumyanaRuseva at FFW commentedUnfortunately as stated by @david.gil in #148 the latest patches result into 404 if the site has non-English default language with an empty language prefix.
I can also confirm that the patch in #45 works in this case. @lcontreras and @holist you may want to try that - even though it's not perfect for all cases, it might be ok for you until a better solution is found.
Comment #165
holist CreditAttribution: holist at Siili Solutions commented@catch: like @david.gil said, steps in issue description result in 404's.
But also in addition to that, I found that it does not matter if the link fields are translatable if these hold:
/some/path
(which is saved as).
I am using patch #45 successfully, and I have instructed content editors to use autocomplete, no paths.
Comment #166
maacl CreditAttribution: maacl commentedWe encoutered a similar problem and are not sure how to proceed, but maybe this provides a more global view of the problem.
The scenario:
Now a visitors recieves a url to this special entity via mail. This URL does not contain a language prefix.
The following happens, with cold caches:
A) German visitor with german browser settings visits first:
B) English visitor with english Browser settings visits first:
So a page call on cold cache, where the language negotation results in an empty route (=no translation available) breaks the language negotation of following page calls of the same route.
For now we are using a rewrite rule to make sure in this special case the user gets redirected to the "de" prefixed url, and test a patch that prevents the cache entry of empty routes. With the patch applied in our scenario a english user as second visitor would be redirected to /en/entity/id, trigger our custom NotFoundException and everything else still works.
Comment #167
timcosgrove CreditAttribution: timcosgrove commented(xpost from https://www.drupal.org/node/2834638#comment-12268007)
Just to add a voice here: we run a D8 site in 29 languages, and we were seeing this issue, and in other cases besides link fields. We run US English as our default, serve that from /en/foo aliases, and redirect from /foo aliases to /{prefix}/foo aliases according to language negotiation rules.
We were seeing similar behavior, where in some cases /foo aliases were cached as 404s when the intended destination entity was not translated into all languages, and an untranslated redirect request (i.e. /en-gb/foo) was requested and cached.
We implemented the patch from https://www.drupal.org/node/2802403#comment-12217386 and this appears to have successfully resolved our issues. Of note, for this thread: some of the aliases in question were not contained in link fields anywhere.
https://www.drupal.org/node/2802403 (this issue) appears to be focused on link fields as that was where the bug was first seen, but the issue does ultimately seem to be with bad route caching, as Catch has said
.
I would probably recommend https://www.drupal.org/node/2834638 be closed as a duplicate of https://www.drupal.org/node/2802403 and the title of https://www.drupal.org/node/2802403 (this issue) be modified to reflect that it affects more than link field data. I would defer to Catch here.
Comment #168
catchAgreed with #167. I think this is the same issue as #2834638: ERROR 404 after a while, clear cache fixes it. If for some reason there's a bug remaining after this issue is fixed, we'd still need to rule out the approach here anyway. Marking #2834638: ERROR 404 after a while, clear cache fixes it as duplicate and re-titling.
Comment #169
catchThis also belongs more in request processing, path aliases are just consuming the API, and the information it needs isn't available at the moment.
Comment #170
jbd44 CreditAttribution: jbd44 commentedIs this a related issue: https://www.drupal.org/node/2910820? It's hard to tell.
Comment #171
gagarine CreditAttribution: gagarine as a volunteer commentedI do have a website with this issue.
You can find the full code source (with config exported in /config/sync ) on https://github.com/oxima-sarl/nouveaumonde.ch (don't try to access our prod server, we are using ssh key, no password ;) )
I attached a dump of the database user 1 is "admin" with "drupal" as password: https://www.drupal.org/files/issues/ozcf_db_prod_2017-09-23.sql_.gz
The /fr/event/youtube-perle (first event in the listing) is a 404.
Clearing the cache do not solve the problem.(i do fix, but only if the website is reloaded with an anonymous user) Accessing the node on /fr/node/18 works.Let me know if I can provide more info.
Comment #172
gagarine CreditAttribution: gagarine as a volunteer commentedIn fact clearing the cache fix the problem ONLY if I reload the page with an anonymous user after truncating the cache_* and cachetag tables.
Perhaps this is because my admin user have the settings to have the admin always in english?
EDIT: I can confirm. Having "Administration pages language: English" create 404 on cache rebuild. If I have no preference for administration page language cache rebuild works.
#161 fix the problem.
Comment #173
maacl CreditAttribution: maacl commentedIn my opinion (as far as I can overlook this and I am fairly new to Drupal) there are a couple of issues related to each other, seeing different symptoms with different proposals but all linked to language negotiation and aliases.
Making sure user gets redirected to prefixed path:
This can become a problem if someone decided/gets requested to have the default language not prefixed.
Make sure path lookup always deliveres a result, if there is any in any language
Other:
Some other older issues
Maybe this helps.
Comment #174
catch#2852303: Adding a multilingual menu link with an URL alias will result in 404 error after the cache clear is an issue with domain negotiation. That's borderline duplicate with this, but adding to related issues for now.
Comment #175
jacktonkin CreditAttribution: jacktonkin at ISSUP commentedI have had a similar issue with 404s and the patches from #127 fix the issue on my site without side effects. I have English as default language and two translations. While reviewing the patches linked at the top of the issue I noticed that the most recent patch is missing the test from #127. I have attached a patch combining #161 and the test-only patch from #127. This includes the following additions to #127 (without additional test coverage):
Having read through the issue the main outstanding area is the reports (#148, #150) that this patch doesn't resolve the issue when using languages other than English as the default.
Comment #176
catchJust to say #175 is my understanding of the issue as well. Do we possibly need to initialize $request->locale to the default language before negotiation runs to help with #148?
Comment #177
jacktonkin CreditAttribution: jacktonkin at ISSUP commentedI just realised I uploaded the wrong patch in #175. That one was #45 + the test-only patch from #127.
The attached patch is as I described: #161 + the test-only patch from #127. Apologies for the noise.
Comment #178
jacktonkin CreditAttribution: jacktonkin at ISSUP commentedAnd here is a patch that modifies AliasManagerTest to install Drupal in a language other than English, in an attempt to give a failing test for #148.
Comment #180
hctomI just created another issue that might have the same root cause: #2915190: Menu links losing active trail when not edited in path alias language (sorry for the bad title, but I did not really know how to title it because of its complexicity, hehe). Currently I only added detailed steps to reproduce losing the menu active trail when using aliased paths as menu links. Perhaps someone might have a look and see if this is related or gives you some more information/ideas on what happens here.
Comment #181
efpapado CreditAttribution: efpapado at Ramsalt Lab commented#177 Doesn't fix on 8.4.3
Comment #182
huyby CreditAttribution: huyby commentedSeems related to: https://www.drupal.org/project/drupal/issues/2846379
Comment #183
Nor4a CreditAttribution: Nor4a as a volunteer commentedMy workaround for this problem right now is to create a different alias for the page which does not work (e.g. displays 404 page) and create redirect to the page from the old alias to the new one (if it is not created automatically).
Comment #185
BerdirI've been working on a fix in #1125428: Language-specific aliases only work with url-based language negotiation but I think my change might actually belong in this issue.
I think the real problem is that getRouteCollectionForRequest() is cached without language information and what this issue is trying to do is just workarounds for that? As far as I see, if the information is cached just for the path, then it's impossible to get it working reliably for session, domain, preferred user language or anything else that is not in the path. Then again only url/domain is currently considered for the url alias.
That said, I've been trying to make sense of the test here but it's quite confusing because the comments seem to be backwards and it seems to actually assert wrong things?
So we switch the language to ro, but then expect that the lookup for /test-page, which is in the default language actually works? IMHO that should actually *not* find anything? Seems like that would only work because somehow the cache clearing isn't fully working?
If I change this assert to 0, then this test is passing for me with my patch from #1125428: Language-specific aliases only work with url-based language negotiation, without any additional changes.
another confusing example, the comment says it's proving that it's not found, but clearly we want to prove that it is found? Comments need to describe the expected behavior now, not how it worked before fixing the bug.
Comment #186
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@Berdir This was the first test I have ever created and first I've created it to prove the issue and not to prove that it is working. So first I`ve commented everything in the sens of failing, but later I have realized that it is not the way it should be done, so I have updated the test to succeed when it`s working correctly.
This test mimicks the steps from the original issue reproduction steps. The part that you describe in point 1. is when the route for the path is requested by the path resolver during rendering of the link on a non default language page. That lookup should actually find the route for /test-page because it does exist.
In my opinion the root issue comes from what i described in this comment
Comment #187
Berdir> That lookup should actually find the route for /test-page because it does exist.
But your current active language is RO and in that language, there is no route/alias, so why should it find something? Doesn't make sense to me.
Comment #188
BerdirSee also my test in the other issue for the session negotiation where I'm testing the alias twice with both languages in different orders. I think also here you should have an explicit tes that ro/test-page does not exist and also /ro-page doesn't
Comment #189
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented> But your current active language is RO and in that language, there is no route/alias, so why should it find something? Doesn't make sense to me.
Yes it should, because I am using the getRouteCollectionForRequest on the route provider service. This method is used also by the url matcher / validator. That is why I am explaining that the path processor logic is flawed as the language is not recalculated the second time this service is called / the language is not per request calculated.
Comment #190
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedHere is again my explanation on what is happening:
Comment #191
bircherA while ago (last October) we experienced that issue as well.
I created a behat test for our site but unfortunately never had the time to convert it to a core test.
But it can be pretty instructive to reproduce it.
https://gist.github.com/bircher/a1e0874bae91e086ef03117260e28c48
Comment #192
anavarreI just reproduced thanks to your Behat test. Thanks, @bircher.
Comment #193
Berdir@_Archy_: I agree that this is a problem, but not with your conclusion and that test. /test-page does *not* exist in the russian, why should it return a match? That only works for the same reason that it works with a real link field when saving and that is for the wrong reason, I'll get to that.
Been debugging this a bit more because I didn't understand why exactly this change fixes the problem, it just changed the language in the link formatter but that is irrelevant, we are not viewing the link, we are saving/validating it. And I'm starting to understand the whole thing a bit better.
We have two different problems that are overlapping a lot.
* This issue, having an aliased link to a different language that then results in a 404 when accessing it in the correct language.
* My problem, that using domain or other negotiation that has the same path for different languages allows to trigger a 404 by simply accessing it with the wrong language domain first.
For both cases, the real problem is the fact that \Drupal\Core\Routing\RouteProvider::getRouteCollectionForRequest() does not cache by language, I'm quite convinced about that. The alias resolving itself would consider the current language, which is different then in both cases And I'm also convinced that this is not the right fix (I'll get to why a bit later), but my issue is possibly also not a complete fix, not sure.
We have two approaches to fix it. Mine from the other issue that makes that cache vary by language. And this, which somehow also fixes it. I understood why mine fixes it (because there are then two caches, one for each language), but I didn't understand why this fixes it. Now I do.
So what happens is that we validate in \Drupal\link\Plugin\Validation\Constraint\LinkAccessConstraintValidator::validate() that the link is valid. This calls $item->getUrl(), which calls Url::fromUri() which then calls \Drupal\Core\Path\PathValidator::getUrl(), which creates a new request object with just the path and no other information. Then that goes into \Drupal\Core\Path\PathValidator::getPathAttributes() which does inbound path processing. in HEAD, this can't resolve the path because there is no alias for ro/the current language. But now with the patch, it can. But it can only do that because this is a new request object that *always* has locale en. Now we continue with the unaliased internal path, which is /node/ID. We now check access and end up in \Drupal\Core\Routing\RouteProvider::getRouteCollectionForRequest() with our $request object that now returns /node/4. So the cache ID is different because the path is different at this point. And since the internal path works in both languages, accessing it both with the alias and the internal path will then work on the correct language.
But what this means is that this *only* works because:
a) we do inbound path processing twice on url validation.
b) it will only ever work for links that have an english alias. No other language combination can be fixed by this approach.
The "problem" with my patch is that it actually can't resolve /test-page when editing a translation. That means we technically can't do the access check, in case my target would be unpublished and not accessible to me. But there could be an alias /test-page for two different languages, what on earth should happen then, anyway, I don't know what the correct behavior would be. The only thing I can think of right now would be to store the de-aliased internal path on the field, then we know that it will always be the same no matter which language. But afaik we explicitly decided against doing that once.
Comment #194
_Archy_ CreditAttribution: _Archy_ at PitechPlus commented@Berdir: Thanks for your detailed input. The issue is clearer now. I agree that we should cache route collections per language as-well, because the path doesn't always contain the language information (example: domain negotiation as you said). But this alone won't fix the issue.
I once proposed in this comment to make the path processing per request and the language manager to be request stack aware so that it returns the language of the current request from the stack. Now I am re-proposing that solution. This patch tough needs some work and I am not yet sure that it is doable as of the many test failures. @Berdir can you take a look and tell me what you think?
I am pro for that solution because in this manner the services that are used durning the path processing (ex: language manager) can be specific for that request.
Comment #195
_Archy_ CreditAttribution: _Archy_ at PitechPlus commenteda. Given that:
* We have url language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for all the languages /test-page.
If the user enters /test-page in the field it should match the EN one. OK
If the user enters /ro/test-page in the field it should match the RO one. OK
If -||- /ru/test-page -||- the RU one. OK
b. Given that:
* We have url language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for RO language /test-page.
If the user enters /ro/test-page it finds the alias.
If the user enters /test-page it finds nothing. Is this OK? It should be because if let's say we have another alias in RU then how do we know which one to select. => OK
c. Given that:
* We have domain language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for all languages /test-page.
If the user enters /test-page it finds ??. In this case what do we do?
c.1 Select the one in current language and if not found in current language skip. Not OK in my opinion.
c.2 Use the url negotiation path processor and allow the user to enter /en, /ru, /ro. OK I guess?
c.3 In this case the domain should also be entered in front of the path: and if it's not just search in the current language. OK I guess?
I am not sure currently how links are managed when using domain negotiation, maybe someone can clarify.
Comment #196
BerdirFor c) with domain negotiation it's the same as a prefix, if the alias is for a specific language that is not the curren't domain it will not find it.
However, the only thing that then doesn't work is access validation on save. Meaning, if linking to an unpublished content that I'm not allowed to access, i'd get a validation error. To be honest, I find that link/access validation very strange in general, if you actually *know* the alias* then why do we bother with not allowing you to link to it? That doesn't mean that you can actually access it. The link still validates successfully and is saved.
Yes, my change with the caching doesn't solve all problems. But is solves the biggest problem here and that's the cached 404's. Discussed also a bit with @dawehner and @catch in IRC and @catch suggested a separate issue and I think that makes sense.
Another separate issue that I want to open is to try and remove the double-inbound-processing on url validation. I suspect that's a left-over from an earlier implementation that didn't use actual route matching to look up the path. More consistency IMHO and a tiny performance improvement probably. We are doing quite a lot of link validation and generation.
Then, I will try to come up with a test that basically tests the full scenario as a web test with a link field without any shortcuts, to avoid incorrect assumptions and that we have test coverage for the actual bug and combine that with my test coverage from the other issue.
Comment #197
BerdirOk, here we go, two new tests, one based the LanguageSwitcherTest from the other issue and the other initially based on PathLanguageTest with parts from the previous test here but essentially completely rewritten. That test also uses a @dataProvider to run it in both directions. First with source language english and french translation, then french original and english translation.
To be able to reliable test in both directions, I also added a language prefix for english. IMHO, that should be the default behavior anyway. It does result in slightly different behavior because it means that the /target-page link on the translation does not work. We could fix that be explicitly using "$source_langcode/target-page" as alias, but not sure if we should do that.
Adding the routing cache fixes those tests. As mentioned, I think the exact behavior for link validation of translated aliases would IMHO be better handled in a separate issue, this is already long and complex enough (aka a "quagmire", as @catch so nicely described it).
Comment #198
BerdirAlso created #2939912: Inbound path processing is done twice when validating an URL path
Comment #199
BerdirAdded a comment as suggested by @dawehner in the other issue.
Comment #201
dawehnerOh I didn't knew one can have assertions in setup(). Interesting!
Do we have a followup for that?
I'm curious whether you could also add a test for the behaviour in here ...
Comment #202
Berdir1. Actually didn't do that on purpose, that part was copied from the PathLanguageTest.
2. Not yet, I think that is related to the alias/link/validation/display thing that we've been discussing in the last few comments but don't have an issue yet.
3. Thought about that. Wasn't sure if really required as we already check that there is a language and it's not the responsibility of the RouteProvider which language it is. But yes, why not, doesn't hurt.
Also fixed the broken namespace, always forget that when working with phpunit :(
Comment #203
dawehnerNice additional test! Thank you @berdir!
Comment #204
BerdirAlso opened #2940013: Define expected behavior for link validation and display with aliases pointing to different languages, the todo should possibly refer to that. I'm not sure if we can or need to do anything about that problem, but lets continue that discussion there :) I created it as a normal bug now, if others think it is more important, feel free to raise the priority.
This should probably be added to that @todo in the test as a reference, maybe that can be done before commit leaving at RTBC so that a committer can have a look.
Comment #205
anavarreJust performed manual testing and everything looks great.
Comment #207
_Archy_ CreditAttribution: _Archy_ at PitechPlus commentedI have also added #2940036: Path processors are fixated to current request and not the processed request with a test ready and open for discussion.
Comment #209
larowlanDon't tests need to have a
{Something}Test
naming pattern in order to be discovered? Or has that changed?Looks like our coding standards expect that too.
Other than that, this looks good to me
Comment #210
plachI don't think this fix will work with session negotiation, as in that case the URL language type will always get the default site language.
I'd rather add all the (configurable) language types to the
$cid
, as we do inLanguagesCacheContext::getContext()
or in D7 indrupal_render_cid_parts()
.We should be able to take care of this with the change above. I don't think the problem lies within the URL language type itself.
Comment #211
david.gil CreditAttribution: david.gil commentedIf someone uses domain module, the patch provided by Bedir dont work because that module override the method. I created a related issue and submit a patch.
https://www.drupal.org/project/domain/issues/2940296
Comment #212
BerdirThanks for the review @plach:
> I don't think this fix will work with session negotiation, as in that case the URL language type will always get the default site language.
Well, this depends on what we'll do with #1125428: Language-specific aliases only work with url-based language negotiation. Right now, session negotiation has no effect on the URL language type unless it is manually configured to do so, which I'm doing in one of the tests actually, to have a way to have a language that's not in the path. If it is in there, then my test shows that it works, if it's not in there, then it's not relevant.
Note that this cache is not a render cache, it doesn't contain an actual part of the response, just inbound path processing + matching routes.
And as long as \Drupal\Core\PathProcessor\PathProcessorAlias::processInbound() hardcodes (by not specifying it and using the default in AliasManager) TYPE_URL, using type URL here is enough. Unless someone of course does inbound path processing with a different language type (or the moon phase or hour of the day, then it won't work anyway). But I'd argue with the current meaning of TYPE_URL, that's not really a valid thing to do?
If \Drupal\Core\PathProcessor\PathProcessorAlias::processInbound() changes in #1125428: Language-specific aliases only work with url-based language negotiation, then this would need to be updated as well.
Also, render caching in D8 by default only varies by the interface language, not content, that needs to be explicitly added and currently, nobody cares about also adding the url language type afaik in core :)
That said, I can add all languages if you feel strongly about that, just wanted to explain why I think that's not really necessary. It would be more about being extra-sure that we're fine if someone does something crazy/incorrect.
> We should be able to take care of this with the change above. I don't think the problem lies within the URL language type itself.
That would probably be true, but I don't think it's the correct fix, just a side-effect of it. The result if the path negotiation will not change, we'll just cache it differently. I think that test-scenario is bit weird and probably not something that happens in reality (depending on what the agreement/fix in #1125428: Language-specific aliases only work with url-based language negotiation will be). But I need a non-path based negotiation and can't use domain in tests.
#209: Yes that is wrong, will update that when we have an agreement on the cid.
Comment #213
dawehnerOne day we might no longer use run-tests.sh/simpletest to find the list of tests. In this absurd future, we will be able to run these tests properly through phpunit :)
Comment #214
BerdirOk, discussed with @platch, we agreed to keep the URL language type here and add a @todo. I also refactored the cache id building into *2* methods, specifically to make it easier for domain.module and similar use cases to override just the part that it needs while allowing us to change the language logic in the future without having to update domain.module again after that.
Comment #215
plachLooks great, thanks!
Comment #218
catchCommitted/pushed to 8.6.x and cherry-picked to 8.5.x. Thanks!
Comment #220
jatinkumar1989 CreditAttribution: jatinkumar1989 commentedHi,
I updated core to 8.5.0 and i am facing the same issue. (#214 is already part of core, so code changes are already there)
for language “EN” all nodes/links are working fine.
but when i switch language, some links (menu links) and node links are giving `page not found`
any help, Pls.
Thanks
-Jatin
Comment #221
jatinkumar1989 CreditAttribution: jatinkumar1989 commentedComment #222
xjm@mail2jatingarg, I'd suggest opening a new issue. Thanks!
Comment #223
thierry.beeckmans CreditAttribution: thierry.beeckmans at Dropsolid for Teamleader commentedWe had a related issue on Drupal 8.3.9 where the alias was identical for a node in Dutch and another node in English and German. The pages are served in a separate domain.
* When a site visitor opened the Dutch node 1, it became cached.
* In a second visit, the English version of node 2 was requested. Instead of serving node 2, it served node 1, as both nodes shared an identical url alias and node 1 was already cached.
I have made a patch for Drupal 8.3 based on #214.
Comment #224
thierry.beeckmans CreditAttribution: thierry.beeckmans at Dropsolid for Teamleader commentedComment #225
thierry.beeckmans CreditAttribution: thierry.beeckmans at Dropsolid for Teamleader commentedThe patch generated in comment #223 for Drupal 8.3.9 was not generated correctly and included some unrelated code parts.
The patch file for 8.3.9 is updated.
Comment #226
xmacinfo@thierry.beeckmans Please open a NEW issue and reference this one.
And then, post your patch in the new issue.
It is generally not a good idea to reopen a closed issue.
Comment #227
skuark CreditAttribution: skuark commented@thierry.beeckmans Can you check your patch in #225 is valid for 8.3.9? I am not able to apply it on a Drupal 8.3.9 installation.
Comment #228
skuark CreditAttribution: skuark commented@thierry.beeckmans No worries, I've been able to apply the patch in #214 to my 8.3.9 installation.
Comment #229
sathishdevan CreditAttribution: sathishdevan as a volunteer and commented@thierry.beeckmans
The patch you have given does not work for domain based localisations.
Comment #230
bradjones1This issue is long closed; if there are new issues, please open a new issue.
Comment #231
kadiiski CreditAttribution: kadiiski commentedFor the rest of the people that will be looking for a solution for this case:
I found that these broken routes are being cached in drupal and thus not being passed to the Inbound when needed. So it's actually caching a rout with no route, query, params, controller etc - just the path. This is by definition a 404 page.
I decided that in my case it's not going to be a harm if we don't cache the 404 pages.
Please not that this might affect the performance.
Comment #232
Carlitus CreditAttribution: Carlitus at Omitsis Consulting SL commentedThanks @kadiiski! With your patch now it works this:
https://www.drupal.org/project/country_path/issues/3332893