Problem/Motivation

I've got a live Drupal 8 site that has certain pages that 404 when accessed. The page is still accessible to edit, provided you can track down the Node ID to write the path. Clearing the caches on the site solves the problem temporarily. I've had several instances where after clearing the cache and making sure the page is accessible, it becomes inaccessible again the next day. This is unsettling as those path caches are not set to expire ever, so I do not know what is acting on them.

Since clearing caches works, I tracked down the cached path information in the cache_data table. Broken paths are missing a significant amount of data. For example, here is the cached data on a path that is not working (cid: route:/patients/prevention/plant-products/cancer-prevention-sulforaphane:) :

a:3:{s:4:"path";s:66:"/patients/prevention/plant-products/cancer-prevention-sulforaphane";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":2:{s:49:" Symfony\Component\Routing\RouteCollection routes";a:0:{}s:52:" Symfony\Component\Routing\RouteCollection resources";a:0:{}}}

This is considerably emptier than the same path cache after a cache clear and re-access:

a:3:{s:4:"path";s:10:"/node/4054";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":2:{s:49:" Symfony\Component\Routing\RouteCollection routes";a:1:{s:21:"entity.node.canonical";C:31:"Symfony\Component\Routing\Route":1327:{a:9:{s:4:"path";s:12:"/node/{node}";s:4:"host";s:0:"";s:8:"defaults";a:2:{s:11:"_controller";s:48:"\Drupal\node\Controller\NodeViewController::view";s:15:"_title_callback";s:49:"\Drupal\node\Controller\NodeViewController::title";}s:12:"requirements";a:3:{s:4:"node";s:3:"\d+";s:14:"_entity_access";s:9:"node.view";s:7:"_method";s:8:"GET|POST";}s:7:"options";a:5:{s:14:"compiler_class";s:34:"\Drupal\Core\Routing\RouteCompiler";s:10:"parameters";a:1:{s:4:"node";a:2:{s:4:"type";s:11:"entity:node";s:9:"converter";s:21:"paramconverter.entity";}}s:14:"_route_filters";a:1:{i:0;s:27:"content_type_header_matcher";}s:16:"_route_enhancers";a:1:{i:0;s:31:"route_enhancer.param_conversion";}s:14:"_access_checks";a:1:{i:0;s:19:"access_check.entity";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";C:33:"Drupal\Core\Routing\CompiledRoute":429:{a:11:{s:4:"vars";a:1:{i:0;s:4:"node";}s:11:"path_prefix";s:5:"/node";s:10:"path_regex";s:24:"#^/node/(?P<node>\d+)$#s";s:11:"path_tokens";a:2:{i:0;a:4:{i:0;s:8:"variable";i:1;s:1:"/";i:2;s:3:"\d+";i:3;s:4:"node";}i:1;a:2:{i:0;s:4:"text";i:1;s:5:"/node";}}s:9:"path_vars";a:1:{i:0;s:4:"node";}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:2;s:14:"patternOutline";s:7:"/node/%";s:8:"numParts";i:2;}}}}}s:52:" Symfony\Component\Routing\RouteCollection resources";a:0:{}}}

Running Drupal 8.1.10 with several languages enabled

Note that I initially started to investigate this issue on #2728725: Special characters like tab or spaces in pattern can break alias generation but I do not think it is related as that issue is not partially resolved with cache clears and also does not involve Pathauto.

Steps to Reproduce

  1. Spin up an instance of multilingual_demo distribution on simplytest.me (http://ply.st/multilingual_demo). Proceed through the install like normal; you should not need to change any defaults.
  2. We'll use the 'About Us' page as the target. Edit that page (/node/9/edit) and give it a new alias under URL Path Settings. In this example I have set the new alias to '/about-us'. It just needs to be different from the aliases given to the other translations.
  3. Add a link field to the basic page content type (/admin/structure/types/manage/page/fields/add-field). Do NOT allow this field to be translated, and make sure it allows internal links. Also move the link field out of 'Disabled' on the full content display (/admin/structure/types/manage/page/display/full); the link field needs to be rendered for this to happen.
  4. Add a basic page (/node/add/page). Content does not matter, but on the newly created Link field add in the path in the URL box ('/about-us'). Do not use the autocomplete.
  5. Make a translation of that new page. I'll go ahead and make a Spanish one (/es/node/10/translations/add/en/es). No changes need to be made; you can just save it
  6. Clear the site caches on the Performance page (/admin/config/development/performance). The new route issue happens when the cache is created, so we can't have the existing cache standing in the way.
  7. Visit the translation of the new page (es/node/10). This renders the link field which breaks the route cache of the page it links.
  8. Attempt to access the English About Us page (/about-us). It will now 404 until the next cache clear.
CommentFileSizeAuthor
#231 disable_caching_of_broken_routes-2802403-231.patch798 byteskadiiski
#225 routing-cache-language-2802403-225.patch19.5 KBthierry.beeckmans
#223 routing-cache-language-2802403-223.patch26.54 KBthierry.beeckmans
#214 routing-cache-language-2802403-214-interdiff.txt3.13 KBBerdir
#214 routing-cache-language-2802403-214.patch18.9 KBBerdir
#202 routing-cache-language-2802403-202-interdiff.txt3.81 KBBerdir
#202 routing-cache-language-2802403-202.patch17.85 KBBerdir
#199 routing-cache-language-2802403-199-interdiff.txt966 bytesBerdir
#199 routing-cache-language-2802403-199.patch15.7 KBBerdir
#197 routing-cache-language-2802403-197.patch15.5 KBBerdir
#197 routing-cache-language-2802403-197-test-only.patch9.72 KBBerdir
#178 failing-test-2802403-178.patch9.08 KBjacktonkin
#178 interdiff-2802403-177-178.txt1.45 KBjacktonkin
#177 2802403-177.patch8.64 KBjacktonkin
#175 2802403-175.patch7.74 KBjacktonkin
#171 ozcf_db_prod_2017-09-23.sql_.gz2.1 MBgagarine
#161 2802403-161-path-alias-404.patch4.33 KBDeFr
#147 path_module-allow-custom-langcodes-2802403-147.patch4.14 KBdhansen
#141 path_module-allow-custom-langcodes-2802403-141.patch1.85 MBGuybrush Threepwood
#140 2802403-140.patch11.57 KBGuybrush Threepwood
#134 2802403-134.patch8.44 KBGuybrush Threepwood
#131 2802403-129.patch8.43 KBbmcclure
#127 2802403-126.patch8.43 KBcatch
#127 2802403-126-test-only.patch4.3 KBcatch
#126 2802403-126.patch8.43 KBcatch
#126 2802403-126-test-only.patch8.43 KBcatch
#125 2802403-30-fix-92-tests.patch8.43 KBcatch
#118 RouteProvider_php_-_thyssen_-____Sites_thyssen_.png157.38 KBdavid.gil
#118 thyssen-problems.jpg317.29 KBdavid.gil
#108 core-lmanager_and_processor_current_request-2802403-108.patch4.49 KB_Archy_
#106 core-path_processors_contain-2802403-106.patch2.32 KB_Archy_
#103 interdiff-91-103.txt1.12 KBtameeshb
#103 2802403-103.patch8.84 KBtameeshb
#99 pass-langcode-to-getPathByAlias-2802403-91.patch8.81 KBoriol_e9g
#99 2802403-test_aliased_route_cache-92.patch4.3 KBoriol_e9g
#92 pass-langcode-to-getPathByAlias-2802403-91.patch8.81 KB_Archy_
#92 2802403-test_aliased_route_cache-92.patch4.3 KB_Archy_
#92 interdiff-for-test.txt2.26 KB_Archy_
#90 2802403-green_proves_wrong_functionality-test.patch3.9 KB_Archy_
#85 2802403-green_proves_wrong_functionality-test.patch0 bytes_Archy_
#73 pass-langcode-to-getPathByAlias-2802403-73.patch4.51 KBGuybrush Threepwood
#56 interdiff-2802403-30-56.txt1.14 KBwillwh
#56 pass-langcode-to-getPathByAlias-2802403-56.patch4.5 KBwillwh
#45 core_path-test-2802403-45-D8.patch2.07 KB_Archy_
#45 core_path-alias_manager-language_fallback_and_test-2802403-45-D8.patch3.37 KB_Archy_
#11 2802403-11.patch6.43 KBdawehner
#15 2802403-11.patch6.43 KBBerdir
#21 getPathByAlias-parse-default-langcode-2802403-21.patch1.8 KBrjay
#23 getPathByAlias-parse-default-langcode-2802403-22.patch1.21 KBrjay
#25 pass-langcode-to-getPathByAlias-2802403-25.patch3.17 KBrjay
#28 pass-langcode-to-getPathByAlias-2802403-28.patch3.97 KBrjay
#30 pass-langcode-to-getPathByAlias-2802403-30.patch4.05 KBrjay
#33 pass-langcode-to-getPathByAlias-2802403-33.patch3.99 KBrjay
#37 core_path-alias_manager-default_match_to_other_languages-2802403-37-D8.patch1.31 KB_Archy_
#40 core_path-alias_manager-default_match_to_other_languages-2802403-40-D8.patch1.35 KB_Archy_
#40 interdiff_2802403_37-40.txt958 bytes_Archy_
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dhansen created an issue. See original summary.

Berdir’s picture

I do think it is the same issue, people now started to claim that cache clear helps them there. And if not, the other issue is actually multiple issues.

But thanks, this is a lot of useful info.

Pathauto uses core API's though to save aliases, and we're not storing anything in cache ourself, afaik that only happens when aliased pages are actually accessed. Could be some kind of race condition triggered by our code, though.

dhansen’s picture

I tracked down that cache generation to RouteProvider::getRouteCollectionForRequest. It seems the processInbound on the PathProcessManager is returning no updates to the path when the cache is being set, which means AliasManager's getPathByAlias is returning nothing when all this happens (it's being called by PathProcessAlias).

I think that means this points to some kind of race condition as @Berdir described, where the cache is being created but the path is not yet set.

It's still very strange. It continues to happen specifically and consistently on specific nodes of the site; a race condition IMO is generally less consistent than this.

Anyone have any ideas on how to make sure the alias is in place before the cache for the route is built?

dhansen’s picture

Version: 8.x-1.0-alpha3 » 8.x-1.0-beta1

Did some updates and I'm continuing to see this issue with:

  • Drupal Core 8.1.10
  • Pathauto 8.x-1.0-beta1
  • Token 8.x-1.0-beta2
dhansen’s picture

I hacked some core to output information on what is happening in the RouteProvider. I've learned the following:

  • The AliasManager not returning the correct alias is not just an issue in the PathProcessManager; calling the AliasManager separately from the PathProcessManager at the same time gives the same result. At least we know it's consistent.
  • I'm logging to watchdog (or whatever it's called in D8 now) so I can see the request that is prompting the call to re-cache the route. NONE of these requests have been to the page itself. All of the requests that prompt the re-caching come from pages that:
    • Have a link field that point to the page that ends up failing.
    • Are translated content (Spanish, which is the only currently active language) but the link field is NOT translatable.

This is looking less and less like a Pathauto issue and more like a core issue.

willwh’s picture

Hi dhansen!

We are also running in to this, thanks very much for reporting this!

Could you possibly provide a patch for your core changes? I'd love to help with this issue, as it is definitely Major, possibly arguably a critical bug.

Thanks!

dhansen’s picture

Title: Some Aliases Caching Incorrectly » Link Field Causes Pages to 404 with Bad Route Caching on Multilingual Sites
Project: Pathauto » Drupal core
Version: 8.x-1.0-beta1 » 8.1.10
Component: Code » link.module
Issue summary: View changes

Big breakthrough: I can reproduce the issue! Lots of ticket updates including updating the issue name and adding reproduction steps.

The issue occurs without the help of Pathauto, so I'm moving this to the Core queue. This is definitely related to the Link field. Pathauto just seemed to be a component because I was having it generated different aliases based on different languages, which seems to be a component of this breaking down.

I should note also that this cache is NOT being updated spontaneously. We're just flushing cache so often in dealing with it that it seemed to happen at odd hours.

@willwh I'm hesitant to create a patch because I don't want anyone to think it is a resolution. Also, most of my code was just an array defining the corrected paths for pages I knew were breaking. If you can identify any pages that break on the regular on your site, you can do something similar to this after line 157 in core/lib/Drupal/Core/Routing/RouteProvider.php:

$broken_paths = array(
  '/path/that/breaks' => '/node/1234', // Use the correct internal path here
);

if (isset($broken_paths[$request->getPathInfo()]) && $request->getPathInfo() == $path) { // If the alias is broken, the returned path will still be the same
  \Drupal::logger('routing_issue')
    ->notice('Path Rebuilt: ' . $path );
  $path = $broken_paths[$request->getPathInfo()];
}
willwh’s picture

Absolutely, thanks for posting more info. I will be spending some time on this today.

azinck’s picture

We're also seeing this; it's very disruptive. Definitely qualifies as "major".

Thank you for posting, dhansen, as this has been almost impossible for us to reproduce outside of our production environment.

dawehner’s picture

This is really a "interesting" issue.

dawehner’s picture

Status: Active » Needs review
FileSize
6.43 KB

@chx and @dawehner looked at this issue and we have some assumption that the PathValidator calls out to the inbound path processor while the route provider does as well.

We have both no idea whether this could fix the issue, but it has the potential to do so.
@willwh @azinck @dhansen
It would be great if you could try out the patch with your setup.

dawehner’s picture

This really sounds borderline critical

willwh’s picture

@dawehner & @chx, thanks guys!

Honestly, I'm still try to reproduce this locally :)

Based on the awesome digging by @dhansen, I thought I'd be able to do so.

i.e. on some pages we have untranslatable link fields used in views... although, despite my many attempts yesterday, I still can't seem to reproduce this in my local environment.

My thinking is roughly; with all types of caching enabled; drush cr, view various pages in each language, then view the pages I would expect to display the above behaviour (they do in our prod environment) in multiple languages (We're using English as a default with 'jp' and 'zh-hans' also enabled).

@dhansen, if you have any further tips as to how you're reproducing this, I'm all ears... I've had no luck so far.

azinck’s picture

We've deployed this patch to production and will report back if we see the problem recur. Thanks for your efforts @dawehner and @chx.

Berdir’s picture

FileSize
6.43 KB

Re-uploading the patch, lets see if testbot likes me more than @dawehner

dhansen’s picture

Status: Needs review » Needs work

@dawehner and @chx: Just tried the patch on my local dev (skipping the test changes, so just tweaking those two lines in the PathValidator). It has no effect on the issue; pages still 404.

@dawehner The only reason I left it as Major instead of Critical is because it represents no data loss. I was on the fence as well, so if you see a good reason to bump it up then we certainly can.

@willwh I have NOT tried it with the link field rendered in a view, which may be a key difference but I am not sure. There are some key things that seem to contribute to the issue on my end:

  • Language is managed through the langcode appearing in the path before the alias. Not sure if that makes a difference or not.
  • The page that breaks has different aliases for different languages beyond just the langcode.
  • The link field that breaks the page uses the path and NOT the autocomplete entity reference. The path it uses is the path that would be broken.
  • The link field must be rendered for it to break. Just loading a page with a hidden field won't do it. This is where I'm not sure how Views fits in.
  • Link field needs to be untranslatable on the content type
  • You need to load the translated version of the content with the link field on it for the alias to break. If you don't have link text, you'll see the output of the link switch to the internal Drupal path if you've done it right.

I'll help out more if I can. I will say that the best work-around currently is just to not use internal paths on the link field and only use the entity reference. That automatically translates the path with no issues.

azinck’s picture

I hadn't noticed the steps that dhansen posted to the summary about how to reproduce the problem. I'll test it in our environment later today.

Berdir’s picture

Version: 8.1.10 » 8.2.x-dev
willwh’s picture

Much like azinck, I did not see those reproduction steps. We're taking a hop through this, this morning, thanks very much for the additional info folks!

rjay’s picture

I was able to reproduce this issue consistenly in a local development environment by following the instructions that bhansen provided, so I started stepping through the code that renders the link field with a debugger.

During the rendering of the link field, getPathAttributes() calls the inbound path processor, which in turn calls PathProcessorAlias::processInbound(). The processInbound method attempts to load the path from the alias that is provided:

...
$path = $this->aliasManager->getPathByAlias($path);
...

The getPathByAlias() method takes a language code as an optional second parameter, which is not being passed in this case. When a language code is not passed in to getPathByAlias(), it falls back to using the current language to try to load a path.

Here is where I believe the problem starts to occur. If you follow bhansen's steps to reproduce, the alias that is entered in the link field is in the default language (eg. English). When you do a cache rebuild followed by viewing the translation of the page that has the link field on it, getPathByAlias() attempts to fetch the path that the alias points to using the current language (eg. Spanish).

In my case, the page that is being linked to only has a translation in the default language, and does not have a translation in any other languages. Ultimately getPathByAlias() is unable to fetch the path that the alias points to in the current language, so $path remains set to the alias instead.

Then getPathAttributes() tries to fetch a route that matches $path, but that fails because $path is still an alias. However a record still gets added to cache_data for the alias with an empty RouteCollection.

So first of all, it seems like it would be sensible to not cache a path if we are unable to find a matching route for it.

Second, anyone have any thoughts on how should we deal with making sure getPathByAlias() attempts to fetch a path in the correct language? We could make PathProcessorAlias::processInbound() pass a language code to getPathByAlias(), which would deal with this specific situation. Or we could make getPathByAlias() attempt to get the language code from the alias, before it falls back to using the current language.

rjay’s picture

Status: Needs work » Needs review
Issue tags: +Needs tests
FileSize
1.8 KB

This patch makes getPathByAlias attempt to determine the $langcode from the $alias first, rather than automatically defaulting to the current language.

Status: Needs review » Needs work

The last submitted patch, 21: getPathByAlias-parse-default-langcode-2802403-21.patch, failed testing.

rjay’s picture

So those test failures indicate that modifying getPathByAlias() was not the right approach here.

This patch modifies the inbound path alias processor and passes a language code to getPathByAlias(). Let's see if Testbot likes this one better.

Status: Needs review » Needs work

The last submitted patch, 23: getPathByAlias-parse-default-langcode-2802403-22.patch, failed testing.

rjay’s picture

Status: Needs work » Needs review
FileSize
3.17 KB

I updated the returnValueMap for getPathByAlias() in the path processor tests. This was necessary because this patch makes the path processor pass a language code to getPathByAlias() now, instead of NULL.

rjay’s picture

Status: Needs review » Needs work

So all of the existing unit tests are passing now, at least.

The URI for the untranslatable link field is now being rendered as /<langcode>/node/<nid>, instead of the alias that was entered in the link field, when viewing the page that the link field appears on in any language other than the site's default language.

catch’s picture

Priority: Major » Critical

I think this comes under:

Cause race conditions, database deadlocks etc. for which even code with 100% automated test coverage may be affected

Also nearly:

Render a site unusable and have no workaround.

If you get 404s on lots of pages the site is unusable. You can't flush caches on a site 24 hours a day as a workaround (well you could but it's not like an admin page being inaccessible).

So bumping to critical. Nice find!

rjay’s picture

Status: Needs work » Needs review
FileSize
3.97 KB

The issue that I mentioned in #26 that is causing the URI to be rendered as /<langcode>/node/<nid> seems to be due to the fact that $url->options does not contain a language.

When no language is explicitly set on the $url, the outbound path processors (specifically, the outbound path alias processor) default to the current language.

I've updated the patch in #26 so it now explicitly sets the language option on the $url when rendering the link field.

Status: Needs review » Needs work

The last submitted patch, 28: pass-langcode-to-getPathByAlias-2802403-28.patch, failed testing.

rjay’s picture

Status: Needs work » Needs review
FileSize
4.05 KB

Reroll of #28, but this time only explicitly setting the language on the $url for untranslatable links.

dawehner’s picture

+++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorAlias.php
@@ -32,7 +32,7 @@
   public function processInbound($path, Request $request) {
-    $path = $this->aliasManager->getPathByAlias($path);
+    $path = $this->aliasManager->getPathByAlias($path, $request->getLocale());
     return $path;
   }

Could this code also just call out to the language manager and ask for the current language?

rjay’s picture

@dawehner: The getPathByAlias() method in the alias manager is already calling out to the language manager to get the current language when a language is not passed in to it, and when I was debugging this issue that was what triggered the 404 errors.

When an untranslatable link field is being rendered on a page where current language != default language, then $request->getLocale() will be different than the current language. That said if there is another, better way of getting the language of the link field than $request->getLocale(), let me know and I'll update my patch :)

rjay’s picture

Upon further testing I was still seeing links in the format /<langcode>/node/<nid> with the patch in #30.

So this patch gets the language from the $item for untranslatable links when rendering a link field, instead of using the language that is passed to viewElements(). This is for the same reason that I mentioned in #32.

dawehner’s picture

Given we have 404s I believe some actual higher level regression test would be really cool.

dawehner’s picture

Status: Needs review » Needs work

Let's be honest here :)

Wim Leers’s picture

Issue tags: +D8 cacheability

This screams cache contexts.

_Archy_’s picture

Hi everyone! I investigated the issue without continuing the findings of @rjay and tracing the problem I ended up in the same place.

The main problem here is that routes do actually get cached on language context, while route collections for a request do not necessarily (might be wrong about this). For example with the provided reproduction the /about-us alias is not getting converted to the path because we are searching for it in the wrong language context. So we get it back as is, then when we try to match a route for it we get 0 results and cache it. Then when we navigate to the path we get this cache with 0 routes.

I've also taught about caching the route collection matched for a request in context of language, but extracting the language from the request is messy before the PathProcessorLanguage has ran (the cache id needs to be generated before this, see: RouteProvider::getRouteCollectionForRequest). This solution might break other stuff.

I propose the following solution, which I think it's cleaner than @rjay s: in AliasManager::getPathByAlias in case we don't find a match in the current language we should also check for one in the default language, and if found put it in the current language cache as well.

_Archy_’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
_Archy_’s picture

Forgot to check if the website is multilingual.

_Archy_’s picture

Forgot to mention that the patch in #40 renders the link as /<langcode>/<alias>. I guess this would be the right/expected behaviour, correct me if I am wrong, but if not it also without effort eliminates the issue with the other patch:

The URI for the untranslatable link field is now being rendered as /<langcode>/node/<nid>, instead of the alias that was entered in the link field, when viewing the page that the link field appears on in any language other than the site's default language.

arunkumark’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +ChennaiDrupalGroup

This patch seems working fine. It handles 404 issue on Multilingual site.

heddn’s picture

Status: Reviewed & tested by the community » Needs work

Still needs tests.

BTW, #33 definitely does not fix the issue. It breaks the site because there is no ->get() method available from BaseFieldDefinition. I discovered this on a multi-lingual with that version of the patch on site when trying to view redirects at admin/config/search/redirect.

xjm’s picture

Issue tags: +Triaged D8 critical

The committers discussed this issue and agreed that it is a critical bug, given the unpredictability, disruption for regular site visitors, and the lack of a feasible workaround.

_Archy_’s picture

Created a test for this issue and added it to my patch. This is my first ever test so it might not be the best. It seems to me that it is bound to my solution, so someone should take a closer look at the test code.

Uploading test patch to fail and the fix with test patch to pass.

NOTE: I have added the following to the testGetPathByAliasNoMatch() so it doesn't throw exception (getId() on NULL..) when ran with my solution:

$this->languageManager->expects($this->any())
      ->method('getDefaultLanguage')
      ->will($this->returnValue($language));

The last submitted patch, 45: core_path-test-2802403-45-D8.patch, failed testing.

drupalninja99’s picture

We have 2 multilingual D8 sites with similar symptoms, that is random 404s for pages that have aliases.

_Archy_’s picture

@drupalninja99 could you try and see if the #45 patch with the fix and the test eliminates the problem.

Bringing to top the test patch and fix+test patch.

rcodina’s picture

@_Archy_ I've tested your patch and it applies cleanly. However, it doesn't solve the problem I experience with URL alias in Spanish. See related issue #2728725: Special characters like tab or spaces in pattern can break alias generation.

_Archy_’s picture

Hello @rcodina. Did you clear the cache after applying the patch? If so and it still doesn't fix the problem: could you give me a step to reproduce the problem, or maybe just point me to one?

I have a hunch that it is caused because of the default language other than english and/or probably that it is the only one. Could you also check if you add back english it happens? If yes: then when you set it default does it?

rjay’s picture

Status: Needs review » Needs work

@_Archy_: I tested your patch in #45 and it does prevent 404 errors from occuring for me when I follow the Steps to Reproduce in the description of this issue.

However, as you pointed out in #41, your patch causes the URI to be rendered as /<langcode>/<alias>. That doesn't seem like the expected behaviour for an untranslatable Link field, in my opinion.

If a user enters an /<alias> in an untranslatable Link field, I would expect that when the Link field is rendered the alias is output exactly as it was entered in the field, regardless of what the current language is. And that is the behaviour of an untranslatable Link field without the patch in #45 applied.

So the patch in #45 alters how the URL of an untranslatable Link field is constructed when it shouldn't. Inside the getPathByAlias() method there doesn't seem to be anything we can use as context to determine if we are fetching an alias for an untranslatable Link field, which is what we would need in order to handle that specific case appropriately.

That's the conclusion I came to in #23, and it's why I abandoned the approach of modifying getPathByAlias() in favour of attempting another, admittedly less elegant, possible solution.

I've also confirmed the issue with the URL Redirect listing page that @heddn pointed out in #43, so my patch in #33 does indeed still need work if we were to return to that approach.

_Archy_’s picture

I just wanted to take a look at alternative ways to handle this, but can't reproduce the issue. Neither from the branch I was working from for this issue (pull--rebased), neither by checking out a new branch that tracks 8.2.x. Maybe something changed that alters the way links are rendered or I am doing something wrong, tough I've re-read the steps and re-tried it 3 times.

Can someone check if it still happens with the latest dev?

EDIT: It renders the link as /about-us on the translated content (to ro) and when I access it works (takes to en).

rcodina’s picture

Hello @rcodina. Did you clear the cache after applying the patch? If so and it still doesn't fix the problem: could you give me a step to reproduce the problem, or maybe just point me to one?

I have a hunch that it is caused because of the default language other than english and/or probably that it is the only one. Could you also check if you add back english it happens? If yes: then when you set it default does it?

Yes, I cleared the cache. I can't give you steps to reproduce the problem. It's weird but It happens at some point. You can check out my comments on issue #2728725: Special characters like tab or spaces in pattern can break alias generation to find more details.

I added back English and it doesn't work too. Then I set English default again and then both nodes failed (manual URL alias and auto alias). Then I saved again both nodes and then the problem was the same as usual (the auto alias doesn't work but the manual it does).

Can someone check if it still happens with the latest dev?

How Can I update the core to latest core dev? Thanks!

oriol_e9g’s picture

@rcodina You can grab the last core code dev from GIT:

git clone --branch 8.3.x https://git.drupal.org/project/drupal.git
rcodina’s picture

@oriol_e9g But if I just want to just update the core I already have? Are you sure the clone will not break the site?

willwh’s picture

willwh’s picture

Status: Needs work » Needs review
willwh’s picture

Status: Needs review » Needs work

We need better tests for this.

_Archy_’s picture

I just confirmed that my assumption that it was magically fixed was wrong. The issue persists.

@rjay thanks for feedback. I've been thinking about what you said regarding my patch #45. I think that the outcome is actually better.

Let's say that the website is multilingual with my patch applied: en + ro for example. Scenarios:

  • An english user visits the site. He is on the page on which we have the aliased link. He clicks on it gets on the /about-us page which is still english. OK
  • A romanian user visits the site. He is on the translated page on which we have the aliased link. He clicks on it and gets to /ro/about-us. The about us content wasn't translated so the english one will be shown, rather than nothing, BUT all other stuff around it (links, blocks .. etc) will be translated so he can still easily navigate on the website. As he knows little english maybe he can still understand something from the about us. OK
  • The website owner wants to publish somewhere else the link to about us:
    • Wants to target english audience: it can be done: /about-us is the link. OK.
    • Wants to target romanian audience tough it's not yet translated: /ro/about-us is the link. OK.

I can only see the benefits of having it as /{lang-code}/{alias}. Remember it renders as /{alias} if on default language. In addition I believe the alias manager actually should handle this and not other. So maybe someone can point me to a real scenario where this behavior it's a drawback.

azinck’s picture

Just want to weigh in that I strongly agree with Archy's interpretation here, as opposed to rjay's.

benelori’s picture

Hello everyone!

We've ran into this problem on our site as well, and here's a more detailed description of what happens.

1. We have two languages: one default and another enabled language.

2. Whenever there is a link that is rendered on the page and it runs through validation to be a valid route, Drupal checks if that information is cached and if not, then it saves that information.

3. Suppose you are on Romanian language page and you have a content on the page that is not translated yet:
e.g. something that renders a link that is an English alias.

4. Drupal will do what I described in step 2.
4.1. RouteProvider::getRouteCollectionForRequest() will check if there's a cache entry already, and if there's not then it will try to find out, if the English alias exists in a Romanian language context. This is not the case obviously, so line 160: $this->getRoutesByPath() will return 0.
4.2. In conclusion the data that will be saved in the cache will be 0.

5. When we try to actually access that particular link, after some navigation, the data will be loaded from cache and it will find no valid route saved for it, so NestedMatcher::matchRequest() will throw ResourceNotFoundException.

So #45 fixes the issue, because AFAIK, any untranslated content will fall back to the default language, and the patch does exactly this: when the data about the route is saved in the cache for the first time, the alias that is searched for handles the fallback (the default language) as well.

At first glance, my gut tells me that #45 treats the symptom and not the problem, though, but it's still a working patch in our case.

willwh’s picture

Hmm, 45 does not work in our case, although my patch in #56 definitely does. I would love to see some input from maintainers if they have the time! Thank you!

_Archy_’s picture

As I see the issue patches that fix it by altering the way links are rendered won't solve it from it's root. Given the right context no matter from where if the route match is requested (in this case from the link formatter) the path aliased wont be found and will be cached with no match.

Another possible cleaner way to solve this would be to figure out a context to cache route matches on, ofc somehow language, but as I described in my first comment getting the language before the language pathprocessor ran is tricky to do. Also this would result in more cache and multiple duplicates.

Quite possibly my patch doesn't solve the problem in all the cases, so @willwh I would like to recreate your environment and try to adapt. What is the difference between your setup and the one described to reproduce? Or maybe you could check why the patch doesn't work.

mudassar774’s picture

Hi,

I have exactly same issue for the page which i created using views. Though I have NO link fields in source content types.

cilefen’s picture

_Archy_’s picture

A special case here is when we create the node not in the default language, so the nodes original language is other than the default and we set the alias. In this case no matter how we try to access /{alias} it won't work. It will only work if the language code in which it was created is prepended /{original-language-code}/{alias}.

This case is not handled nor affected in any way by the #45 patch.

I think this should be handled as-well, because what happens: the user creates this node in a language other than the original, adds the alias and puts it in a link field somewhere else. Now if the page where this link shows is not in the linked nodes original language it will lead to 404. Even if we don't take in consideration links, the path /{alias} will lead to 404, even tough I as the user wouldn't expect that. Now I am not sure what to do about this. For the AliasManager to take care of this as-well it would have to check through all other languages (not the with the one which first requested and in case of #45 applied neither original) as the last resort to find a match.

EDIT: After thinking a bit more about it, let's say that this node is originated from a language not the original and ALSO it is translated in another language (both having the same /{alias} set). So now even if implemented to search in other languages for alias match it could find multiple matches: the original one should be selected tough. But I think this is impossible to know at that point..

Should it be left as is or made something, what do you think?

rcodina’s picture

Should it be left as is or made something, what do you think?

@_Archy_ Something has to be done. In my case ALL my content is only in Spanish and it is frustrating to have this "Page not found" errors. I don't have any content in English.

_Archy_’s picture

About the special case I was talking about, that requires some serious changes and it's just related to this issue. It might be tough a feature request if it's needed.

@rcodina I can't really tell what is happening on your project. I've read through the related issue you posted. I've tried having drupal with only one language other than english, created nodes, tried it with links rendered in nodes, tried pathauto but they all seem to work with patch #45. Does your problem even involve links? After clearing the cache and directly accessing the /{alias} does it work? Does it work by accessing /es/{alias}?
Could you maybe help us a bit and (with patch #45 applied)

var_dump($alias.'='.$path);

in PathProcessorAlias::processInbound method and after clearing the cache access the /{alias}. If you DON'T get {alias}={alias} and you are still getting 404 => than your problem is somewhere else.

rcodina’s picture

@_Archy_ I have updated core to 8.2.4 and pathauto and ctools modules to latest version. Then I applied your patch at #45. But the problem persists.

patch -p1 < core_path-alias_manager-language_fallback_and_test-2802403-45-D8.patch 
patching file core/lib/Drupal/Core/Path/AliasManager.php
patching file core/tests/Drupal/Tests/Core/Path/AliasManagerTest.php

The affected content type has no links but it has two Entity Reference fields. After clearing the cache neither /{alias} nor /es/{alias} work. To sum up:

Content with alias done automatically by pathauto:
es/talleres/taller-nuevo-siguiente => Not found
talleres/taller-nuevo-siguiente => Not found

Content with alias done manually to avoid problem:
es/taller2 => Found
taller2 => Not found

I know it's weird. I have done myself a new setup from scratch and I haven't been able to reproduce the problem.

I have been looking for PathProcessorAlias::processInbound but I haven't found it. However, I've found that module "Easy Breadcrumb" calls processInbound function. But that module has been uninstalled for long time (it is a dev version from 2016-05-31 and it didn't work as expected so we uninstalled it after some testing). Could it be the source of problems? If you tell me where processInbound is I can try to check out the var_dump you suggested. Many thanks!

_Archy_’s picture

@rcodina: under the core folder search for the file PathProcessorAlias.php and in that you will find the method called processInbound.

_Archy_’s picture

Status: Needs work » Needs review

Could be that maintainers aren't looking at this because it's in 'needs work'. Putting in review so they can re-analyze the situation.

Guybrush Threepwood’s picture

I have some language codes that I use that are not in the drupal defaults. I get 404 errors on most pages when using my own language codes with this patch. Otherwise it seems to work well.

Guybrush Threepwood’s picture

I made a slight change to willwh's patch to make it so that, if someone has custom lang codes in the language.negotiation.yml file, the website will still work.

Example in an example language.negotiation.yml file

prefixes:
en-gb:uk

This patch change will allow this example to go through where it wouldn't before.

Mac_Weber’s picture

Status: Needs review » Needs work

@Tachion thank you for adding the patch for custom lang codes.
Patch looks good. I think we just need to add a test for this use case.

Guybrush Threepwood’s picture

Status: Needs work » Needs review

Sounds Good. I found an issue with this patch with core 8.2.4 and up. I will track it down, modify the patch, and make a test.

Guybrush Threepwood’s picture

Status: Needs review » Needs work

changing this to needs work.

tameeshb’s picture

@Tachion What issue is it for >8.2.4 ?

_Archy_’s picture

@Tachion Did you try the patch in #45 for the custom language codes and for the new problem that you found with the other patch?

Guybrush Threepwood’s picture

It appears that the side-effect I found was just something on my end. I will write up some tests for the patch.

Guybrush Threepwood’s picture

@tameeshb The issue was that some of the links, in particular it seemed to be views, would go back to the default language instead of the current language in some menu links. After looking at my site, I think it is highly unlikely that this issue has to do with core or with views. @_Archy_ Yes, I tried patch #45. It had the same effect.

_Archy_’s picture

@rcodina and others: Could this issue be your problem?

Guybrush Threepwood’s picture

So this is my first time making tests for core. Can someone point me to some Drupal standards for making tests?

rcodina’s picture

@_Archy_ I have updated the Pathauto and core and issue hasn't dissapeared.

cilefen’s picture

@Tachion There is a good presentation on Drupal testing in general and the new methods.

The above link explains the test types: unit, integration, and functional. You can decide what is appropriate for the coverage needed. Generally, look for existing tests to expand if possible. You may know this already: We like to see a tests-only patch uploaded alongside the full patch, as a proof the tests work.

And, learn how to execute the tests on your own installation.

_Archy_’s picture

This took some time and was really hard to make. But here is a patch containing a test that if PASSES the problem is still present. I will re-upload all the patches + this test to see if they fix the root problem .. tomorrow. FAILING ones will be solving the problem.

This TEST patch should now PASS as it proves the presence of the problem.

_Archy_’s picture

Status: Needs work » Needs review
catch’s picture

@_Archy_ that's a 0 byte patch, depending on how you've created the patch, you might need to run git diff --staged instead of git diff or do something like git show $commithash

_Archy_’s picture

Oh damn, sorry. Guess I was too tired. Can only upload on monday.

Status: Needs review » Needs work

The last submitted patch, 85: 2802403-green_proves_wrong_functionality-test.patch, failed testing.

_Archy_’s picture

Status: Needs work » Needs review
FileSize
3.9 KB

This patch is intended to prove the problem. Since I didn't upload it right last time (0 bytes), I've also decided to change it to fail when the problem is present.

I didn't quite find a suitable place for it, so I've created a new BrowserTest. The main part of it is to prove that anything that will try to match a request for an alias will not find it and cache 0 results if the master request is set and the language is different from the one in which the alias exists. In this case the LinkFormater through Url it will use the PathValidator to request a match for it. But this could happen from anywhere, not just the link renderer.

This patch with the test should fail.

Status: Needs review » Needs work

The last submitted patch, 90: 2802403-green_proves_wrong_functionality-test.patch, failed testing.

_Archy_’s picture

Updated the test: removed unused method, added some commenting, and most importantly added a pop to the manually pushed request and reset to language manager before trying to access the aliased page. The interdiff is for the test only. Also renamed the file.

Added the test to the patch in #73.

The last submitted patch, 92: 2802403-test_aliased_route_cache-92.patch, failed testing.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

willwh’s picture

Hi _Archy_,

It looks like your patch is based on my previous one, if somewhat loosely? This is working really well in the site we were having issues with previously. We'd been using my patch in #56; but after testing yours, this is working nicely now! (I had some issues with your previous patch)

I think we can slap this with an RTBC, nice work!

willwh’s picture

Status: Needs review » Reviewed & tested by the community
_Archy_’s picture

@willwh That is totally your patch. I've only added the tests to it. Decided to drop my patch that adds fallback on language for aliases, as it didn't work for everyone.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 92: pass-langcode-to-getPathByAlias-2802403-91.patch, failed testing.

oriol_e9g’s picture

There is a fail with PHP 7 with aggregator, I re-uploaded the patches in #92 to force run testing with Drupal 8.3.x branch. Do not credit me.

The last submitted patch, 99: 2802403-test_aliased_route_cache-92.patch, failed testing.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorAlias.php
    @@ -32,7 +32,7 @@ public function __construct(AliasManagerInterface $alias_manager) {
    -    $path = $this->aliasManager->getPathByAlias($path);
    +    $path = $this->aliasManager->getPathByAlias($path, $request->getLocale());
    

    As an afterthought it feels like making the langcode optional was a bad idea, because that stopped thinking about the langcode.

  2. +++ b/core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php
    @@ -173,10 +173,16 @@ public function viewElements(FieldItemListInterface $items, $langcode) {
    +    // Get Parent Language Code
    +    $item_language = \Drupal::languageManager()->getLanguage($items->getLangcode());
    

    This comment seems to not match what its actually doing. It is not fetching the langcode from the parent.

  3. +++ b/core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php
    @@ -173,10 +173,16 @@ public function viewElements(FieldItemListInterface $items, $langcode) {
    +      // If a field is not translatable but is being viewed in another language we need to provide the default language to display the field
    

    ... We need to sure, but why? Expanding the documentation about that would be nice.

catch’s picture

Status: Reviewed & tested by the community » Needs work

#101 is good points, also noticed the following:

  1. +++ b/core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php
    @@ -173,10 +173,16 @@ public function viewElements(FieldItemListInterface $items, $langcode) {
    +    // Get Parent Language Code
    

    Shouldn't this be 'Get the language code for the field items.'? And if that's all it is, does it need an inline comment at all?

  2. +++ b/core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php
    @@ -173,10 +173,16 @@ public function viewElements(FieldItemListInterface $items, $langcode) {
    +      // If a field is not translatable but is being viewed in another language we need to provide the default language to display the field
    

    As well as explaining why, this should also wrap at 80chars.

tameeshb’s picture

Status: Needs work » Needs review
FileSize
8.84 KB
1.12 KB

Changes from #102, #101

dawehner’s picture

Well, at least for me #103 still doesn't answer why this is done like that. Why is $items->getLangcode() prefered over $entity->getLangcode() etc.

Berdir’s picture

  1. +++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorAlias.php
    @@ -32,7 +32,7 @@ public function __construct(AliasManagerInterface $alias_manager) {
       public function processInbound($path, Request $request) {
    -    $path = $this->aliasManager->getPathByAlias($path);
    +    $path = $this->aliasManager->getPathByAlias($path, $request->getLocale());
         return $path;
    
    +++ b/core/modules/language/src/Plugin/LanguageNegotiation/LanguageNegotiationUrl.php
    @@ -108,6 +108,7 @@ public function processInbound($path, Request $request) {
             // Rebuild $path with the language removed.
             $path = '/' . implode('/', $parts);
    +        $request->setLocale($language->getId());
             break;
    

    What about different negotation methods?

    What about the fact that we have multiple language types that can be configured differently?

    getLocale() doesn't really seem to have any documentation what "a locale" exactly is, but it does forward it to \Locale::setDefault(), which mentions "UAX #35 extensions are accepted." More docs on http://php.net/manual/en/class.locale.php.

    Language objects can have custom ID's, not sure what will happen then.

    Will setting the locale interfere somehow with Drupal?

    Is this really the correct and best option to pass this around? Maybe it would be better to use an arbitrary attribute key?

  2. +++ b/core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php
    @@ -173,10 +173,17 @@ public function viewElements(FieldItemListInterface $items, $langcode) {
    +    // Get the language code for the field items.
    +    $item_language = \Drupal::languageManager()->getLanguage($items->getLangcode());
     
         foreach ($items as $delta => $item) {
    

    same here, what about content vs interface language, shouldn't this use the content language? Or actually, the language of the entity being displayed, which is nicely provided to use as $langcode. Which I think means you should use that?

_Archy_’s picture

After seeing all these feedback to the latter patch, and knowing that the problem is deeper than links.. decided to force alternatives. Digging even deeper I have come to the conclusion that there is a great problem with the path processors: they aren't contained enough.

The problem is that the AliasManager getPathByAlias() and getAliasByPath() both believe that the language manager will always provide them with accurate information. But here is what happens:

  • We get a request, this will be the master request.
  • Path processors run, we determine language and route etc.
  • During this request we do a match for a path (in this issues case to validate an URL).
  • Pathprocessors are run again, but they are fooled as the language is not what the path represents.

Here is a totally new approach to resolve the problem that pushes a new request on the request stack and resets the language manager on every process inbound and outbound. This patch will 100% fail as I haven't updated the tests for it, uploading to demonstrate.

This issue is much deeper than links and aliases. Other path processors might rely on data that doesn't actually represent what they are processing.

Status: Needs review » Needs work

The last submitted patch, 106: core-path_processors_contain-2802403-106.patch, failed testing.

_Archy_’s picture

Did not expect so many failures tough. What if the language manager returns the current language based on the current request from the request stack? This seems to solve the problem as-well, let's see what it brakes.

_Archy_’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
rcodina’s picture

@_Archy_ In my case, updating to latest 8.3.0-beta1 and updating all contrib modules fixed the problem for me. So finally my problem may not be related to the one discussed in this issue.

EDIT: Amazing discovery: the real problem was that in my pathauto rule I had a tab character at the end of the pattern. Removing it makes the problem disappear also in prior versions of Drupal core. However, if you have at least 8.3.0-beta1 the alias aren't broken if you have that tab character (this is the way I discovered which was the problem).

pawel_r’s picture

None of proposed solutions helped in my case. #108 Worked perfectly for default language (English) but 10 other languages got permanent 404.

Waiting for solutions, meanwhile I had to fix it somehow. My tmp solution has not-so-terrible performance outcome:

//core/lib/Drupal/Core/Routing/RouteProvider.php 

// comment out setting the route cache
 $this->cache->set($cid, $cache_value, CacheBackendInterface::CACHE_PERMANENT, ['route_match']);
Mac_Weber’s picture

Component: link.module » path.module

This issue does not belong to Link module, this is probably a Path subsystem issue, or a Routing System issue. @catch will better know how to address this.

BTW @catch, is PathProcessor is considered part of path.module?

david.gil’s picture

Hi pawel_r,

i use your workaround #112, and it solve the worst problem (aleatory 404 pages), but now sometimes i see in my menu links not pretty-path routes (/node/xxx),
does anybody has a better temp solution?

Best
David

david.gil’s picture

Hi,

more info, and for me it is more really a critical BUG, so any help is appreciated. We move a big site (https://www.museothyssen.org/) to production today and it really happens in multiple pages. I try to explain my problem better:

As i said in #114 with line commented

//core/lib/Drupal/Core/Routing/RouteProvider.php 

// comment out setting the route cache
 //$this->cache->set($cid, $cache_value, CacheBackendInterface::CACHE_PERMANENT, ['route_match']);

we are not obtaining 404 error pages, BUT as we have varnish in front of our sites, we are caching lots of pages without pretty-urls.
If i are logged (so varnish is not caching), the pretty paths in rendered pages are Ok and in some pages we have the same menu cached with pretty and without pretty urls.

Any idea how can i temporally patch this.

Attached a couple of screenshots of the problem in https://www.museothyssen.org/ just relesed today in production.

Thx in advance

_Archy_’s picture

I've been investigating this issue for quite some time by now. And the problem is two distinct things:

  1. The language manager doesn't provide current language based on the currently matched request (but rather on the master request)
  2. The alias manager currently can't tell whether your /about-us refers to /about-us or {any-langcode}/about-us

So these need to be resolved.

#1. For the language manager we need to be able to override the request and store per request the the negotiated language. So the path processor manager would inject the request into the language manager for which path is processing, then reset to the current request when all processors ran. (this is sort of what #108 does)

But this alone doesn't resolve the problem.
#2. The alias manager has to provide fallbacks for an alias that has multiple translations.
First it should find whether in the current language the alias exists. If not find whether in the original language it exists (this is what my first patch does #45). But this still isn't enough: Then finally it should select the first one from any other available languages. And finally fail and return the alias as is.

So this way the user could put in a link field /about-us and get the best matched translation (because of #2) and if he prefers any translation of it he could specify {langcode}/about-us (because of #1).

#1. Is something that should be done in a separate issues I think, added the issue.

_Archy_’s picture

@david.gil Sorry I deleted you files unwillingly as I was writing my comment before there were uploaded. Can you put them back? Also did you try the patch in #45 , that worked for our case.

david.gil’s picture

here they are, i will give a try to #45

jiisuominen’s picture

We had this exact issue and were able to reproduce it with 8.2.7. We have link field attached to a Paragraph field, and those paragraphs are attached to nodes. And the target node gave 404 if translated node where the paragraph was added was accessed before the link target node after cache clearance. This applied only when the link was added manually without auto complete. Also if the target node had pathauto option enabled this issue went away.

Patch in #45 seemed to do the trick for us too.

catch’s picture

PathProcessor isn't part of path module, due to a very long-standing situation where alias storage and handling was in core and the UI was in path module.

There's a case to move this to 'request processing' since it's a default implementation of that, but let's leave in path module for now.

david.gil’s picture

Hi,

more info in my install. I have been testing workaround in #112 and it is a bad solution, at least in my case it makes pretty-urls randomly not work.

Today i use patch in #45 and it seems to work!. My sites have 2 languages, main is spanish and secondary english. I can reproduce exactly the problem in "Steps to reproduce" and with #45 it is solved.

Best

catch’s picture

#45 looks like it's probably a good solution to this and I'm not sure why it was dropped.

The case in #66 could be split to a new issue.

_Archy_’s picture

@catch #45 is just a partial solution and only works in simpler instances. Check #116. For instance for @willwh case it didn't work as he mentioned in #62.

Maybe we could extend #45 with the fallback to the first available alias from any language, so that should take care of wider range of combinations. And for the language manager problem mentioned in #116 I've already created a separate issue.

david.gil’s picture

Hi all again,

i was wrong with my comment #121 it seens that in my install there are 2 problems, the one of this issue that it is solved with #45 and another one that makes that in some cases the url alias are rendered without pretty paths, there must be another problem related with path subsystem that i cannot find and makes that sometimes urls are rendered using internal path (node/XXX), anyone can refer me a related isuue?

catch’s picture

Status: Needs work » Needs review
FileSize
8.43 KB

I read through nearly the whole issue again, on re-reviewing #45, don't think we should get the original language alias when the current and default language don't return anything, that's a big change of behaviour for the path alias system.

I got back to #25 which looks a lot closer to what I'd expect, also a very minimal change to avoid the 404 which is the critical bug here.

Here's an untested patch combining the fix from #30 with the browser test from #92, might not even be a perfect re-roll, got a lot of conflicts since those patches are from a while back but see what comes back.

catch’s picture

Short review:

  1. It's obvious that there's no other way to make the language available to InboundPathProcessorInterface at least without adding a new interface with a new method to allow it to be passed explicitly. Not sure what I think about that yet.
    +++ b/core/lib/Drupal/Core/PathProcessor/PathProcessorAlias.php
    @@ -32,7 +32,7 @@ public function __construct(AliasManagerInterface $alias_manager) {
    +    $path = $this->aliasManager->getPathByAlias($path, $request->getLocale());
    
  2. +++ b/core/modules/language/src/Plugin/LanguageNegotiation/LanguageNegotiationUrl.php
    @@ -110,6 +110,7 @@ public function processInbound($path, Request $request) {
    +          $request->setLocale($prefix);
    

    We don't use $request->setLocale() anywhere else in core. Should we be setting that in all of the language negotiation plugins if we use it here?

Fixing one stray array syntax code style issue from the new test and uploading a test-only patch.

catch’s picture

That wasn't a test only patch, sorry for the noise.

The last submitted patch, 127: 2802403-126-test-only.patch, failed testing.

bmcclure’s picture

When I'm on my Redirect admin page (/admin/config/search/redirect), I'm getting the following error with the latest patch here:

Error: Call to undefined method Drupal\Core\Field\BaseFieldDefinition::get() in Drupal\link\Plugin\Field\FieldFormatter\LinkFormatter->viewElements() (line 181 of core/modules/link/src/Plugin/Field/FieldFormatter/LinkFormatter.php).

I'm not entirely sure why yet, but apparently `$item->getFieldDefinition()` is returning an object without a get() method in some cases or something. Still trying to investigate.

Berdir’s picture

Status: Needs review » Needs work

That's not the problem, the problem is that get() is a config entity method, and only configurable fields are config entities, base fields are not. Redirect has a link base field. I'd expect that menu links and shortcuts could cause similar errors when viewed somehow, e.g. in a view.

Use isTranslatable() instead.

bmcclure’s picture

Status: Needs work » Needs review
FileSize
8.43 KB

The fix was to change the conditional to use isTranslatable() instead of get(). There was no get() method. Here's a slightly modified patch with this change.

bmcclure’s picture

@Berdir Thanks! Found the issue at the same time I think :) Last patch changes only that method call from the previous patch.

Guybrush Threepwood’s picture

So the 8.3 patch does not work if you have a custom language prefix. I can look into it, but if anyone gets that taken care of first that would be great! For example: I have made the basic page /test-page and then switch to /uk/test-page and it immediately gives a 404. I don't even need to click the link in the outlined steps.

Guybrush Threepwood’s picture

I made a small change to bmcclure's patch to allow for custom lang codes. The reason I didn't make a test for this is because 8.3 came out. I will make a test for this case unless someone gets to it first. At any rate this change should allow for custom lang codes.

Guybrush Threepwood’s picture

Status: Needs review » Needs work
nkraft’s picture

Just checking on the state of this, as I've started encountering this bug on a massive, multilingual site we will be putting into Production in just a few weeks. This issue is causing us some major stress.

I'm not adept at this whole patching review process. Which of these patches should I apply? Is #131 the patch I should be use? Has it been tested adequately? Is there any hope of getting this bug patched in a Core update soon?

Thanks for any insight you can provide!

Guybrush Threepwood’s picture

you should probably use patch 135. You don't have to use 135 if you don't use custom langcodes, but I imagine that once I write a test for 135 that the patch based on 135 will be used.

nkraft’s picture

Thanks!

One quick question -- is there a way to AVOID this issue without the patch? I saw some people alluding to the link field linking to a page in a multilingual site... seems like there's a lot of moving pieces. But could I recreate a problem node (make a new copy of it), and avoid doing something to it so as not to incur this issue?

Guybrush Threepwood’s picture

I don't think so. I would apply the patch. We have it on our site and it seems to work pretty good.

Guybrush Threepwood’s picture

I tried doing a fresh install and running the PHPUnit tests, but I didn't see any issue even when I put $this->assertSession()->statusCodeEquals(202); at the bottom of AliasManagerTest->testAliasRouteCache() and I didn't see any additional assertions. (I figure that the page can't have 202 and 200 status codes). I put in a test function that I think will test for the custom langcodes, but again, I didn't see anything pop up when I made another function inside of AliasManagerTest.

Why isn't this patch going through the test queue?

Guybrush Threepwood’s picture

Trying to upload the patch again.

Guybrush Threepwood’s picture

Status: Needs work » Needs review

The last submitted patch, 140: 2802403-140.patch, failed testing.

Guybrush Threepwood’s picture

Hmm looks like I might need to redo the 140/141 patch. I will get to this tomorrow.

Guybrush Threepwood’s picture

ok so 141 seemed to have passed. I put in a test that I think would test the issue I fixed, but it seems that the tests in the previous patch are not even being executed. I am new to the Drupal PHPUnit testing, so I could be missing something.

Guybrush Threepwood’s picture

I will not be able to work on this next week, but I will be able to look at this after that.

dhansen’s picture

141 has a LOT of garbage in it like permissions changes and the addition of drush as a composer requirement, and that's before I even get to any reviews. Reissuing that patch.

david.gil’s picture

Hi,

i think that the approach that start @catch in #127, and so all the patchs sended after it had some problems, at least in my installation. I give more info. I have a multilingual site with 2 languages (spanish & english), the default one is spanish. The language detection is based in url, and i have no prefix in default language, so my urls in spanish dont have a prefix.

With the approach in latest patches i found that:

- The locale of Request objects is initialized in processInbound of class LanguageNegotiationUrl, but when there is not a path prefix it is not initialized.

- When PathProcessorAlias ask aliasManager for a path using: getPathByAlias if the Request has the locale not initialized as is in my case, the defaultLocale of Request is used, and in Symfony:Request class it defaults to "en".

With latest patch #148 based in #127 if you have an installation where the default language is not English and default language has no path prefix ALL pages in default language resolve as a 404!.

I tried several ways to initialize locale in LanguageNegotiationUrl but all that i tries has side effects. Any ideas?

Best

david.gil’s picture

Status: Needs review » Needs work
holist’s picture

I can confirm that the patch results in just 404's with non-English default language.

Also I found out that the issue arises if the link field is translatable but target node is not translated to current language (I'm on a site where there are four languages out of which two are only partially supported). If the link is entered as the alias "/test-page" it will be first "/{current-lang}/node/{id}" after save but convert into "/test-page" after cache rebuild. So effectively linking to default language version of a node from another language breaks the target page. Using autocomplete for the link the issue does not appear in this case either.

Guybrush Threepwood’s picture

@dhansen: Thanks!

catch’s picture

So I think we should add an else branch there to setLocale() when there's no path prefix to the default language, but not able to roll that patch atm. Will also need test coverage for that case.

david.gil’s picture

Hi catch,

i think it is not so simply (at least in the manual tests that i did. i already try what you suggest), the problem is not with the main request but with the sub-request, in some comment _Archy_ talk about that (#116 and some others).

holist’s picture

Should someone be trying to solve a problem like this on their site but don't have Link fields (and to keep count on the severity of the issue): It looks like this issue may also affect a site that uses Pathologic module, which causes links in A tags in formatted text fields to be processed as internal: URIs.

Haven't tested yet, but just read through the code and it seems a plausible explanation for the problem on a site that was suffering from this same issue. There actually weren't any Link fields causing the problem, but there is Pathologic to support separate URLs for editors and anonymous users.

This would be related to the feature in the issue that arises when linking by path to default language version of a node that is not translated to current language that I explained in #150.

Might actually warrant even changing the title of the issue, as it is not about Link field but linking by path across languages.

Arlina’s picture

Patch #147 solved this issue on a site using custom language codes.

lcontreras’s picture

@david.gil Did you solve this?. I have a site like yours. Spanish like default language and english.

esolitos’s picture

Version: 8.3.x-dev » 8.4.x-dev

Still happens on 8.4 :(

david.gil’s picture

Not exactly the same problem, but i report here because some of my messages in the thread talks about another problem with alias and internal paths, probably someone can benefit of this: https://www.drupal.org/node/2879512

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

_Archy_’s picture

Issue tags: +Drupalaton 2017

I will try to do some progress on this.

DeFr’s picture

Had the same problem, but using translation by domain instead of path prefix, so added that to #147. Attaching the resulting patch.

catch’s picture

@DeFr thanks for updating.

Holist or david.gil are you able to post step-by-step instructions to introduce the bug you found with 404s so we can convert that to automated test coverage?

david.gil’s picture

Hi catch,

they are in the description of the issue

RumyanaRuseva’s picture

Unfortunately as stated by @david.gil in #148 the latest patches result into 404 if the site has non-English default language with an empty language prefix.
I can also confirm that the patch in #45 works in this case. @lcontreras and @holist you may want to try that - even though it's not perfect for all cases, it might be ok for you until a better solution is found.

holist’s picture

@catch: like @david.gil said, steps in issue description result in 404's.

But also in addition to that, I found that it does not matter if the link fields are translatable if these hold:

  • the link is inserted as internal path (step 4), like /some/path (which is saved as
    internal:/some/path</code,</li>
      <li>the content at the target path is in another translation (say FI) than the linking page (say SE), and</li>
      <li>there is no translation of the target path content in the same language as the linking page</li>
    </ul>
    
    And it works if the links are inserted as references to entities (saved as <code>entity:node/123

    ).

    I am using patch #45 successfully, and I have instructed content editors to use autocomplete, no paths.

maacl’s picture

We encoutered a similar problem and are not sure how to proceed, but maybe this provides a more global view of the problem.

The scenario:

  • 14 languages, and do never want to provide a fallback
  • to prevent the fallback, we check in preprocessor if a translation is available and oterhwise throw a NotFoundException
  • browser language as negotation, when the URL does not provide a language
  • set custom aliases on a special enity bundle. These aliases are flagged "german".

Now a visitors recieves a url to this special entity via mail. This URL does not contain a language prefix.
The following happens, with cold caches:

A) German visitor with german browser settings visits first:

  • The RouteProvider (core/lib/Drupal/Core/Routing/RouteProvider.php) looks for a path, uses the result of language negotation. (=de)
  • Both routes without language prefix, and with language prefix included are written to cache
  • The german visitor gets redirect to the prefixed URL and everything is fine

B) English visitor with english Browser settings visits first:

  • The Pathprocessor looks for a path, fetches the result of language negotation. (=en)
  • There is no alias for english, so a empty route gets cached.
  • The english visitors sees 404
  • Afterwards a german visitor calls the URL without prefix. But now there is a empty route cached, so the german visitor sees 404 also.

So a page call on cold cache, where the language negotation results in an empty route (=no translation available) breaks the language negotation of following page calls of the same route.

For now we are using a rewrite rule to make sure in this special case the user gets redirected to the "de" prefixed url, and test a patch that prevents the cache entry of empty routes. With the patch applied in our scenario a english user as second visitor would be redirected to /en/entity/id, trigger our custom NotFoundException and everything else still works.

timcosgrove’s picture

(xpost from https://www.drupal.org/node/2834638#comment-12268007)

Just to add a voice here: we run a D8 site in 29 languages, and we were seeing this issue, and in other cases besides link fields. We run US English as our default, serve that from /en/foo aliases, and redirect from /foo aliases to /{prefix}/foo aliases according to language negotiation rules.

We were seeing similar behavior, where in some cases /foo aliases were cached as 404s when the intended destination entity was not translated into all languages, and an untranslated redirect request (i.e. /en-gb/foo) was requested and cached.

We implemented the patch from https://www.drupal.org/node/2802403#comment-12217386 and this appears to have successfully resolved our issues. Of note, for this thread: some of the aliases in question were not contained in link fields anywhere.

https://www.drupal.org/node/2802403 (this issue) appears to be focused on link fields as that was where the bug was first seen, but the issue does ultimately seem to be with bad route caching, as Catch has said
.

I would probably recommend https://www.drupal.org/node/2834638 be closed as a duplicate of https://www.drupal.org/node/2802403 and the title of https://www.drupal.org/node/2802403 (this issue) be modified to reflect that it affects more than link field data. I would defer to Catch here.

catch’s picture

Title: Link Field Causes Pages to 404 with Bad Route Caching on Multilingual Sites » Combination of language negotiation and path aliasing can cause a corrupted route cache, 404s

Agreed with #167. I think this is the same issue as #2834638: ERROR 404 after a while, clear cache fixes it. If for some reason there's a bug remaining after this issue is fixed, we'd still need to rule out the approach here anyway. Marking #2834638: ERROR 404 after a while, clear cache fixes it as duplicate and re-titling.

catch’s picture

Component: path.module » request processing system

This also belongs more in request processing, path aliases are just consuming the API, and the information it needs isn't available at the moment.

jbd44’s picture

Is this a related issue: https://www.drupal.org/node/2910820? It's hard to tell.

gagarine’s picture

I do have a website with this issue.

You can find the full code source (with config exported in /config/sync ) on https://github.com/oxima-sarl/nouveaumonde.ch (don't try to access our prod server, we are using ssh key, no password ;) )

I attached a dump of the database user 1 is "admin" with "drupal" as password: https://www.drupal.org/files/issues/ozcf_db_prod_2017-09-23.sql_.gz

The /fr/event/youtube-perle (first event in the listing) is a 404.

Clearing the cache do not solve the problem. (i do fix, but only if the website is reloaded with an anonymous user) Accessing the node on /fr/node/18 works.

Let me know if I can provide more info.

gagarine’s picture

In fact clearing the cache fix the problem ONLY if I reload the page with an anonymous user after truncating the cache_* and cachetag tables.

Perhaps this is because my admin user have the settings to have the admin always in english?

EDIT: I can confirm. Having "Administration pages language: English" create 404 on cache rebuild. If I have no preference for administration page language cache rebuild works.

#161 fix the problem.

maacl’s picture

catch’s picture

jacktonkin’s picture

I have had a similar issue with 404s and the patches from #127 fix the issue on my site without side effects. I have English as default language and two translations. While reviewing the patches linked at the top of the issue I noticed that the most recent patch is missing the test from #127. I have attached a patch combining #161 and the test-only patch from #127. This includes the following additions to #127 (without additional test coverage):

Having read through the issue the main outstanding area is the reports (#148, #150) that this patch doesn't resolve the issue when using languages other than English as the default.

catch’s picture

Just to say #175 is my understanding of the issue as well. Do we possibly need to initialize $request->locale to the default language before negotiation runs to help with #148?

jacktonkin’s picture

I just realised I uploaded the wrong patch in #175. That one was #45 + the test-only patch from #127.

The attached patch is as I described: #161 + the test-only patch from #127. Apologies for the noise.

jacktonkin’s picture

And here is a patch that modifies AliasManagerTest to install Drupal in a language other than English, in an attempt to give a failing test for #148.

Status: Needs review » Needs work

The last submitted patch, 178: failing-test-2802403-178.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

hctom’s picture

I just created another issue that might have the same root cause: #2915190: Menu links losing active trail when not edited in path alias language (sorry for the bad title, but I did not really know how to title it because of its complexicity, hehe). Currently I only added detailed steps to reproduce losing the menu active trail when using aliased paths as menu links. Perhaps someone might have a look and see if this is related or gives you some more information/ideas on what happens here.

efpapado’s picture

#177 Doesn't fix on 8.4.3

huyby’s picture

Nor4a’s picture

My workaround for this problem right now is to create a different alias for the page which does not work (e.g. displays 404 page) and create redirect to the page from the old alias to the new one (if it is not created automatically).

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Berdir’s picture

I've been working on a fix in #1125428: Language-specific aliases only work with url-based language negotiation but I think my change might actually belong in this issue.

I think the real problem is that getRouteCollectionForRequest() is cached without language information and what this issue is trying to do is just workarounds for that? As far as I see, if the information is cached just for the path, then it's impossible to get it working reliably for session, domain, preferred user language or anything else that is not in the path. Then again only url/domain is currently considered for the url alias.

That said, I've been trying to make sense of the test here but it's quite confusing because the comments seem to be backwards and it seems to actually assert wrong things?

  1. +++ b/core/tests/Drupal/FunctionalTests/Core/Path/AliasManagerTest.php
    @@ -0,0 +1,133 @@
    +    // is not translated.
    +    $this->resetAll();
    +    $this->createAliasedNode('/ro-page', 'ro');
    +    \Drupal::requestStack()->push(Request::create('/ro/ro-page'));
    +    \Drupal::languageManager()->reset();
    +    $this->assertEquals('ro', \Drupal::languageManager()->getCurrentLanguage(LanguageInterface::TYPE_URL)->getId());
    +
    +    // Prove that the route won't be found in the current language. This happens
    +    // because the alias manager won't find the alias in the current language
    +    // context. In this step the route provider will cache that there are no
    +    // routes for this request. With link rendering this happens right after the
    +    // processInbound call in the PathValidator::getPathAttributes.
    +    $routes = $this->container->get('router.route_provider')
    +      ->getRouteCollectionForRequest(Request::create('/test-page'));
    +    $this->assertEquals(1, $routes->count(), 'Route for the alias in ro language was not found.');
    +
    

    So we switch the language to ro, but then expect that the lookup for /test-page, which is in the default language actually works? IMHO that should actually *not* find anything? Seems like that would only work because somehow the cache clearing isn't fully working?

    If I change this assert to 0, then this test is passing for me with my patch from #1125428: Language-specific aliases only work with url-based language negotiation, without any additional changes.

  2. +++ b/core/tests/Drupal/FunctionalTests/Core/Path/AliasManagerTest.php
    @@ -0,0 +1,133 @@
    +    // Prove that the aliased page is not found.
    +    $this->drupalGet('/test-page');
    +    $this->assertSession()->statusCodeEquals(200);
    

    another confusing example, the comment says it's proving that it's not found, but clearly we want to prove that it is found? Comments need to describe the expected behavior now, not how it worked before fixing the bug.

_Archy_’s picture

@Berdir This was the first test I have ever created and first I've created it to prove the issue and not to prove that it is working. So first I`ve commented everything in the sens of failing, but later I have realized that it is not the way it should be done, so I have updated the test to succeed when it`s working correctly.

This test mimicks the steps from the original issue reproduction steps. The part that you describe in point 1. is when the route for the path is requested by the path resolver during rendering of the link on a non default language page. That lookup should actually find the route for /test-page because it does exist.

In my opinion the root issue comes from what i described in this comment

Berdir’s picture

> That lookup should actually find the route for /test-page because it does exist.

But your current active language is RO and in that language, there is no route/alias, so why should it find something? Doesn't make sense to me.

Berdir’s picture

See also my test in the other issue for the session negotiation where I'm testing the alias twice with both languages in different orders. I think also here you should have an explicit tes that ro/test-page does not exist and also /ro-page doesn't

_Archy_’s picture

> But your current active language is RO and in that language, there is no route/alias, so why should it find something? Doesn't make sense to me.

Yes it should, because I am using the getRouteCollectionForRequest on the route provider service. This method is used also by the url matcher / validator. That is why I am explaining that the path processor logic is flawed as the language is not recalculated the second time this service is called / the language is not per request calculated.

_Archy_’s picture

Here is again my explanation on what is happening:

  1. We have a request for: /ro/a-page. This is the primary request.
  2. Path processors are run, route is found, language is calculated as RO.
  3. This route is rendering a page where we have a link: /test-page. The link is validated before rendering consecutively calling the getRouteCollectionForRequest with a new Request object set with the path.
  4. Path processors are run on this path. BUT the language is not recalculated / or calculated for this path. NOT GOOD
  5. The alias path processor fails to convert the path because it uses the language manager which says that the language is RO, tough for this request it`s EN. NOT GOOD
bircher’s picture

A while ago (last October) we experienced that issue as well.
I created a behat test for our site but unfortunately never had the time to convert it to a core test.
But it can be pretty instructive to reproduce it.
https://gist.github.com/bircher/a1e0874bae91e086ef03117260e28c48

anavarre’s picture

I just reproduced thanks to your Behat test. Thanks, @bircher.

Berdir’s picture

@_Archy_: I agree that this is a problem, but not with your conclusion and that test. /test-page does *not* exist in the russian, why should it return a match? That only works for the same reason that it works with a real link field when saving and that is for the wrong reason, I'll get to that.

Been debugging this a bit more because I didn't understand why exactly this change fixes the problem, it just changed the language in the link formatter but that is irrelevant, we are not viewing the link, we are saving/validating it. And I'm starting to understand the whole thing a bit better.

We have two different problems that are overlapping a lot.

* This issue, having an aliased link to a different language that then results in a 404 when accessing it in the correct language.
* My problem, that using domain or other negotiation that has the same path for different languages allows to trigger a 404 by simply accessing it with the wrong language domain first.

For both cases, the real problem is the fact that \Drupal\Core\Routing\RouteProvider::getRouteCollectionForRequest() does not cache by language, I'm quite convinced about that. The alias resolving itself would consider the current language, which is different then in both cases And I'm also convinced that this is not the right fix (I'll get to why a bit later), but my issue is possibly also not a complete fix, not sure.

We have two approaches to fix it. Mine from the other issue that makes that cache vary by language. And this, which somehow also fixes it. I understood why mine fixes it (because there are then two caches, one for each language), but I didn't understand why this fixes it. Now I do.

So what happens is that we validate in \Drupal\link\Plugin\Validation\Constraint\LinkAccessConstraintValidator::validate() that the link is valid. This calls $item->getUrl(), which calls Url::fromUri() which then calls \Drupal\Core\Path\PathValidator::getUrl(), which creates a new request object with just the path and no other information. Then that goes into \Drupal\Core\Path\PathValidator::getPathAttributes() which does inbound path processing. in HEAD, this can't resolve the path because there is no alias for ro/the current language. But now with the patch, it can. But it can only do that because this is a new request object that *always* has locale en. Now we continue with the unaliased internal path, which is /node/ID. We now check access and end up in \Drupal\Core\Routing\RouteProvider::getRouteCollectionForRequest() with our $request object that now returns /node/4. So the cache ID is different because the path is different at this point. And since the internal path works in both languages, accessing it both with the alias and the internal path will then work on the correct language.

But what this means is that this *only* works because:
a) we do inbound path processing twice on url validation.
b) it will only ever work for links that have an english alias. No other language combination can be fixed by this approach.

The "problem" with my patch is that it actually can't resolve /test-page when editing a translation. That means we technically can't do the access check, in case my target would be unpublished and not accessible to me. But there could be an alias /test-page for two different languages, what on earth should happen then, anyway, I don't know what the correct behavior would be. The only thing I can think of right now would be to store the de-aliased internal path on the field, then we know that it will always be the same no matter which language. But afaik we explicitly decided against doing that once.

_Archy_’s picture

@Berdir: Thanks for your detailed input. The issue is clearer now. I agree that we should cache route collections per language as-well, because the path doesn't always contain the language information (example: domain negotiation as you said). But this alone won't fix the issue.

I once proposed in this comment to make the path processing per request and the language manager to be request stack aware so that it returns the language of the current request from the stack. Now I am re-proposing that solution. This patch tough needs some work and I am not yet sure that it is doable as of the many test failures. @Berdir can you take a look and tell me what you think?

I am pro for that solution because in this manner the services that are used durning the path processing (ex: language manager) can be specific for that request.

_Archy_’s picture

a. Given that:
* We have url language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for all the languages /test-page.

If the user enters /test-page in the field it should match the EN one. OK
If the user enters /ro/test-page in the field it should match the RO one. OK
If -||- /ru/test-page -||- the RU one. OK

b. Given that:
* We have url language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for RO language /test-page.

If the user enters /ro/test-page it finds the alias.
If the user enters /test-page it finds nothing. Is this OK? It should be because if let's say we have another alias in RU then how do we know which one to select. => OK

c. Given that:
* We have domain language negotiation.
* Have the following languages: EN, RO, RU; with en being default.
* And the following alias for all languages /test-page.

If the user enters /test-page it finds ??. In this case what do we do?
c.1 Select the one in current language and if not found in current language skip. Not OK in my opinion.
c.2 Use the url negotiation path processor and allow the user to enter /en, /ru, /ro. OK I guess?
c.3 In this case the domain should also be entered in front of the path: and if it's not just search in the current language. OK I guess?
I am not sure currently how links are managed when using domain negotiation, maybe someone can clarify.

Berdir’s picture

For c) with domain negotiation it's the same as a prefix, if the alias is for a specific language that is not the curren't domain it will not find it.

However, the only thing that then doesn't work is access validation on save. Meaning, if linking to an unpublished content that I'm not allowed to access, i'd get a validation error. To be honest, I find that link/access validation very strange in general, if you actually *know* the alias* then why do we bother with not allowing you to link to it? That doesn't mean that you can actually access it. The link still validates successfully and is saved.

Yes, my change with the caching doesn't solve all problems. But is solves the biggest problem here and that's the cached 404's. Discussed also a bit with @dawehner and @catch in IRC and @catch suggested a separate issue and I think that makes sense.

Another separate issue that I want to open is to try and remove the double-inbound-processing on url validation. I suspect that's a left-over from an earlier implementation that didn't use actual route matching to look up the path. More consistency IMHO and a tiny performance improvement probably. We are doing quite a lot of link validation and generation.

Then, I will try to come up with a test that basically tests the full scenario as a web test with a link field without any shortcuts, to avoid incorrect assumptions and that we have test coverage for the actual bug and combine that with my test coverage from the other issue.

Berdir’s picture

Ok, here we go, two new tests, one based the LanguageSwitcherTest from the other issue and the other initially based on PathLanguageTest with parts from the previous test here but essentially completely rewritten. That test also uses a @dataProvider to run it in both directions. First with source language english and french translation, then french original and english translation.

To be able to reliable test in both directions, I also added a language prefix for english. IMHO, that should be the default behavior anyway. It does result in slightly different behavior because it means that the /target-page link on the translation does not work. We could fix that be explicitly using "$source_langcode/target-page" as alias, but not sure if we should do that.

Adding the routing cache fixes those tests. As mentioned, I think the exact behavior for link validation of translated aliases would IMHO be better handled in a separate issue, this is already long and complex enough (aka a "quagmire", as @catch so nicely described it).

Berdir’s picture

Status: Needs work » Needs review
Berdir’s picture

Added a comment as suggested by @dawehner in the other issue.

The last submitted patch, 197: routing-cache-language-2802403-197.patch, failed testing. View results

dawehner’s picture

  1. +++ b/core/tests/Drupal/FunctionalTests/Routing/RouteCachingLanguageTest.php
    @@ -0,0 +1,186 @@
    +    $this->assertTrue($definitions['path']->isTranslatable(), 'Node path is translatable.');
    +    $this->assertTrue($definitions['body']->isTranslatable(), 'Node body is translatable.');
    +  }
    

    Oh I didn't knew one can have assertions in setup(). Interesting!

  2. +++ b/core/tests/Drupal/FunctionalTests/Routing/RouteCachingLanguageTest.php
    @@ -0,0 +1,186 @@
    +    // @todo Clicking on the link does not include the language prefix.
    

    Do we have a followup for that?

  3. +++ b/core/tests/Drupal/FunctionalTests/Routing/RouteCachingNonPathLanguageNegotiation.php
    index 674214bf0c..b7381b8cff 100644
    --- a/core/tests/Drupal/KernelTests/Core/Routing/RouteProviderTest.php
    
    --- a/core/tests/Drupal/KernelTests/Core/Routing/RouteProviderTest.php
    +++ b/core/tests/Drupal/KernelTests/Core/Routing/RouteProviderTest.php
    

    I'm curious whether you could also add a test for the behaviour in here ...

Berdir’s picture

1. Actually didn't do that on purpose, that part was copied from the PathLanguageTest.

2. Not yet, I think that is related to the alias/link/validation/display thing that we've been discussing in the last few comments but don't have an issue yet.

3. Thought about that. Wasn't sure if really required as we already check that there is a language and it's not the responsibility of the RouteProvider which language it is. But yes, why not, doesn't hurt.

Also fixed the broken namespace, always forget that when working with phpunit :(

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Nice additional test! Thank you @berdir!

Berdir’s picture

Also opened #2940013: Define expected behavior for link validation and display with aliases pointing to different languages, the todo should possibly refer to that. I'm not sure if we can or need to do anything about that problem, but lets continue that discussion there :) I created it as a normal bug now, if others think it is more important, feel free to raise the priority.

This should probably be added to that @todo in the test as a reference, maybe that can be done before commit leaving at RTBC so that a committer can have a look.

anavarre’s picture

Just performed manual testing and everything looks great.

The last submitted patch, 199: routing-cache-language-2802403-199.patch, failed testing. View results

_Archy_’s picture

I have also added #2940036: Path processors are fixated to current request and not the processed request with a test ready and open for discussion.

larowlan’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/tests/Drupal/FunctionalTests/Routing/RouteCachingNonPathLanguageNegotiation.php
@@ -0,0 +1,95 @@
+class RouteCachingNonPathLanguageNegotiation extends BrowserTestBase {

Don't tests need to have a {Something}Test naming pattern in order to be discovered? Or has that changed?

Looks like our coding standards expect that too.

Other than that, this looks good to me

plach’s picture

  1. +++ b/core/lib/Drupal/Core/Routing/RouteProvider.php
    @@ -146,8 +158,11 @@ public function __construct(Connection $connection, StateInterface $state, Curre
    +    $langcode = $this->languageManager->getCurrentLanguage(LanguageInterface::TYPE_URL)->getId();
    +    $cid = 'route:' . $langcode . ':' . $request->getPathInfo() . ':' . $request->getQueryString();
    

    I don't think this fix will work with session negotiation, as in that case the URL language type will always get the default site language.

    I'd rather add all the (configurable) language types to the $cid, as we do in LanguagesCacheContext::getContext() or in D7 in drupal_render_cid_parts().

  2. +++ b/core/tests/Drupal/FunctionalTests/Routing/RouteCachingNonPathLanguageNegotiation.php
    @@ -0,0 +1,95 @@
    +    // A more common scenario is domain-based negotiation but that can not be
    +    // tested. Session negotiation by default is not considered by the URL
    +    // language type that is used to resolve the alias. Explicitly enable
    +    // that to be able to test this scenario.
    +    // @todo Improve in https://www.drupal.org/project/drupal/issues/1125428.
    

    We should be able to take care of this with the change above. I don't think the problem lies within the URL language type itself.

david.gil’s picture

If someone uses domain module, the patch provided by Bedir dont work because that module override the method. I created a related issue and submit a patch.

https://www.drupal.org/project/domain/issues/2940296

Berdir’s picture

Thanks for the review @plach:

> I don't think this fix will work with session negotiation, as in that case the URL language type will always get the default site language.

Well, this depends on what we'll do with #1125428: Language-specific aliases only work with url-based language negotiation. Right now, session negotiation has no effect on the URL language type unless it is manually configured to do so, which I'm doing in one of the tests actually, to have a way to have a language that's not in the path. If it is in there, then my test shows that it works, if it's not in there, then it's not relevant.

Note that this cache is not a render cache, it doesn't contain an actual part of the response, just inbound path processing + matching routes.

And as long as \Drupal\Core\PathProcessor\PathProcessorAlias::processInbound() hardcodes (by not specifying it and using the default in AliasManager) TYPE_URL, using type URL here is enough. Unless someone of course does inbound path processing with a different language type (or the moon phase or hour of the day, then it won't work anyway). But I'd argue with the current meaning of TYPE_URL, that's not really a valid thing to do?

If \Drupal\Core\PathProcessor\PathProcessorAlias::processInbound() changes in #1125428: Language-specific aliases only work with url-based language negotiation, then this would need to be updated as well.

Also, render caching in D8 by default only varies by the interface language, not content, that needs to be explicitly added and currently, nobody cares about also adding the url language type afaik in core :)

That said, I can add all languages if you feel strongly about that, just wanted to explain why I think that's not really necessary. It would be more about being extra-sure that we're fine if someone does something crazy/incorrect.

> We should be able to take care of this with the change above. I don't think the problem lies within the URL language type itself.

That would probably be true, but I don't think it's the correct fix, just a side-effect of it. The result if the path negotiation will not change, we'll just cache it differently. I think that test-scenario is bit weird and probably not something that happens in reality (depending on what the agreement/fix in #1125428: Language-specific aliases only work with url-based language negotiation will be). But I need a non-path based negotiation and can't use domain in tests.

#209: Yes that is wrong, will update that when we have an agreement on the cid.

dawehner’s picture

One day we might no longer use run-tests.sh/simpletest to find the list of tests. In this absurd future, we will be able to run these tests properly through phpunit :)

Berdir’s picture

Ok, discussed with @platch, we agreed to keep the URL language type here and add a @todo. I also refactored the cache id building into *2* methods, specifically to make it easier for domain.module and similar use cases to override just the part that it needs while allowing us to change the language logic in the future without having to update domain.module again after that.

plach’s picture

Status: Needs review » Reviewed & tested by the community

Looks great, thanks!

  • catch committed c13d37f on 8.6.x
    Issue #2802403 by _Archy_, Berdir, rjay, catch, Tachion, jacktonkin,...

  • catch committed b9bd2dc on 8.5.x
    Issue #2802403 by _Archy_, Berdir, rjay, catch, Tachion, jacktonkin,...
catch’s picture

Version: 8.6.x-dev » 8.5.x-dev
Status: Reviewed & tested by the community » Fixed

Committed/pushed to 8.6.x and cherry-picked to 8.5.x. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jatinkumar1989’s picture

Hi,

I updated core to 8.5.0 and i am facing the same issue. (#214 is already part of core, so code changes are already there)

for language “EN” all nodes/links are working fine.
but when i switch language, some links (menu links) and node links are giving `page not found`

any help, Pls.

Thanks
-Jatin

jatinkumar1989’s picture

Version: 8.5.x-dev » 8.5.0
xjm’s picture

Version: 8.5.0 » 8.5.x-dev

@mail2jatingarg, I'd suggest opening a new issue. Thanks!

thierry.beeckmans’s picture

We had a related issue on Drupal 8.3.9 where the alias was identical for a node in Dutch and another node in English and German. The pages are served in a separate domain.

* When a site visitor opened the Dutch node 1, it became cached.
* In a second visit, the English version of node 2 was requested. Instead of serving node 2, it served node 1, as both nodes shared an identical url alias and node 1 was already cached.

I have made a patch for Drupal 8.3 based on #214.

thierry.beeckmans’s picture

thierry.beeckmans’s picture

The patch generated in comment #223 for Drupal 8.3.9 was not generated correctly and included some unrelated code parts.
The patch file for 8.3.9 is updated.

xmacinfo’s picture

@thierry.beeckmans Please open a NEW issue and reference this one.

And then, post your patch in the new issue.

It is generally not a good idea to reopen a closed issue.

skuark’s picture

@thierry.beeckmans Can you check your patch in #225 is valid for 8.3.9? I am not able to apply it on a Drupal 8.3.9 installation.

[ca315ad868] -> % curl https://www.drupal.org/files/issues/2018-04-24/routing-cache-language-2802403-225.patch | patch -p1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 19966  100 19966    0     0  19613      0  0:00:01  0:00:01 --:--:-- 19632
patching file core/core.services.yml
Hunk #1 FAILED at 799.
1 out of 1 hunk FAILED -- saving rejects to file core/core.services.yml.rej
patching file core/lib/Drupal/Core/Routing/RouteProvider.php
Hunk #1 FAILED at 7.
Hunk #2 succeeded at 87 with fuzz 2 (offset -1 lines).
Hunk #3 FAILED at 135.
Hunk #4 succeeded at 128 with fuzz 2 (offset -21 lines).
Hunk #5 FAILED at 160.
Hunk #6 succeeded at 184 (offset -22 lines).
3 out of 6 hunks FAILED -- saving rejects to file core/lib/Drupal/Core/Routing/RouteProvider.php.rej
patching file core/tests/Drupal/FunctionalTests/Routing/RouteCachingLanguageTest.php
patching file core/tests/Drupal/FunctionalTests/Routing/RouteCachingNonPathLanguageNegotiationTest.php
patching file core/tests/Drupal/KernelTests/Core/Routing/RouteProviderTest.php
skuark’s picture

@thierry.beeckmans No worries, I've been able to apply the patch in #214 to my 8.3.9 installation.

sathishdevan’s picture

@thierry.beeckmans

The patch you have given does not work for domain based localisations.

bradjones1’s picture

This issue is long closed; if there are new issues, please open a new issue.

kadiiski’s picture

For the rest of the people that will be looking for a solution for this case:

I found that these broken routes are being cached in drupal and thus not being passed to the Inbound when needed. So it's actually caching a rout with no route, query, params, controller etc - just the path. This is by definition a 404 page.

I decided that in my case it's not going to be a harm if we don't cache the 404 pages.

Please not that this might affect the performance.

Carlitus’s picture

Related issues: +#3332893: 404 Not found

Thanks @kadiiski! With your patch now it works this:
https://www.drupal.org/project/country_path/issues/3332893