I have i18n and Entity Translation installed with fallback language.
There is a node written in only one language (Portuguese -Brazilian). Then going to http://www.example.com/en/node_title displays the interface in English and the content in Portuguese because of the language fallback.

The canonical tag created is:
http://www.example.com/en/en/node_title
It should be:
http://www.example.com/pt-br/node_title

The way this is working right now the website can be penalized for duplicated content.

note: the issue of duplicated language on the canonical URL (/en/en/) is already at reported at http://drupal.org/node/1205982 and maybe is related to this one

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Mac_Weber’s picture

I just realize it happens only if at Global Redirect's config is not set the option "Language Path Checking".
However, it is interesting in some cases not redirect and have the canonical pointing to the original content. For example, in case the admin wants to keep the interface language yet the content is displayed on a different language.

mrfelton’s picture

Status: Active » Needs review
FileSize
1.86 KB

This is a problem in Drupal 6 too, and it stems from the fact the url() function is being used to generate the canonical url. I have spent some time working through this issue, and come up with this patch which does the following:

1) If a node is language neutral, then the Canonical URL will point to the unprefixed version of the node, regardless of how you access the page. For example

node/1 has url alias my/page
visiting /my/page => canonical url = http://www.example.com/my/page
visiting /en/my/page => canonical url = http://www.example.com/my/page
visiting /es/my/page => canonical url = http://www.example.com/my/page

2) If a node is language specific, then the Canonical URL will point to the correct version of the node. For example

node/1 has url alias my/page and language en
visiting /en/my/page => canonical url = http://www.example.com/en/my/page
visiting /es/my/page => 404 error, nothing to do here

3) If a node is langage specific, and has translations in other languages then the Canonical URL will point to the correct version of the node. For example

node/1 has url alias my/page and language en
node/2 has url alias my/page and language es

visiting my/page => canonical url = http://www.example.com/my/page
visiting es/my/page => canonical url = http://www.example.com/es/my/page

This has only been tested with LANGUAGE_NEGOTIATION_PATH_DEFAULT, and is for Drupal 6. This also corrects the problem described in #1205982: multi-language globalredirect with pathauto creates bad canonical.

Mac_Weber’s picture

mrfelton, thanks for the patch. I will test it and post a feedback.

Mac_Weber’s picture

I just added the comment number to the file name, so it can pass by QA

jthorson’s picture

Version: 7.x-1.3-alpha1 » 6.x-1.x-dev

Just testing ... but I think the patch is against 6.x.

jthorson’s picture

jthorson’s picture

Version: 6.x-1.x-dev » 6.x-1.2
Status: Needs review » Active
jthorson’s picture

Status: Active » Needs review
chicodasilva’s picture

Has anyone tested patch #6 for drupal 7?

anou’s picture

Hello,
I've tested patch #6 on Drupal 7. You have to add $query_string as argument of the new function: _globalredirect_get_canonical($path, $query_string) {...

But the function removes the language prefix...

So I just changed the canonical part :

  if ($settings['canonical']) { 
    $request_path = str_replace($prefix, '', $request_path);//added this line
    drupal_add_html_head_link(array(
      'rel' => 'canonical',
      'href' => url(drupal_is_front_page() ? '<front>' : $request_path, array('absolute' => TRUE, 'query' => $query_string)),
    ));
  }

and this way I don't have duplicate prefixes. I have a patch but I not sure that's the right place to attach it.