Download & Extend

language_from_browser cannot get correct language from browser's code sent

Project:Drupal core
Version:8.x-dev
Component:language system
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active

Issue Summary

language.inc, line 95, in function language_from_browser() :

change from

<?php
 
foreach ($browser_langs as $langcode => $q) {
    if (isset(
$languages['1'][$langcode])) {
      return
$languages['1'][$langcode];
    }
  }
?>

to

<?php
 
foreach ($browser_langs as $langcode => $q) {
    if ((
$langcode == 'zh-cn') && isset($languages['1']['zh-hans'])) {
      return
$languages['1']['zh-hans'];
    }
    if (isset(
$languages['1'][$langcode])) {
      return
$languages['1'][$langcode];
    }
  }
?>

As you can see, for Simplified Chinese langugage, both IE and Firefox will send 'zh-cn' as the preferred language code, which is different from Drupal's 'zh-hans' code, thus the browser detection will not work for Simplified Chinese.

Maybe there's also some other languages having this issue, but this is what I got now.

Comments

#1

I was also testing the browser detection for Simplified Chinese, and this seems to be an issue in 5.15 as well. However, yang_yi_cn's code changes seem to be specific for Drupal 6. Does anyone know how to resolve this for 5.x?

#2

Version:6.9» 7.x-dev
Status:needs review» needs work

Firefox 3 still uses zh-cn
http://www.w3.org/International/questions/qa-lang-priorities#answer

the correct xml:lang is zh-Hans
http://www.w3.org/International/questions/qa-css-lang#colon-lang
http://www.w3.org/International/articles/language-tags

Mandarin Chinese, Simplified Script: zh-Hans or zh-CN
http://tlt.its.psu.edu/suggestions/international/bylanguage/chinese.html...
http://www.w3.org/International/questions/qa-css-lang#bytheway

I am in favor to correctly map Firefox 3 (or other browser) languages with Drupal basic languages. So, fixing only 1 language is just a start, it is not a solution (although it is a most used language). And we need to allow any mapping of browser languages with Drupal languages, but it should be left for a contrib module.

#3

Just change ""zh-hans" => array("Chinese, Simplified", "简体中文")," into "zh-cn" => array("Chinese, Simplified", "简体中文")," in locale.inc.
It'll be crazy to wait for browsers.

#4

Version:7.x-dev» 6.13

I use Drupal 6.13 for a multi language website, it works very smart, but in Chinese system it's not work.
I traced source code include/language.inc change code
line 83
if (preg_match("!([a-z-]+)(;q=([0-9\\.]+))?!", trim($browser_accept[$i]), $found)) {
to
if (preg_match("!([a-z-]+)(;q=([0-9\\.]+))?!", trim(strtolower($browser_accept[$i])), $found)) {
if ($found[1] == 'zh-tw') { $found[1] = 'zh-hant';}
if ($found[1] == 'zh-cn') { $found[1] = 'zh-hans';}
In firefox $_SERVER[HTTP_ACCEPT_LANGUAGE] would return zh-tw,zh-cn but IE 8 and Chrome would return zh-TW - for traditional Chinese and zh-CN for Simpled Chinese
than it work fine.

#5

I would like to add that zh-hk and zh-sg are also used. Since Hong Kong uses traditional Chinese and Singapore uses simplified Chinese, I changed the patch of cshuangtw to:

if (preg_match("!([a-z-]+)(;q=([0-9\\.]+))?!", trim(strtolower($browser_accept[$i])), $found)) {
  if ($found[1] == 'zh-tw' || $found[1] == 'zh-hk') { $found[1] = 'zh-hant';}
  if ($found[1] == 'zh-cn' || $found[1] == 'zh-sg') { $found[1] = 'zh-hans';}

Now it seems to work with all Chinese variations quite nicely, at least I'm happy. Thanks for finding the correct piece of code.

#6

Version:6.13» 7.x-dev

#7

Status:needs work» closed (duplicate)

is n't this issue a duplicate of #221712: locale_language_from_browser() doesn't parse language tags correctly, has a broken logic

#8

Status:closed (duplicate)» needs work

This is not a duplicate of #221712, but somewhat related.

The bug #221712 is about language "en-us" not mapping to generic language "en" used by drupal. The proposed and probably correct solution there is to add the language "en" to the list of supported languages so that it will match.

This won't work for Chinese, because there is no generic language of Chinese as in "zh". Chinese has to be always defined as either simplified ("zh-hans") or traditional ("zh-hant"). This bug is about browsers using outdated country specific language identifiers (like "zh-cn" or "zh-tw") instead of the new standard and politically correct identifiers. The solution proposed in #221712 doesn't work here, instead there has to be some manual mapping of language identifiers as proposed above.

Chinese is the only language I know of, where the variants ("zh-hans" and "zh-hant") can't be determined from the language code. That is because some traditional Chinese users won't understand the simplified Chinese and vice versa. It's quite likely that English speakers in New Zealand are able to understand the US English, even though there might be small differences.

#9

Version:7.x-dev» 8.x-dev
Status:needs work» active

Bugs are fixed in the development version first, backported then.

No patch here.