I think this is an important issue that has arised as a result of this other discussion, #1216094: We call too many things 'language', clean that up
Atm we are using language code = locale for both, content and interface translation. While it makes sense for interface translation, it causes some problems for content translation, mostly the inability to to have content in a single language (English) while using different locales (American English / British English).
** This is possible for a single languages scenario, just skping content language, but not if you add any other language. I.e. American English / British English / Spanish / German.
If we could (should) use proper language codes for content (en, es, pt, etc..) while using locales depending on language and country, we could cover the most usual story, that is being able to show the content for that language, while formatting other things depending on locale (date and number format, currency, etc...)
So far I've never seen a site that translates articles and content in general from British English to American English, nor from Spanish/Spain to Spanish/Mexico. While you may want to show different content for each country that is a different case unrelated to content's language.
Thus I think we should split both concepts having different lists/codes for both.
References:
Language codes, http://en.wikipedia.org/wiki/Language_code
Locale, http://en.wikipedia.org/wiki/Locale
IETF language tag, http://en.wikipedia.org/wiki/BCP_47
Comments
Comment #1
gábor hojtsyHow do you think this relates to separation of content vs. interface language in Drupal 7, that is already supported? Sounds like your case can be made so that your site would have different languages set up, and some of them maybe would not be used for content but interface only. I'm trying to figure out how this relates or is different to that.
Comment #2
sunIf I understand you correctly, you want to remove the territory/country part from language codes pertaining to content language?
In a concrete {node} data example:
The above would no longer be possible. Possible would be only:
Not sure I understand the problem, or what the benefit is of doing that. Sounds like a step backwards to me.
Also note that there are only a few language codes in iso.inc that actually contain an extension.
Clarifying the issue summary would be helpful. (it's editable now)
Comment #3
sunsorry, cross-post.
Comment #4
jose reyero commentedYes, we have content/interface language separation. The issue is we are using the same 'language list' for both and mixing up what is a language code with a locale.
One example, a site with two languages (list 1):
- en, English
- pt, Portuguese
And four locales (list 2):
- en-us, English/US
- en-uk, English/UK
- pt-pt, Portuguese/Portugal
- pt-br, Portuguese/Brazil
The problem is we are working with a 1-1 mapping from locales to languages, using the same list everywhere. It would be better several locales mapping to one language. We always can infer the language from the locale but not the other way around.
While for selecting the user's language we may display the locale list (list 2), for selecting the content language we need the language list (list 1). Currently we are using the same for both.
It doesn't make too much sense when someone posts an article, selecting between 'British English' and 'American English'. But for displaying the very same content we can use locale for things like date or number format.
Comment #5
gábor hojtsyWell, in terms of locale, all the UI translations and possibly some of the configuration translation should support all locales, right? British English will have a different translation for the UI compared to US English, right? They might need different labels for form fields just as much, no? It does not sound to be only about number formats or date formats. We do have separate translations for British and variants of Portuguese as @Sun pointed out, so those would map to locales too, right?
Comment #6
sunWell, that's where we disagree. It does make sense for me. And this is only possible if you actually install both of these languages in the first place. Otherwise, you have one or the other either way only.
Comment #7
jose reyero commentedYes, interface translations (including user defined strings for labels and field descriptions) should map to locales so our localization strategy wouldn't change that much.
About #6, I still haven't seen such a web site with American/British content. However as long as our 'language list' and our 'locale list' are configurable, that would be possible too.
We should be able configure the 'locale to language' mapping (though the default may be just stripping the country code out of the locale) and then we would have the tools to support any use case that comes to my mind, included separated American and British English content. And anyway, it is not the same having different content per country than having different content per language.
So basically what we need is a 'content languages' configuration separated from 'Site languages' (locales actually but we can call it languages for usability sake) which anyway is already a feature request. This would be a feature on top of that that would allow independent lists and names for 'content languages' and 'interface languages'.
Comment #8
sunI'm sorry, but I still don't get the point of further splitting language configuration. We already have content language separated from interface language (and developers are killing us for that already...) Drupal core configures both language types at the same time when configuring through Locale module's UI. Entity Translation module in contrib changes that UI to allow to configure them separately. (This was deemed to be too advanced for Drupal 7 core.)
I really think we need a better issue summary here, because it's very hard to understand and follow what the actual point of the proposed change is. I.e., what is the actual problem? Any steps to reproduce? What's exactly proposed to be changed? And how does that improve the situation?
Comment #9
plach@Jose:
If I'm not mistaken you are suggesting that different locales might share the same content when they (more or less) share the same language. I'd argue that since the two languages albeit very similar might actually differ in some forms (I'm not taking into consideration the scenario in which different content has to be provided for different locales), this is a degnerated case of fall back: if I understand italian perfectly and I understand (british, american, whatever) english reasonably well, I might want to read content in english if not available in italian. With the same scenario in mind, an editor not wishing to provide a "british english" version of an "american english" content (for obvious reasons) might want both locales to access it: an american user would see it "natively", while an english user would see it because some smart fallback rule decided that was the right content to display.
So, what about addressing this use-case with some advanced fallback rule?
Comment #10
wmostrey commentedI think this suggestion from sun in #1216094: We call too many things 'language', clean that up solves the problem I think Jose is raising:
The optional locales allow us to bind one language to multiple locales and thus "to show the interface and content in English (en), but everything that can be localized in the appropriate locale (en-us or en-gb)."
Comment #11
gábor hojtsyTagging for base language system.
Comment #11.0
gábor hojtsyAdded link to language tag article
Comment #12
colanI'm actually working on a site that will have both British and United States English, but I'll admit it's more about having two different Marketing teams (who want different copy) than language translation. My plan is to use "en" for US and "en-GB" for UK.
If we support "en-GB" as a language which installs in Drupal core, then shouldn't the default "en" language actually be "en-US"? I've been looking at things as though each Drupal "language" is actually a language + locale combination. If I understand the RFC correctly, this is a reasonable way to organize.
So "en", if available, could be a fall-back if a requested combination isn't available.
I'll admit I don't see much value in adding another dimension to the data. Folks can organize "languages" as they see it with the current system (as in my example above). But we should still change "en" to "en-US". ;)
Comment #13
gábor hojtsy@colan: I'm not sure that it would be correct to say that the Drupal interface text is using US English (or any other particular English for that matter).
Comment #25
smustgrave commentedGoing to close as outdated since this has been in PNMI for 11 years.
If you feel this is still an issue please reopen. After searching for any duplicates