Problem/Motivation
There's inconsistency in the way core modules are rendering the language code LANGUAGE_NONE.
- The locale module renders LANGUAGE_NONE as "All" (in locale_language_name(), locale.module line 739) or "All languages" (in locale_form_path_admin_form_alter(), locale.module line 315). This shows up e.g. when viewing path aliases at admin/config/search/path, where aliases with language code "und" are listed as having language "All"; or when adding a path alias, where the selection "All languages" corresponds to option code "und".
- The node module renders LANGUAGE_NONE as "Language neutral", e.g. at admin/content. In fact it adds special cases (e.g. in node_filter_form(), node.admin.inc line 161) to use "Language neutral" in favor of locale_language_name()'s label "All".
- Even the locale module adds exceptions to render LANGUAGE_NONE as "Language neutral", when it's associated with the node module (locale_form_node_form_alter(), locale.module line 372).
- In at least one place (locale_translation_filters(), locale.admin.inc line 828), the locale module uses an ad hoc language code "all" to stand for "All languages".
The D7 API says that LANGUAGE_NONE is used to mean a language of "no language specifically assigned", or "undetermined". This seems to me to be distinct from "all". In the first case the language is unknown; it may be e.g. Vietnamese, we don't know. In the second case, the content is known to apply to all languages.
Proposed resolution
I think there should be separate codes for "not specified", "not applicable", and "multiple".
Parent issue
#1260488: META: Introduce real language APIs
User interface changes
There are no UI changes in this issue. The UI part will be handled in follow-up issues. Here's a related issue where the content type's settings UI is being reworked: #258785: Provide more flexible settings for initial language on content types
Also related: #1314250: Allow filtering/configuration of which languages apply to what (UI, nodes, files, etc)
API changes
None except the constant changes.
Remaining tasks
The patch in #36 just changes LANGUAGE_NONE to LANGUAGE_NOT_SPECIFIED, and introduces, but does not yet use LANGUAGE_NOT_APPLICABLE and LANGUAGE_MULTIPLE. After that patch lands, the next step will be to convert places that use LANGUAGE_NOT_SPECIFIED, but should use one of the other two, to do so.
Follow up issues once this one lands:
#1471432: Rework language_list(), let people use more special languages
#1471444: Review use of LANGUAGE_NOT_SPECIFIED vs. LANGUAGE_NOT_APPLICABLE vs. LANGUAGE_MULTIPLE
Comment | File | Size | Author |
---|---|---|---|
#36 | 965300-36.patch | 161.92 KB | xjm |
#36 | interdiff.txt | 3.28 KB | xjm |
#28 | more-special-langugae-options-28.patch | 161.54 KB | Gábor Hojtsy |
#26 | more-special-langugae-options-26.patch | 160.85 KB | Gábor Hojtsy |
#25 | more-special-langugae-options.patch | 160.85 KB | Gábor Hojtsy |
Comments
Comment #1
Freso CreditAttribution: Freso commentedPerhaps the ISO 639-2 code "mul" could be used for LANGUAGE_ALL? "all" shouldn't be used at least, as there is a language assigned to that code already.
Comment #2
droplet CreditAttribution: droplet commentedComment #3
plachBugs are fixed in the development version first, backported then.
Comment #4
Gábor HojtsyTagging this under the umbrella of the Drupal 8 Multilingual Initiative. I agree LANGUAGE_NONE is confusingly used for two purposes at places. It would be useful to separate the two meanings. #1266318: Make English a first class language also introduces LANGUAGE_SYSTEM (with the value of 'system' solely to be used for interface translation purposes). However, LANGUAGE_NONE has no meaning in terms of interface translation, does it?
Comment #5
plachNo, atm is used only for content.
Comment #6
Gábor HojtsyOk, well, I've been thinking about this quite a bit recently. Depending on how you see it, LANGUAGE_NONE (undetermined) is what we mean in both cases mentioned (nodes and paths). Or rather, neither case :) for nodes, we mean "no language is applicable to associate", that is "there is nothing in this node that is in some language". For paths, we mean "this path applies to all languages", which is one kind of saying "no language is applicable to associate", right? So maybe we can say "not applicable" or "undetermined" in both cases? For the paths, we do use them for all languages, for nodes, it totally depends on your views and other content displays, not really in control of core at all. So maybe some people would use LANGUAGE_NONE as "all" for nodes, just like paths, depending on how they set up their views and apply this meaning. It is really not in our control. So I think using more neutral wording (like we did with "language neutral") is best. However, people don't really get "language neutral" IMHO.
What do you think?
Comment #7
Gábor HojtsyCopying @plach's feedback from :
Introducing one or more "no language" constant will surely let us distinguish between the different cases of no language, such as applied to "all languages", "independent of language", "unknown language", that might or might not be explained with the same concept. I'd expect this will bring some problems for DX (and also UX when we expose multiple "no language" options on the UI such as for fields).
What is the set of "no language" values we need? The following values were suggested (with some constant names that floated around):
- LANGUAGE_UNDEFINED?, LANGUAGE_NONE? - und
- LANGUAGE_NOT_APPLICABLE?, LANGUAGE_UNSUPPORTED? - zxx
- LANGUAGE_MULTIPLE?, LANGUAGE_ALL? - mul
?
Comment #8
Gábor Hojtsy#1414314: Make node and path depend on language module only for language support, get rid of locale_language_name just standardized paths and nodes without language to simply say 'None' for 'und' (and '- None -' in the options). Should we repurpose this issue for introducing more no-language varieties now that the naming is standardized or do we already have an issue for that (we might, I could not find one).
@plach?
Comment #9
plachI'm not aware of any open issue about this. I think we should introduce the concept of unsupported language for the reasons explained in http://groups.drupal.org/node/197848#overview (bullet #2), however probably we won't need to expose multiple no-language choices to our users, at least for the use case outlined there.
Comment #10
Gábor HojtsyRepurposed this for that direction now. I think we might be better off opening a new issue though.
Comment #11
Gábor HojtsyOk here is an initial patch to stir some conversation. Changes:
- changed LANGUAGE_SYSTEM => LANGCODE_SYTEM (same value)
- changed LANGUAGE_NONE => LANGCODE_UNDETERMINED (same value)
- added LANGCODE_NONLINGUISTIC with value zxx (but did not use it yet anywhere)
- added LANGCODE_MULTIPLE with langcode mul (but did not use it anywhere)
Just did the name changes whoesale in the code as per the two renames. I think most of the time we had LANGUAGE_NONE before, we actually meant the zxx non-linguistic meaning (ie. we specifically say it does not have language meaning) vs. the und langcode which means we don't know about its language. So sounds like most places we'd want to actually use LANGCODE _NONLINGUISTIC instead of LANGCODE_UNDETERMINED (and update existing data accordingly)? (Also sounds like we'd want to expose some UI for that right? and describe it in language_name() for the UI).
The patch was mostly a mass search and replace, and with places using LANGUAGE_NONE identified, we can think about how to change them.
Comment #12
effulgentsia CreditAttribution: effulgentsia commentedMy first reaction to #11 is a big +1. Seems like this is a great first step towards following the ISO standard better, and I recommend we try to get this patch to land as quickly as we can, and then open follow-ups for changing from LANGCODE_UNDETERMINED to LANGCODE_NONLINGUISTIC where appropriate. Here are some minor things I found, but leaving as "needs review", because I think this patch still needs more substantive feedback from plach and/or others heavily involved in the D8MI work.
This comment is an improvement, but still wasn't clear to me. I had to ping Gabor, and he explained to me that this refers to "Drupal English", and that a site's "English" language might be a customization of this if the locale_translate_english variable is set. Anyway, I wonder if we can add this information to this comment concisely.
I think this comment can be further improved too.
Would it be useful for us to add an example to this comment (e.g., a PDF file of a manual written in multiple languages) or should we defer that until the issue where we first use this new constant in core?
Also, "Undetermined" is not correct here, is it?
Line now exceeds 80 chars, so please wrap.
Line now exceeds 80 chars, so wrap.
s/$langcode_none/$langcode_undetermined/.
s/$language/$langcode/
Is this where we want LANGCODE_NONLINGUISTIC instead? If that needs to be done in a separate issue (e.g., upgrade path complications), can we at least add a @todo here?
2 days to next Drupal core point release.
Comment #13
plach@effulgentsia:
I didn't review the patch yet, but from the summary I'd say it's what we want :)
Many places currently using LANGUAGE_NONE should not become LANGUAGE_NONLIGUISTIC, AAMOF many fields which in D7 are (initially) marked as non translatable, and thus getting the 'und' language code, are actually linguistic fields, for instance taxonomy terms.
I tried to explain how I would approach this matter in http://groups.drupal.org/node/197848#overview.
Comment #14
sunTotally no pun intended on @plach ;), but the constant name is not only hard to remember, it's also very hard to type correctly. I totally know it's the official label for the langcode, but could we find a simpler constant name? :)
Comment #15
Gábor Hojtsy@sun: I was thinking of a simpler name for a while but did not find one + did not want to repeat the mistake we did with LANGUAGE_NONE where it is understood both as "the user did not tell" and "cannot be related to language". If this constant would be NO_LANGUAGE let's say, it would have the same double meaning no? Maybe LANGCODE_NOT_APPLICABLE? I'm not sure we are best served by going that far from the standard.
Comment #16
effulgentsia CreditAttribution: effulgentsia commentedI'd be okay with LANGCODE_UND and LANGCODE_NA for distinguishing undetermined from not applicable, if people think that's better DX. I'm not yet fully convinced that the reduced typing is worth the slight increase in obscurity though.
Comment #17
Gábor HojtsyAny other opinions on constant names?
Comment #18
plachMaybe LANGUAGE_NOT_SPECIFIED and LANGUAGE_NOT_APPLICABLE?
I'd stick with the LANGUAGE prefix instead of the LANGCODE one as LANGCODE_NOT_SPECIFIED (and friends) sounds odd to me: it's the language that is not specified, not the language code.
Comment #19
Gábor HojtsyI agree LANGCODE_NOT_SPECIFIED looks odd, since you specify it as a langcode :) However these might also look pretty odd no?
Looks like we need to choose non-ideal naming anyway. One plus argument is that LANGUAGE_* would keep the constant names in line with other names in the language system.
Comment #20
plachI agree that on a first sight the line below
reminds our old inconsistent pattern, but in my mind this reads as "if the language code is/equals to/means 'language not specified'" which in turn makes perfect sense. Moreover the LANGUAGE_* prefix automatically defines the constant as part of the language module/namespace.
Comment #21
Gábor HojtsyOk, I'm fine with going LANGAUGE_*, we still need to fix the remaining parts of the constants though :) Let's see what others think about your proposals.
Comment #22
effulgentsia CreditAttribution: effulgentsia commentedHm. In the {users} table we now have a "preferred_langcode" column. That sounds less natural to me than "preferred_language", but I though being consistent with when to use langcode vs. language was important.
The constants are defined in bootstrap.inc (not language.inc), and used all over the place, whether language.module is enabled or not.
Comment #23
effulgentsia CreditAttribution: effulgentsia commentedThat said, we still have LANGUAGE_LTR and LANGUAGE_RTL constants that do not refer to a $language object. Maybe for constants, it's implied and therefore okay for LANGUAGE to not refer to an object.
Comment #24
plachYes, that's why I mentioned the language namespace: the only reason those constants are currently defined in bootstrap.inc is that they need to be available very early, but ideally they belong to language.inc and thus to the language subsystem.
I think this could be a sensible way to present the distinction between the two patterns.
Comment #25
Gábor HojtsyOk, let's keep LANGUAGE_* then. Here is a new patch with LANGUAGE_NOT_APPLICABLE and LANGUAGE_NOT_SPECIFIED as per the discussion. Also made the changes @effulgentsia suggested (lot more docs on the constants, more reference to W3C docs - also cross-checked the W3C docs about these language codes). Also wrapped lines where he noted and changed variable names.
I did not change any of the use of the 'und' value, I think that we could add @todo all around core, basically we should revisit every use of LANGUAGE_NONE as it was before to see if our extended support for special language codes now still make that the best choice or not. So I think the new constant name in itself is a TODO, no? :) Also, we'll need to make UIs available for people to assign things like 'mul', 'und' and 'zxx' separately.
Also found some newly added LANUAGE_NONEs since the last patch, converted those too. It is a huge patch so would be nice to get it in sooner than later, but agreement would be vital first :)
Comment #26
Gábor HojtsyFound two typos in the 'mul' phpdoc block. Heh :)
Comment #28
Gábor HojtsyDon't know how I missed node.test, likely brand new stuff :)
Comment #29
effulgentsia CreditAttribution: effulgentsia commentedThanks. RTBC and assigning to jhodgdon for final sanity check of the new constant names.
Comment #30
effulgentsia CreditAttribution: effulgentsia commentedAlso tagging with "coding standards", since I think changing a widely used constant name relates to that, sort of.
Comment #31
Dave ReidI feel like this could use an issue summary as well since it doesn't seem to be at all current to the latest patch.
Comment #32
jhodgdonI think the constant names look good.
I took a quick scan throught the patch, at least the top. One thing I noticed was that these types of lines look very suspicious to me:
These codes are all strings... At least if you are going to compare them with < you might add a code comment explaining why that makes sense?
Comment #32.0
jhodgdonAdd parent.
Comment #32.1
effulgentsia CreditAttribution: effulgentsia commentedUpdated issue summary.
Comment #33
effulgentsia CreditAttribution: effulgentsia commentedRe #31, I updated the last paragraph of the summary to clarify.
Re #32, those are pre-existing comparisons not being changed by this patch, only the constant name is being changed. The drupal_lookup_path() function they're in does have inline comments explaining the rationale:
Comment #33.0
effulgentsia CreditAttribution: effulgentsia commentedUpdated issue summary.
Comment #33.1
effulgentsia CreditAttribution: effulgentsia commentedUpdated issue summary.
Comment #34
xjmThis change makes way too much sense. :)
I found a couple of lines of documentation over 80 chars. I'll reroll to fix them:
There's also lots of assertion messages being translated, but most of those are just assertion lines being updated with the new constants, so leaving them alone.
Comment #35
xjmOh and tagging per @Dave Reid's request.
Comment #36
xjmRewrapping the documentation lines as above, plus one other small fix to the affected lines.
Comment #37
effulgentsia CreditAttribution: effulgentsia commentedThanks, xjm. Removing tag, because prior to #33, I already updated the last paragraph of the issue summary.
Comment #38
Gábor HojtsyI've added #1471432: Rework language_list(), let people use more special languages and #1471444: Review use of LANGUAGE_NOT_SPECIFIED vs. LANGUAGE_NOT_APPLICABLE vs. LANGUAGE_MULTIPLE as followups to do once this lands. This is a huge patch, so would be great to see in so we can proceed using/exposing these :)
Comment #38.0
Gábor HojtsyUpdated issue summary.
Comment #39
klonosThanx Gábor. I've updated the issue summary:
- "Conform" to the issue summary standards (no "Original report by [username]" section though because there were only minor changes and some moving around to place things in the proper sections AFAICT).
- Linking to related & follow-up issues.
- Changed patch reference to link to the latest patch in #36.
Please check to see if I got everything right and also someone update the "API changes" section (I couldn't possibly know what to write there). Thanx in advance.
Comment #39.0
klonos...look at #39 for the changes.
Comment #40
Gábor Hojtsy@klonos: updates to summary look good, noted in the API changes that except the constant changes there is nothing else (but the constants are part of the API).
Comment #41
klonosThanx for taking the time to go through it.
Just to get this one straight: the ultimate conclusion is that this is not to be backported to 7.x. Right?
Comment #42
Gábor Hojtsy@klonos: right, Drupal 7 does not allow API changes like this. Contrib modules are full of LANGUAGE_NONE and assumptions about what it means in D7.
Comment #43
catchThis patch would be so, so much smaller if we were using field_get_items() to check field values consistently, I'll open a follow-up. The actual change looks fine though.
Comment #44
Gábor Hojtsy@catch: we have an issue for introducing getters/setters on entities for that, so moving to an interim API does not seem like the best idea. #1277776: Add generic field/property getters/setters (with optional language support) for entities is the issue to follow/help for that.
I've seen you committed this at http://drupalcode.org/project/drupal.git/commitdiff/b9e924417127c7e0e3d0... so added changelog note at http://drupal.org/node/1473200 and closing this one.
Comment #45
catchGabor pointed out we have #1277776: Add generic field/property getters/setters (with optional language support) for entities. Committed/pushed to 8.x.
I think we could actually backport the LANGUAGE_NOT_APPLICABLE constant to 7.x so people can start using it if they want - just set it to the same value as LANGUAGE_NONE.
Comment #46
Gábor HojtsyYes, maybe we can backport the introduction of the LANGUAGE_NOT_APPLICABLE constant itself, but leave LANGAGE_NONE around and used as always. I think just leaving LANGUAGE_NONE and not introducing the new constant might be better though since then we'll not have differences in different modules which would lead to misunderstandings, "this module is only compatible with Drupal 7.Z+" type of issues, etc...
Comment #48
Gábor HojtsyAll right, well, LANGUAGE_MULTIPLE tested horribly and it served to confuse users in all kinds of different ways. Therefore: #1869292: Remove confusing "multiple" language from core. (Yay for experimentation, space for testing ideas and correction when failures are found).
Comment #49
geerlingguy CreditAttribution: geerlingguy commentedCould we get a change notice for this? I spent about 20 minutes trying to figure out why a particular test case was failing before finding that the constant was changed—there seems to be no mention of the change currently: http://drupal.org/list-changes/drupal?to_branch=8.x
Comment #50
Gábor Hojtsy@geerlingguy: a change notice was added almost a year ago, see above (http://drupal.org/node/1473200). Anything missing from there?
Comment #50.0
Gábor HojtsyFill in API changes part.