Problem description
The list of the language users can select in the Primary language and Other languages fields is incomplete and incoherent; some items should also be removed.
For example, it includes Chinese, Simplified and Chinese, Traditional (where Simplified and Traditional are scripts) where it should eventually include Cantonese and Mandarin Chinese; it includes Sardinian (spoken in Italy), but not Sicilian and Lombard (also spoken in Italy); it includes Old Slavonic, which was the first Slavic literary language; it includes Rhaeto-Romance, which is a family of languages that includes Friulan, Ladino (spoken in Italy), and Romansh (spoken in Switzerland); it includes Marshallese and Nauru, but not Kosraean and Pohnpeian (two other Micronesian languages); it includes Serbo-Croatian and Bosian, where Bosnian is a standard form of Serbo-Croatian, together Serbian, Croatian, and Montenegrin. Papiamento, the official language of the Dutch Caribbean islands Aruba, Curaçao and Bonaire is missing from the list too.
Steps to reproduce
- Go to the Drupal.org login page and log in using your credentials
- Edit your user profile on
https://www.drupal.org/user/[YOUR_USER_ID]/edit - Scroll down and click the tab Language and location
- Try selecting one of the examples listed above, such as: Papiamento, Sicilian, or Pohnpeian. They are currently not present in the list.
Proposed solution
The list of languages should be taken from a standard list of languages, for example ISO 639-1, ISO 639-2 or ISO-639-3.
| Comment | File | Size | Author |
|---|---|---|---|
| #6 | Screen shot 2012-03-30 at 3.15.32 AM.png | 17.53 KB | Nick Lewis |
Comments
Comment #1
avpadernoThe list reports also Chinese, simplified and Chinese, traditional, which are clearly two language scripts, not two spoken languages.
Comment #2
avpaderno#1017922: Two language scripts are reported in the list of spoken languages
Comment #3
cleaver commentedJust to bring in some details from the other issue: If the intent is for spoken language, then "Chinese" would not be enough--there would be a lot of Cantonese speakers who would not understand Mandarin and vice versa.
I suggest at least two entries: "Chinese, Cantonese | Chinese, Mandarin" or alternately: "Cantonese | Mandarin".
There are other dialects, but in most cases a Chinese speaker would understand at least one of these.
If the intent is for written language, then it gets a little more complex. I suppose that "Chinese Simplified" and "Chinese, Traditional" might do. Although the script is independent of the spoken dialect, grammar and expressions may differ by dialect.
Comment #4
avpadernoI think that we should determinate what the exact purpose of the field is. Probably, for most the uses, it is enough to have a list of official languages.
Comment #5
zhouhana commentedOf course, the purpose of the field is a good question. It doesn't seem that necessary to me, but if nothing else, I'm sure a lot of people like to to be able to show their language capabilities in case a possible future employer drops by their profile.
In the thread linked in #2 someone wrote "Or we could go the other way, and change from 'Languages spoken' to 'Languages read/written'. After all, this site is a written medium, so we probably care more about what people can read."
A couple of thoughts on that:
1) I don't think there's any real need to state whether your script is traditional or simplified. If someone really needs to know, they can take a very good guess by looking at the user's country. (If from China, the person most likely writes and reads simplified Chinese; if from Hong Kong, Macau, or Taiwan, they most likely use traditional characters). I'm not sure, but I bet the case is the same with most other scripts; that one can just guess it from the country.
2) Drupal remains largely unknown in China. It's just too "Western". To change that, I think separating the Chinese languages correctly is a small but relevant step to take. When I help Chinese people register on drupal.org, the fact that the language with the most native speakers in the world isn't listed in the list of users' spoken languages isn't making a very good first impression of Drupal's multilingual capabilities.
Comment #6
Nick Lewis commentedThe original issue points to a fuzzy problem, that has uncovered a rather embarrassing problem. Read: (comment #5)
I'm bumping this issue to critical, and changing the title. Attached screenshot shows that indeed, Mandarin is not in our list. If that's not critical, I don't know what is.
Comment #7
zhouhana commentedlol at attachment. :) Please both add Mandarin and Cantonese. Someone suggested to add them as "Chinese, Mandarin" and "Chinese, Cantonese". Might be good -- I think someone looking for "Mandarin" or "Cantonese" and doesn't find it is more likely to do a second search for "Chinese" than the other way around.
Comment #8
gerhard killesreiter commentedI am happy to change that, however:
1) Should we just add these additional choices?
2) Should we add them and remove the existing choices?
3) Should we add them and migrate the existing selections in some way?
Comment #9
gerhard killesreiter commentedChanging back the title, the original issue is about Norwegian.
Regarding Norwegian: I guess the existing entries need to be merged, one of the choices deleted and the other renamed to "Norwegian".
Comment #10
gerhard killesreiter commentedMerging / migrating existing selections is probably best done by an update hook in drupalorg module.
Comment #11
hansfn commentedCorrect.
Comment #12
gerhard killesreiter commentedAdditional possibility: change the field's label to "I read/ write the following languages".
Comment #13
hansfn commentedBut then you get into trouble again when written != spoken languages ;-) I speak Norwegian, but write Norwegian Bokmål.
Comment #14
gerhard killesreiter commentedIf we change the label, we'd not make other changes.
Comment #15
zhouhana commentedOh, this is more complicated than I realized.
In the case with Chinese users (my original concern), I think it's more relevant for them to be able to show their spoken rather than written language(s). If you're going to communicate in Chinese writing it's pretty easy to understand the other version of the script or even to use a tool to change it into your own version. No big deal. The main spoken Chinese languages, however, are not mutually intelligible, and also, I think there are many non-Chinese who can speak a Chinese language rather well (and would like to show that on their profile) but who wouldn't say they can write it.
My suggestion requires work, but is perhaps worth it:
1) Remove the options "Chinese, Simplified" and "Chinese, Traditional" from the select list (and profiles?),
2) create the new select list options "Chinese, Mandarin" and "Chinese, Cantonese", and
3) send out an e-mail to the users who've previously selected "Chinese, Simplified" and/or "Chinese, Traditional" as their spoken language(s), urging them to update their language options before a certain date on which the old ones are either removed (if they weren't already in step 1) or mapped according to this or a similar pattern:
"Chinese, Simplified" + "China" > "Chinese, Mandarin"
"Chinese, Traditional" + "China" > (could be any; remove or put "Chinese, Mandarin")
"Chinese, Simplified" + "Hong Kong" > (could be any; remove or put "Chinese, Mandarin")
"Chinese, Traditional" + "Hong Kong" > "Chinese, Cantonese"
"Chinese, Simplified" + "Macau" > (could be any; remove or put "Chinese, Mandarin")
"Chinese, Traditional" + "Macau" > "Chinese, Cantonese"
"Chinese, Simplified" + "Taiwan" > "Chinese, Mandarin"
"Chinese, Traditional" + "Taiwan" > "Chinese, Mandarin"
"Chinese, Simplified" + any other country > (could be any; remove or put "Chinese, Mandarin")
"Chinese, Traditional" + any other country > (could be any; remove or put "Chinese, Mandarin")
My other suggestion is to just remove the old options from the select list and profiles, but then that would suddenly make it seem like our community doesn't have any Chinese speakers at all, and that's too bad.
Comment #16
zhouhana commentedOr, to work around it, we could simply keep the old options, add the new ones, and change the label to just "Languages" (unspecified, unless we want "Languages read and/or written by hand, typed, spoken, understood by listening, etc.").
In that case, I'd like to see "Chinese, Simplified" and "Chinese, Traditional" renamed to "Hanzi, Simplified" and "Hanzi, Traditional" (technically there are also other Chinese script systems), to steer users away from selecting those unless they decidedly want to show their writing/reading capabilities. Four options starting with "Chinese" would propably confuse some.
In this case, to be fair, the option "Kanji" (the Japanese writing system), to just name one, should also be added for Japanese readers/writers.On second thought, propably not necessary. It may be a logographic script, but it is only used for writing Japanese, while Hanzi is used for many languages.Comment #17
Nick Lewis commentedI agree with the suggestion in #16.
This simplifies a lot of technical problems. Most sensible approach I can think of.
And of course, we're not collecting research data, we just want to make it easy for drupaliens to find other drupaliens to mesh with (so they can be integrated into our global cloud cluster poised for world domination). Language is at least as important as nationality when it comes to connecting them (I'd argue language is much more important).
This is critical, so I want to suggest what needs to get resolved before we can call this issue closed: (please feel free to remove or add items from this suggestion)
1. The world's most spoken language must be included in the list. The list also accounts for Norwegian variations not previously accounted for. No opinion on whether they are merged.
2. We resolve the written vs spoken problem. I like the one proposed in comment #16. Removes complexity that is probably not necessary.
3. A common sense policy in deciding what is actually a language (for example, Texan is not a language. It's simply english. Bajan Creole is another story.)
Number 3's policy could simply be "let's use common sense." and I'd say we're good to go, personally.
Comment #18
gerhard killesreiter commentedActually, the real use case for this field is the no. of languages displayed above the world map on the frontpage, so we'd need to update the text there as well, if we change it here.
Also, I propose we stay away from deciding what is a language and what not. It's a really thorny issue, maybe more so in Europe than e.g. the US. We'd already made a similar decision in the "country" listing.
I also like the proposal in #16.
Comment #19
zhouhana commentedOkay, so I think everyone in this thread will be more or less satisfied if we
1) Change the label "Languages spoken" into "Languages",
2) add the options "Chinese, Mandarin" and "Chinese, Cantonese" to the select list,
2) rename the options "Chinese, simplified" and "Chinese, traditional" to "Hanzi, simplified" and "Hanzi, traditional" respectively,
3) change the option "Norwegian Nynorsk" into "Norwegian",
4) merge "Norwegian Bokmål" and "Norwegian" into "Norwegian",
4) delete the option "Norwegian Bokmål" in the select list,
5) update the number of spoken languages on the drupal.org front page,
and just leave it at that.
Comment #20
Mark_L6n commentedOn the issue of Chinese languages, the scripts 'Hanzi, simplified' and 'Hanzi, traditional' clearly are not languages. Perhaps it would make it easier for us to update the site that way, but it would rather make it appear that we are incapable of properly maintaining our own site. I strongly recommend those 2 scripts not be included in the languages list.
Secondly, we may wish to add a 3rd Chinese language, 'Chinese, Min', as it includes dialects that are the first language of a majority of Taiwanese (as well as others across S.E. Asia.)
Alternatively, on the issue of 'what is a language and what is not' (post 18), one possibility is to simply use the list of languages at ethnologue.com, which is a pretty well-accepted authority, rather than supply a list ourselves. If a person wishes to claim they speak one of those languages, they should probably be free to do so (seems appropriate for an open-source community!) It might require a better interface than a list box, however, as it currently contains 7,413 'primary' language names. This would be the best solution, I believe, and I'll volunteer to work on an interface for this if whoever is managing this contacts me about it.
Comment #21
lizzjoyHi, I am closing this issue due to inactivity. Please open this issue again if you wish. Thanks.
Comment #22
avpadernoI am re-opening this since it has not been fixed.
It's not correct to say "language you write" and then give me a list of scripts and languages. I speak and write Italian and English, but in both cases I am using the Latin alphabet. At the same time, it doesn't make sense for me to say "I write in Cyrillic," since that is not an helpful information, particularly because there are more languages using the same script.
Comment #23
avpadernoThat field is imported using a feature. Changing it requires changing the feature used to import it.
Comment #25
drummI believe this field was meant to be a supplement to the “Primary language” field, so I think we want to keep the same options in both lists.
“spoken” ended up being clumsy wording, it’s being updated to “Other languages”
If we can, we should defer to an ISO list or other source. Spot checking, it looks like there are corrections we can make.
Comment #26
avpadernoISO has different lists of language codes, which also include macrolanguage codes.
Which list to use depends from how much detailed we want to be. For example, using the languages listed in ISO 639-3, in my profile I could report my primary language is Italian and my other languages are Lombard and English. Using the languages listed in ISO 639-2, I could not include Lombard as my secondary language, but users living in Sicily could select Sicilian as their secondary language.
Comment #27
avpadernoThe language names given in those lists should be probably adapted. For example, they list Modern Greek (to differentiate it from Ancient Greek) but that could sound awkward to users (who would still select that, if their language is Greek). There are other languages for which a different name is probably used/preferred or it's simply more common.
Comment #28
tintoUpdating the issue summary with steps to reproduce and adding some sections/headers for legibility.
Comment #29
tintoUsing Papiamento as an example (and personal use case), I'd opt to adapt either ISO 639-2 or ISO 639-3 for the selection. Despite being the official language of several countries, it is missing from ISO 639-1.
Comment #30
tinto