Updated: Comment #31
Problem/Motivation
As has been mentioned elsewhere, we don't yet have guidelines on when and how to use string contexts (= translation contexts). Due to lack of guidelines every developer/module maintainer re-invents the wheel and decides on their own context strings. This results in a great variety of contexts, some good, some not. Further, the lack of guidelines for context, makes it difficult for module/core maintainers to accept string context suggestions. As a result discussions come to a standstill or bad contexts get accepted.
For examples of real-life string context issue, see the list of issues tagged with "string context".
Proposed resolution
Create a description of string context for both developers and translators. Including criteria for string context and examples of good and bad contexts.
Remaining tasks
- Agree on criteria for string context
- Complete the string context documentation page with this information.
- Get this documentation promoted to be a coding standard page.
(#20)
String context criteria
Define a context that:
- Fixes an actual translation problem that has been identified due to the lack of context.
- Allows a translator to distinguish the meaning of the string and decide on the right translation.
- Is reusable.
- Provides information on the meaning of the string.
- Is short and concise (phrase rather than a full sentence).
- Only add the context to the string in the deviating context. The dominant use of the string does not receive a context.
Avoid the following in context definition:
- Linguistic contexts like: "Verb", "Noun" or "Dative". This usually does not provide enough distinction
- "Module name". Module names are usually not translated. If the module name is the donimant use of the string, it will not receive a context.
- Don't use a module name. It silos the use within (groups of) modules and prevents collaboration between modules.
- Don't use functional part of the module where the string is used (e.g. "Block"), function name or php class name. This is not re-usable and this context may change over time.
Related issues
Arguments from the original report by @zirvap
- Context is used to give additional information about strings to translators, for strings which would otherwise be difficult to translate well.
- General rule of thumb: If a string has several possible meanings (ie. "May" which can be both the complete month name and a three-letter abbreviation of the name), we use the bare string for one meaning, and add context to others. For instance: "May" without any context is the abbreviation, "May" with context "Long month name" is the complete name.
- Contexts should be reused. (For instance, a contrib module should not introduce "May" with context "Complete month name", it should reuse the core context.
(#1034882: Make list of contexts used more evident for developers is necessary for this)
Comments
Comment #1
spuky CreditAttribution: spuky commentedIt maybe would be good to come up with a short list of common context cases that would be first choice for a developer when seeking for context... I try to start of.. with some examples that could be extended.
A Rule for Developers:
Defining context is most important for short translateables like "view" or "views" where it is hard for a translator just using a tool like localize.drupal.org to get contextual information.
I'd start of with defining simple contexts like:
those are highly reuseable and would help a lot. I'd try to come up with more cases but I haven't been translating that much lately...
I hope other people will come up with more examples...
Comment #2
Blooniverse CreditAttribution: Blooniverse commented... I've been following the German translation process since months. Unluckily I haven't had enough time to join in. Here my tiny contribution in form of a quick-intuitive proposal:
A firm global translation meaning tree
A semantic taxonomy/ontology in OWL/RDFS (a standardised linguistic approach).
I haven't found any useful/fitting existing namespaces (e.g. http://usefulinc.com/ns/doap, a project vocabulary) to enrich resp. extend pure RDF in order to make use of RDF's basic functionalities.
Comment #3
plachsubscribe
Comment #4
Gábor Hojtsy@spuky: the difficulty of defining contexts is exactly that we should (IMHO) avoid using simplistic contexts like you propose. "Verb" our "noun" do not provide real context for meaning. The word "view" if a noun can still mean various things, like "view defined in views", "view from Rocky Mountains", "view in a hierarchical database". These are all nouns, but might be differently translated to a foreign language. I don't have an English example offhand, but one fun Hungarian example, is that we use the same noun for "lightbulb" and "pear fruit", probably because they look the same. Simply saying "noun" or "verb" does not really provide sufficient context in many cases.
For using the module name, that is yet another thing I'd avoid. Clustering strings per module name is pretty bad, since it can easily spiral into killing collaboration between modules and will lead to inconsistencies in translation.
Some strings that needed context in core include "strong" (for important and the strong HTML tag) and the above mentioned May (for short and long month name). Contrib examples include "check" for "check as in form of payment" and "check as in checking out". The context should provide supplemental information about the meaning of the string.
Comment #5
Blooniverse CreditAttribution: Blooniverse commented... for [very] professional results one should probably take XLIFF into consideration as well:
Comment #6
Anonymous (not verified) CreditAttribution: Anonymous commentedShouldn't the module be an option as a context, maybe even first choice?
Or would the module be there, anyway?
Comment #7
Gábor Hojtsy@Sabine: no, the word "view" means the same in Views, Views slideshow, Views whatever modules and its use should not be siloed into module specific namespaces or it kills efforts for translators to establish a unified terminology. Contexts should provide supplemental information about the meaning of the string. Many strings have a whole ecosystem of modules using them with the same meaning.
Comment #8
spuky CreditAttribution: spuky commentedBut if I tag the string that is suposed to be translated/or not.. with a tag "modul_name" (not the actual modulname programmers tend to replace vars "modul_name" is not a var!!!) then Views, Views slideshow, Views whatever, CCK or which ever module when having the string "views" that ist meant as module name could tag it that way. So even within views module you coud distinguish between the module name and other usages of view
So if modul names all over the place would get taged with modul_name a translator could see a this is a modulname and should be translated in a special way (in German we try to keep the english names for making modules easier to identify)
I agree that "noun, verb" don't reval much context... but more than none.. so in your pear example you could one Module have:
t("körte", array(), array("langcode"=>"hu", "context"=> "noun, fruit")) ;
and an other:
t("körte", array(), array("langcode"=>"hu", "context"=> "noun, lighting")) ;
When I would be up to translate that I would in both cases see that is a noun and get Information on context.. of course context should try to be short, to not bloat the System..
On thing is to come up with standards for possible contexts (that don't force a developer to learn whole lot when trying to ad context information to their t functions )
The other thing is to build a feedback loop betwen strugeling translators and the developers...
Comment #9
zirvap CreditAttribution: zirvap commentedI agree, there are some cases where "module name" should be a context. Example: The string "location" is used in core to mean location in file system, and in GMAP and Location (and probably other Geo/mapping modules) to mean geographical location. Those are two different words in Norwegian, so we need a context for "Geographical location". BUT: There's also a string for the module name "Location", and (at least until now) the policy in a lot of translation communities to not translate module names. If we want to keep that (and I think we do), then the module Location must use two different contexts for "Location", depending on whether it's referring to the module or to a geographical location.
There's a similar situation with Webform. The string "Webform" appears in the user interface when you need to edit a webform. Users may have permission to add or edit webforms without having knowledge about or interest in what the module is called. So it would be useful to be able to translate the string "webform" in some places, but not all.
Comment #10
Gábor HojtsyOk, got that, agree it would be useful.
Comment #11
Gábor HojtsyMarked #802980: Defining contexts as a duplicate.
Comment #12
Anonymous (not verified) CreditAttribution: Anonymous commentedYou're right, of course.
But aren't you taking the effort from the translators and putting it on the part of the developers, instead?
When looking for a practical approach, my idea was that the module might somehow be automatically inserted as a context, and the translators would still be free to decide they need a special translation for that module, or leave that context out and provide only one translation for general use, as before. So that the context would act like an aid, to be picked up by the translator, or not. Something like that.
As to semantic contexts: In my opinion they are necessary and useful if you aim at an automatic translation or a translation by people who don`t know anything about the context the string is used in. But it puts quite some effort on the developers, and they are no linguists, how are they to know? How will you control in which case a context has to be provided, and in which case not?
When we are asked to suggest contexts, we should take into account the whole variety of modules.
My old dictionary gives a good choice in the list of abbreviations. There are about 135 abbreviated distinctive terms used to differentiate meaning and semantic use of words.
Comment #13
Gábor HojtsyUnfortunately Drupal itself does not know about the source of the string either, so it cannot automatically add a context. There were some efforts to make this automatically available to Drupal, but they significantly degraded the site performance. It basically requires runtime stack inspection, which is pretty expensive in PHP. So as a matter of fact, it is only developers who can add contexts, and they need to work with translators.
Comment #14
Blooniverse CreditAttribution: Blooniverse commented... do we need to look at something like W3C's "Internationalization Tag Set (ITS)" — http://www.w3.org/TR/its/ ? The authors of this document also mention a few well known translation softwares!
Comment #15
DjebbZ CreditAttribution: DjebbZ commentedsubscribing until I find some time to chip in.
Comment #16
wojtha CreditAttribution: wojtha commentedSubscribing
Comment #17
LarsKramer CreditAttribution: LarsKramer commentedAd #13: It is a pity Drupal provides no context about the origin and context of a string. Wouldn't it be possible when a module is installed to save this information into some database table with the fields: string, module_name, line_number? So that this information could be retrieved when the translator or adminsitrator enters the "translate interface page". Just an idea...
Ad #7: Actually the word "Views" is also used in the module advanced_forum, meaning the number of times a forum topic has been read. In many languages that would conflict with the translation of the name of the module Views (if at all the module name should be translated, which I agree it shouldn't).
Comment #18
zirvap CreditAttribution: zirvap commentedI’ve started a handbook page at http://drupal.org/node/1369936 The intention is that we can link to that page when we open issues about adding string contexts, so I’ve included various background and how-to info as well.
Please discuss and improve, as needed!
Comment #19
Anonymous (not verified) CreditAttribution: Anonymous commentedI realized when I had to use contextual translation for a module that maybe the biggest problem we have here is that there is no way (or is there ?) to see how a string is translated by default and by any other existing context. What I mean is that we cannot enter a string into a search bar and then have the list of all translations existing for it depending on the context.
This would actually help a lot because we would know for sure if we have to create a context for a string or not.
Comment #20
jhodgdonCoding standards are normally discussed in the Drupal Core issue queue.
Comment #21
jhodgdonforgot tag
Comment #22
Gábor HojtsyWell, yes, and no :) This would apply to core and definitely to contrib.
Comment #23
jhodgdonYes, but we still usually discuss coding standards for the Drupal project as a whole in the Drupal Core issue queue, rather than in Documentation where only docs writers will ever see them. Issues tend to get buried there. :)
Comment #24
dozymoe CreditAttribution: dozymoe commentedInstead of module name, shouldn't that be noop, or keyword, it's a more generalized meaning, that the word should not be translated.
Comment #25
Blooniverse CreditAttribution: Blooniverse commented@Lars#17 (first paragraph): You probably mean a tool like 'Translation template extractor' (http://drupal.org/project/potx )! With this contrib module you can extract all the strings of e.g. your custom module and put the resulting language-specific .po file (e.g. 'modulename_date.de.po') -- after adding the [lacking] translations -- into your custom module's 'translations' folder. Alternatively you can even produce a language neutral .pot template file with this contrib module.
The .po/.pot file contains the file name(s) and even line number(s) for every occuring t()-string in the code!!! After the manual or automatic import of the .po file this additional meta data shows up in the admin backend (translation GUI).
Comment #26
hass CreditAttribution: hass commented#1933082: Add context to short action names
#1933072: Add translatable string context to Views module name
Comment #27
hass CreditAttribution: hass commentedUyghur (here in core it means the language) from
_locale_get_predefined_list()
. http://en.wikipedia.org/wiki/Uyghur has listed:So, context could be (including my previous linked examples):
Comment #28
manarth CreditAttribution: manarth commentedAn English example, close to our developer heart: table is (usually) a noun, but do we mean database table, dining table, table of contents…
Comment #29
Globalbility CreditAttribution: Globalbility commentedI make a new website now and I have a probably rare use case for translating strings; so I'm adding it here both to get tips for how to use the string context in the best way and to give input on the different use cases:
I have a button called "Download" to make people able to download a poster. For some languages the word "Download" can simply not be translated without a context. E.g. for Chickasaw the string would be translated to something like "Push here to get the poster" - and there would be needed different translations for different contexts (depending on what you want to download). So I will simply add "Poster" as the context:
t('Download', array(), array('context' => 'poster')),
For other languages there only needs to be one translation, so I work to make a context fallback to the language fallback module #2002694: Add context fallback. In this way, I can simply translate "Download" one time for some languages and several times for other languages when needed. This function might be useful in other situations too. E.g. it could be possible to make it give different fallback priority to different types of string context (word classes, according to pronoun form, objects, names, etc).
Comment #30
Sutharsan CreditAttribution: Sutharsan commentedI've been through the process of creating issues to add context several times now. Also as a translator I have come across various contexts. I use these rules:
Instead of compiling a list of standard context, as some have tried here, we should stick to guidelines and good examples. You only have to look at the list of current contexts, and you will understand that it is impossible to come up with contexts up front. Only when we encounter translation problems due to lack of context, we can come up with good contexts.
Comment #31
Sutharsan CreditAttribution: Sutharsan commented@pounard mentioned in #1429822: Wrong localization context usage the gettext comments on context, which I find valuable for this discussion:
Source: http://www.gnu.org/software/gettext/manual/gettext.html
Comment #32
Sutharsan CreditAttribution: Sutharsan commentedComment #33
jhodgdonI looked through the proposed guidelines for string contexts that is in the issue summary... Oh, also looked at the existing docs page https://drupal.org/node/1369936 -- Some comments:
- It would be clearer if some examples of good vs. bad context were shown, illustrating the guidelines.
- There are two "bad" items about not using the module name. One is probably enough. :)
- There are some grammatical and typographical issues (I've made an edit).
- The last "good" item says that you should only have a context on the "deviating" string. But that seems wrong to me. To use the example in the docs page, if "Order" is unclear, it seems to me that all instances of "Order" should have context, because how would a translator know what the "dominant" version is supposed to mean without the context being there? I think if you have a string that needs context, then each version of it should have a context?
- Maybe we need a guideline saying to check localize.drupal.org to see if a string you are putting into your module already has several choices of context defined, and pick an existing one if so?
- The last item in "bad" I don't understand at all what it means... "Don't use functional part of the module where the string is used (e.g. "Block"), function name or php class name. This is not re-usable and this context may change over time." ?!? Really I have no idea what it means.
Comment #34
Sutharsan CreditAttribution: Sutharsan commentedGreat idea, lean by example.
The first one is about the context "module name" the second about a context like "commerce". But apparently we should be more clear about this.
It may seem that it is difficult for a translator to know what the dominant usage of a string is, but in practice we are managing quite well. Most translators know drupal and just know that the dominant usage of the word. But most of all, it is the practical approach to add a context to for the deviating meaning only. Take as example the string 'Block'. You just know that the dominant usage of 'Block' is in the context of a Drupal Block. But now Private Message module uses 'Block' in the meaning of preventing access (#2160591: Allow translation of Block with right context). It would be practically impossible to add a context to all 'Block' strings, including old versions of modules. All translations are shared even with Drupal 5 modules. If we limit the context to the deviating meaning, we keep it simple.
True. But with the remark that the context is list has several bad examples. We currently can not filter it and cleaning it up is impossible since strings of old releases are included too.
And I find it hard to explain too ;) I think it refers to what is described in the quote in #31. Some bad examples: context 'page title' [1], 'json_error' [2]. The string may not be used as page title or json error in the future ("may change over time") and when this string is used in a different place, not as page title or json error, we may still need this different translation.
[1] https://localize.drupal.org/translate/languages/nl/translate?context=Pag...
[2] https://localize.drupal.org/translate/languages/nl/translate?context=jso...
Comment #35
droplet CreditAttribution: droplet commentedCome from https://drupal.org/comment/8525745#comment-8525745.
It's 3~4 years since D7 introduced the contexts. If we take a look at the "REAL WORLD" usage, no modules added Context to "Weight".
It's very clearly shown that how the maintainers thoughts when they coding modules. They don't need X.Y.Z, will never add it. Of course you will tell me to provide a patch. Right. I can patch ONE module and wait for half years or so. But no able to patch 10 or even 100..
Please consider add contexts to all drupal-specified common words, eg. "Weight" ( I think 99% of other system using the word "Order / Sort" instead.)
Thanks.
Comment #36
hass CreditAttribution: hass commentedI think we need to convince the developers however hard it is or write a doc page and just point them there :-). Weight is really a great example. I had others like "state" in my case it was "territory of a country" and after 6 months (OMG) we added this as context. If someone do not understand that "weight" is bad without context he should step back and let others become a co-maintainer to get this stuff fixed. Seriously.
Comment #37
Sutharsan CreditAttribution: Sutharsan commented@hass I have had many (issue) discussions with developers regarding context. In my experience developers need a few things:
@droplet, Developers will indeed only add solutions if they think there is a problem. I don't want them to solve problems that do not exist, that would lead to bulky and unmaintainable code. Adding context to a string's default context is just the same thing. It will lead to bulky code, which does not solve real problems. That is why I propose to only add the context to those strings that represent the minor use case. In you example don't apply is to weight in the meaning of sort order, but to weight in the meaning of mass. Only in Drupal 8 core 'Weight' is used 53 times for sort order and never for mass. But lets continue the discussion on specific string in their respective issues.
Comment #38
Sutharsan CreditAttribution: Sutharsan commented@all Lets try to finalise this discussion, and come to sensible and acceptable guidelines and example. I propose to have a BoF discussion at Developer days in Szeged two weeks from now.
Comment #39
droplet CreditAttribution: droplet commented@ALL
D8 going to release next week, any new guidelines (policy).
There's no perfect world. I believe we need some actions instead of do nothing.
We build framework and modules and API to solve common problems. To me, `Strings Context` is a special API layer in CORE.
Thanks All.
Comment #46
Joachim NamysloI believe it too, and there is, even more, we should do, at least in my opinion on our way to D9 Maybe you think that's not necessary anymore maybe there is no time and money for this but we should at least talk about that a bit more until D9 alpha and beyond
https://www.drupal.org/project/l10n_server/issues/3000298
Comment #47
M-Schmitt CreditAttribution: M-Schmitt commentedI believe we need to make translation contexts useful in order to make translating easier. Very few of the existing contexts have been of any help with my translations.
So we should create a list of criteria that contexts have to meet and then create helpful contexts. Lastly module maintainers and developers need to be made aware of those contexts.
What I thought of the past days, was the following.
Criteria for contexts from OP and my own thoughts:
Contexts, that would be very useful for me:
Contexts, that might be useful:
Once we agreed on some contexts, those should be made public in the WeeklyDrop newsletter, DrupalCon presentations and on the documentation pages.
Comment #48
jhodgdonThis seems like a very good idea. The current information on api.drupal.org about contexts is not very long, just a section on:
https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Language%...
So if this information is added to the documentation, we should definitely link from that api.drupal.org topic to the more complete documentation.
Comment #49
Balu Ertl@M-Schmidt thanks for taking the time to collect such a concise list of possible needs. I agree with all of your suggested categories, with some minor additions:
Would be better to call "project_name": a theme, a distribution, and a module all counts as "project" in Drupal's ecosystem.
could be even more sophisticated by filtering the 3 main audience types that Drupal handles:
This would make possible to optimize spending translator's little free time to focus on first the UI strings that the widest user base is facing (anonymous visitors).
And one more reason why properly aligned string context usage is important: from the technical viewpoint .po file format is well prepared to handle this kind of meta information. Therefore if we plan to involve any CAT tools in the future, then these string context info will be very valuable.
Working out a common consensus and promoting the agreed directions across all module maintainers may seems a huge challenge at first sight, but I totally share your opinion about this would be an almost revolutionary improvement for a while. However sorry for my short reply on such an important topic, I would have probably a lot more ideas if I dig up my notes from the last 3-5 years. I'll try to find some time for localization this spring or summer.
Comment #50
M-Schmitt CreditAttribution: M-Schmitt commented@balu-ertl I really like your suggestions and they would definitely improve the translator experience. The names for the contexts are of course up for debate.
Funnily, I just came across this post after reading yours about different audiences: https://www.drupal.org/project/paragraphs/issues/3044103#comment-13048702 . So that's definitely a useful idea if project creators and maintainers use those contexts.
Jhodgon also commented on contexts in the #localize-german channel on drupalchat.me. I'll just paste her replies in here, so that everyone can see them:
So that's something to consider.
Comment #51
Balu ErtlI agree, in technical speaking it is a coding-related issue indeed. However, first the Drupal translators should come together and agree on their common "wishlist" about what to require from developers. (If not, then it would be a very unfortunate situation, if some of us starts propagate new directions to module maintainers plus crafting patches to replace string contexts, meanwhile an other portion of translators raise veto against some of these agreements.) I know this kind of democratic/meritocratic decisions are slow to bring final state, but at least we should start somehow.
Maybe we can post a call both on Translations (and also on Internationalization) groups inviting here people to share their thoughts on the topic:
Also, advertise through the related DrupalChat/Slack channels as @Jennifer suggested above. Or even maybe achieve at DA to post from any official Twitter-account? Personally, I wouldn't refrain to write one-by-one to the Localization team admins through their personal contact forms to hear from them.
Step 0.
Revitalize translator community of Drupal :)
Comment #52
hudriI'd like to share one observation I've made with translation contexts with Drupal and german language:
Wrong translations due missing context is coming (almost) exclusively from single-word strings (or "noun verb" strings with the verb being a "general purpose verb" like "set" or "edit").
Example:
In english the noun "order" has the meaning of sort/weight and the meaning of an order in an online store. In german those are two completely different words, and we have those errors in our own custom modules despite the fact all of us speak english and german.
But we haven't had any problem with the missing translation context as soon we had a two-word string like "order alphabetically" or "send order".
So my proposal is to make context mandatory if the source string is a single word.
Comment #53
jhodgdonThat is an interesting proposal! I think the vast majority of single words in English are unambiguous and would not really require context... but maybe the majority of single words in English that are used alone in the Drupal UI do?
Comment #59
Joachim NamysloWe should address that issue very soon. With Drupal core translated 100 % to German now, I can tell you that it is very frustrating to see that it is not possible to make sure Drupal is 100 % German after using the standard installation profile, today.
An other thing here is that we have to make some Screenshots for the User Guide containing as less English strings as possible. We can do that by overwriting strings locally prior creating Screenshots, but to be honest that feels like lying to the end user in terms of good user experience.
We need to talk about that issue more often to enhance the experience for non-English end users. Since some of us are, working on a new localize.drupal.org we should take another look at this to fix it.
I am absolutely aware of the fact that there, are, some other issues preventing Drupal from being 100 % translated when you install it using a different language than English, but good guidelines for developers would help to prevent such problems in the future.
Comment #61
smustgrave CreditAttribution: smustgrave at Mobomo commentedComment #62
tsotoodeh CreditAttribution: tsotoodeh commentedIn my opinion, documenting context strings and implementing a filter based on these type of string is a necessity. Without proper documentation finding context strings is not an easy task, indeed.
Such situation would lead to over defining what is already at hand, The example would be a date format strings required (
l, F j, Y - H:i
) in multilingual website, many people create and customize a new format without knowing that the format string could be translated.