Hello.
I represent the members of russian translation team. Today we've discussed the complexity of adding new context and the problems it causes. The major problem is it's so unclear that for now our team has suggested 0 contexts. Even though we constantly find new strings that obviously need to be split into contexts.

Solution

So far we found only one solution for that: to write a script that will help translators. It supposed to automate the creation of issues for each project which has a string found to be context-dependent. The main idea is following:
Translators gives string ID and the list of contexts to suggest.
Then a script:

  1. loads ldo page for that string and parses it
  2. gets the list of the projects related to this string
  3. determines the last stable 7.x+ version of each module
  4. generates several patches for it: one patch for each context (to let module authors decide which context suits best for them)
  5. generates the text for the issue. Including the link to this page. And explicit warning that generated patch is only supposed to be correct: instead of applying it, the author should only use it as reference and manually define the context for each occurance of this string in his module.
  6. And, finally, it creates a bunch of issues with patches attached (1 module = 1 issue) under our team's "bot account".

Solution's disadvantages

This is the best solution we found. But it's kind of "dirty" in some ways:

  • drupal.org doesn't like when it's pages are parsed by some scripts.
  • some strings have a really big number of modules using it (like 50+ or even 100+). So the script will need to create corresponding amount of issues. Which drupal.org doesn't like, too. Especially if it's done under bot account.
  • the better solution would be adding similar feature to the l10n_server itself, but none of us is experienced enough to write that submodule for Localization Server.

So here we are, asking for help. How is it possible to overcome the described problems? Or, maybe, we're looking in the wrong way, and there's much more efficient and "clean" way to do our task?

Comments

Lex-DRL’s picture

Issue summary: View changes
podarok’s picture

Issue summary: View changes
Lex-DRL’s picture

Issue summary: View changes
andypost’s picture

There's a lot of work happens now in D8 to provide contexts for strings, it would be awesome to find a set of contexts that could be reused (menu-link context was commited to translate 'Extend' menu item)

Related issues:
#2114069: Routing / tabs / actions lack way to attach context to UI strings
#2119551: Add string context support to menu system
#2120235: Regression: routing / tabs / actions / contextual links lack way to attach replacement arguments to UI strings

Lex-DRL’s picture

Yep, D8 (as expected) will be much better, then D7. But we're talking about D7 now.

Gábor Hojtsy’s picture

I agree a way to easily submit an issue for a string would be useful. Not only for contexts. If you find a typo in a string, or its hard to translate for lack of context or it contains all kinds of obscure HTML, etc.

Part of the reasons it was not implemented are the ones you listed. Eg- I'm not sure submitting issues for all the uses of the string is useful or scalable. Also, contexts need to be made up by humans, oftentimes new ones which are not yet there. Eg. Drupal 8 has the string "Extend" displayed prominently in the admin menu. This in English may mean "extend length of time", "extend length of distance" or "extend with components" and even "extend my best wishes". The Drupal 8 menu means the component/module one. That is not a pre-existing context. So the context needs to explain what the word means to help translators. Picking the right context involves human choices. That cannot be automated really.

The issue submission may be automated, but those have the problems listed, mostly scalability :/

podarok’s picture

contexts are module(file) aware where t() invoked
filename is enough to explain the word means to help translators (menu.module or location.module). It can be automated, cause 100% context collisions are with strings 1 or very rarely 2 words - that is ~1-3% from all the strings of all the code from drupalcode.org
I`d wrote some code for t() changing that automatically create context based on filename from where t() function invoked
tonight i`ll post a patch for review after looking how l10n_server get strings from the code

Gábor Hojtsy’s picture

@podarok, sounds like that would lead to you needing to translate the same string again based on how many files it appeared in. Eg. look for the string "Operations" or "Home" on localize.drupal.org and count how many times you would need to re-translate it based on the list of occurrences displayed.

podarok’s picture

#9 Yup, but it is not a problem, it is feature and can be automated for translate easely

Lex-DRL’s picture

Hm... looks like I didn't explain my suggestion clear enough.
I didn't mean to automate the generation of contexts themselfes. No, new contexts are suggested by translator when string collision is found. And then this translator should also discuss this string with others from his team to choos the best context suggestions for it.

I only suggested to automate the process of issue creation for all the projects related to this string. Suggested context names needs to be entered manually. By human. After discussion.
No need to generate tons of extra context - only when really needed.

Also, generating contexts based on module (or module file) has one more problem: it's possible that a single module (and even submodule) uses the same string several times with different context each time.

podarok’s picture

it`s not a tons... it`s only 1% of all the strings that not coded with context

Lex-DRL’s picture

Anyway, I don't think it's good idea to generate context names themselves. The reasons are above.

podarok’s picture

andypost’s picture

Suppose better to figure out a list of common contexts as we have for cache and query tags.
And better to keep the list is short to make filter form useful

Lex-DRL’s picture

#15
Yes, and that's why I'm sure each context name have to be discussed first, before launching this "mass issue creation" script.
But I don't think it's possible to cover all the multiple-meaning strings with only "common" contexts. No matter how detailed they are. Yes, contexts like "menu" and "button" should be reused as many times as possible.
But there are strings which just have several meanings by themselves. Like "state" (as piece of country and as in "checkbox state") or "won" (a form of "win" verb and a currency).
Common contexts will be found as a byproduct of creating contexts at all - it's obvious. But I won't say that searching for that common contexts itself is a primary goal.