We had a phone call with the following folks to figure this out and discussed concerns raised in #1448330: [META] Discuss internationalization of configuration:

- Gábor Hojtsy, D8MI lead
- Greg Dunlap, CMI lead
- Wolfgang Ziegler, Entity API master
- José Reyero, i18n module master
- Franceso Placella, Multilingual field/entity master
- Károly Negyesi, Super core developer
- Alex Bronstein, Super core developer
- Daniel Kudwien, Super core developer
- Angie Byron, Super core developer

The overall agenda was to come to firm agreement on how to approach storing multilingual configuration data. Most but not all people on the call agreed to this setup.

Defining translatable configuration properties

We evaluated about 7 different variations of how multilingual configuration could be stored. The approach we decided on was that on any properties that are translatable, store those languages as sub-keys. For example:

core/modules/contact/config/contact.category.feedback.yml

# This is a machine name. It should not be translated, even though it's a string.
name: feedback

# Category, on the other hand, is something shown to end users, so should be shown
# in their language. Subkeys hold language specific values.
category:
    en: 'Website feedback'
    # Site-specific configuration file might also contain:
    # de: 'Internetseitenrückmeldung'
    # fr: ...

# These values are just standard, non-translatable configuration.
contact_threshold_limit: 5
contact_threshold_window: 3600

This allows the configuration system to load the entire configuration object, along with all of its related translations, all at the same time and keep them together.

Adding multilingual-ness to the config API

To set and retrieve multilingual values, an optional "sub-key" argument will be added to $config->get() and $config->set() so you can do:

// Retrieve feedback category configuration.
$config = config('contact.category.feedback');

// Set a non-translatable value.
$config->set('site_mail', 'bananas@example.com');

// Set a translatable value; specify the language code.
// Q: So any module setting configuration does need to know whether a property is translatable or not? -- Yes.
$config->set('category', 'Website feedbackorama', 'en');
$config->save();

// Retrieve the German version of category.
// @todo: Open question remains about what to fall back to in case 'de' is not defined.
// @todo: Need to have the original submission language defined in the config file too.
$config->get('category', 'de');

General metadata

We want to keep the configuration system focused on actual configuration key/value storage and management, not requiring it to have deeper knowledge of other things that might layer on top of configuration, such as language. To facilitate this, we talked about a hook for layering metadata on top of configuration data, like so:

/**
 * Implements hook_config_info().
 */
function hook_config_info() {
  return array(
    'contact.category.feedback' => array(
        // Identify the 'category' property as both translatable and required for validation.
        'category' => array(
            'translatable' => TRUE,
            'required' => TRUE,
        ),
    ),  
  );
}

This metadata is NOT required for the above methods to work, because as discussed, the module code needs to be aware of translatable properties explicitly.

Pros/cons

This is a good thing from the standpoint of the following:

a) No need to duplicate configuration data in multiple files, or deal with messy merging of values from multiple files, as in the approach of something like core/modules/contact/config/contact.category.feedback.de.yml.
b) Switching to a different language during run-time is easier, as in the case of sending contact e-mails to a batch of 200 users, each of whom might have their own different preferred language.
c) It is similar in API/storage to how we store entity field values, except we do not force a 'und' key on all language independent values.
d) Ensuring the configuration object remains idempotent. In other words, the following should always get you the identical object back, regardless of current language context:

$config->load('foo.settings');
$config->get('bar', 'de'); // yields: "beer"
$config->save();
$config->load('foo.settings');
$config->get('bar'); // yields: "whiskey"
$config->get('bar', 'de'); // yields: "beer"

The main disadvantages with the approach are:

a) The source file and destination file will not match in terms of their keys (e.g. contact module's shipped YAML will only define an "en" value; but a given site might have 4-5 languages defined. A "Restore from defaults" operation would destroy data. However, we looked into the current behaviour in Drupal 7 and below with similar operations and found that the same thing occurs today. So for now, we plan to punt on this issue so that we can move forward, even though we might conceivably look into a more robust solution later.

b) Arguably, this pushes multilingualness of the system into the configuration writers' faces, when it is not used on the majority of Drupal sites. However, we were able to debunk this concern by pointing out that an author of a custom module for an English-only (or French-only, for that matter) site that never intended to be translated wouldn't have to define sub-keys at all. A simple category: Website feedback would do.

c) This doesn't leave any room to implement other configuration 'realms/domains' later with the same system. We discussed that context- and domain-specific configuration overrides are higher level topics, and language is inherently tied to configuration.

Summary

This approach, in short:
- Keeps knowledge about language out of the config system (which remains to be a simple key/value store).
- Does not attempt to merge or provide language-specific values for config objects that are not language-specific in the first place (saves performance; i.e., no enforcement of a baked-in language concept that is enforced onto all config objects, whether they need or want or not).
- Leaves the decision of what is language-specific or translatable to the actual implementation layer/logic (e.g., module) - which has to load/get/set/save affected keys specially either way.
- Does not attempt to solve the problem space of context/domain-specific configuration "overrides" at all (which is different to begin with).
- Retains and ensures idempotence for configuration objects; i.e., config()->load()->save()->load() does not destroy any values that might be language-specific.
- Leaves the problem space of providing meta-data to declare certain keys as translatable in a config object to a separate higher-level module/subsystem (which should declare required, multiple/cardinality, etc at the same time) and keeps that additional business logic out of the config system, which remains to just simply perform CRUD operations on configuration objects (and handle synchronization between storages).
- Attempts to provide an interface for developers that is consistent with the entity API (as much as possible).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Gábor Hojtsy’s picture

Title: Multlingual/translatable configuration » Multlingual/translatable configuration [OPTION A]
Issue tags: +D8MI, +sprint

Putting this on the sprint and marking up for D8MI and titling to tie up with #1616594: META: Implement multilingual CMI which is OPTION B.

Gábor Hojtsy’s picture

Issue summary: View changes

Post call summary

Gábor Hojtsy’s picture

Issue summary: View changes

Minor layout tweaks

Gábor Hojtsy’s picture

Issue summary: View changes

Add note on original language

chx’s picture

Seems to me that in order for just get('foo') to work when there is a translation we need to provide something in the config file itself to indicate to the config system it needs to shop for the default language. Something like:

category:
  _translated: yes
  en: 'Website feedback'
Gábor Hojtsy’s picture

@chx: we'd rather need the original language identified, that should be enough in itself:

category:
  _language: en
  ar: '....'
  en: 'Website feedback'
  fr: '....'
  de: '....'
chx’s picture

We were discussing with Gabor of how you get after you run $config->set('category', 'translated value', 'fr');. Do we mandate $config->get('category', language_default()->langcode)? Do we add a wrapper to do that? But we do not want the config system to know about i18n. So perhaps change the _language key to _default and have $config->get('category'); load that and let the module author specify a language? But most module authors do not want to care about language. They wrap their strings in t() and don't care. So #1617334: Multilingual/translatable configuration reusing t() for i18n config() [OPTION C] was born.

chx’s picture

A short comparison between option A and C by Gabor, since they are so similar and at the same time different :)

Characteristic OPTION A OPTION C
Storage Translatable config keys are list of values with a key designating the original key. Language is not specifically the target for this feature.
Set $config->set() method gets generic subkey argument to make it easier to set a specific subkey (language) value.
Get $config->get() method gets generic subkey argument to access subkeys direct. Fallback not defined, and if subkey is not passed, the fallback logic would need to be inside the CMI system (or get() would need to return an array, similar to C and an outside wrapper would be needed). $config->get() will return the list as-is (no additional argument). t() proposed as wrapper to pick from the list (or individual values can be accessed from the array without the wrapper).
chx’s picture

Battle plan: set($key, $value, $subkey = '', $type = 'language'), get($key, $subkey = NULL, $type = 'language') add a function called config_get_language or something like that. So if you get a config array which has _default set then we call the config_get_$type function which handles the whole fallback / defaults logic which option C has in t(). If you pass FALSE for $subkey , we skip that so the translation UI can grab the whole thing. Everything handled.

chx’s picture

Status: Active » Needs review
FileSize
3.22 KB
chx’s picture

FileSize
3.24 KB

Sorry I forgot the $subkey FALSE case. Usage examples:

$config = config('contact.categories');
$config->set('somecategory', 'french value', 'fr');
$config->set('somecategory', 'spanish value', 'sp');
// This gets somecategory back in the current language
$config->get('somecategory');
// This gets somecategory back in French because a) German version doesn't exist b) fr was the first saved hence the default.
$config->get('somecategory', 'de');
// This gets an array of all translations.
$config->get('somecategory', FALSE);
Berdir’s picture

Status: Needs review » Needs work

I like the idea. Will of course need to usual fluff (tests, proper api docs, ...)

- Can you delete a translation? What happens if that was the default?
- So to make a variable translatable, you would need to do "$config->set('something', $value, drupal_container()->get(LANGUAGE_TYPE_INTERFACE)->langcode), right? And each module will be responsible itself to decide which variable is translatable and which isn't?

chx’s picture

Yes you can delete a translation. If it was the default? You tell me. I am no i18n coder. $config->set('something', $value, $the_language_the_value_is_in); I have some doubts whether it'll be the current langcode. It might or might not be. But yes, you need to specify a langcode with set.

cosmicdreams’s picture

+++ b/core/includes/config.inc
@@ -86,3 +86,21 @@ function config_get_storage_names_with_prefix($prefix = '') {
+  if (isset($data[$langcode])) {
+    return $input[$langcode];
+  }

use of $input, you've likely already fixed this via IRC comments

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.php
@@ -86,7 +86,7 @@ class DrupalConfig {
+  public function get($key = '', $subkey = '', $type = 'language') {

@@ -131,14 +139,29 @@ class DrupalConfig {
+  public function set($key, $value, $subkey = '', $type = 'language') {

I thought we didn't want to do this. I thought the specialized nature of our config was only for the support of multi-language. This allows config to be specialized for whatever reason you want.

like config->get('variable', 'development')

chx’s picture

Status: Needs work » Needs review

I am merely setting CNR to solicit more opinions.

#11, I feel not hardwiring language is a good idea. it's only a wrapper. unlike many other ideas, this one is a gentle change. You can do config->get('variable', 'development') yes but it will be still the same configuration storage and all.

chx’s picture

FileSize
5.53 KB

Here's some fluff as Berdir asked. 8 passes.

chx’s picture

FileSize
5.89 KB

I am told that the patch will fail because I forgot to git pull and config tests moved.

sun’s picture

I don't think there's a point in caring for other "types"/contexts.

The example that @cosmicdreams gave is irrelevant here, since that means a realm/domain-specific override, which is an entirely different can of worms.

Aside from that, this patch and entire approach should be considered "pre-alpha prototype material".

We need and want to know about any possible concern or question that you might have. It is perfectly possible that the concern you will bring up might invalidate this approach entirely. That's why we're here, that's why we're discussing and exploring this approach. So please do not hesitate to ask anything.

chx’s picture

Huh? This is not "pre-alpha prototype material", this patch perhaps could use more documentation but aside from that, I consider it done. It does everything I am told a multilanguage extension to CMI should do: stores and retrieves translations, handles fallback logic and allows for getting all translations in one call. It satisfies my CMI concerns: it's extremely simple (23 LoC added to DrupalConfig.php and some of that is comments) and the Config directory in lib is still not aware of multilangual.

Edit: the only thing I am considering adding is a setTranslateable($key, $value) which would be a 1 LoC wrapper around set($key, $value, drupal_container()->get(LANGUAGE_TYPE_INTERFACE)->langcode).

Status: Needs review » Needs work

The last submitted patch, 1613350_14.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
5.92 KB

When I moved the config test I didn't move the use statement along with it.

Status: Needs review » Needs work
Issue tags: -Configuration system, -D8MI, -sprint

The last submitted patch, 1613350_18.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
Issue tags: +Configuration system, +D8MI, +sprint

#18: 1613350_18.patch queued for re-testing.

Gábor Hojtsy’s picture

Status: Needs review » Needs work
+++ b/core/includes/config.incundefined
@@ -86,3 +86,21 @@ function config_get_storage_names_with_prefix($prefix = '') {
+/**
+ * Wrapper for choosing the right translation from a config data array.
+ */
+function config_get_language($data, $langcode) {
+  if (!$langcode) {
+    $langcode = drupal_container()->get(LANGUAGE_TYPE_INTERFACE)->langcode;
+  }
+  if (isset($data[$langcode])) {
+    return $data[$langcode];
+  }
+  elseif (isset($data['_default']['language'])) {
+    return $data[$data['_default']['language']];
+  }
+  else {
+    return $data['en'];
+  }
+}

Would be good to explain a bit more what this does. Pick the language value requested or the interface language value if available or falls back to English if available.

Why is $langcode not optional?

Sounds like the 'en' value should be checked for existence too. If we are at that point and the 'en' value is not present, we should drop an exception. I mean if your config was hand created and does not have _default, and you request a language that was not present (and 'en' was not present either), this would result in a notice and NULL returned at the moment.

Thinking of that if the config specifies a default but there is no such value, an exception would be in order too? Not sure how strict are we getting now in D8 :)

Finally, we previously used _default as-is for the key value, not a subkey of _default. This might be overly expressive, no? If we want to keep the subkey anyway, Drupal 8 uses 'langcode' for whenever a language code is identified so we should name the subkey langcode. So ['_default']['langcode']. Looking at the code it looks like this would have implications for function naming even.

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -86,7 +86,7 @@ class DrupalConfig {
-  public function get($key = '') {
+  public function get($key = '', $subkey = '', $type = 'language') {

phpdoc was not updated on get().

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -103,14 +103,22 @@ class DrupalConfig {
+    if (is_array($return) && $subkey !== FALSE && isset($return['_default'])) {
+      $function = 'config_get_' . $type;
+      if (!function_exists($function)) {
+        throw new InvalidTypeException('Invalid type "'. $type . '" requested.');
+      }
+      return $function($return, $subkey);
+    }
+    return $return;

The function is supposed to work even if _default is not provided, so why require that for calling the function?

+++ b/core/lib/Drupal/Core/Config/InvalidTypeException.phpundefined
@@ -0,0 +1,9 @@
+/**
+ * @todo

Todo.

+++ b/core/modules/config/lib/Drupal/config/Tests/ConfigTranslationTest.phpundefined
@@ -0,0 +1,61 @@
+ * @file
+ * Definition of Drupal\config\Tests\ConfigTranslation.

ConfigTranslationTest no?

+++ b/core/modules/config/lib/Drupal/config/Tests/ConfigTranslationTest.phpundefined
@@ -0,0 +1,61 @@
+/**
+ * Tests the secure file writer.

This is certainly not the secure file writer :)

+++ b/core/modules/config/lib/Drupal/config/Tests/ConfigTranslationTest.phpundefined
@@ -0,0 +1,61 @@
+    // Set the default language to french.
+    $new_language_default = (object) array(
+      'langcode' => 'fr',
+      'name' => 'French',
+      'direction' => 0,
+      'weight' => 0,
+      'default' => TRUE,
+    );
+    variable_set('language_default', $new_language_default);

The functionality does not actually depend on the default language per say. It depends on the interface language. Possibly more elegant would be to language_save() the French as default (which swaps in the variable and properly adds French in the system otherwise too).

chx’s picture

Status: Needs work » Needs review
FileSize
7.8 KB

That @todo stays because the rest of config exceptions are @todo 'd too. The rest has been adjusted.

chx’s picture

FileSize
8.41 KB

And with more and better tests. (dedicated to webchick)

Gábor Hojtsy’s picture

Status: Needs review » Needs work
+++ b/core/includes/config.incundefined
@@ -86,3 +87,30 @@ function config_get_storage_names_with_prefix($prefix = '') {
+  elseif (isset($data['_default']['langcode'])) {
+    if (!isset($data[$data['_default']['langcode']])) {
+      throw new ConfigException('Invalid default set.');
+    }
+    return $data[$data['_default']['langcode']];
+  }
+  else {
+    if (!isset($data['en'])) {
+      throw new ConfigException('Invalid default set.');

Would be great if the two exceptions were different. The second one is actually about 'No default set and English not available.', right (while the first is indeed 'Invalid default set'?

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -75,18 +82,17 @@ class DrupalConfig {
+   * @param $subkey
+   *   Typically a language code. The meaning of this is determined by the $type
+   *   parameter.

Sounds important to document that if FALSE is passed, the whole array is returned. That is the only way to get the whole array, right?

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -75,18 +82,17 @@ class DrupalConfig {
+   * @param $type
+   *   Same as was used for set(). Determines the wrapper function if a $subkey
+   *   was used for set().

Could be useful to mention the callback is 'config_get_by_$type'. Also, say its a callback not a wrapper IMHO, it does not wrap anything, its a named callback, no? :)

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -103,14 +109,22 @@ class DrupalConfig {
+    if (is_array($return) && ($subkey || ($subkey === '' && isset($return['_default'])))) {
+      $function = 'config_get_by_' . $type;
+      if (!function_exists($function)) {
+        throw new InvalidTypeException('Invalid type "'. $type . '" requested.');
+      }
+      return $function($return, $subkey);
+    }

Would be good to have a short explanation above this. Something like:

// If the value is a list and a subkey was provided or
// the list has a fallback default, attempt to invoke
// logic to pick a value.

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -131,14 +145,29 @@ class DrupalConfig {
+   * @param $subkey
+   *   Typically a langcode but can be anything. If the config data is new,
+   *   this will be saved as _default.

Given we store langcode *under* _default, I don't think its true to say "saved as _default". It is saved with the $type subkey under _default though.

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.phpundefined
@@ -131,14 +145,29 @@ class DrupalConfig {
+   *   If a $subkey is used for the save, this will define the wrapper
+   *   function called by get() to handle fallback. Defaults to language and
+   *   rarely needs to be changed.

Defualts to langcode.

+++ b/core/modules/config/lib/Drupal/config/Tests/ConfigTranslationTest.phpundefined
@@ -0,0 +1,69 @@
+  function testConfigTranslation() {
+    $config = config('config.test');

All language names under here should start uppercase. french, english, german, etc.

Gábor Hojtsy’s picture

Been discussing this more with @catch. We talked about the entity API similarities. Entity getters do look up the default language value if available (and langcode was not specified) but will not do magic language fallback lookups beyond that. The language based fallback is in the display layer (and uses the whole language configuration chain).

Think about how this will be used. For a translatable site name, the interface language is needed indeed. For a translatable contact category name (used in email subjects), the user's preferred language will be used, which the calling code needs to provide. Then for displaying a field label in an entity display for example, the current *content* language will be needed (not the interface language). So assuming the interface language at all times might not be good. Question is how do we want to resolve that. Option C resolved this by utilizing a wrapper and always returning the whole array data. This patch provides that too, so if we think the interface language (or specific language) are the most common, then we can go with custom logic for other cases.

So the getter has 3 use cases in this patch as far as I see:

// 1. Will fall back through interface language => default value => English (English only provided for convenience if config does not provide default key).
$config->get('key');

// 2. Will fall back through default value => English (English only provided for convenience if config does not provide default key).
$config->get('key', SOME_VALUE_THAT_IS_NOT_A_VALID_LANGCODE_BUT_IS_NOT_EMPTY_EG_TRUE);

// 3. Will get the whole array with all values. This can later be used to pick any value at will.
$config->get('key', FALSE);

3 is akin to how fields are represented in arrays, 2 is kind of how the entity property get()er works if there was no argument (if we ignore the English convenience), 1 is really unique to this patch. However, the invocation pattern for 2 looks pretty awkward, no? 3 is the equivalent getter from option C.

Gábor Hojtsy’s picture

Also

// 4. Lookup for specific language. Falls back as $langcode => default value => English.
$config->get('key', $langcode);

4. is not very consistent with how the entity getter works either. The entity getter does not do fallback if a language was specifically asked for. It gets you the value. It will be NULL if the value was not available in that language.

So basically our logic there is that the getter/setter should get you what you asked for. If you did not provide language, you get the language the value was originally submitted in (_default langcode in this case). Otherwise you just get the value for the langcode provided. If you want to utilize certain fallbacks, you'd do that in render.

As is this patch currently, if I get a value, it falls back to another language, I cannot tell if the value was set in the language I wanted to have. Then I save the value, the config storage will change, since now the value will be saved in the language.

// This changes what is stored if get() falled back for $langcode, because it was not available.
$value = $config->get('key', $langcode);
$config->set('key', $value, $langcode);

I don't think this is what we expect to happen.

Gábor Hojtsy’s picture

Just had a great discussion with berdir about the idempotence and the 3 ways in which people would need to use this API (for frontend language fallback, backend forms and shared code):

berdir: GaborHojtsy: Hm, isn't $config->get() quite similar to t() as it stands now? (Except of the different fallback mechanism) ?
berdir: GaborHojtsy: no argument defaults to current interface language
berdir: GaborHojtsy: providing a language explicitly falls back to passed string (t()) / default (config)
berdir: GaborHojtsy: not sure why the en fallback is necessary, though

GaborHojtsy: berdir: that was to assume _default langcode en for config files basically
GaborHojtsy: berdir: if you don't say your config default language, it would be en

berdir: GaborHojtsy: which will never happen with set(). So it's only for the case that you manually create a default config file initially, which however still requires that you explicitly specify the en subkey but "forget" the default. right?

GaborHojtsy: berdir: yes
GaborHojtsy: berdir: for this getter/setter, the main problem is that you never know you got what you asked for, and when writing code, I think its good to know what data you are actually working with 
GaborHojtsy: berdir: you can't really do a get/set sequence to change something because you might have gotten something else vs. what you expected
GaborHojtsy: berdir: the only way is to get the raw array and manipulate that (or use set() after looking at that) really with the current patch

berdir: yes
berdir: but how often do you really get a specific language key and then write it back, except when actually translating it?

GaborHojtsy: berdir: on all your admin UIs at least

berdir: GaborHojtsy: not with a specific language?

GaborHojtsy: berdir: ?

berdir: GaborHojtsy: by default, the admin ui will get and set a setting without an explicit language, no? Whatever the widget/UI will look like to translate it (e.g. something similar to i18n_variable's language switcher widget), then it should be in a way that I don't need to care about.

GaborHojtsy: berdir: well, the admin UI will manage values in some language
GaborHojtsy: berdir: even if we assume the built-in UI would always only manage the original config values
GaborHojtsy: berdir: they are in a language
GaborHojtsy: berdir: so with the current option A patch, you'd need to use this variant to get the value for every code that manages the admin form in your thinking:
GaborHojtsy: berdir: $config->get('key', SOME_VALUE_THAT_IS_NOT_A_VALID_LANGCODE_BUT_IS_NOT_EMPTY_EG_TRUE);

berdir: GaborHojtsy: maybe with a slightly shorter constant (;)), but yeah, got it
berdir: because you want to get and set the default, no matter what current language yo're in...

GaborHojtsy: berdir: yeah, you can do $config->get('key', TRUE); just wanted to point out it is non-trivial what TRUE means there (where FALSE gets you an array, TRUE gets you the default value)
GaborHojtsy: berdir: then frontend/display/output oriented code would use the ->get('key') thing without a language so the magic can happen
GaborHojtsy: berdir: and any actual translation management code would need to use the whole array ->get('key', FALSE) because that is the only way to get the real data without the fallback uncertainties
GaborHojtsy: and code you want to share between the three, well....
GaborHojtsy: don't know 
GaborHojtsy: berdir: that is why entities carry all their field data, so you can decide later what you need
GaborHojtsy: berdir: option C was attractive because we got that and then we did the language lookup "later"
GaborHojtsy: berdir: because your backend and frontebd code would need to use different language fallbacks, the only way to share code between them is to not do the language fallback in the shared code

berdir: GaborHojtsy: yep, with the downside of using the "get current translation" magic. (you still have time magic, but need an explicitly function call to trigger it)
berdir: using = losing

GaborHojtsy: berdir: yeah, so the point is that reusable code would basically need to do the ->get('key', FALSE) lookup and pass around the array

berdir: GaborHojtsy: the example that I used yesterday was that we currently have a site where the client wants a different UA- google analytics code per language. I'm 99% sure that the author of the mdoule would *not* have thought of that, so if there is no ct() call around get(), I have no way to change it per language. when the magic is in get(), I do

GaborHojtsy: berdir: so the question is who would understand this (two separate fallback paths, etc) and would we get people to code proper with this API
GaborHojtsy: berdir: you don't care about your UA settings being overwritten and getting lost? 
GaborHojtsy: berdir: if the module does a get/set/save and the get falls back, your set will overwrite something else
GaborHojtsy: berdir: I mean *their set* in the module
GaborHojtsy: berdir: if the code is not aware that the value can vary, these accidents can easily happen

berdir: GaborHojtsy: yes, I know. I guessed that there would be a more or less easy way to alter the settings form and "fix" it there, assuming that's the only place where it is set. I'm talking about custom code and I don't see another way currently for something like this

GaborHojtsy: berdir: happens now with i18n, it injects values in $conf, and then if a variable_set() save happens, the injected translation can overwrite your original value
GaborHojtsy: berdir: yeah, if you do custom form overrides for all affected forms, you can at least catch UI based changes
GaborHojtsy: berdir: not really solvable on the API level
GaborHojtsy: berdir: and its painful.... 

berdir: GaborHojtsy: true. but with global $conf, you can hack your way such things. with the current $config->get(), it's just not possible

GaborHojtsy: berdir: definitely CMI will need to say something to users of domain module, right, so some kind of overrides will be there anyway, that can still be used in this hacky way
GaborHojtsy: berdir: last I've heard there is still a global override possibilitiy

berdir: ok, possible 

GaborHojtsy: berdir: so my point is that we want to avoid these issues with language support, not implement language support with a grand hack like before
GaborHojtsy: berdir: and for that the code needs to be aware of possible variations
GaborHojtsy: berdir: otherwise you get this get/set/save overwrite hell

berdir: GaborHojtsy: yes

GaborHojtsy: berdir: so there is a full context based proposal with runtime overrides from Jose at http://drupal.org/node/1616594
Druplicon: http://drupal.org/node/1616594 => Multilingual/translatable configuration with Context and Plug-ins [OPTION  => Drupal core, configuration system, normal, active, 2 comments, 3 IRC mentions
GaborHojtsy: berdir: which does solve this by implementing runtime override layers on top of config that are used when picking values; it cannot work around the need for the specific context for the data variation to be passed on when the value is get or set
GaborHojtsy: it attempts to generalize that though

berdir: GaborHojtsy: wondering if, instead of these magic boolean arguments, getDefault($type = 'language') and getAll($type = 'language') would be better for DX for option A

GaborHojtsy: berdir: its a lot more involved
GaborHojtsy: possibly
GaborHojtsy: berdir: I think the main question is if people will be able to learn that they have 2 language fallback ways to consider between fronted and backend code and they'd need to use the API in a very different way for shared code to satisfy both fallback possibilities (and do fallback / value selection later)
GaborHojtsy: berdir: the appeal of option C was that it forced you to always use the array with get, and you did not really have an option to do otherwise
GaborHojtsy: berdir: so one thing to learn 
GaborHojtsy: berdir: then you still need to invoke the right language fallback later but at least you cannot go wrong in getting the data
sun’s picture

Thanks! This is the exact level of discussion we need to have and what I meant earlier. I highly appreciate @berdir asking all these questions, so thanks, and keep them coming, please! :)

  1. you can't really do a get/set sequence to change something because you might have gotten something else vs. what you expected

    This is extremely concerning.

  2. The entire special/internal _default subkey is concerning.

    When I proposed this variant, I wasn't aware that something like this would be required. It introduces state / behavior within a config object, and that's very concerning. This state information can get outdated or inappropriate/bogus very easily. When that happens, your site will suddenly throw lots of errors. And I'm saying when, not if, because there's nothing that would prevent it from happening, which in turn means that it will happen.

    So this wasn't part of the originally envisioned idea at all. I'd therefore encourage everyone to think about ways out that do not require to have such state info in config objects. One possible option would be to perform actual language negotiation/fallback handling based on actual site languages when attempting to retrieve a value (in current language but also for a specific language). I hope there are further possible options.

    If we are not able to resolve that, then I almost guess that I will withdraw my support for this entire idea and will go back to the drawing board instead.

  3. Let's also consider the installation of (module or install profile) default config case on a site that has no English language configured at all — the default config only contains an English value. As long as the config is not edited, the system will basically have to fall back to the English value all of the time, even though there is no English language on the site at all. And if you eventually edit it, then something weird happens:

    1) In the admin form, you either see no existing values, or you see the English values as default values. Both cases sound a bit weird.

    2) When filling in values for your actual site language, then those values will (most likely) be written in addition, retaining the English values. But still; even though there is no English language on this site.

    Both of these also apply to the cases A) When you install a new language on your site, and B) When you change the site's default language, removing the previous language (e.g., monolingual site).

PS: Also note that I've edited the IRC log in #27 for readability.

Gábor Hojtsy’s picture

@Sun: 2: For _default, we need *some* way to tell which was the original language that the value was submitted in, which is what is being edited on the regular admin form for the config (governed by permissions for that config). It is also useful to identify the "source" language for translation. It is also useful if you want to provide a getter setter that has optional langcode, since you need to tell which langcode to pick. It also makes config consistent with entities, where a source language is also present. The _default is meta-information of sorts about the piece of config being managed to inform workflows, permissions, admin UIs and display.

3: For installation on a non-English site when the source is English, I've heard the CMI config would be copied from the module dir to the site's config. In that copying, t() can be applied to translatable pieces in the config (we can identify them in this option A by _default/langcode/en). For those things that did not have a result from t(), the source will be copied as-is. We'd mark that config as in that foreign language then. If the site does have English, the English config can be copied 1-1. If there is no English on the site, the config should not have values in English as default IMHO.

chx’s picture

I can reopen Option C if you like that better. I think a summary of concerns is this

  1. A settings form asks for a translation in, say, de but get en instead.
  2. On save the current en version is copied to de.
  3. en gets changed and now de is out of date.

Now:

  1. If you have an English setting and a German setting and update the English one, the German is outdated no matter whether it's actually in German or a copy of English. So you know you need to update the German one anyways.
  2. A translation UI would certainly offer the English and every translation so you would see the situation.

In short, I fail see how this scenario poses new problems. Please note that the fallback scenario can only happen if the de didn't exist before so we will not override an existing translation.

chx’s picture

Or is the concern that you can not know what is translated and what is not? That's a somewhat valid concern. However, I am not sure how will we handle the settings form scenario because when the user presses save how do you know whether it's intended to be translated later and so you should not save the string? You present a checkbox? But if you do that then this version knows equally. In short, I still do not see how this poses new problems.

chx’s picture

Final comment: if you guys do not like the fallback logic and want NULLs instead, ripping it out is trivial. I predict a lot of teeth gnashing, however. I feel we need fallback -- you add a language and poof your site is empty / broken even due to the lack of config in that language?

Gábor Hojtsy’s picture

@chx: maybe I did not express the choice quotes well enough. The main problem is that you have four ways to invoke the getter here:

A) If you provide no language, it falls back through 3 languages (and you don't have control of either),
B) if you provide a language, it falls back through 3 languages (the first one being what you provided),
C) if you provide TRUE, it falls back through 2 languages,
D) if you provide FALSE, it gives you an array.

For writing backend code, A), B) and C) is not useful, since you cannot actually provide an API / admin UI to change concrete values, because you never know if you got that value or not. So practically, any reusable code, hook, callback, etc. would need to use D, which is however the most obscure, because we have no wrapper to run on it that would pick a value for you (like in the option C patch).

So A and B at least very much assume you are writing frontend code, and you don't want to go and do a set()/save() since you could easily add something you did not want to. C would be useful for cases when you wanted the default/original value. And then any code which wants to work for both cases would need to use D.

What I'm saying is that due to the 4 ways to use this API and our assumed (low) level of language know-how that people have, they will have a hard time picking what to use, and for any reusable code that would need to use D) they don't actually have helpers to pick language values later.

So I praised C in this light, since it only ever did D) (on purpose, to avoid this confusion, as you explained there), and then provided a helper. It definitely was an API much easier to understand and resulting in code easier to reuse in frontend/backend/shared hooks and callbacks. It is similar to entities being loaded with all their language versions and rendered later.

chx’s picture

We are fairly close to won't fix this. It seems the problems are not surmountable. I have re-opened Option C and copied the tests over.

plach’s picture

I've been mulling on this since our call and since then I had a concern about the signature of the proposed accessors: on one hand we said that we did not want to address other variation axes than language, OTOH the initial proposal features a generic $subkey element which might be very well used to try and retrieve a subkey that has nothing to do with language. The latter incarnations of the proposal go even further this way by introducing the $type element which, as @berdir is pointing out above, would probably make the 90% of developers skip language altogether. I have a similar concern wrt the current EntityInterface: by defaulting to the entity langauge we are guaranteed that almost no code will be written with language in mind, since everything will work perfectly without specifiying it. And this is a big issue because the more samples I see, the more I realize that there is no real way to avoid explictly specifying the proper language for things to work in a ML environment. t() (mostly) works because it's braindead simple to use (too much I'd say, as user input strings teach us), the DX of the current proposal is nowhere near that simplicity as @Gabor is pointing out in #33.

My point is that what we are missing here are contextual defaults, since there is no one-size-fits-all value that will work in every scenario. OTOH both ConfigObject and EntityInterface instances must remain simple data storages and not include contextual information. How do we solve all this mess and keep things lean and clean as per the initial proposal?

I propose to introduce some helper code that would figure out which are the proper variant defaults for a particular context and make the variant parameter required, so no one can forget about it. When I say variant I mean language and any other variant we might think of, which for D8 might equal to "just language". This is the same concept of Realms of Option B, I did not use it to avoid confusion, but I'd happily pick any term that is deemed better if my proposal is accepted. Enough talking let's see some code.

Accessors always require a variant specified. If the value is variable and no variant is provided all the values are returned:

// Config objects and entities would both implement this.
interface Variable {
  function get($key, array $variants = array());
  function set($key, $value, array $variants = array());
}
function my_variant_aware_function(ContainerBulder $container) {
  $config = config('my.config.my_entity_type');
  $label = $config->get('label_prefix');

  foreach ($label as $variant => $value) {
    // Do something smart.
  }
}

Here it is how things would appear to the API consumers:

function my_page_callback() {
  drupal_container()->set('context', new Context('view'));

  //...

  $entity = entity_load('my_entity_type', 1);
  $config = config('my.config.my_entity_type');

  // v() stands for variant, replace it with anything else you like.
  print v($config)->get('label_prefix') . ': ' . v($entity)->get('label');
}

v() would be a simple wrapper around a VariantResolverInterface implementation:

interface VariantResolverInterface {
  function setContainer(ContainerBuilder $container);
  function setData(Variable $data);
  function getDefinedVariants();
  function get($key);
  function set($key, $value);
}

function v(Variable $data, $container = NULL) {
  $container = !empty($container) ? $container : drupal_container();
  $resolver = $container->get('variant_resolver');
  $resolver->setContainer($container);
  $resolver->setData($data);
  return $resolver;
}

And here is how a dumb variant resolver would look like (no variants, say a monolingual site):

class BaseVariantResolver implements VariantResolverInterface {

  protected $container;
  protected $data;

  function setContainer(ContainerBuilder $container) {
    $this->container = $container;
  }

  function setData(Variable $data) {
    $this->data = $data;
  }

  function getDefinedVariants() {
    // Variants might be retrieved through some hook invocation or set into
    // stone depending on how much bold we are :)
    return array('langcode');
  }

  function get($key) {
    return $this->data->get($key, $this->getVariants());
  }

  function set($key, $value) {
    $this->data->set($key, $value, $this->getVariants());
  }

  protected function getVariants() {
    // By default we support no variant :)
    return array('langcode' => language_default()->langcode);
  }
}

A "real" context-aware implementation would be slightly more complex (for brevity I omitted a Context implementation sample):

class AdvancedVariantResolver extends BaseVariantResolver {
  protected function getVariants() {
    $variants = array();
    // Each context instance has a bunch of subscribers associated that provide
    // a variant value for that specific context.
    $context = $this->container->get('context');
    foreach ($this->getDefinedVariants() as $name) {
      // We pass also the data object as we might have subscribers for a
      // specific data type.
      $variants[$name] = $context->get($this->container, $name, $this->data);
    }
    return $variants;
  }
}

// These might probably be any callable type.

class EntityViewContextLanguageSubscriber {
  function get(ContainerBuilder $container) {
    return $container->get(LANGUAGE_TYPE_CONTENT)->langcode;
  }
}

class ConfigViewContextLanguageSubscriber {
  function get(ContainerBuilder $container) {
    return $container->get(LANGUAGE_TYPE_INTERFACE)->langcode;
  }
}

class FormContextLanguageSubscriber {
  function get(ContainerBuilder $container) {
    // Return the active form language no matter which type of data we are
    // dealing with.
    return $container->get(LANGUAGE_TYPE_FORM)->langcode;
  }
}

class SendMailContextLanguageSubscriber {
  function get(ContainerBuilder $container) {
    return $container->get('user')->preferred_langcode;
  }
}

Obviously in this scenario config should support multiple variants through nesting. Variants would be identified through a fixed order:

category:
    en:
      example.com:
        'Website feedback'
    de:
      fancypants.example.com:
        'Internetseitenrückmeldung'

Unfortunately this is not as simple as @chx was aiming, but at least the complexity is not exposed to the API consumers. My main concern is performance here, hopefully with some form of caching we would be able cope with it.

A couple of additional notes about other things said above:

  • Source language issue: fields do not have a source language specified, there is no such concept at storage level, it is built on top of it through a "global" value (at least in the entity scope), which is the entity language. For config we could have the same: I don't think we need to specify a different source for each config object, a global value as the i18n source string language in D7 could work here IMHO.
  • Admin forms and permissions: IMO we should distinguish here between string translation and per-language configuration, the former is only a subset of the latter. Translators might have access to translation pages where only string widgets appear as i18n does now, which is great. Multilingual config would take advantage of language-aware forms and be allowed only to roles having access to those forms in first place.
Gábor Hojtsy’s picture

I'm not sure separating context from the config API like that, while the variant resolver under the config API would use context is actually making it simple. It still does not make it simple to write reusable code (which might be invoked from different contexts or need to access different contexts to get/set). Also, as discussed above, context is not as simple as "Context('view')", for example, depending on whether your config data is about entity configuration (labels, allowed values), you'd need to use the content language to display it, while if your data is about the UI (views title, site name, etc.) you'd use the interface language. Config is not always interface.

Source language issue: fields do not have a source language specified, there is no such concept at storage level, it is built on top of it through a "global" value (at least in the entity scope), which is the entity language. For config we could have the same: I don't think we need to specify a different source for each config object, a global value as the i18n source string language in D7 could work here IMHO.

It was discussed before that putting the language on the top of the config structure also achieves the same data relation conceptually. I think people wanted to see it under the individual keys because it makes it easier to deal with (the whole configuration structure is not necessarily loaded in most cases unlike entities).

Admin forms and permissions: IMO we should distinguish here between string translation and per-language configuration, the former is only a subset of the latter. Translators might have access to translation pages where only string widgets appear as i18n does now, which is great. Multilingual config would take advantage of language-aware forms and be allowed only to roles having access to those forms in first place.

String-only widgets are a pretty ugly but possibly doable solution in D8. It would not work for things like making the site logo translatable or other things where a more involved widget is needed, and will be really ugly for views. I don't really know what could even happen for multilingual configuration in D8. 3 months passed since Drupalcon Denver. Contrast what we achieved in these three months with our less than 6 months left until code freeze.

plach’s picture

I'm not sure separating context from the config API like that, while the variant resolver under the config API would use context is actually making it simple. It still does not make it simple to write reusable code (which might be invoked from different contexts or need to access different contexts to get/set).

Would you please make an example? I ain't sure I get what you mean with not being able to write reusable code. Are you thinking of scenarios in which in 3 loc I switch 2 contexts? It does not seems likely to me. Again an example would help me here.

Also, as discussed above, context is not as simple as "Context('view')", for example, depending on whether your config data is about entity configuration (labels, allowed values), you'd need to use the content language to display it, while if your data is about the UI (views title, site name, etc.) you'd use the interface language. Config is not always interface.

Well, this was a simple example: context wouldn't be fixed and you could get any Context implementation you'd need to solve your complex use case.

It was discussed before that putting the language on the top of the config structure also achieves the same data relation conceptually. I think people wanted to see it under the individual keys because it makes it easier to deal with (the whole configuration structure is not necessarily loaded in most cases unlike entities).

I'm not talking about a language specific for a particular config object (file), I'm talking about a general, sitewide, string source language.

String-only widgets are a pretty ugly but possibly doable solution in D8. It would not work for things like making the site logo translatable

Anything that is not a plain string (we'd need metadata, yes, but that has already been agreed upon) would need access to the regular admin form. Making those ML aware should be pretty trivial. If building a string-only form is deemed too much work to be achieve before code freeze, for translators we could confine string translation into the regular translation UI as happens now in D7. Roles having both translation and administration rights could access the language-aware forms, which would use the regular widgets for every value, as is planned for entities.

Gábor Hojtsy’s picture

Would you please make an example? I ain't sure I get what you mean with not being able to write reusable code. Are you thinking of scenarios in which in 3 loc I switch 2 contexts? It does not seems likely to me. Again an example would help me here.

I was thinking of two types of code:

1. Generic (info-type) hook and callbacks, like hook_menu, etc. Let's put aside the hook_menu itself might not survive Drupal 8, there will be some similar way possibly to define menu items that are configurable. Current hook_menu() returns non-localized strings, so it can be later localized as needed, stored elsewhere, etc. You cannot really return a value from config storage as part of hooks like this since you don't know how they will be rendered later, in which language, etc. You basically need to carry around the required data (like entities do with fields, or hook_menu does now for built-in menu items, which has the source string which is a unique identifier to look up the translation).

2. Code which needs to use multiple contexts shortly in sequence, like an email module sending mails to hundreds of users (even contact module sends multiple emails with different language needs in a request). For this code, you'd just need to repeat setting the context and then invoking the config getter again. At least the context is known locally.

I'm not talking about a language specific for a particular config object (file), I'm talking about a general, sitewide, string source language.

Using a site-wide assumed default language for multiple things is something that I want to avoid by design in D8MI if at all possible. If you start to build out a site and want to add languages later, you'll be forever tied to your original language choice (which many people don't like as evidenced by people crying about changing default language). Even if we keep a global config default language separate from the site default language, it sounds very painful. It is again the same problem of moving the data away from what it is related to. I have the same wrong feelings about it like your above proposal to deouple context based lookups like that. It is supposed to make things simpler but it makes them more abstract, harder to understand and still does not give you a chance to ignore it anyway.

plach’s picture

1. Generic (info-type) hook and callbacks, like hook_menu, etc. Let's put aside the hook_menu itself might not survive Drupal 8, there will be some similar way possibly to define menu items that are configurable. Current hook_menu() returns non-localized strings, so it can be later localized as needed, stored elsewhere, etc. You cannot really return a value from config storage as part of hooks like this since you don't know how they will be rendered later, in which language, etc. You basically need to carry around the required data (like entities do with fields, or hook_menu does now for built-in menu items, which has the source string which is a unique identifier to look up the translation).

I don't see how this is different in my proposal. Let's keep the hook_menu() example: without my proposal you need to specify a subkey to avoid passing around the data array, hence you need a subkey value. If there is no valid default with my proposal you can do exactly the same: it's just a helper to get a default value, it does not prevent you from specifying an explicit variant/subkey. In fact it would be built on top of the OP proposal.

2. Code which needs to use multiple contexts shortly in sequence, like an email module sending mails to hundreds of users (even contact module sends multiple emails with different language needs in a request). For this code, you'd just need to repeat setting the context and then invoking the config getter again. At least the context is known locally.

The latest subscriber in the example above had exactly this use case in mind: it's not the context that is changing but the account object stored in the container, and this is perfectly legal I guess. I don't see what code could not be reused here.

Edit: however if you know where to pick the subkey value you have no need to use the resolver at all: just use the user language preference.

Even if we keep a global config default language separate from the site default language, it sounds very painful. It is again the same problem of moving the data away from what it is related to.

Right, I didn't think about the default langauge change problem. Then I guess putting the default at the same level of variants is the only choice here.

I have the same wrong feelings about it like your above proposal to deouple context based lookups like that. It is supposed to make things simpler but it makes them more abstract, harder to understand and still does not give you a chance to ignore it anyway.

I am afraid there is no way to ignore language if we want things always work even on ML sites. I'm pretty sure that going with fixed defaults will condemn us to fix tons of non-complying contrib code. Or we would need to make the subkey required.

chx’s picture

@plach I have some fears about the complexity of your suggestion. Have you seen #1617334: Multilingual/translatable configuration reusing t() for i18n config() [OPTION C] ?

plach’s picture

@chx:

Sure, I saw it and I took great inspiration from it. It's nice and simple, but it also handles only the view phase. For anything else you have to pick the proper language, even if you have no idea of what it could be in that particular context. My proposal is to push the responsibility of telling which is the proper default to the code defining the context, not on the shoulders of every single developer out there. Obviously this requires a richer infrastructure, which OTOH would be entirely optional. However you'd be encouraged to use it since it would make your life easier. That said, I don't expect dozens of contexts to be defined, I might be wrong but I think just a handful of them would be enough to make things work in the 90% of cases.

Gábor Hojtsy’s picture

@plach:

I think this comes down to the following: code that handles configuration needs to be aware of language eventually. We either need to make it aware right when retrieving it or later when "rendering" it. To make reusable functions aware of different language scopes that they can be invoked with, you can do the following things:

  • Retrieve the right language version directly
    • Add language argument through the whole invocation chain, always pass it on.
    • Pass language as implicit argument, such as with dependency injection, the context you are introducing, etc.
  • Instead of getting the actual data, put in an indirection
    • Retrieve all the data at the start and provide helper functions to pick the right value later. This is what entity field rendering does currently, as-is also what option C is about.
    • Retrieve an accessor (a class for example, or a representation of the config key) and use the accessor later to retrieve the right language version.

There is no escaping really to specify what language do you want. You either do it directly when retrieving and somehow pass the information there or you do it later and somehow pass on a meta-level instead of just the value.

effulgentsia’s picture

Status: Needs work » Needs review
FileSize
10.79 KB

I skimmed through some of the comments in this issue, but haven't fully internalized some of the meaty ones (e.g., #27, #35) yet. I think #23 is a good start except for this part:

+++ b/core/lib/Drupal/Core/Config/DrupalConfig.php
@@ -75,18 +82,17 @@ class DrupalConfig {
-  public function get($key = '') {
+  public function get($key = '', $subkey = '', $type = 'langcode') {
...
+      $function = 'config_get_by_' . $type;
+      return $function($return, $subkey);
...
+++ b/core/includes/config.inc
@@ -86,3 +87,30 @@ function config_get_storage_names_with_prefix($prefix = '') {
+function config_get_by_langcode($data, $langcode = '') {

Instead, what I'm thinking is more along the lines of renaming DrupalConfig to BaseConfig and keep that totally unaware of language/variants. And creating a new DrupalConfig that extends that with language aware get() and set() (and I suspect maybe clear()). The config() function already allows for the 'DrupalConfig' class to be swappable, and maybe by D8 release, we'll find a way to use the DI container so that that class can also be swapped out site-wide, not just for specific callers of config(). I think that should be enough to allow contrib to explore additional variants without it being something for core to take into account at all.

Here's an initial stab at this. What do you think? (The patch has a lot of minus signs, but that's just an artifact of most of DrupalConfig lines moving to BaseConfig, so it's probably easier to review by applying than in dreditor.)

chx’s picture

#43, I think you need to read #33 why anything that produces $this->assertIdentical($config->get('testkey', array('langcode' => 'de')), 'Spanish value'); is problematic.

Gábor Hojtsy’s picture

Status: Needs review » Closed (duplicate)

As per our discussion in Barcelona with heyrocker (CMI), webchick (core), merlinofchaos (views), reyero (i18n), webflo (i18nviews), Gábor Hojtsy (D8MI), xjm (views) and others, we are going with a base implementation of option B for now. In short we figured out we have a need for almost anything to be translatable/multilingual, we need some context information passed around and we need to make the system extensible due to the definite lack of time to solve all problems until code freeze, and all of those criteria lead us to work with option B. Let's focus our efforts on getting it done best!

Marking duplicate of #1616594: META: Implement multilingual CMI on the grounds of same problem space being solved.

sun’s picture

My name was missing in that list, but I was heavily involved in those discussions as well. Since I created and originally proposed this option/issue, I just want to clarify that I'm 100% on board with option B.

Gábor Hojtsy’s picture

Issue tags: -sprint +language-config

@sun: sorry for missing your name, my fault.

Moving off sprint to keep pointing focus where it's at.

Gábor Hojtsy’s picture

Issue summary: View changes

fix typo