Currently there is no general solution in core for making fields holding user-defined strings translatable. Some are outlined here: http://groups.drupal.org/node/18735.

Probably the most promising is to add built-in support for translation to the Fields API.

Key advantages would include:

* No need for custom handling of each translatable field (as some other approaches would require).
* Greater incentive for use of the new Fields API.
* Translations would be loaded on initial object load, rather than waiting until later and loading each translation as a separate override.

We're preparing for an internationalization focused code sprint in the next two weeks. In preparation, it would be great to get some direction from Field API experts on what adding translation support to the API would mean. What are our options for implementation?

Adding a language field to each field table (and adding language to the primary key array)? Separate storage for multiple languages?

Use the existing locale API for translation? Add translation to the UI for object editing?

Files: 
CommentFileSizeAuthor
#239 translatable_fields-367595-239.patch829 bytesplach
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch translatable_fields-367595-239.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#235 changelog.txt_translatable_field.patch1.17 KBxmacinfo
Failed: Failed to apply patch.
[ View ]
#223 translatable_fields-367595-220.patch167.91 KBcatch
Failed: Failed to apply patch.
[ View ]
#220 translatable_fields-367595-220.patch167.91 KBplach
Failed: Failed to apply patch.
[ View ]
#217 drupal-tf.217.patch168.21 KBsun
Failed: Failed to apply patch.
[ View ]
#215 drupal-tf.215.patch167.29 KBsun
Failed: Failed to apply patch.
[ View ]
#215 drupal-tf.212.essentials-DREDITOR-d6.patch78.7 KBsun
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.212.essentials-DREDITOR-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#212 drupal-tf.212.patch167.4 KBsun
Failed: Failed to apply patch.
[ View ]
#211 drupal-tf.211.patch167.37 KBsun
Failed: Failed to install HEAD.
[ View ]
#211 drupal-tf.essentials-d6.patch122.28 KBsun
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.essentials-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#211 drupal-tf.essentials-minus-tests-d6.patch73.89 KBsun
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.essentials-minus-tests-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#210 drupal-tf.210.patch167.62 KBsun
Failed: Failed to install HEAD.
[ View ]
#205 translatable_fields-367595-205.patch167.7 KBplach
Failed: 12101 passes, 3 fails, 0 exceptions
[ View ]
#203 translatable_fields-367595-203.patch166.81 KBplach
Failed: Failed to apply patch.
[ View ]
#194 translatable_fields-367595-194.patch166.36 KBplach
Failed: 12261 passes, 9 fails, 2 exceptions
[ View ]
#193 translatable_fields-367595-193.patch164.93 KBplach
Failed: 12259 passes, 9 fails, 2 exceptions
[ View ]
#189 translatable_fields-367595-189.patch147.27 KBplach
Failed: 12291 passes, 9 fails, 2 exceptions
[ View ]
#189 translatable_fields-367595-189-notest.patch.txt45.33 KBplach
#187 translatable_fields-367595-187.patch143.67 KBplach
Failed: Failed to apply patch.
[ View ]
#185 translatable_fields-367595-185.patch140.24 KBplach
Failed: 11891 passes, 5 fails, 58 exceptions
[ View ]
#185 translatable_fields-367595-185-notest.patch.txt42.21 KBplach
#183 translatable_fields-367595-183.patch.txt143.57 KBplach
#183 proof-of-concept-183-all.patch.txt185.34 KBplach
#180 translatable_fields-367595-180.patch146.3 KBplach
Failed: Failed to apply patch.
[ View ]
#177 translatable_fields-367595-176.patch143.47 KBplach
Failed: Failed to apply patch.
[ View ]
#176 translatable_fields-367595-176.patch143.47 KBplach
Failed: Failed to apply patch.
[ View ]
#174 translatable_fields-367595-174.patch144.21 KBplach
Failed: Failed to apply patch.
[ View ]
#172 translatable_fields-367595-172.patch146.33 KBplach
Failed: Failed to apply patch.
[ View ]
#170 translatable_fields-367595-170.patch141.7 KBplach
Failed: Failed to apply patch.
[ View ]
#168 translatable_fields-367595-167.patch138.66 KBplach
Failed: Failed to apply patch.
[ View ]
#165 translatable_fields-367595-165.patch137.11 KBplach
Failed: Failed to apply patch.
[ View ]
#161 drupal.translatable-fields.patch134.57 KBsun
Failed: 11647 passes, 1 fail, 1 exception
[ View ]
#159 translatable_fields-367595-159.patch137.64 KBplach
Failed: 11602 passes, 1 fail, 1 exception
[ View ]
#158 translatable_fields-367595-158.patch131.34 KBplach
Failed: Failed to apply patch.
[ View ]
#141 translatable_fields-367595-141.patch131.66 KBplach
Failed: Failed to apply patch.
[ View ]
#139 translatable_fields-367595-139.patch129.57 KBplach
Failed: Invalid PHP syntax in modules/taxonomy/taxonomy.test.
[ View ]
#133 translatable_fields-367595-133.patch124.21 KBplach
Failed: Failed to apply patch.
[ View ]
#131 translatable_fields-367595-131.patch124.36 KBplach
Failed: Failed to apply patch.
[ View ]
#128 translatable_fields-367595-128.patch124.63 KBplach
Failed: Failed to apply patch.
[ View ]
#126 translatable_fields-367595-126.patch123.97 KBplach
Failed: 11151 passes, 26 fails, 228 exceptions
[ View ]
#124 translatable_fields-367595-124.patch125.13 KBplach
Failed: 11157 passes, 1 fail, 0 exceptions
[ View ]
#114 translatable_fields-367595-114.patch98.7 KBplach
Failed: Failed to apply patch.
[ View ]
#112 translatable_fields-367595-112.patch104.83 KBplach
Failed: Failed to apply patch.
[ View ]
#110 translatable_fields-367595-110.patch99.54 KBplach
Failed: Failed to apply patch.
[ View ]
#103 translatable_fields-367595-103.patch57.98 KBplach
Failed: 11156 passes, 117 fails, 192 exceptions
[ View ]
#102 translatable_fields-367595-102.patch43.59 KBplach
Failed: Failed to apply patch.
[ View ]
#100 translatable_fields_97_submit.png26.91 KBplach
#100 translatable_fields_97_view.png34.71 KBplach
#97 translatable_fields-367595-97.patch17.39 KBplach
Failed: Failed to apply patch.
[ View ]
#87 translatable_fields-367595-87.patch26.21 KBplach
Failed: Failed to apply patch.
[ View ]
#80 translatable_fields-367595-80.patch26.55 KBplach
Failed: Failed to apply patch.
[ View ]
#77 translatable_fields-367595-77.patch20.51 KBplach
Failed: 10268 passes, 0 fails, 46 exceptions
[ View ]
#75 translatable_fields-367595-75.patch15.84 KBplach
Failed: Failed to apply patch.
[ View ]
#71 translatable_fields-367595-71.patch14.07 KBplach
Failed: 10609 passes, 1 fail, 0 exceptions
[ View ]
#69 translatable_fields-367595-69.patch12.61 KBcatch
Failed: Failed to apply patch.
[ View ]
#65 translatable_fields-367595-65.patch12.66 KBcatch
Failed: Invalid PHP syntax in modules/translation/translation.module.
[ View ]
#63 translatable_fields-367595-63.patch12.06 KBplach
Failed: Failed to apply patch.
[ View ]
#61 translatable_fields-367595-60.patch12.38 KBplach
Failed: Invalid PHP syntax in modules/translation/translation.module .
[ View ]
#58 drupal7-sun.translatable-fields__.patch11.92 KBsun
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal7-sun.translatable-fields__.patch.
[ View ]
#57 drupal7-sun.translatable-fields_.patch11.9 KBsun
Failed: Failed to apply patch.
[ View ]
#55 drupal7-sun.translatable-fields.patch10.91 KBsun
Failed: 9744 passes, 0 fails, 1 exception
[ View ]
#54 translatable_fields_en.png32.39 KBplach
#54 translatable_fields_fr.png32.54 KBplach
#54 translatable_fields_it.png32.38 KBplach
#53 drupal7-sun.translatable-fields.patch10.1 KBsun
Failed: Failed to apply patch.
[ View ]
#51 translatable_fields.patch9.88 KBplach
Failed: Failed to install HEAD.
[ View ]
#47 translatable_fields.patch8.44 KBplach
Failed: Failed to install HEAD.
[ View ]
#45 translatable_fields_en.png31.54 KBplach
#45 translatable_fields_fr.png31.66 KBplach
#45 translatable_fields_it.png31.57 KBplach
#45 translatable_fields.patch8.66 KBplach
Failed: Invalid PHP syntax in modules/field/field.install .
[ View ]
#43 translatable_fields.patch8.77 KBcatch
Failed: 9338 passes, 0 fails, 1 exception
[ View ]
#36 drupal7-sun.translatable-fields.patch8.51 KBsun
Failed: Failed to apply patch.
[ View ]
#35 translatable_fields.patch9.13 KBcatch
Failed: 10267 passes, 1 fail, 0 exceptions
[ View ]
#26 translatable_fields.patch11.4 KBcatch
Failed: Failed to apply patch.
[ View ]
#23 translatable_fields.patch11.35 KBplach
Failed: Failed to apply patch.
[ View ]
#20 translatable_fields.patch7.19 KBcatch
Failed: Failed to apply patch.
[ View ]
#19 translatable_fields.patch7.18 KBcatch
Failed: Failed to apply patch.
[ View ]
#17 translatable_fields.patch7.06 KBcatch
Failed: 9617 passes, 18 fails, 221 exceptions
[ View ]
#16 translatable_fields.patch4.39 KBcatch
Failed: 9594 passes, 18 fails, 221 exceptions
[ View ]
#14 translatable_fields.patch4.4 KBcatch
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in translatable_fields_1.patch.
[ View ]
#7 translatable_fields.patch4.19 KBcatch
Failed: 9613 passes, 18 fails, 221 exceptions
[ View ]
#5 translatable_fields.patch2.02 KBcatch
Failed: Failed to apply patch.
[ View ]

Comments

Issue tags:+i18n sprint

Title:Make fields translatableObject translation option #4: make fields translatable

I've been mulling over some of the various options this week, and this seems like the ideal one to me for a number of reasons. Thanks for beating me to it posting the issue.

# If fields are content, then the translations of a field are equally content - and should be loaded and stored as part of the object rather than tacked on later. If I want to migrate/syndicate content around Drupal installs, being able to load objects and save them as one single entity is much nicer.

# By doing loading everything in the same palce - the locale system only has to handle rendering the correct language, not pulling in language information from custom storage as an afterthought.

# Having proper multilingual fields means that we can cache objects and/or fields safely without having to wonder about language-dependent cache keys or anything.

# We don't need any additional queries or joins. If there's lots of translations we'll end up with some very large objects being loaded sometimes - but that particular issue is only going to affect sites with lots of translations.

So the main question seems to be how to get this to fit in with the proposed field storage model, and how the UI for it might look.

My initial inclination would be to have a 'multilingual/translatable' property which would be part of any field definition- which would map roughly to the difference between single or multiple. So rather than have multiple fields with numeric indexes/weights, you'd have multiple fields keyed by language. And then when displaying this to the user, we'd have something along the lines of $node->field_name[$langcode]->view instead of $node->field_name[0]->view

An implementation might support:

* Admin adds some fields to an object.
* Some of these fields are translatable.
* To make a translation, click 'Translate into French' and get an edit
form for the object with just the translatable fields.
* When loading the object for display in a multilingual environment,
fields are loaded from the start for a given language (the current
language).

(Possibly, if we took this route, nodes too could optionally use it. A
node type could use the translation module *or* just feature
translatable fields.)

Some of the questions:

* How should a field be designated translatable? Is it by field type
(e.g., text = translatable) or is it a property applied to field instances?
* What are the options for storage, and of them which are best?

Status:Active» Needs work
StatusFileSize
new2.02 KB
Failed: Failed to apply patch.
[ View ]

We're working on this as part of the i18n sprint. Here's a summary of the current discussion, at least from my point of view ;)

'translatable' is a new property, which can be applied to any field.

If a field is translable, a few things happen:

1. The field_sql_storage module adds a 'language' column to the table (in addition to object id, delta and field-provided columns)
2. The field is treated as multiple with FIELD_CARDINALITY_UNLIMITED. In reality there will be one row per language in the table, but we can't predict how many languages are installed on a site.
3. All data for the field is loaded with the object it's attached to - same as any other field, this avoids query rewriting and language dependent cache keys.
4. On rendering, only the value in the current language will actually be displayed.

Here's a very initial patch which adds the 'translatable' property to the field definition documentation, and adds the column in field_sql_storage.module

Unanswered questions:
1. Should we always add the language column for any field, but just have it default to an empty string and treat that as language neutral, or should the column be added only if a field is marked as translatable?

2. sun mentioned a possible alternative data structure like having columns for data_en, data_fr, data_de - so all languages are held in one row - but so far in discussion we don't see a way to do this without dynamic/admin triggered schema manipulation.

Very basic patch to show where the changes in field.crud.inc and field_sql_storage.module would start. Initial opinions on this idea from the FiC team would be great of course.

If 2) turns out to be not possible, I'd say yes to 1).

Title:Object translation option #4: make fields translatableTranslatable fields
Category:feature» task
Status:Needs work» Needs review
StatusFileSize
new4.19 KB
Failed: 9613 passes, 18 fails, 221 exceptions
[ View ]

Here's work so far.

language column is added to all per-field tables by default, if a field isn't translatable, then it just gets populated with an empty string (same as nodes do now). This means that say body becomes a field, and then a site wants to make 'body' translatable, then while it wouldn't be a straight switch, it also wouldn't require actual schema changes either. It might be considered crufty though, so up for discussion.

All translatable fields get FIELD_CARDINALITY_UNLIMITED - because we never know how many languages might be on a site. This also means that adding the language column won't make any difference if fields API moves to a default per-bundle model, because FIELD_CARDINALITY_UNLIMITED always means a single dedicated table for that field.

On loading, the multiple field is loaded like any other, except, it's keyed by language.

so:

<?php
$node
->field_name['en'] = $field_data_in_english;
$node->field_name['fr'] = $field_data_in_french;
?>

Instead of
<?php
$node
->field_name[0] = $field_data_first_record;
$node->field_name[1] = $field_data_second_record;
?>

Which saves a lot of messing about later hopefully (i.e. no array_search() no extra looping) .

Not in the patch yet by any means, but a question which needs answering is this:

Where do we select the language-specific content to display to the user?

There's no query rewriting going on, and no t() or anything here - all fields are available all the time as an inherent property of the object, which shifts the responsibility to somewhere in rendering.

My inclination would be to have somewhere, maybe even in drupal_render() itself, where if an element has a '#translate' property set, the rendering function assumes that the element children are keyed by language, and only renders $elements[$language] for display.

That'd make the whole rendering issue separate from this patch though (i.e. not a formatter or anything like that). And would give us a generic method which could be used for stuff which currently isn't fields for whatever reason.

If that sounds like a possible approach, then the initial groundwork laid here might be approaching enough for a first patch - with a bunch of related/followup issues to open alongside it. So marking to needs review for some more eyes.

+        if ($field['translatable']) {
+          // Add the item to the field values for the entity, keyed by language.
+          $additions[$row->entity_id][$field_name][$item->language] = $item;
+        }
+        else {
+          // Add the item to the field values for the entity.
+          $additions[$row->entity_id][$field_name][] = $item;
+        }

means multiple fields cannot be translatable / a translatable field can only have one value (per language) ?

yched: yes, at the moment that's exactly what it means. I'm having trouble thinking of use cases for translatable multiple fields, and also just generally wrapping my head around how it'd happen.

We could do something like
$additions[$row->entity_id][$field_name][$item->language][] = $item;
of course, but that's probably going to add some overhead elsewhere. Another possibility would be only doing that for "translatable multiple" fields - so we have 'translatable' and 'translatable multiple' as two possible options. Ideas welcome of course.

We definitely need fields that are multiple and translatable. But indeed, I too have problems to wrap my head around how the schema and the resulting object would have to be structured for this.

Maybe having 1-extra-column-per-language-value would be easier to understand (and track) for this? (as mentioned in #5)

subscribe

Here's a stab at supporting multiple translatable fields.

Only change is we do

<?php
$additions
[$row->entity_id][$field_name][$item->language][]
?>

Instead of
<?php
$additions
[$row->entity_id][$field_name][$item->language]
?>

I also added the #translatable property per field where it's marked as translatable.

This should now match up fine with the rendering approach at #368823: Add #translatable support to content rendering. I don't think the schema would need to change - we'd still load all fields for an object, and if there's more than one entry for a language, these would be displayed. Where it'd get more complicated is supporting this in the UI.

One other thing to note, if you needed to list 10s of items - Views being the obvious example, then it ought to be very possible to do so by adding a language condition to the field table if a full object load absolutely needs to be avoided. Also, while I think we should avoid dynamic schema at all costs (at least until there's something cooked up for fields in general), either the per-bundle module or materialized views could probably cook up a similar table to the column_language_1 column_language_2 idea suggested by sun - these would need to rebuilt when adding or removing a language, as I understand it mv is designed to handles those changes on some level. So for specific cases, I don't think we're precluding loading only the data you need - even if I think that's not a good fit for 'first class object' loads themselves.

StatusFileSize
new4.4 KB
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in translatable_fields_1.patch.
[ View ]

Yes, I see no reason why the code in #12 wouldn't technically work.
Two issues though :
- Unpredictability of what you find in $object->field_name. Currently it's always an array (possibly empty) of multiple values (possibly only one). This is really important for DX. With the approach taken here, it depends on whether the field is translatable. We need to find a way that doesn't force you to test translatability and have two code branches in the may places where we refer to a field's values.
- As http://drupal.org/node/361683#comment-1236964 shows, there's a limit on how many columns you can add to a db. So this approach of translatable fields make the issues there make denormalized tables more hazardous.

Also, yes, it is critical that the primary storage tables (PF in the current state of HEAD, PB if it becomes the default at some point) do not have schema changes. This means that enabling a new language cannot add a new column, so they cannot have field_name_value_[language] columns.

StatusFileSize
new4.39 KB
Failed: 9594 passes, 18 fails, 221 exceptions
[ View ]

yched: yes that'd be a big issue with having translation of multiple value fields done this way (part of why I'd initially hoped I could leave it out since only allowing one value per language per field would mean the existing structure could be re-used with only named keys added).

One way around it would be to populate the field values as usual, but add $object->field['translations'] additionally - which would contain the data at a low level. I think it's critical for performance and ease of access that we don't have to keep iterating over fields once they're initially loaded to see if they've got languages associated or not, so it needs to be done early, once - but doing it that way means two copies of the data instead of one.

Then in rendering, we'd do:

<?php
if (!empty($element['translations'])) {
 
$output = drupal_render($element['translations'][$language->$language]);
}
?>

The 'translations' array would then replace the #translatable property (or something similar). Here's a patch just for comparison.

StatusFileSize
new7.06 KB
Failed: 9617 passes, 18 fails, 221 exceptions
[ View ]
new7.06 KB
Failed: 9617 passes, 18 fails, 221 exceptions
[ View ]

Merged #368823: Add #translatable support to content rendering in with this one, since we're now using #pre_render, and while I'm leaving the #pre_render callback in common.inc for now since it's potentially a generic approach we might want to take later, it doesn't really make sense out of context. Also tests added for this.

By using #pre_render we keep drupal_render() agnostic about translation, and it also gives contrib the chance to change the callback before we get there, say to handle proper language fallbacks.

subscribing

StatusFileSize
new7.18 KB
Failed: Failed to apply patch.
[ View ]

Fixed up some test failures (other than the ones already in HEAD). When all core field tests pass, I'll add some more here.

StatusFileSize
new7.19 KB
Failed: Failed to apply patch.
[ View ]

plach did some thorough review of this in irc, $language wasn't ever going to get properly populated, changed to $item->language

I've only quickly scanned the discussion here. I do not love the idea of intermingling translation and cardinality. If we cannot keep these unrelated concepts orthogonal, the design is wrong.

What about a separate field data table for each field for each language? I haven't thought through it, but it would allow per-language storage without taking over the cardinality capability.

Each field indicates if it is translation-enabled. Each object specifies the languages it should be loaded/saved in via an object property identified by hook_fieldable_info(). The field storage engine is responsible for maintaining separate tables for each field per-language, and for reading/writing them as appropriate. Changing table schemas in flight is hard, but creating or dropping tables is easy.

Just a thought.

I have no idea about how to represent multi-lingual field data on the object itself at the moment.

@bjaspan, if I'm loading an object, I should get the actual object including all content, not a slice of it depending on the language of the current user. Adding a table per language per field, means potentially multiplying the number field tables by 5, 10, 20 depending on the number of languages on a site. If we need to show all values in all languages in all fields for an object (very possible to do with Views), then we'll quickly hit MySQL's 61 table join limit. So I'm very much against adding an extra table per language per field + query rewriting, it's going to mean a massive slowdown. For example a lot of sites need language fallbacks, sometimes multiple levels of fallback - having to hit a number of different tables in turn to get the fallback value for a field is going to be very, very nasty.

As to cardinality, I don't see a partciular reason why we couldn't separate the two with the current model - the only issue is that we have to allow for adding a language to a site without schema changes, so the maximum number of records per object is going to be variable over time.

Status:Needs review» Needs work
StatusFileSize
new11.35 KB
Failed: Failed to apply patch.
[ View ]
new11.35 KB
Failed: Failed to apply patch.
[ View ]

Tested the patch with a demofield from FIC bzr repository.

There were some things not working, I tried to fix them without screwing the whole picture we discussed yesterday.

There are some aspects yet to be fixed but I were able to:
* create an article with a demofield (text) field
* after inserting manually in the DB a couple of field translation, make them show up according to selected UI language
* show nothing if no translation is avalaible for the selected UI language
* show the field value with any UI language if it is language neutral

I'll be glad to discuss the details tomorrow.

@catch: I am not specifically supporting my field-table-per-language idea; I have not thought about the problem in enough detail to be confident the idea is any good. However, I want to disagree with a couple assertions you made.

1. If you are storing 5, 10, or 20 times the unique data, you are going to need 5, 10, or 20 times the amount of space to put it in and should expect to do a lot more work. I do not see why loading multiple languages ought to come "for free" from the database. If there is a way to make it happen that does not break the architecture, great.

2. Loaded fields can be cached per object, so storing field data in different tables does not slow down load performance. As for write performance, this is related to but not quite the same situation as the great hybrid-storage debate. I do not think there are very many real-world scenarios in which there is an extremely high volume of objects being created in 20 languages simultaneously.

3. The 61-join limit for Views is a red herring. Unless you are actually filtering/sorting by >61 fields, there is no reason Views needs to actually join to select-only fields every time; that is not the most efficient way to retrieve the data, especially given the field cache. Views could just retrieve the object ids and use load-multiple (which uses the cache) to get the full objects, or if you really need a live query it could retrieve the object ids and the first 60 fields in one query, then the next 60 fields in the next query, etc.

I strongly agree with catch. One of our sites translates content into 14 languages. It has 25 content-types, with a total of 115 fields. So, that alone would result in 1610 tables. Adding the remaining, non-CCK-related tables (~180) of this site would result in a total of 1790 tables.

StatusFileSize
new11.4 KB
Failed: Failed to apply patch.
[ View ]

@bjaspan, @1 I think we can make it happen without breaking the architecture, IMO sun's example strongly points to a table per field per language being not a good option at all.

@2 great ;)

@3 this is true, but it'd require the administrator choosing an row style which loads the full object (and also that all languages get loaded into the object all the time - not sliced by current language within the load functions) - my point is that it'd be even easier than usual for an admin to do this configuring a field row style per D6 Views, and we shouldn't make situations like that any easier than they already are to walk into.

Here's a minor update with the FIELD_CARDINALITY_UNLIMITED change removed, however there's a couple of issues with cardinality we need to solve, and since this patch is my first really serious look through the field API I'm not sure how best to do it:

The main one, is that If a field has cardinality of 4 for example, we need a way to make the delta/cardinality ratio per-language. This needs to both be handled in the UI, and enforced in validation. Lines like this for example aren't going to work if the cardinality is 4 and there's 10 languages (i.e. $delta could hit 40):

<?php
      
if ($field['cardinality'] != FIELD_CARDINALITY_UNLIMITED && ++$delta_count == $field['cardinality']) {
            break;
          }
?>

We need to find a way to make deltas work on a per language basis - at this point it seems likely to require ignoring the actual delta, and doing row counts for entries-per-language-per-object in validation somewhere.

I just gave the patch a cursory look, no time for an in-depth review right now.

- In field_sql_storage_field_storage_load()

+          $additions[$row->entity_id][$field_name]['#pre_render'] = array('drupal_filter_translation');
+          $additions[$row->entity_id][$field_name]['#translations'][$item['language']][] = $item;

I'm not sure about the 2nd line - turning each $object->field_name[delta] in a #-array ? maybe, but probably needs some pondering... Basically means you need element_children() each tilme you want to iterate over field values (as you do in text_field_sanitize(), but there are *many* more places). Not sure of the other consequences right now.

- A definite -1 on the first line, though. Filling a render property at load time really sounds like a bad idea :-)

- Right, if we go this way, the code to limit the number of values written or loaded will need to be updated to limit per language.

- More generally, what's the impact of this 'translatable fields' approach on D6's 'translated nodes are different nodes' approach to node translations ?
I'm pretty sure you explained that somewhere in one of the Translation Sprint threads, can you link to a post I could read ?

what's the impact of this 'translatable fields' approach on D6's 'translated nodes are different nodes' approach to node translations ?

It's a good question. Here's my understanding.

The focus on fields came out of a process where we looked at various possible approaches to making non-nodes translatable, see http://groups.drupal.org/node/18735.

Option #2 on that page was to follow the translated nodes approach with other objects, like users. But multiple records per user, one per language, each at a different URL?

For any non-node object that fields can attach to, translatable fields would make it possible for the object to exist in multiple languages, without the various problems associated with having separate objects per language.

For nodes, multilingual fields would provide an *alternate* way of making nodes translatable. A particular site could keep the translation module disabled and use purely fields for multilingual nodes. It could have only unilingual fields and use translation module. Or it could use a combination of the two--some node types handled by translation module, others by multi-language fields.

Probably there will be some quirks to work out to get this working though. For example, if this is the scenario, is it a requirement that whether a field is multilingual or not is by field instance, so that a particular field could be used with translation module handled node types and those not handled by translation module?

There are also UI considerations. Do we have the same UI for translation module (create a new node based on this node but in a different language) and field-based node translation (bring up a copy of this node with translatable fields editable in a different language)?

In the long term, it's *possible* that multilingual fields will render our current nid/tnid approach unneeded, but they could equally coexist.

Thanks yched, I'm also not sure about requiring element_children() for this. plach, could you explain why that change was necessary?

The #pre_render in load, yes it doesn't really belong there, put that down to me reading through Field API as I go along...

Nedjo pretty much covered translatable fields vs. translatable nodes. The main reason for trying this approach is that trying to share anything between nodes in a translation set (nodequeues, votes, CCK fields) currently requires doing everything twice - we've been working on adding that support to contrib modules and it's non-trivial to implement and requires a lot of complexity. I also have an early draft of a patch for attaching fields to translation sets (i.e. proper shared fields as opposed to copying values over) here: #132075: Attach fields to translation sets. although that takes a bit of fakery to get going. So this would be a much simpler approach to take for things like taxonomy terms and users, and an optional approach for nodes (obviously you wouldn't want both node translation and translatable fields on the same node type).

All very good points. To summarize:

- If fields are translatable and stored by language, then we are not going to store different nodes (objects) for translations. I.e. node/1 contains all translations on load.

- Separate nodes would mean that the additional language column for fields would not be required at all. However, as catch already mentioned, those separated translation nodes were/are a major flaw in the current translation system in many cases.

- OTOH, separate nodes have valid use-cases, f.e. to have only French comments on the French translation. I.e. the scheme of having a node/1 for EN and separate node/2 for FR has some pro points, but also a major con point: We can't have user/1 for EN and (the same) user/2 for FR.

- Upgrading existing nodes (with separate translation nodes) to translatable fields is rather impossible without rebuilding your Drupal 7 site from scratch. Users must, however, be able to use both node translations and field translations in parallel.

In short: We need translatable fields (e.g. for users), but we also need to keep the existing translation method for nodes.

@catch: I was forced to use element_children() because the $object->fieldname fields are everywhere treated as integer-keyed arrays of field values, as yched was pointing out: the introduction of the #pre_render and translation keys broke this assumption. If we decide to keep this approach we'll have to use element_children() almost everywhere in the Fields code (I wish we could use getters and setters to access object properties...)

As for my understanding there are three main aspects treated in this discussion:

  1. Storage pattern: if we decide that making fields translatable is the right choice (and IMHO it is) there might be a different solution for every storage pattern; I think that the current approach is the most natural for the per-field storage pattern, obviously if the per-bundle one becomes the default we'll have to choose whether a translatable field behaves like an unlimited multiple one or not.
  2. Node (entity) data structure: we are not yet sure of which shape the translated node (entity) object should assume; right now a node (entity) object supports a single language: we need nothing more, unless we want to show more than one translation at a single time. If we had a node (entity) factory returning a node (entity) object translation for a given language, we should be able to keep most of the object-handling code almost unchanged, as the the data structure would be the same as before.
  3. Node translations vs Field translations: as previously stated by sun there are valid use-cases for both the approaches, that's why I think that we should try to support both of them; with this I don't mean that we should implement both of them: having translatable fields and maintaining unchanged the node data structure would allow us to emulate node translations, as I was saying here; this way we could get the most out of each approach. Obviously this would force us to entirely convert Node to Fields, and this may be a very expensive process, I guess.

I am not sure that any of my proposal is viable, but I think we should find a shared solution for all the aspects above to get somewhere.

We spent all the 3rd day in discussing, this is the plan for the 4th day:

  1. in field_sql_storage_field_storage_load we load the translations in #languages, i.e. field[0]['#languages']['en']['value'] -- we provide NO default value (for translatable fields, obviously)
  2. we have a hook_field_attach_load() default implementation which performs language negotiation, i.e. shifts the proper values into field[0]['value']
  3. There could be other (contrib) implementations which apply complex language fallback logic (override the default)
  4. drupal_render always gets a plain old node with all the field values in place (#languages is not dropped)
  5. We ensure 2. is not cached (#111127: Cache node_load issue) by implementing a mini-test.

read coding reviewing testing

@bjaspan: the storage pattern issue is still to be considered, we decided that, due to the field storage pluggability, we can ignore it for the moment: we'll be glad to discuss that on #drupal-i18n with you and yched.

So the idea here, is that for multilingual fields, we want the initial loading of fields/objects (and caching) to be agnostic about what actually gets rendered - so you always get a consistent field[$delta]['#languages'] array from database loading or the field cache regardless of what the current language is.

We also don't want drupal_render() to have to worry about translation. So, for multilingual fields only, we store the values in #languages, then populate the actual field value to be used in field_field_attach_load() - which means other modules can hook in later to do proper language fallbacks as needed.

Then any rendering/theming is going to just do field[0]['view'] same as now - doesn't need to worry about translation at all, and nor does the caching layer - all the hard work is done in hook implementations.

This seems like the best compromise between keeping the existing field array consistent and having enough flexibility. i.e. we won't need to use element_children() as in the existing patch or anything like that.

Status:Needs work» Needs review
StatusFileSize
new9.13 KB
Failed: 10267 passes, 1 fail, 0 exceptions
[ View ]

Here's a new version in line with yesterday's discussion. Main changes:

1. For multi-lingual fields, we load and cache all multilingual values for a field in a new $field[$delta]['#languages'] property in field_sql_storage_field_storage_load().

This has the following advantages:
* Regardless of the language for the current request, this information is always the same, so can be loaded from cache or the database safely.
* As a #property, it'll be 'invisible' to all field modules (see later) so shouldn't mess up existing code
* We'll be able to track delta values per language (language has been added to the primary key) - so field cardinality should still work (although saving delta per language is still a TODO).
* Language selection and/or fallbacks can be implemented in PHP instead of requiring complex JOINs (see 2)

2. In field_field_attach_load() we populate $field[$delta] with the best fit value fr the language language (per-delta, to allow for partially translated multiple value fields).

Once this is run, other modules implementing the hook could implement their own language fallbacks (use en-US, if not use en-CA, if not use en-GB) relying on the #languages array to see what's available. If we require locale and/or translation modules for translatable fields, then this hook implementation could also be moved there, but it gives an idea of how this would work.

After this point, all field modules and field API will see a single value per delta, as with non-translated fields, in the desired language for that request.

The main things still to resolve:
# Setting delta values on a per-language basis on save, and using the delta to enforce exactly the same cardinality rules on validation and load as happen for mono-lingual fields.
# Some idea of how this would map to a UI. My initial idea would be node/n/edit/fr node/n/edit/en with only fields in the available languages shown on those tabs, maybe with a side-side view so the existing values can be shown alongside the form.
# Tests.

However this is at the point where it needs more architectural review, so setting to CNR.

StatusFileSize
new8.51 KB
Failed: Failed to apply patch.
[ View ]

Added another fallback to an empty ('') language value, which basically equals the "language neutral" setting we have for nodes now.

I haven't looked at this yet.

I'm curious how this affects per-bundle storage (see http://drupal.org/node/368674, the patch keeps getting rejected but in fact is fine, and also pbs.module in contrib). Is there a defined list of all possible languages that can exist at any given time? If so, should per-bundle storage create a separate column in its wide tables for every possible language for every field that it otherwise puts in those tables? If so, there also needs to be hooks that pbs.module can implement for "create language" and "delete language" (and "rename language" if appropriate") so that it can know when it has to update the per-bundle table columns.

If there is a not a defined list of possible languages at any given time with hooks for when it changes, then I think this approach kills per-bundle storage, and that is going to be a deal-breaker for a very vocal contingent.

"Is there a defined list of all possible languages that can exist at any given time?"

Yes, it's available from language_list(), I don't know if locale module provides hooks for when languages are added or removed, but if it doesn't then it ought to ;)

More generally, were it somehow impossible to add fields to per-bundle storage, I don't see it 'killing per-bundle storage' as such, but it would kill per-bundle storage for translatable fields - meaning they'd have similar restrictions to fields with unlimited cardinality on that front, at least as I understand it. Even then, they'd only be subject to the limitations of per-field-only storage as it is now, as opposed to query rewriting, additional joins with conditions, or hits to the database on every load which most other options are going to impose.

Okay.

Quick read-only review:

<?php
// Only operate on translatable fields.
$fields = array();
$instances = field_info_instances($obj_type);
?>

This is wrong. field_info_instances() takes a bundle name, not an object type. This code will work for users but not for nodes. See other calls to this function in (e.g.) field.attach.inc.

I don't love having field_field_attach_load() in field.module do the language thing. If this is going to be a core behavior of Field API, it should probably be in field_attach_load() directly. If not, it should probably be in a different module (locale? something else?).

The primary key for field_config and field_config_instance is going to change to serial ids. The current primary key will become a unique key. Anyway, it isn't clear to me where 'language' belongs in the key, if it does, or if we should have a separate index for it.

I haven't really thought about the changes to field_sql_storage.module.

[edit : how can I crosspost with a comment posted half an hour ago ? - bah]

Architecturally this looks fine.

field_field_attach_load() :

+  $instances = field_info_instances($obj_type);

is not right. field_info_instances() takes the bundle of the object, which can be extracted using field_attach_extract_ids().
Actually, the common way to 'run an operation on a object's field values for every instance in the bundle' is through _field_invoke_default() / field_default_*(), so field_field_attach_load() might be worth implementing as a field_default_* operation, for consistency.

About impact on PB storage, the main issue is limitations on the number of columns. I don't know if pbs.module intends to account for this (don't know if that's even possible), but adding one set of columns per language, per delta (pbs stores all limited-cardinality fields) makes you reach the limit much faster.

pbs.module could choose to store only single-value fields if they are also translatable. It could even make that decision dynamically based on the database's per-table column limit and the number of fields it wants to combine. Or it could provide a UI to let the admin decide.

catch: FYI, hook_field_attach_load() is going to change as part of the hybrid storage patch to be completely different than it is now. The current hook behavior will probably be provided by hook_field_attach_load_alter() or somesuch. This will be an easy update to make, obviously. :-)

StatusFileSize
new8.77 KB
Failed: 9338 passes, 0 fails, 1 exception
[ View ]

field_field_attach_load() - I'd be fine with moving it to locale.module - since locale is required to have languages, then it can realistically be required for translatable fields to work.

field_info_instances() - thanks for that, now fixed. (both these changes in this patch)

primary keys - do you mean translatable rather than language? Language is added to per-field tables and in those tables, to allow for one delta to have a record for each language, we added language to the primary key there. But yeah there's nothing for translatable yet, probably needs more discussion.

Thanks for the feedback, looking forward to more.

Sounds good with hook_field_attach_load() changing, it looks misplaced at the moment since it doesn't get added to the field cache but runs in the 'if not cached bit' (and also not multiple, tsk tsk). Is it always going to be reliably not added to the field cache? Also, if the cache node load patch gets in at some point, that'll be caching the full output of field_attach_load() - so we might need to move hook_field_attach_load_alter() out of that, somehow. A bit tricky but a fairly minor issue compared to the other ones that need solving. I'm glad this version is going down better than what we came up with last week ;)

Hoping to add some very basic low level tests for this, if possible later tonight.

StatusFileSize
new8.66 KB
Failed: Invalid PHP syntax in modules/field/field.install .
[ View ]
new31.57 KB
new31.66 KB
new31.54 KB

Made some small fix to make it work, we are still stuck with the field cache issue, but apart from that the patch seems to work.
I tested it again with demofield: I manually inserted three demofield entries (en, it, 'language neutral'), then I configured three languages for the site (en, it, fr) and made fr the default one: all seemed to work as expected.

You can find some screenshot attached.

Status:Needs review» Needs work

The last submitted patch failed testing.

StatusFileSize
new8.44 KB
Failed: Failed to install HEAD.
[ View ]

Windows newlines strike again...

Status:Needs work» Needs review

Crossposting UI proposal: http://groups.drupal.org/node/19057

A few more remarks :

-

+        // For translatable fields, put the multilingual values for each delta
+        // into a '#languages' property. This information can be safely cached
+        // in the field and object caches.
+        if ($field['translatable']) {
+          $item['language'] = empty($row->language) ? '' : $row->language;
+          $additions[$row->entity_id][$field_name][$row->delta]['#languages'][$item['language']] = $item;
+        }
+        else {
+          // Add the item to the field values for the entity.
+          $additions[$row->entity_id][$field_name][] = $item;
+        }

means that if a field is 'translatable' but locale.module is disabled, we get no useful data for the field at all (stays hidden in '#languages', only locale knows how to fetch it from there).

I think the overall approach is fine but there are some details to iron out about code separation and who does what - like : 'translatable' is an "official" $field property, stored in the field_config_* tables (not sure we need a dedicated column, btw), and storage modules need to support it, but the actual feature is hooked in by another, non-required module.

- The current code doesn't let you specify "I want to load the fields for a given object in a given language" - the language is automatically decided for you based on page context.

- So far the discussion mainly focused about loading the data, not too much about saving incoming data. We'll need to be able to read the language of the incoming values somewhere, not sure how this will look like. The patch currently sets $obj->field_name[$delta]['language'] on load (field_sql_storage_field_storage_write() needs to be fixed, uses $item->language instead of $item['language']).
Means you can write an object with a french field_a and an english field_b ? or even field_a value 1 in french and value 2 in english ?
Means you have to manually specify a language for all field values when doing a programmatic node_save() ?
+ detail : $item['language'] might clash with a field-type 'column' named 'language'.

StatusFileSize
new9.88 KB
Failed: Failed to install HEAD.
[ View ]

nedjo, catch and I managed to address the field cache issue, now all seems to work fine: the screenshots are the same as before but no need to truncate cache_field before switching language; I think now we are ready for a serious review.

I think the overall approach is fine but there are some details to iron out about code separation and who does what - like : 'translatable' is an "official" $field property, stored in the field_config_* tables (not sure we need a dedicated column, btw), and storage modules need to support it, but the actual feature is hooked in by another, non-required module.

Yes, we need to make this decision fairly soon. We could move locale_field_attach_load() back into field.module somewhere, so at least we're always guaranteed a full field, and then other modules can still override the handling. Depends how much we want core field module to know about it I guess.

One possible option would be to have field module show all values for multilingual fields by default - so ignoring language apart from populating the #languages array early, and otherwise it just builds the array as usual (so field_sql_storage_field_storage_load() would just populate the array as usual in addition to adding the #languages for the deltas. So in load, every field would get it's own array key, no empty values or filtering at all, then locale module comes along, says

<?php
unset $object->field_name;
$object->field_name = array();
$object->field_name[$delta] = $best_fit_language_value + everything still in ['#languages'] for later
?>

That means, if you have a multilingual site, then one day just disable locale module, suddenly your multilingual content is showing all values for every language - might make your nodes look ugly, but we don't have panicky admins with 'missing' content that way.

- The current code doesn't let you specify "I want to load the fields for a given object in a given language" - the language is automatically decided for you based on page context.

The field value for rendering is, but you still have all the translations in [$delta]['#languages'] if you need to get to them. Do you have an idea how we could offer both here?

So far the discussion mainly focused about loading the data, not too much about saving incoming data. We'll need to be able to read the language of the incoming values somewhere, not sure how this will look like. [..]
Means you have to manually specify a language for all field values when doing a programmatic node_save() ?

Yes this is where it gets tricky, and where it might be partially informed by our UI. For a programmatic node_save() - if your field is translatable, then IMO yes, it'd be like any other field value when saving - just always there regardless of the field providing module (or if not set, it's saved as language neutral).

StatusFileSize
new10.1 KB
Failed: Failed to apply patch.
[ View ]

a) Translated field values need to use the same deltas like the original/source field values. This means there's not much we can stack into field values when a site or object suddenly is no longer translatable. I am not sure whether Translation module in core supported this before. As a user, I would expect that only language neutral values will remain when I disable translatability. That's the default case for both the 'language' column as well as programmatically created objects that do not specify a language.

b) Locale module is for user interface translations/localization. Translatable fields belong to content translation, so if we happen to move the functionality, Translation module has to get them. This particular issue boils down to the more generic question whether Drupal core will provide support for multilingual content by default in the future. Doing this would vastly increase the sensibility and awareness for languages throughout core and contrib. I think this should be discussed in IRC or in a separate issue. Please NOT in this issue.

In this patch:

- Moved implementation of hook_field_attach_load() into Translation module.

- We should be able to safely use $row->language, as the column is defined as varchar not null default ''. Solving the potential $item['language'] name clash.

- Minor clean-ups.

StatusFileSize
new32.38 KB
new32.54 KB
new32.39 KB

Tested sun's patch and works as expected, attaching fresh new screenshots (with the user!).

@sun:

Translated field values need to use the same deltas like the original/source field values. This means there's not much we can stack into field values when a site or object suddenly is no longer translatable. I am not sure whether Translation module in core supported this before. As a user, I would expect that only language neutral values will remain when I disable translatability. That's the default case for both the 'language' column as well as programmatically created objects that do not specify a language.

with catch we were worried about the case that NO language neutral value survived the translation process, that's why we came up with that proposal; personally, I am more oriented on displaying just the language neutral content and applying some fallback logic on translation module disabling/uninstallation; also maybe: the first value inserted for a field is duplicated both in a language neutral row and in its language specifc row, this way we always have the language neutral value populated (the famous source).

StatusFileSize
new10.91 KB
Failed: 9744 passes, 0 fails, 1 exception
[ View ]

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new11.9 KB
Failed: Failed to apply patch.
[ View ]

Last one introduced {field_object_language}, a generic table that stores the source/default language for each object.

This one focuses on fetching the information in _field_attach_load(), before objects are passed to the storage method. Needs work though. But leaving PNR, since all of this needs to be reviewed.

StatusFileSize
new11.92 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal7-sun.translatable-fields__.patch.
[ View ]

Optimized method for checking for (overall) object (field) translatability.

in _field_attach_load() :

There's a misleading // Make sure empty fields are present as empty arrays. comment.

+        if ($object->{$instance['translatable']}) {

- shouldn't that be if ($field['translatable']) ?

@sun : suggestion - could you name your patches after the issue and comment # ? E.g drupal7.translatable-fields-367595-58-sun.patch. Would make checking progress between patches easier :-)

StatusFileSize
new12.38 KB
Failed: Invalid PHP syntax in modules/translation/translation.module .
[ View ]

Just tested the patch and fixed some minor problems.
I added the language_default option as the second best fit choice in translation_field_attach_load.

I was thinking about one thing: if we used the newborn {field_object_language} table as a registry for all the translations of an object we could store the default/source language, but also some translation metadata (e.g. creation date, author, last modified date, ...).
This way we could actually emulate the current node translation functionality through translatable fields without having to convert each single node field to Field.

The {field_object_language} table might become something like a {field_object_translations}:

+------+----+----------+---------+--------+---------+----------+
| etid | id | language | default | author | created | modified |
+------+----+----------+---------+--------+---------+----------+
|   1  | 1  | fr       | 1       | 2      | 1234567 | 1234567  |
+------+----+----------+---------+--------+---------+----------+
|   1  | 1  | en       | 0       | 1      | 1234567 | 1234567  |
+------+----+----------+---------+--------+---------+----------+

Status:Needs review» Needs work

The last submitted patch failed testing.

StatusFileSize
new12.06 KB
Failed: Failed to apply patch.
[ View ]

fixed windows newlines...

Status:Needs work» Needs review

StatusFileSize
new12.66 KB
Failed: Invalid PHP syntax in modules/translation/translation.module.
[ View ]

Updated to move the #translations key into $object->field_translations as discussed last week.

Quick summary of where this issue is:

All translatable fields are stored in a per-field table with a language column. On load (i.e. in field_sql_storage module), we load all the values in each language into an $entity->field_translations property. So for example:

$entity->field_translations['my_text_field']['en'][0]
$entity->field_translations['my_text_field']['fr'][0]

Then in field_attach_load() itself, we pull the values for the object's default language. This is stored in a new table, which has the object type, ID, and the language it was originally entered in (plach's ideas in #61 look like good possible extensions to this). We decided it doesn't make sense to associate a language with a user (and we definitely can't have tnids for users), but this metadata allows us to populate the field value for each object with a sensible default value before any field hooks need to operate.

This is in the update patch, and it's one simple query to get this information per set of objects being loaded, and only if they're loading translated fields. At this point we can safely cache both the $additions object in the field cache, and the $object itself for something like node_load() caching - because this data is always going to be persistent across requests. No need for extra cache keys or anything like that.

The translations are still available in $entity->field_translations at all times. This means that hook_field_attach_load() and later hooks can pull out translations in the user's language, implement complicated fallbacks like Czech -> Slovak -> Croatian etc. - and they only have to deal with an array of field values keyed by language code - no going back to load different values from the db, no query rewriting. Rendering and everything else always deals with the values in $entity->field_name - so this process is transparent to field modules, themers etc. (although once again the data is available for advanced use cases).

What's still missing is saving of the data (the language for each field value) and a UI of course (and tests, but that really relies on us having the data saving working to an extent so we can load it again). At this point I think it'd be good to know how much we need to do in this initial patch in order to have something committable. I don't think we should try to tackle the UI issue here (and there's a groups discussion open if anyone wants to have a crack at that), probably we need to get saving working.

Status:Needs review» Needs work

The last submitted patch failed testing.

Great ! No time to look at the patch right now, just a minor point :

Could we abstract the current code in translation_field_attach_load() into a function that does 'switch $obj->field_name front values with $langcode (with fallbacks)' - it seems we'll need that function anyway - and then simply call that function in translation_field_attach_load() ?

@yched: nice idea, I'll work on it and on reviewing the last patch tonight

StatusFileSize
new12.61 KB
Failed: Failed to apply patch.
[ View ]

Fixed typo in translation.module. I agree with abstracting out the code in translation_field_attach_load(), also translation.module isn't the place for it to live - locale() is required to have languages enabled, so while that's technically for interface translation, it seems like a less heavy dependency.

1) I'm still bugged by the fact that if translation.module (or locale, if this ends up in there) gets disabled, you get no $entity->field_name value for translatable fields, which are then 'lost'. field_sql_storage_field_storage_load() should at least populate $entity->field_name with the values for the object's language_default. You'd still get a value, even though you disabled the 'translation' feature.
That would mean moving the code that retrieves the default_value *before* the call to field_storage_load() in field_attach_load(), instead of just after.

2) There's a now a DX difference between :
a) entity types that handle their own caching outside of field_attach_load :
they get to cache an object where the current page's $best_fit language has already been put upfront by translation_field_attach_load() at the end of field_attach_load(), so they need to remember to do the language $best_fit reshuffle themselves after reading from their cache.
b) those that don't handle their own caching :
field cache then stores 'current language'-agnostic values + all translations, and $best_fit negociation happens on the values that get out of the field cache, before we return from field_attach_load().

Makes me think that maybe what currently lives in translation_field_attach_load() does in fact not belong to hook_field_attach_load(), but rather should be the job of a separate field_attach_*() func, field_attach_get_translation() or whatever.
- Fieldable entities call field_attach_load() (or fetch from their cache); this gives an object populated with fields values for the object's default language, plus $entity->translations.
- They then call field_attach_get_translation('fr'), repopulates $entity->field_name entries with translated values. This function does *no* db hit, just invokes hook_field_attach_get_translation(). The current translation_field_attach_load() then just becomes translation_field_attach_get_translation(). And field_attach_get_translation() is then the function I mentioned in #67.

Opinions ?

3) Bikeshed point :

$entity->field_translations['my_text_field']['en'][0]
or
$entity->field_translations['en']->my_text_field[0]
?

Since we're talking about translations, seems natural to find an array keyed by languages; then, in $entity->field_translations['en'], you find the same data format that you're used to all over the field API.

Status:Needs work» Needs review
StatusFileSize
new14.07 KB
Failed: 10609 passes, 1 fail, 0 exceptions
[ View ]

I reviewed and tested #69, didn't notice yched's #70 until now, but incidentally some of the issues raised there are covered by this one:

1) Now field_sql_storage_field_storage_load() loads the language default field transations directly in place, if they are available.

2) Too late for me to think about this, just note that this patch introduces a field_attach_switch_translation() function which accepts a series of language preferences and which is used in translation_field_attach_load() for the "heavy work".

3) +1 for $entity->field_translations['en']->my_text_field[0]; moreover I have the bad feeling that $entity->field_translations might clash with a field named translations, what about another name?

Just got back from vacation and have not had time to review the new patch or design, so I'll just comment on the bikeshed issues :-).

I agree with $entity->field_translations['en']->fieldname instead of $entity->field_translation['field_name']['en'].

I'm not sure I like the property name field_translations. Currently, the "field_" prefix for fields is imposed only by the CCK UI; if you create a field named (e.g.) "body" then you get $object->body, not $object->field_body. Using field_ as a prefix with CCK and as a special case prefix for field_translations seems poor. However, I do not presently have a better suggestion. $object->translations would work, but that would be the first case of the Field API "polluting" an object's namespace with a constant property name. Perhaps fieldable info should indicate the property name to use, and then the field API would refuse to create a field whose name matches that property.

Actually, that brings up the important point that field api should ALREADY be refusing to create field names that override other property names identified by fieldable info (e.g. for node: nid, vid, type) but is not. Time for a new issue.

Nedjo, Sun and I had an IRC meeting in which we discussed most of the open issues, these are our conclusions:

  1. Accounting for entity-level caching in this issue is scope creep: we leave the introduction of a field_attach_post_load() for the future (if the cache node_load() issue does not beat us to do it).
  2. We rename field_attach_switch_translation() to field_attach_translate()
  3. We keep the hook_field_attach_load() implementation in Translation module, because it's content, because Translation module is swappable (easier than Locale), and because the field translation UI must be provided by some module. But we hope to get rid of Translation module, or better, the old-fashioned entity-translation.
  4. We change the structure to $entity->field_translations['en']->my_text_field[0]
  5. A definable property for $entity->field_translations sounds like the way to go: it would be good but not essential to be able to define a default property name, since there's no strong reason to expect that this needs to differ between object types.
  6. We defer the language fallback logic to a separate issue. In field_attach_translate() we simply iterate over all fields separately and check whether there are values for the given language (default is the current language) in our field translations. If there are values, we use them - (even) if they are empty. If there are no values, the source/default language values remain.

StatusFileSize
new15.84 KB
Failed: Failed to apply patch.
[ View ]

This should cover the points discussed yesterday, the main issue still open is saving.

subscribe

StatusFileSize
new20.51 KB
Failed: 10268 passes, 0 fails, 46 exceptions
[ View ]

In this one I addressed the saving issue: at the moment all fields being saved are assumed to share the same language or have none, I'm not sure this is the right approach but needing feedback on this.

This is the first patch fully-testable with no need to insert data in the DB with a SQL client: just install the translatable version of Demofield and enable the multilanguage support for the article content type, then start creating some articles and enjoy your translatable (demo) fields :)

(Obviously you'll have to enable the language switch block to see the translations)

Status:Needs review» Needs work

The last submitted patch failed testing.

Barry got this issue to my attention. With all the Drupalcon DC craze, I'll only have time to come back to this later.

Status:Needs work» Needs review
StatusFileSize
new26.55 KB
Failed: Failed to apply patch.
[ View ]

I was unhappy with the way the previous patch handled the field language saving, primarily because it involved the field item property _language: IMHO that one would have made supporting the node_save(node_load($nid)) requirement pretty hard to achieve.
Moreover I realized that the $object->language_default property was another case of entity namespace pollution, following barry's definition :)

The current patch:

  1. uses fieldable info to define a dynamic object property to store the language default, which is set in the field submit handler on entity creation using $form_state['values']['language']
  2. stores the submitted field data into $object->{info['translations key']}[$form_state['values']['language']]
  3. deletes the entity's language default from db on entity deletion
  4. refactors field_sql_storage_field_storage_write in order to be able to write all the field translations available in $object->{info['translations key']}, if any (for translatable fields the actual value is ignored and only translations are taken into account)
  5. makes a small correction on the tests in order to make the patch valid

This way on field save all the translations found are just written to the db, allowing us to perform a node_save(node_load($nid)) (with all the available translations in place) or a node_submit() (with just the current translation in place) having a consistent behavior.

I think the next step is writing down some simpletests but obviously I'd like to have your feedback on this one before starting.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review

Unable to reproduce any error, re-testing...

Status:Needs work» Needs review

I don't think that's the right approach for saving.
In this approach, it's not the content of the canonical $object->$field_name that gets written, but the values in the nested $object->translations[$lang]->$field_name.

The patch accounts for that in the case of a regular node form submission by populating form values into $object->translations[form_lang]->$field_name instead of $object->$field_name in field_default_submit().
- it means the values are not where we expect them (in $object->$field_name) for the rest of the submit workflow : hook_field_validate(), hook_field_attach_validate(), hook_field_presave(), hook_field_attach_presave() are screwed. Our current approach for 'load' is also broken on that aspect : if they only look in $object->$field_name, then hook_field_load() and hook_field_attach_load() only see one language.
- it doesn't account for programmatic saves : developers need to remember 'I should put the values in $object->field_foo for non-translatable fields, but in $object->translations['fr']->field_bar for translatable fields'. That's terrible DX IMO.

This is leading to a really convoluted API.

I don't think I see a way to have the three of :
1) $objects that load and save all their translated values in one operation and carry them for all their lifecycle.
2) a reasonably simple DX where field values live in a predictable spot in the $object, translatable or not, for all apis and hooks...
3) an $object structure similar to CCK D6 to ease DX transition to D7

I'm starting to wonder if the solution is something like how multiple values are handled : when it comes to the format of the data in $object, all fields are 'multiple', and single fields are just a degenerate case that only come with delta 0. Likewise, all fields are 'translatable', and have a structure like : $object->$field_name[$lang] = array of values keyed by numeric delta. Non translatable fields only have an $object->$field_name[''] entry.
We sacrifice point 3) above and all the APIs (field-type modules, hooks, templates, etc) have to account for that new structure, and do their stuff accordingly. Maybe supporting natively translatable fields comes at that cost...
Er, is this where we started from, and we just moved in circles ?

Status:Needs review» Needs work

The last submitted patch failed testing.

I agree with yched that we need a global method similar to what we do with multiple values so we have a predictable structure to the array whether or not there are translations. Then we just have to adapt all the functions that touch the array to expect the new structure.

I haven't tested it yet, but I'm worried about option_widgets which has to transpose the array structure during FAPI processing -- I'm hoping that won't break completely. I'm guessing no one has tested that yet because I'm pretty sure that code will need some adaptation.

Oops, it's not option_widgets any more, its 'options'. And where I said 'array' I mean 'object', but you get my drift.

StatusFileSize
new26.21 KB
Failed: Failed to apply patch.
[ View ]

Here is a rerolled version of the last one just in case anyone wishes to test it.

@yched: all your points are good, I think the main reason we decided to go this way was to preserve the field/cck code from massive refactoring (#3); I think I can see a way to concile the three points, but after what you and karen stated I am not so sure that #3 is a priority anymore. Anyway:

The patch accounts for that in the case of a regular node form submission by populating form values into $object->translations[form_lang]->$field_name instead of $object->$field_name in field_default_submit().
- it means the values are not where we expect them (in $object->$field_name) for the rest of the submit workflow : hook_field_validate(), hook_field_attach_validate(), hook_field_presave(), hook_field_attach_presave() are screwed.

Right, there was a lack of simmetry between load and save: currently after field load the canonical fields are populated, but on field submit we did not match this behavior. In the current patch both $object->translations[form_lang]->$field_name and $object->$field_name are populated, this way we should be ok.

Our current approach for 'load' is also broken on that aspect : if they only look in $object->$field_name, then hook_field_load() and hook_field_attach_load() only see one language.

Yes, it seems that currently also the load hooks might be screwed up, but if we perform the translation switch-in trick by calling a hook_translate before invoking hook_field_attach_load we should be fine: all the modules implementing the load hook will see the right translation according to the current UI language and fallback rules, if they need a different language they should know where to find it.

- it doesn't account for programmatic saves : developers need to remember 'I should put the values in $object->field_foo for non-translatable fields, but in $object->translations['fr']->field_bar for translatable fields'. That's terrible DX IMO.

Absoultely right. I think this is one of the most evident examples of the need of using getters/setters to access object properties now that we have PHP5 objects, but also this solution would involve a massive code refactoring.
So I am thinking about an api function that we might use to populate the object fields, something like field_store_field_values($fields, $object, $language = NULL).
The $object->$field_name[$lang][$delta] is also fine for me, but I am a little bit scared about the amount of work it would take.

@KarenS: as you expected I did not test the options widget, to be honest I never really got into that transpose stuff, if you could provide me some clue or even better test it that would be great :)

I still see the following code in the patch:

<?php
-        // Add the item to the field values for the entity.
-        $additions[$row->entity_id][$field_name][] = $item;
+       
// For translatable fields, put the multilingual values for each delta
+        // into a dedicated property. This information can be safely cached
+        // in the field and object caches.
+        if ($field['translatable']) {
+         
$additions[$row->entity_id][$info['translations key']][$row->language]->{$field_name}[$row->delta] = $item;
+        }
+       
// Load translation defaults according to the entity default language.
+        if (!$field['translatable'] ||
+            (isset(
$objects[$row->entity_id]->{$info['language default key']}) &&
+           
$objects[$row->entity_id]->{$info['language default key']} == $row->language)) {
+         
// Add the item to the field values for the entity.
+          $additions[$row->entity_id][$field_name][] = $item;
+        }
?>

This means the node object will be structured differently for translatable and non-translatable fields, which is what we are concerned about. We need a consistent structure in both situations so we don't continually have to test whether there is a language or not when we manipulate the node. There should be a empty value using the same structure for no language rather than creating a different structure in that case.

However I'm also wondering about losing the top-level position of fields in the bundle. In the current code, you can always be sure that the value of 'field_example' is located in $node->field_example and with this new structure you will need a way to figure out what languages are available to be able to find your field values in the bundle. I am now wondering if it might be better to reverse that so we have (with an empty placeholder for the language when it is not translatable) so that we can still have an easy way to find the values for 'field_example' in any situation:

$node=>
  field_example=>
    en =>
      0 => ...
      1 => ...
    fr=>
      0 => ...
      1 => ...

And a field with no language would be:

$node=>
  field_example=>
    '' =>
      0 => ...
      1 => ...

@KarenS: I understand your and yched's concerns but I would like to be sure that the current approach, which is the result of many endless discussions in IRC with catch, sun and nedjo (at least for the loading part), is totally clear before marking it as not viable, so excuse me if I repeat myself and/or repeat things you're fed up with :)

The main guidelines we followed while designing this were:
A) Avoid per language entity caching
B) Avoid to break anything is currently working
C) Support node_save(node_load($nid))
D) Support any kind of language fallback rules that one might need

To support A we decided to load and cache all the available field translations at field load.

In order to support B we decided to leave untouched the current $object structure and store the field translations in a dedicated property: the current $object data structure is designed to hold a single language and we need nothing more, unless we have to show more than one translation at a single time. So we decided we can load the best suited values at each content view in the $object canonical fields and store all the field translations elsewhere; currently this is what happens on field load:

  1. Field storage load: if a field is translatable its values are stored in $object->$translations[$lang]->field_value; if a particular field translation matches the source language (the one set while creating the entity), it is also stored in the canonical place: $object->field_value. These values get cached. When the process ends we are ready to invoke hook_field_attach_load with an $object in which all the canonical fields should be populated. This should guarantee that nothing breaks.
  2. Field attach load: currently the Translation module implements hook_field_attach_load and switch into the canonical fields the tranlations for the current language if available in $object->$translations[$current_lang]. All the following modules see a translated $object, again with all the canonical fields populated.
  3. Entity rendering: we have a plain old fully populated (although translated) $object, we have just to show it.

To support C I tried to match as much as possible the load behavior: on entity save anything found in $object->$translations is written to the DB plus all the untranslatable field values found in the canonical object properties.
On entity submit all the field values are stored in their canonical fields, if translatable they are also stored into $object->$translations[$form_lang].

D might be achieved through contrib modules implementing hook_field_attach_load as Translation does, otherwise we might introduce a new hook, e.g. hook_field_attach_translate, to be called before attach load.

If we have done the things right, the only places you should need to look into are the canonical $object fields.

Assigned:plach» Unassigned

plach ; no worries. My suggestion in #83 indeed has a high API change cost, so there's no harm in trying to see if the current approach floats. Actually, I have myself pushed in this direction, so I would not lightheartedly discard it. Plus #89 provides a really welcome summary of where we currently stand.

The problem is that populating one language at a time in the canonical $object->$field_name is not enough for hook_field_[attach_]?_load, hook_field_[attach_]?_validate, hook_field_[attach_]?_presave, etc. If something has to be done about field values in there, it has to be done for all languages, not just the current one.

Take hook_field_load(). This is where file field reads from the {file} table and merges the info into the field value - which natively only stores an fid. This has to be done for all languages, not just the object's default language, or any definition of 'current language'.
Same goes for validation : if all languages are written, then all need to be validated, not just the language that sits in front at the time validation is performed - and we do support validation for programmatic saves too (even if no 'core' fieldable entity uses it right now), so this not just a question of form workflow.

Which means that, in the curent approach, the hook implementations need to take care of both the front location ($object->$field_name) and the translations bucket ($object->$translations[$lang]->$field_name). I really think this is no good, and, again, take my fair share of the blame for pushing in what I now think is the wrong direction :-(

Assigned:Unassigned» plach

Subscribing to Field API Progress and mentoring initiative.

Assigned:Unassigned» plach

I am not so sure we have to apply all the hooks to all the field translations at each entity view/save and I would like to explore this idea.

I'd like to insist on the point that during a single entity view we need just a slice/facet of the $object data: if we have the proper translations (according to all the defined fallback rules) in place before calling the field_[attach_]load hooks I think we can safely ignore the other translation data.

Let's resume the file field example: say the field is translatable, we have an english PDF and a french PDF, each one with its fid; when we are viewing the english version the load hooks will get into the $object the english PDF data, viceversa when viewing the french translation we'll get the french PDF data. Why bother about the other translations in this case? Entity-level cache, maybe?

On the save side if we let just one translation be created/modified at a single time, we'll have that all the validation/presave/uknow hooks are correctly invoked and again can safely ignore any other existing translation data: if it exists it must have passed through all the above hooks previously to get there, so it must be valid and complete. This way, when saving an $object holding many translations we should be pretty sure that all of them passed through the hooks each time they got modified.

To ensure this probably we should slightly modify the current save process to be completely symmetric with the load one: we should be guaranteed that the we can perform a switch-out trick (pulling out the data from the canonical place and store it in $object->$translations[$save_lang]) after all the other hooks are over.

Let's say a basic switch-out is performed by the Translation module and let's say the data found in the canonical object fields is considered to be in the source language (and written onto the DB accordingly), if we disable Translation we gracefully degrade to the simple no-translation case in which the data is pushed into the DB and pulled out of it without having to worry about its language.

Having all these assumptions satisfied would allow us to have a completely language agnostic field_save: all the translation logic would be delegated to hook implementations, e.g. the Translation one.

Talking about good DX? :)

I think plach is right about hook_field_load() - the only place we'd need all hooks run on all translations of a field would be for entity level caching - in that case we can just not cache hook_field_load() in the object - which currently isn't cached anyway. This is already accounted for in the cache node_load() patch which has a hook_nodeapi_post_load() for those situations - an equivalent field_attach_load_uncached() could be introduced as well - we'll hit the same problem for user specific information in fields, not just languages - i.e. poll module's $allowvotes. If there are other use-cases apart from that for having everything run on all translations that'd be good to know though.

$object->fieldname[$lang] - means every piece of code which has to grab a field value manually has to account for languages - i.e. $object->fieldname[$lang][0]['view'] in node templates. Do I need to know if a field is translatable to use [''] or [$lang]? I also disagree that this is a predictable array structure - if a field is translatable, but untranslated - then I'll have empty values for $lang all over the place - leading to fallback logic needing to be implemented at a much higher level. We tried really hard to keep the heavy lifting for translations as internal to the API as possible to avoid situations like this. It seems like plach has a plan to avoid too much complexity on programmatic saves - but even if it's a bit more complex, better that than putting the fallback logic on everyone who needs to access bits of the node object directly.

Summary of IRC meetup with plach and sun yesterday - new approach :

1) field_attach_X() has to operate equally on all languages. All hooks and function calls (field_default_X(), hook_field_X(), hook_field_attach_X()) have to be run on each language.

2) No more $object->translations. We go with :

// translatable field:
$object->field_foo = array(
  'en' => (array of values keyed by delta),
  'fr' => (array of values keyed by delta),
);
// non translatable field - or translation.module disabled:
$object->field_foo = array(
  '' => (array of values keyed by delta),
);

3) We keep this new 'keyed by language then by delta' structure transparent to field_default_X() and hook_field_X(), by changing _field_invoke() so that it calls them for each language separately. What they receive in $items are the values for *one* language (they probably need to know which one, with an additional $language param), and they act as they do know.
3rd party modules that act directly on the object through hook_field_attach_X() have to deal with the new double-nested structure.

4) Still to be ironed out : field_attach_view() / field_attach_form().
Usual use case is : 'display $object in spanish' / 'display the form for $object in french'.
field_attach_view() : we need to extract *one* set of values of each field (witch language fallback logic), and display it.
field_attach_form() : we need to extract values of non transtatable fields and values in translatable fields in a given language
But it would also be useful to be able to have several languages in one form (side by side translation). Even if we don't implement it, we probably need to make sure its possible, possibly with some form massaging.

I like the sound of this approach, it sounds in concept like something that will provide the flexibility needed but still have a predictable structure that other parts of the code can deal with sanely. Thanks for all the work hammering this out!

Status:Needs review» Needs work

The last submitted patch failed testing.

StatusFileSize
new17.39 KB
Failed: Failed to apply patch.
[ View ]

This is the first draft patch implementing the new approach described in #94. It needs lots of work (I did not even try to make it pass the tests) but the basic form render/field save/field load/field display code seems to work correctly.

However there are already a few basic points that I'd like to discuss before going on: the current approach has the great advantage of not needing the concept of current language. Basically fields are natively multilingual, so the concept of current translation is something that might be built on top of the basic field layer by other modules (I am looking at Translations).

As fields become multilingual I think Field API should not enforce any kind of restriction on the values that get entered/displayed, just as much as it does not say "only odd values are entered/displayed for multiple fields". IMHO the language selection on field enter/display should be performed by the Translation module or any other one that might need to hook in to implement complex visualization/fallback/form manipulation logics.

The current patch goes this way: on form rendering all the field values for all the available languages are displayed, on entity save all the field translations are written to the DB, on entity load every field translation is loaded and cached, on entity view all the available field translations are displayed.

Anyone wishing to test the patch while reviewing it (I'd appreciate it a lot :) can use the translatable version of Demofield.

Yay ! Great work, plach ! I do feel we're getting there.

My code-related remarks so far:

1) Minor : can you set up your editor to remove trailing whitespaces (including on empty lines) ? Removing them in followup patches is a pain.

2) field_create_field() : please use 'translatable' => FALSE instead of 0

3) _field_invoke() :
The incoming object might have values for a language and not for others (think: submission of a form with only one language)
So the // Initialize field translations according to the available languages. part should act language per language :

foreach (languages as $lang) {
  $field_translations[$language] = isset($object->$field_name[$lang]) ? $object->$field_name[$lang] : array();
}

4) _field_available_languages():
what's $instance['languages'] ? I don't see it anywhere else in the patch. Probably related : hook_field_languages ? Could you elaborate on the need for this ?
invalid by-reference param at call-time : $function($field, $instance, &$languages);

5) db storage : now would be a good time to actually write the code that limits the number of values *per language* on read / write
(current patch doesn't count values as soon as a field is translatable)

6) field_sql_storage_field_storage_write() / _field_sql_storage_field_storage_write_items()
I don't think we need a separate _field_sql_storage_field_storage_write_items() function.
It currently does one insert query per language, which is a performance loss compared to:

delete values for all languages
foreach (languages as $lang) { // degenerate case for non-translatable nodes : languages = array(FIELD_LANGUAGE_NEUTRAL);
  add the values for the language to the insert query
}
run the insert query.

7) Also: it's probably time we start to write some tests ;-) ?

Less technical : The points about form / view will need "more i18n / less Field API' oriented people like nedjo or Jose to come back and provide feedback / insights for UI strategies. Maybe a couple screenshots would help to convey a better idea of where the patch currently stands ?

Also : I don't think we should tie this patch to *the* perfect UI for field translations, nor that both discussions can happen in the same thread. Ideally, support for translatable fields should be committed first with a 'reasonable / least intrusive' UI, and let discussion about the ideal UI ake place in a separate thread. So we need to agree on this minimal acceptable UI for 1st commit.

In short: i18n gurus needed, now :-)

What we can be pretty sure about IMO is :
- we'll need to view 'de + neutral' values only, when current page request (according to lang negociation rules decided somewhere else) is 'de'.
- we'll need to handle the case of a form submission for one language only, and make sure the values for this language are validated and saved properly, while the other languages are untouched. (plach : right now, it can be roughly tested by *not* filling the $form definition with elements for a given language)

So we can start thinking about the internal mechanism for Field API to support this.

I am working on yched's suggestions from #98, meanwhile here are some screenshots: the test has been made adding the French language and attaching two fields to the article content type: a translatable demofield and an untranslatable one, as you can see all values from all languages are displayed.

I adopted the approach of displaying all the available values during view/submit to have a consistent behavior: this way all the field_attach_X functions act on all the field translations. I think that with some minor adjustments (e.g. reorganize fieldsets/display the related language on field labels) this could be the basic official Field API UI. As I was saying in #97 other modules might easily hook in field_attach_form and field_attach_view and decide which values/form elements get actually displayed and how.

This way we could support with little effort the common use case of having one single language on object viewing/submitting and the less common, but probably desirable, case of having many languages during a single object view/submission. IMHO supporting both of them from the opposite scenario (Field API displays one single translation at a time by default) would be far harder to achieve.

Anyway I fully agree on discussing the translation UI elsewhere.

On #98/4: sorry for not commenting that part: I thought about a couple of ways for other modules to determine which languages are available for a given field instance: one might be somehow populating the $instance['languages'] setting, another one might be having a hook invocation right there. Not sure about this point.

About tests: before starting, should we fix the existing ones?

Having several languages on one form raises a technical issue : right now it generates incorrect HTML (duplicate ids) and thus breaks JS behaviors ('add more', possibly d-n-d reorder). This would require adding the language code into id strings, but it means field_default_form() needs to receive the language being treated. That's // TODO: Should we introduce a $language parameter?. Note that this language param will be needed by text_field_sanitize() as well, for the check_markup() call.

Supporting multiple language in a form might not be that easy (fieldgroups, ordering between translatable and non translatable fields...). Needs some UI thinking.
I think meanwhile we can start thinking about how we can limit form and view to a single language. The $options param for field_invoke() introduced in #392494-18: Fields API needs to allow searching of scalars might be a solution : $options['language'] = 'de' only iterates on 'de' values.

About $instance['languages'] : I'm not sure we need to support a list of languages per field. Going with the set of active languages should be fine in a first pass. If the need arises, we'll see...

About tests : Well, I guess for a start we could add a set of separate FieldTranslation tests, and leave the existing ones broken.

StatusFileSize
new43.59 KB
Failed: Failed to apply patch.
[ View ]

Implemented yched's suggestions from #98, added the $language parameter, removed $instance['languages'], and fixed field attach tests (found a couple of bugs in the previous patch thanks to this).

TODO:
* iron out the field view/submission multilingual vs monolingual issue
* write field translations tests

Status:Needs work» Needs review
StatusFileSize
new57.98 KB
Failed: 11156 passes, 117 fails, 192 exceptions
[ View ]

In the current patch I worked on tests: I wrote a couple of simpletest for the multilingual field logic and had all the non form-related tests fixed.
I left alone the form-related tests as we still have to iron out the field view/submission issue.

Still to be considered yched's suggestion below, as I need to discuss it in some detail before coding:

I think meanwhile we can start thinking about how we can limit form and view to a single language. The $options param for field_invoke() introduced in #392494-18: Fields API needs to allow searching of scalars might be a solution : $options['language'] = 'de' only iterates on 'de' values.

Reviews, especially from the i18n gurus, at this point are welcome.

The last submitted patch failed testing.

Status:Needs review» Needs work

A brief summary for people needing to get up-to-date: the issue is currently assigned to me (plach) as I subscribed to the to Field API Progress and mentoring initiative (http://drupal.org/community-initiatives/drupal-core/fields), yched agreed to be my mentor and is providing the main Field API-related design suggestions. The current patch implements the approach that came out from the latest IRC meeting (http://drupal.org/node/367595#comment-1412738), which has been attended by yched, sun, catch and me. The new approach radically differs from the previous one but did not subvert the design guidelines that inspired the latter (see points A, B, C, D from http://drupal.org/node/367595#comment-1364062).

We currently stand at this point: we have pretty functional load/save/delete/validate/sanitize behaviors and simpletests are in place (http://drupal.org/node/367595#comment-1574020), but we are completely stuck on view/submission. The new approach has the great advantage of not needing the concept of current language, as a matter of fact all the mentioned operations are currently performed on all the available field translations, view and submission are no exception (see the screenshots on http://drupal.org/node/367595#comment-1485130). What we have to decide right now is: do we want to keep this multilingual approach for field view/submission, at least when there is no translation module enabled having the responsibility to introduce the concept of current (UI? content?) language, or should we go for a monolingual approach right from the Field API core? Details are available on http://drupal.org/node/367595#comment-1471596.

An IRC meeting has been held today about the open issues reported in #105, the attendees were bjaspan, catch, nedjo, sun and me.

The main conclusions were:

  1. On unilingual sites (Locale module not enabled, or Locale module enabled but only one language enabled) the field view/submission behavior should be exactly the same as the current D7 HEAD: language is implied. For multilingual sites we should try to emulate the current Translation module behavior.
  2. Edit and display should be in a single language based on a parameter passed to field_attach_form/field_attach_view. We are confortable with the parameter solution as $language is not a property of the object: it indicates the part of the object to be rendered as a form or viewed.
  3. Contrib modules should be able to alter the form: i.e. same object, get the field form for another language to put it somewhere. This should be possible as field_attach_form fills in a $form array it receives as parameter and field_attach_view generates a structured content array and returns, so both of them should be re-callable provided that no hook alters the $object.
  4. We are not entirely clear about where the $language parameter should come from, in the standard workflow, but our initial inclination is a language code as part of the node/%node/edit path.
  5. There is consensus about the need of distinguishing between content language and UI language, but this is far out of the scope of this patch.

The next step after implementing 1 and 2 is defining minimum RTBC. Probably we'll have a patch that is only the framework in Field module, with actual translation workflow to follow in later patches, likely to Translation module.

Nice summary plach. This was a valuable meeting, particularly as it brought together Field API and multilingual expertise.

The idea is pretty similar to what we have now with node translation. Node module has the infrastructure needed for multilingual nodes (e.g., language, tnid, and translate fields) but doesn't on its own implement multilingual functionality. Similarly, field module would provide the minimal infrastructure needed for multilingual fields but not itself provide any UI or implementation.

So this initial patch, probably, should have no UI reflection. It should provide the minimal backend infrastructure needed for other modules to implement multilingual fields.

Then followup patches can put that multilingual support in place, likely in the translation module, and likely closely paralleling what we already have in the current "one node per translation" model.

I've started a wiki page, http://groups.drupal.org/node/22478, to begin to sketch out what that future work might include.

Sorry for missing the meetup, and thanks plach and nedjo for these summaries.

While I agree that this patch should not be held by UI issues, the following questions come to my mind :
- how can we define a notion of a 'committable state' for this backend patch ?
- how do we make sure that there is enough community momentum on the followup supporting patches so that we this initial commit doesn't end as a bunch of dead code ? I, for one, have my hands more than full with 'strictly Field API' stuff :-)

For what is worth I give my availability to work on the following patches, obviously I would be more than glad if other big ones joined us. Maybe we could try to get webchick involved, perhaps she is more clear about who might be interested/available to help us with the remaining work.

Edit: I think that when field_attach_view and field_attach_form and all the simpletests will be fixed we'll be ready for some serious review after which go RBTC.

Status:Needs work» Needs review
StatusFileSize
new99.54 KB
Failed: Failed to apply patch.
[ View ]

This one implements the design guidelines reported in #106 and yched's suggestion cited in #103.

I added the $language parameter to field_attach_view, field_attach_form and field_attach_preprocess.

The $language parameter defaults to the current UI language, for now I left untouched the Node module as it was unclear how to set it and its possibile fallback values, I think this should be addressed in the following patches. ATM the view/submission/preprocess language is implied (actually it's the current UI language).

All the field tests are fixed. I think we are ready for review.

Thanks plach !
No time for a proper review right now, couple remarks from a really quick look at the patch :

- could you expand on the need for $option['merge_results'] in _field_invoke() ?

- $return[$id][$language] = isset($return[$id][$language]) ? array_merge($return[$id][$language], $result) : $result; looks like unneeded precaution, since $return[$id][$language] is initialized a few lines earlier ?

StatusFileSize
new104.83 KB
Failed: Failed to apply patch.
[ View ]

- could you expand on the need for $option['merge_results'] in _field_invoke() ?

No real need ATM, in the first patch it seemed to me sensible having per-language separate results from _field_invoke, in this one the only functions using the result relied on a "merged" result, so I decided to keep the possibility to have separate results for the future but currently we don't need that option.

- $return[$id][$language] = isset($return[$id][$language]) ? array_merge($return[$id][$language], $result) : $result; looks like unneeded precaution, since $return[$id][$language] is initialized a few lines earlier ?

You're right, I fixed it in _field_invoke e _field_invoke_multiple.

If we have no actual use for $option['merge_results'] for now, I'd suggest we remove it and keep only the behavior we need.

StatusFileSize
new98.7 KB
Failed: Failed to apply patch.
[ View ]

Fine, removed the option and the per-language results of _field_invoke e _field_invoke_multiple.

Issue tags:+Fields in Core

Status:Needs review» Needs work

The last submitted patch failed testing.

Rerolled #114 but #372743: Body and teaser as fields broke everything :(

Implementing node body and teaser as translatable fields is a task too complex for me to accomplish alone. If at least Dries, Webchick, Gabor and Nedjo don't give their feedback on this one I am afraid I'll have to give up: I can't go on working without knowing if this is going to be supported by the community.

plach - it doesn't seem like body and teaser need to be translatable as part of this patch - can't that be left as a non-translatable field?

catch, AFAICT for this one to be committed all the node/blogapi/book tests need to be fixed and tests will work only when all the $node->body[0]['values'] become $node->body[$language][0]['values'] and who knows what else; and you know better than me what a pain that $language may be.

If I understand the current status well, at this point translatable fields involve relevant changes in the Drupal core thus I'd like to know if this issue is considered interesting and deserving further work before going on.

plach: It's true that the patch won't get the attention it needs if tests are broken and it can't stay in CNR state.

But the new node body field doesn't have to be translatable for now, so fixing tests mainly involves changing $node->body[0]['value'] to $node->body[FIELD_LANGUAGE_NEUTRAL][0]['value']. Tedious but possibly not as complex as it sounds. The latest patch in #372743: Body and teaser as fields can be used to identify places in .test files that were altered and thus need to be fixed again - not that much AAMOF.

Ahh, yes. That's a big reason I didn't want to change the object structure in the first place... let's try to get some 'early' committer feedback then.

yched: I hope you're right because I already tried to search/replace $node->body[0]['value'] to $node->body[FIELD_LANGUAGE_NEUTRAL][0]['value'] while rerolling the patch in #117 and nothing seemed to change in the test results...

To be as explicit as possible: fixing all those tests may involve lots of work and I won't be the one who will do that if I am not reasonably sure that the needed time is well spent.

Edit: please forget the sentence above, I had a couple of troubled days

plach : if you have done some of the replacements already, why don't you upload what you have so that we know where we stand and people can help ?

Status:Needs work» Needs review
StatusFileSize
new125.13 KB
Failed: 11157 passes, 1 fail, 0 exceptions
[ View ]

I forgot not to panic (and my towel too, probably...)

Here's a rereolled patch which uses FIELD_LANGUAGE_NEUTRAL where needed and should fix all the broken tests.
Let's see what the testbot has to say.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new123.97 KB
Failed: 11151 passes, 26 fails, 228 exceptions
[ View ]

Oops, wrong patch. Let's try again...

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new124.63 KB
Failed: Failed to apply patch.
[ View ]

this one should work

plach: ;-)

DBTNG review ;)

+ db_delete($table_name)->condition('etid', $etid)->condition('entity_id', $id)->condition('language', $languages, 'IN')->execute();

I know it's already wrong, but you should place each method call on it's own line when there are 2+. See http://groups.drupal.org/node/17906#comment-77003

EDIT: Oh, and IN is the default now when passing arrays to condition(), so that isn't necessary anymore.

- $entity->{$this->field_name} = NULL;
+ $entity->{$this->field_name}[$language] = NULL;
field_attach_insert($entity_type, $entity);
$count = db_result(db_query("SELECT COUNT(*) FROM {{$this->table}}"));

db_result() should be replaced by fetchField(), but you don't change it, and I did it already in #491556: DBTNG the rest so I'd say you can leave it for now, that patch is already huge ;)

StatusFileSize
new124.36 KB
Failed: Failed to apply patch.
[ View ]

Here is a new patch implementing Berdir's suggestions.

@Berdir: any review is welcome ;)

@yched: :)

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new124.21 KB
Failed: Failed to apply patch.
[ View ]

rerolled, testbot?

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review

I think I forgot "please"

Er, I just got a reversed epiphany I wish I had several weeks ago :-(.
What are actual use cases for 'translatable fields' if objects "titles" (node title, term name) are not fields and are thus not translatable ?
Translatable user profiles, probably, since user names are language neutral. But how many sites have users willing or able to fill their profile in more than one language ?

Of all 'X as field' potential tasks, 'node title as field', 'term name as field' are probably the most problematic performance wise...

@yched: I had a discussion with catch on IRC: he believes there are possible workarounds to the non-field entity "titles" but obviously this needs a separate discussion.

He suggests we should talk with webchick and decide if this issue is core-worthy, if the patch gets in we'll address your concerns in a separate issue.

[My point] A review before the meeting with webchick would be precious :)

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new129.57 KB
Failed: Invalid PHP syntax in modules/taxonomy/taxonomy.test.
[ View ]

rerolled

Status:Needs review» Needs work

The last submitted patch failed testing.

StatusFileSize
new131.66 KB
Failed: Failed to apply patch.
[ View ]

Here is a brief description of the current (rerolled) patch: the aim of the patch is providing multilingual capabilities to the Field API. During an IRC meeting attended by bjaspan, catch, nedjo, sun and me it was decided that in this patch we would lay the basic groundwork, simply making Field API support multiple translations of the same field value, letting other modules, most likely Translation, implement the actual translation logic.

We decided to refactor the current entity field structure from:

<?php
  $object
->field_name[$delta][$field_type_key]
?>

to:

<?php
  $object
->field_name[$language][$delta][$field_type_key]
?>

This approach allow us to consistently handle both translatable fields and untranslatable ones, which get a dummy language code defined into FIELD_LANGUAGE_NEUTRAL, just as much as single fields share the $delta key with multiple ones.

This refactoring involved lots of changes in many places, thus increasing the patch size, but most of them are trivial: simply introducing the new language level into the various data structures.
The main relevant changes were to _field_invoke and _field_invoke_multiple (field.attach.inc): as per yched's suggestion, to ensure that each Field API hook/function is called on all the available field translations they now act on each one found in the passed entity data structure which has an allowed language code; the concept of allowed/available language is provided by _field_available_languages (field.attach.inc), which currently returns a list of the enabled UI languages.
This way we don't have to rely on any function call sequence for the whole thing to work, thus preserving Field's API nature. Moreover no language-specfic cache is needed.

The only exceptions to this otherwise full multilanguage handling are field_attach_view and field_attach_form, which use an additional $language parameter to narrow the result of the field hook invocations to the single language specified. With this choice the only ones needing to care about the change in the data structure are the field hook implementations, while the theme layer and the Form API data structures are language agnostic.

On the performance side this changes should be pretty unrelevant for unilingual sites, for which they should be fairly transparent, for multilingual sites we have an additional level of iteration whose entity is dependent on the number of languages enabled, not that much, probably.

Status:Needs work» Needs review

very nice patch summary. sounds reasonable to me.

Thanks for the summary, plach.

I did another pass of review a couple days ago, and didn't find the time to polish and post so far.
Code-related remarks below, I'll keep higher-level remarks in a separate comment.

- we tried to avoid this kind of stuff in the BAF patch:

-      $edit['body[0][value]'] = $this->randomName();
+      $edit['body[' . FIELD_LANGUAGE_NEUTRAL . '][0][value]'] = $this->randomName();

Should we use the actual value of FIELD_LANGUAGE_NEUTRAL ? (not sure about '_0', BTW. Can NULL, or emty string be used as an array key ?)

- detail: field_sql_storage_field_storage_load() - I'm not sure that part is actually needed:

+      if (!isset($delta_count[$row->entity_id][$field_name][$row->language])) {
+        $delta_count[$row->entity_id][$field_name][$row->language] = 0;
+      }

I think $delta_count[$row->entity_id][$field_name][$row->language]++ works even if $delta_count[$row->entity_id][$field_name][$row->language] is not defined (didn't check, though). Same thing in field_sql_storage_field_storage_query()

- field_sql_storage_field_storage_write() - I'm unsure about that part:

-    // Leave the field untouched if $object comes with no $field_name property.
-    // Empty the field if $object->$field_name is NULL or an empty array.
-
-    // Function property_exists() is slower, so we catch the more frequent cases
-    // where it's an empty array with the faster isset().
-    if (isset($object->$field_name) || property_exists($object, $field_name)) {
+    // Leave the field untouched if $object comes with an empty $field_name property.
+    // Empty the field translations if $object->$field_name[$language] is NULL or an empty array.
+    if (isset($object->$field_name) && is_array($object->$field_name) && count($object->$field_name) > 0) {

IMO the behavior we want is:
no $object->field_name : leave data untouched
$object->field_name = array() or = NULL: empty the field for all (enabled) languages
no $object->field_name[$lang]: leave data untouched for $lang
$object->field_name[$lang] = array() or = NULL: empty the field for $lang
Maybe the code does this, but I can't say for sure.

- _field_invoke() / _field_invoke_multiple(): I don't think we need $options['language*s*'] as an array right now. A single language should be enough.
Also, I'd suggest taking $suggested_languages out of _field_available_languages(), and let field_invoke_* handle $options['language'] itself.

- field_add_more_submit():

-      $form_state['field_item_count'][$field_name] = count($form_state['values'][$field_name]);
+      $form_state['field_item_count'][$field_name] = count(current($form_state['values'][$field_name]));

seems a bit hackish. I guess we'd need the 'add_js' button to have a #language entry so that field_add_more_submit() can retrieve $form_state['clicked_button']['#field_name'] ?

Higher level remarks:

- Can we summerize (er, again ?) how much the feature is tied to locale and/or translation modules ?
what happens with translatable fields before locale/translations is enabled ?
what happens when locale gets disabled / a node is edited / locale gets re-enabled ? any chance of data loss ?
what happens when a language gets disabled / a node is edited / the language gets re-enabled ? any chance of data loss ?
are there other scenarios we need to watch out for data loss ?

- Currently sql_storage_load returns values for all languages it finds in the db. If a language is actually disabled, the data still end up in $object, but escape field_invoke() treatments, which is unsafe. sql_storage_load should only return values for active languages. Similarly, sql_storage_write shouldn't touch values for disabled languages.
I think this should address a couple data loss cases above.

- We have no notion of "translation of the object exists for $lang", only "object exists" and "field_foo has (or has no) values for $lang"
For instance, we don't know which 'see in $lang' links we can provide.

- Probably related: a node is usually created in a single language at first. How do translatable fields appear in other languages: empty ? with the values of the default language ?

- Also possibly related: Gotchas with default values.
Default values are added on object *creation* (not update) - currently, for all enabled languages. Happens in field_default_insert()
Imagine: 'en'+'de' are enabled; field_foo has default value 'default'; node 1 is created in 'en' with field_foo = 'bar'; *then* langage 'fr' gets enabled:
a) field_foo is 'bar' in en, 'default' in de, empty in fr
b) the default value won't show up on the form to 'create' the fr translation.

- Also: do we need default values to be per language ? (ouch)
Some other field settings ('list of allowed values' for "list" fields ?) could be problematic as well, I guess...

Subscribe

This weekend an IRC meeting will be likely held with webchick, I'll work on yched's suggestions from #143 after that one.

About #144:

  • Locale/Translation dependency: none; _field_available_languages currently uses language_list (bootstrap.inc) to determine the allowed languages: with Locale disabled translatable fields get $language = 'en' and untranslatable ones get $language = FIELD_LANGUAGE_NEUTRAL. Enabling Locale would allow _field_available_languages to return all the enabled languages. Translation should implement the basic content translation interface.
  • Data loss: I think the desired behavior is already implemented, I'll check it explicitly and update field tests to take $language into account if necessary.
  • SQL storage and allowed languages: makes sense, I think this can be easily fixed by invoking _field_available_languages during field load.
  • Object translations and language fallback: both remarks make sense but I think they should and hopefully will be addressed in the following patches, along with the other issues presented in nedjo's TF implementation details wiki page.
  • Default values: these are problematic but should be addressed in this patch IMHO. I need the core committers feedback before going on. FWIW I think default values should be provided per language.

Agree with Moshe, great summary. I think this approach is sound and could support all kinds of translation approaches with core and contrib modules. It does not in itself provide any translation features whatsoever, just lays down the groundwork, but without the groundwork, we step back from Drupal 6's achievement of translatable nodes.

One thing not clear to me: how does this work with a multilingual site when you add taxonomy terms or users for example, for which I guess there will be no core UI to translate? Will all data be stored in _0 fields, or would we have some problems with data stored under different languages?

@Gábor Hojtsy: thanks for reviewing :)

I think we could refactor User and Taxonomy modules to default to FIELD_LANGUAGE_NEUTRAL or the default site language and provide the minimum hook infrastructure needed to allow contrib modules to properly support taxonomy term/user profile field translations.
Another possibility would be: fields are content, Translation provide the basic content translation UI not just for nodes but for any fieldable entity (still emulating the plain old translatable nodes system).

I think maybe one possibility to limitate UI issues and 'how do we handle the huge amount of features/issues we just added by making fields translatable' could be to state that only some *object types* are translatable - starting with nodes.
Thus, a field on an object can carry multilinguial values if (field is translatable && obj_type is translatable).

Note that all fields on all object types, translatable or not, would still come with the same structure :
$object->field_name[$lang || LANGUAGE_NEUTRAL] = (array of values).
That's important for DX and predictability of what you find in $object->field_name.

We could do that in hook_fieldable_info() (or hook_fieldable_info_alter()) quite happily I think, sounds like a good plan for now unless someone comes up with a magic UI which works for any entity.

I am not sure I understand the need of explicitly marking the entity as (un)translatable: if taxonomy, profile or whatever declare their fields as not translatable don't we get the same result?
If we added some hooks to let other modules change this setting and provide a translation logic we should be fine: with no contrib module taxonomy and profile behave just as field weren't translatable but, say, i18n might mark user fields as translatable and implement the missing UI.

re plach #151:

"if taxonomy, profile or whatever declare their fields as not translatable don't we get the same result ?"

Taxonomy just declares 'taxo terms' as 'fieldable', it doesn't control what fields get actually added to them. Once an entity is fieldable, anyone (Field UI, any contrib module...) can add any field type with any setting to any entity.
So currently taxonomy has no way to say 'Hey, you cannot add translatable fields to terms'.

"If we added some hooks to let other modules change this setting"

Already there: hook_fieldable_info() comes with hook_fieldable_info_alter().

How dumb of me! For whoknows which reason I thought that Taxonomy would declare its fields, Profile the same. End of the story.

Works for me. This is THE solution. I'll implement it as soon as we talk to webchick.

re #136 and the concern that lack of translatable titles means this whole patch is doomed to uselessness: We do not need to turn "titles into fields" to make titles into fields. :-) See http://drupal.org/node/82751#comment-876871. Brief summary: We have a contrib module that adds a new field named "title_field" or something that is translatable (and formattable) and hides the actual title widget on the edit form.

Agreed with #154. Additionally - my user might be called 'Barry Jaspan' in nine languages, but might need to have their biography translated. Some workarounds being necessary for translating title for nodes are nothing compared to the workaround that would be required trying to implement tnids for users.

Status:Needs review» Needs work

The last submitted patch failed testing.

[deleted content: wrong issue]

Status:Needs work» Needs review
StatusFileSize
new131.34 KB
Failed: Failed to apply patch.
[ View ]

I've worked on yched's suggestions from #143/#144:

  • About:

        -      $edit['body[0][value]'] = $this->randomName();
        +      $edit['body[' . FIELD_LANGUAGE_NEUTRAL . '][0][value]'] = $this->randomName();
       

    Using the actual value makes defining the constant pretty useless IMHO, I'd prefer to keep using the constant if there is no strong need to use its value.

    About the actual value using NULL or '' causes element_children to throw warnings, moreover I think that using a 2 characters string is more consistent. Obviously _0 might be replaced by any other 2 characer string not already taken by language codes. Suggestions?

  • Removing the delta counter initialization makes field_sql_storage_field_storage_load throw notices on line 248 e 259.
  • I modified field_sql_storage_field_storage_write() and its tests to implement the described behavior. I also modified field_sql_storage_field_storage_load() to load only available languages. This should guarantee that no data loss is possible if a language or even Locale gets disabled.
  • $options['languages'] has become $options['language'], but taking out suggested languages from _field_available_languages is problematic because we have to test if the field is translatable and decide if it should use the suggested languages or FIELD_LANGUAGE_NEUTRAL. IMO this logic should be kept confined into _field_available_languages.
  • 'add_js' button's #language fixed.

The only open issues should be:

  • per-language field/instance settings (default values, allowed values, ...), on this one I need some discussion.
  • translatable entities switch

Marking CNR to see what the testbot has to say.

StatusFileSize
new137.64 KB
Failed: 11602 passes, 1 fail, 1 exception
[ View ]

In this one I implemented the translatable fieldable info property which defaults to FALSE. Now every entity wishing to allow its translatable fields to be treated as multilingual has to explicitly set its translatable property to TRUE or rely to a hook_fieldable_info_alter implementation, as nodes are likely to do.

I also added some missing PHP docs. If we fix the default values issues I think we are done :)

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new134.57 KB
Failed: 11647 passes, 1 fail, 1 exception
[ View ]

Straight re-roll.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review

Here were some stuff that stood out to me on an initial pass w/ yched and sun on IRC. Some notes:

- I haven't read the issue yet: I'm only about pass 2-3 on http://webchick.net/6-pass-patch-reviews. If I'm asking/pointing out stuff that's already been discussed, please point me to the
- I spent about an hour and only have a couple of things to show for it, but I think I have the general gist of what's going on. I'll need another pass to read the issue summaries more thoroughly and then will probably have another list of things.
- Thanks a lot for your perseverance on this patch!!

/**
+ * The language code assigned to untranslatable fields.
+ */
+define('FIELD_LANGUAGE_NEUTRAL', '_0');

A few questions...

1. The PHPDoc says that this is "untranslatable" fields, but yet the variable is called "language neutral." Which is it?
2. Is "language neutral" a well-understood term in the i18n world? Would "FIELD_LANGUAGE_NONE" be better? (sun's suggestion) Then it would make sense to module developers who don't know anything about i18n.
3. '_0'? What the heck? :) Can we not just use something like NULL? Or '' as the $languages dropdown does in node.admin.inc?

Either sun or yched (I forget which) pointed out that drupal_render() raises notices with with [''] as an array index. BUt I think fixing that in another issue would be preferable to doing this sort of hack here.

-      $edit['body[0][value]'] = '!SimpleTest test body! ' . $this->randomName(32) . ' ' . $this->randomName(32);
+      $edit['body[' . FIELD_LANGUAGE_NEUTRAL . '][0][value]'] = '!SimpleTest test body! ' . $this->randomName(32) . ' ' . $this->randomName(32);

I'm concerned about the impact for themers with this change. Luckily we got render($content['foo']) in, because without it, it would be impossible to print out a node field like you could in D6.

In D6, you would print_r($node) and get something like:

$foo_value = $node->field_foo[0]['value'];

I don't see any possible way to get that in the current system, because the location of the language code is *before* the delta and index. For translatable fields, yched points out that the $langcode variable could be used:

$foo_value = $node->field_foo[$langcode][0]['value'];

But this will not work for non-translatable fields. :\

yched is going to ponder some about how we could structure the renderable array so that themers would not be out of luck.

Also, knowing that language-neutral means non-translatable.. why is the body field not translatable?

+    $count = db_result(db_query("SELECT COUNT(*) FROM {{$this->table}}"));

We try not no longer do COUNT(*) as this is tremendously slow on InnoDB, which is the default storage engine. See #196862: COUNT(*) is an expensive query in InnoDB.. Will counting an explicit column work by chance? Oh. Actually, I see that this appears elsewhere in the tests. Let's file a follow-up to fix this then.

+ * Implementation of hook_field_languages
+ * Implementation of hook_fieldable_info_alter.

Should be like: "Implement hook_field_languages()." (Implement ... end with () and .) Weird, but true.

Doing count(*) inside a test isn't an issue IMO, it's only slow in the sense of "don't want to do it on every page load" slow.

On the node structure issue - early on I'd suggested something like $node->field_translations[$lang][$field_name][$delta] to hold all the field values in all languages, and leaving the $node->$field_name structure the same - just containing the current language or the language neutral version.

But there's two issues with that - yched rightly pointed out that that sort of structure doesn't allow field modules to run properly on translations - for example we want hook_field_load() to run on each translation so everything that's needed is there before we actually display it. We also want translation module or similar to do the work of displaying the most appropriate value of the current language (fallbacks etc.), not have any of that work done in Field API itself.

Between field theme functions, field templates and render() it seems like it shouldn't be necessary for themers to use values directly like that any more, but I don't know how realistic that is.

StatusFileSize
new137.11 KB
Failed: Failed to apply patch.
[ View ]

@webchick:

Thanks a lot for your perseverance on this patch!!

Despite the amazing work going on, without this one D7 simply won't fit my needs :)

  1. About FIELD_LANGUAGE_NEUTRAL: I adopted language neutral as it is the current definition of "no language" in the Translation module; an untranslatable field is content with no language so FIELD_LANGUAGE_NONE works for me.
    About its value, as I was saying in #158, the main reason I used a non empty one is consistency: built-in language codes are 2 character strings. Moreover a NULL/empty value might be harder to use with equality/empty()/isset() operators than an actual value.
    Apart from this I totally agree that '_0' is ugly and we can find something better. I used the underscore as I wanted to be sure that no clash with built-in language codes was possible, for the 0, well, you get it :) Suggestions are welcome.
  2. About $foo_value = $node->field_foo[0]['value']: a possible approach to this problem is providing the current field language code as a template variable; this way no matter the field being translatable or not we get:
    $foo_value = $object->field_foo[$field_language][0]['safe']
  3. About the node body being untranslatable (see #117 and following for details) we decided to address the "Node body and teaser as translatable fields" issue in a separate patch as it needs a better understanding of the translation workflow (we have to find out how to populate the $language key).
  4. About COUNT(*)s: thanks for linking the other issue but, I agree with catch, in this case we are talking about tests on two-rows tables, if we decide counts are evil then we'll need to polish all the tests...

The attached patch introduces FIELD_LANGUAGE_NONE, adds the $field_language template variable, fixes PHP docs and adds a simpletest for a yet untested behavior: multilingual fields storage/retrieval.

It also ships a little change in _field_valid_language: now it's possible to choose if empty/invalid language codes are replaced with the default language or the current one, this ensures a more predicable behavior for field_attach_form while allowing field_attach_view calls not to explictly specify the $language parameter.

Thanks for the fast update, plach:

- FIELD_LANGUAGE_NONE / '_0': on IRC we agreed that '' (empty string) would make the most sense, if possible. If it *works* (but just produces notices in drupal_render), we might want to go for it and fix drupal_render in a separate patch. I'm not sure I get your concerns about equality/empty()/isset() operators.

- About templates: AFAICT, you added a variable in the field-templates, right ? We probably need it as well, but the main issue is in node templates. There are some cases (less often in D7 than in D6 though) where themers need to access a field's raw values. Right now they need to know the language for each field, to access $node->field_foo[$lang][0]['value'].
My first guess is they need to pick between FIELD_LANGUAGE_NONE and 'the language asked for the display'. But (stop me if I'm on crack) this might become a little more complicated if complex language fallback rules are applied:
when node 1 is displayed in 'fr', field_a values displayed will be FIELD_LANGUAGE_NONE, field_b values displayed will be 'fr', field_c values displayed will be 'en' because there are no 'fr' values...

Even if it's just 'either FIELD_LANGUAGE_NONE or the requested language', it seems that we need have some variables for 'the raw values for field_a in the language being used in the template'

Curently in node templates, the $field_a variable is strictly equal to $node->field_a (raw values for all languages). Maybe we could re-purpose $field_a to be the 'values for currently displayed language only' ?

- "It also ships a little change in _field_valid_language: now it's possible to choose if empty/invalid language codes are replaced with the default language or the current one, this ensures a more predicable behavior for field_attach_form while allowing field_attach_view calls not to explictly specify the $language parameter."
Sounds like it makes sense, but it's a little complex :-) Could you expand the PHPdoc for _field_valid_language() a bit, and add inline comments explaining why field_attach_form() uses one form and field_attach_view uses the other ?

ISO639-2 has reserved the zxx code for No linguistic content / Not applicable.

StatusFileSize
new138.66 KB
Failed: Failed to apply patch.
[ View ]

@yched:

FIELD_LANGUAGE_NONE: Damien's suggestion suits perfectly my design, using it would allow us to avoid the tricky situations I mentioned above, e.g.:

<?php
function foo($language = NULL) {
 
// what if FIELD_LANGUAGE_NONE == '' ?
 
if (empty($language)) {
   
$language = 'en';
  }
 
// ...
}
foo(FIELD_LANGUAGE_NONE);
?>

I reintroduced field_attach_preprocess which makes available in $variables the field values as currently displayed.
I also fixed the PHPdocs.

I wasn't aware of the ISO639-2 convention, thanks Damien. I fear that 'xzz' will raise massive "WTF is this ?" when people start seeing this in their nodes (all the more than most fields won't be translatable...), but I'll wait to hear what others think.

field_attach_preprocess() looks fine, minor points:
- The PHPdoc should just make it clear that the function adds to the $variables argument by reference. See how we documented that in the PHPdoc for field_attach_load(). It should also explicitely say what variables we add and what they contain.
- We should document those variables in node.tpl.php and user.tpl.php (also recommending the use of $field_name vs $node->field_name)

StatusFileSize
new141.7 KB
Failed: Failed to apply patch.
[ View ]

About 'zxx', IMHO it's the best solution, simply people will have to learn one simple thing among the others. But I won't insist on this point, if you think we should go with '' I'll do it.

Now PHPdocs should be ok, except for possible syntax errors :)
I also added a simpletest for field_attach_preprocess and the variable $field_translatable to field.tpl.php.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new146.33 KB
Failed: Failed to apply patch.
[ View ]

rerolled

We agreed with yched to leave 'localized settings' (allowed values, default values) for a later patch. So the last pending issue (apart from those raised by nitpick reviews) should be the "default values on translation creation" one.

Currently if field_attach_insert is invoked with $object->$field_name unset and the field instance defines a default value function callback a default value it's provided. This cannot happen with translations unless we add a similar functionality in field_attach_update. My (modest) proposal is to leave the current behavior untouched: I think it's acceptable that a language added later has empty values, unless the field values are required, but this introduces the need of some field-level validation which AFAICT it's not properly implemented yet.

Default values are strictly connected to language fallback rules for what concerns translation, so I'd suggest that it's Translation responsibility to provide the appropriate values in case of missing ones: it might call $instance['default_value_function'] as field_default_insert does and even take $language into account. Perhaps I'd just modify field_default_insert to handle missing translations and not only the entire property.

Comments?

StatusFileSize
new144.21 KB
Failed: Failed to apply patch.
[ View ]

Wednesday an IRC meeting as been held, the partecipants were sun, yched and me.
Initially we tried to find a way to provide default values when inserting a translation rather than on entity creation, but we didn't reach an agreement on the approach to follow; however we agreed on the following requirements:

  1. we need to insert default values on a language when this language is being open;
  2. when we handle field translations we need a table to track entity translations as a whole;
  3. we might need not to burden Field API with unneeded translation complexity, as most D7 sites will probably be unilingual, hence the entity translation code should be somehow optional.

At first we agreed on my statement it's acceptable that a language added later has empty values, where they are not explictly required.
We also agreed that field_default_insert should consider if a particular field translation is defined before assigning it a default value, thus when creating a new entity only actually "empty-populated" field translations get a default value. This should ensure a consistent behavior between standard workflow and free API calls.

Then yched pointed out that having default values on translation creation rather than on entity creation involves having a way to track entity translations and their status, i.e. more or less what I described in #61. AFAICT sun agreed with him.
This table would allow Field API to know if any value for a given entity has already been entered for a particular language, thus letting it call the appropriate default providers in field_attach_presave. If Field API handles that table, it will handle it for every entity that wants to be translatable, core or contrib.

My approach is slightly different: I agree that tracking the information yched refers to involves the use of a table like the one cited above, what I don't agree on is the assumption that entity translation is the only kind of translation we might need.

I don't think that Field API should natively implement the concept of entity translation, i.e. a collection of field values sharing a common language, because there are other use cases in which we might need fields with different languages showed during the same page view, and I am not thinking of fallback rules. Rather think of a couple of "Citation" fields which hold two citations in two different fixed languages, say "Latin" and "Greek", among other content fields being translatable in the common sense; if we were able to explicitly associate a language to the citation fields we could also mark them up semantically with an xml:lang attribute, otherwise we would have to demote them to untranslatable status, i.e. fields with no language associated, and we would have no way to provide the semantic information language represents.
Another possible alternative translation mechanism might be one implemented on a multicurrency multilingual site: think of a site showing different prices for different locales where prices are stored as different values, not converted from a source currency; the content language might be totally unrelated to the currency locales.

I propose a solution which emulates the current Locale/Translation modules relationship:

  • if we have Locale and Translation disabled the default core behavior is to treat all fields as language neutral (pretty much as now);
  • if we enable Locale, we gain the ability to associate a language to a field value but we are not allowed to associate more than one language value to the field, because this would involve a relationship between the two values that currently Locale, with respect to nodes, does not implement; so no matter fields being translatable or not, every entity gets only one single value for each field, the difference is only in the language associated to it; at this level we would have no relationship enforced between the various field languages, call them esperanto nodes :) however the core UI would allow to simply associate a common language to all the node fields; default values are inserted on entity creation, just as now, for one single language per field;
  • if we enable also Translation, we introduce the concept of translation as intended by yched: the relationship between two different language values belonging to the same field is clearly defined and guaranteed by Translation, which populates and maintains the entity translation table we described above; the core UI allow to create different node (entity?) translations each one having all of its fields sharing a common language; default values are inserted on entity creation, just as now, for one single language per field, while on translation creation empty values are left alone and during entity view language fallback rules are applied, one of them might be to shift in per language default values.

If we go this way we would make Field API lighter and we would allow different, pluggable, translation systems to hook in, each one having the ability/responsibilty to choose wich entity and/or field handle; this could be easily done by invoking a hook which would collect entities interested in being handled by a certain translation system.
And, if we decide to go this way, to close this patch we just need to slightly modify field_default_insert to take into account field translations, as we already agreed to do (the attached patch does exactly this), everything else would be delegated to the following patches.

Any feedback is greatly appreciated, however I am working on a proof-of-concept patch.

Status:Needs review» Needs work

The last submitted patch failed testing.

StatusFileSize
new143.47 KB
Failed: Failed to apply patch.
[ View ]

rerolled

Status:Needs work» Needs review
StatusFileSize
new143.47 KB
Failed: Failed to apply patch.
[ View ]

As part of the proof-of-concept patch work I created this side-issue: #533924: Allow different language types to be enabled separately

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
Issue tags:+translatable fields
StatusFileSize
new146.3 KB
Failed: Failed to apply patch.
[ View ]

rerolled again

Status:Needs review» Needs work

The last submitted patch failed testing.

re #174: I don't really know what to think about the 'alternate uses of value languages'.
The 'display price in different currencies' example is moot IMO, because there's no direct and simple relation between languages and currencies.
The other example ('Citation field; this citation is latin, this citation is greek') is a little mind bending, but I guess there are use cases.
Well, this is entering territories that make my head spin. Not to say that the idea that object 'translations' are just one way to handle value languages is bad, just that I'm probably not the right guy to have an opinion about this.

Just to make sure I understand your proposal:
- no Locale.module, no Translation.module: all field values are language-neutral
- Locale.module without Translation.module: translatable entities (nodes...) get *one* language ($node->language or something), pickable by some UI on the node create form (not sure whether this language can be changed afterwards). This is the language assigned to all values for translatable fields in the object. Nodes that existed before Locale was enabled remain 'language neutral', and so do their field values.
- Locale.module + Translation.module: you can edit a node in several languages. Default values are inserted for a given language when the node is first submitted with values in this language. Translation.module maintains a table of 'existing languages' for each entity. It has to work for *any* entity type that wants to be 'translatable' (including contrib entity types), not just nodes.

About the patch itself: please add a comment about the new logic in field_default_insert(). I'm probably tired, but right now I really can't wrap my head around it, even though I very possibly asked for that change ;-) [edit: I mean a comment even more detailed that the one you already added :-p. Something like 'what's the underlying logic, why is it the right way']

Status:Needs work» Needs review
StatusFileSize
new185.34 KB
new143.57 KB

@yched:

Not to say that the idea that object 'translations' are just one way to handle value languages is bad, just that I'm probably not the right guy to have an opinion about this.

Yes, we definitely need the i18n folks' feedback on this.

Just to make sure I understand your proposal: [...]

More or less that's it, you can find attached my proof of concept work implementing #174. The code is organized in three patches depending on one another (a single big patch is also attached for reviewers' convenience):

  1. #533924: Allow different language types to be enabled separately: implements the distinction between UI language and content language.
  2. The current TF patch slightly reworked to better integrate with the others plus some refactoring work, mainly moving most of the translatable fields code in field.multilingual.inc and renaming the function names accordingly.
  3. #539110: TF #4: Translatable fields UI: Locale gets a brand new locale.field.inc file which extends the current multilingual node approach to fields, basically when creating/editing a node all the translatable fields get the node language; any node language change is correctly handled and reflected on the single fields.
    I didn't touch Translation, instead I created a new Field Translation module which is meant to substitute it; however they could also live together, as nedjo was pointing out in his g.d.o writeup. Field Translation implements almost the same functionality of Translation but without creating a node for each translation, translatable fields are used instead; it relies on a separate module called Entity Translation which exposes an API which allows to track translations for any entity.
    From the UI point of view they are slightly different as Field Translations does not use a node form for translations, but it simply exploits field_attach_form to let users enter just the field values; however there are a couple of translation-specific options "inherited" from the node form.
    To test everything (please, do!) just enable Field Translation, multilingual support for the article content type, install the Demofield module and start translating content.

The main issues left out, as they involve deep i18n expertise, are:

  • Content language negotiation and fallback (this reflects also on default values for entity translations), currently a very raw form is introduced through query string parameters, but we should get rid of it ASAP.
  • Body and teaser as translatable fields.
  • Title as a translatable field, I think we need a workaround in core for this.
  • Per language field settings.

Again your feedback is greatly appreciated, especially if you are i18n experts :)

OK I read through the latest patch, and generally it looks great to me. I spoke with plach and had a couple of issues, mainly that #533924: Allow different language types to be enabled separately shouldn't be a dependency - since it's a change with a lot of implications by itself, and it's more about language negotiation than anything else, which already causes problems with our current translation model and doesn't really stop this from being committed.

While the patch is 143k, I'd estimate only around 30-40k is field API changes, the rest is fixing up tests and adding the language key for field values

StatusFileSize
new42.21 KB
new140.24 KB
Failed: 11891 passes, 5 fails, 58 exceptions
[ View ]

Thanks catch! I rerolled the patch to remove the dependencies from #533924: Allow different language types to be enabled separately. I also attached a "light" version of the patch with all the test fixes plus trivial language key addings stripped out to ease reviewing.

I prepared a demo site with the proof-of-concept patch installed, just contact me on IRC or by email and I'll give you the access credentials. However in #539110-7: TF #4: Translatable fields UI I posted some screenshots.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new143.67 KB
Failed: Failed to apply patch.
[ View ]

I missed the "taxonomy term field" commit. The no-test version of the patch is the same of #185.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new45.33 KB
new147.27 KB
Failed: 12291 passes, 9 fails, 2 exceptions
[ View ]

Reroll + some phpdoc improvement.

@yched: It seems we are getting good reviews in #539110: TF #4: Translatable fields UI :)

I've read up on #174 above on (repeated) requests and here is my feedback:

I'd say that implementing field language support and translation on separate levels is a good idea. Just like locale does, if you enable language support, you might not want to also have translations (think a simple multilanguage blog, where you write different posts in different languages, but not translate them; or a more complex site where you have subsections for different languages, but they are independently edited, not related).

On designated fields with a language while the whole node or some of the fields do not have a language (the Greek quote example), I think the language support in itself would ideally cover that. You should be able to set fields language neutral while others might have a language in your content type.This applies to simple product types, where a price might be language neutral (not your product example). In your product example, I think using "price translation" to display different prices for different languages might easily not work, since prices are usually different per region, country or something else, and not language. I don't think we should support that.

I'm not sure I'm totally up to speed on the discussion, so let's see if I was useful at all :)

@Gábor Hojtsy

I've read up on #174 above on (repeated) requests and here is my feedback:

Thanks for finding some time, Gábor! I am really really sorry for having to be so annoying, I tried to avoid bugging you until it was avoidable, but now I feel we are almost done and for me it would be really frustrating seeing months of work trashed just because I left anything untried.
That said, I'd like to thank again you and all the people giving their feedback here, especially yched, catch, sun and nedjo for having subverted their priorities to allow me to work on this one :)

I'd say that implementing field language support and translation on separate levels is a good idea.
[...]
I don't think we should support that.

It seems we are agreeing on the approach but not on the use cases, better that than the opposite :)

I'm not sure I'm totally up to speed on the discussion, so let's see if I was useful at all :)

yched will tell us ;)

I've read the patch through for the first time in several months.

Most of this patch comes down to a very simple change: adding a language code key to the field array structure. I didn't completely follow the various iterations and back and forths on array structure, but the result here is straightforward and field module authors should find the changes easy to grok and work with.

Together with the accompanying patches, I'm convinced this is the right direction and a major step forward for multilingual support in Drupal. This would be the single biggest multilingual improvement in D7.

I read the full patch and it seems to be in very good shape. Here are the issues I found. Most are trivial, and the rest should be easy to fix.

<?php
- *         '#delta' => 0,
+ *      
'#language' => $language,
+ *      
$language => array(
  *        
'#field_name' => the name of the field,
?>

In this documentation of an example field form, we should document what the $language variable means. Maybe:

// The language code of the object being edited. We key the form array by language code.

<?php
 
*   The object with fields to render.
  * @
param $build_mode
 
*   Build mode, e.g. 'full', 'teaser'...
+ * @
param $language
+ *   The language the field values are to be shown in, if no language is
+ *   provided the current language is used.
?>

Should be two sentences:

<?php
+ *   The language the field values are to be shown in. If no language is
+ *   provided, the current language is used.
?>

<?php
+function field_attach_view($obj_type, $object, $build_mode = 'full', $language = NULL) {
// If no language is provided use the current UI language.
$options = array('language' => field_multilingual_valid_language($language, FALSE));
+
  
// Let field modules sanitize their data for output.
_field_invoke('sanitize', $obj_type, $object);
_field_invoke('sanitize', $obj_type, $object, $null, $null, $options);
?>

Is sending an uninitialized $null variable as an argument what we do elsewhere when we need to pass a null variable by reference?

<?php
+ * - translatable (integer)
+ *    
Whether the field is translatable
?>

Should have a period:

+ * Whether the field is translatable.

<?php
+    $edit = array($this->field_name . '[' . $language . '][0][value]' => $value);
?>

'[' . $language . '][0][value]' in many places in the tests is a bit hard to read and might be better with double quotes:

"[$language][0][value]"

<?php
+ * Unit test class for the multilanguage fields logic.
+ *
+ *
The following tests will check the multilanguage logic of
+ * _field_invoke and that only the correct values are returned
+ * by field_multilingual_available_languages.
?>

field_multilingual_available_languages should be field_multilingual_available_languages().

<?php
+    return array(
+     
'name'  => t('Field translations tests'),
+     
'description'  => t("Test multilanguage fields logic."),
?>

"Test multilanguage fields logic." might as well have single quotes.

<?php
// We also check that the current field translation is actually defined before
// assigning it a default value, this way we ensure that only the intended languages
// get a default value, otherwise we could have default values for not yet open
// languages.
?>

This should be full sentences:

<?php
 
// We also check that the current field translation is actually defined before
  // assigning it a default value. This way we ensure that only the intended languages
  // get a default value. Otherwise we could have default values for not yet open
  // languages.
?>

<?php
-function field_default_prepare_translation($obj_type, $object, $field, $instance, &$items) {
+function
field_default_prepare_translation($obj_type, $object, $field, $instance, $language, &$items) {
  
$addition = array();
   if (isset(
$object->translation_source->$field['field_name'])) {
    
$addition[$field['field_name']] = $object->translation_source->$field['field_name'];
?>

Do we need to add $language to the function signature if it is not used in the function?

<?php
// Let other modules make changes to the form.
+  foreach (module_implements('field_attach_preprocess') as $module) {
+   
$function = $module . '_field_attach_preprocess';
+   
$function($obj_type, $object, $element, $variables);
+  }
?>

This new hook needs to be documented in field.api.php.

Since what we appear to be doing is allowing modules to alter the element, I wonder if this could/should this instead be a drupal_alter() style hook.

<?php
--- modules/blogapi/blogapi.module    5 Jul 2009 18:00:07 -0000    1.158
+++ modules/blogapi/blogapi.module    12 Aug 2009 00:51:38 -0000
@@ -213,7 +213,7 @@
   }
   else {
    
$edit['title'] = blogapi_blogger_title($content);
-   
$edit['body'][0]['value'] = $content;
+   
$edit['body'][FIELD_LANGUAGE_NONE][0]['value'] = $content;
?>

In various places we have hard-coded the language code for the body field. Could we instead load this from the field definition? Doing so would make it easier in future to change this to a translatable field.

On second thought, though, I guess this wouldn't help, since, if the field is translatable, what we want is not the default value but the language code for the specific node. I guess any change will need to wait until body is translatable.

<?php
+ * Field variables: for each field instance attached to the node a corresponding
+ * variable is defined, e.g. $node->body becomes $body. When needing to access
+ * the field raw values developers/themers are strongly encouraged to use these
+ * variables, otherwise they will have to explicitly specify the desired field
+ * language, e.g. $node->body['en'], thus overriding any language negotiation
+ * rule previously applied.
?>

Good explanation. One of the sentences is a run on. Here's a fixed version:

<?php
* Field variables: for each field instance attached to the node, a corresponding
* variable is defined, e.g. $node->body becomes $body. When needing to access
* the field raw values, developers/themers are strongly encouraged to use these
* variables. Otherwise they will have to explicitly specify the desired field
* language, e.g. $node->body['en'], thus overriding any language negotiation
* rule previously applied.
?>

<?php
+    $this->assertFalse(array_key_exists($unavailable_language, $entity->{$this->field_name}), 'Field translation in an unavailable language ignored');
?>

Many existing core messages lack punctuation, but I think this (and various others introduced in the patch) should have a period: 'Field translation in an unavailable language ignored.'

<?php
+    $addition[$field['field_name']] = array(
+     
'#tree' => TRUE,
+     
'#weight' => $form_element['#weight'],
+     
'#language' => $language,
+     
$language => $form_element,
+    );
?>

It isn't self-explanatory what we're doing here. Suggested comment:

// Add the field form element as a child keyed by language code.

<?php
// Currently node language neutral code is an empty string.
// TODO: we might want to unify the language codes.
+  if ($language === '') {
+    return
FIELD_LANGUAGE_NONE;
+  }
?>

I don't understand this comment. Where might an empty string come from?

A final comment on variable naming. This patch generally uses $language as the variable name for the language code of a field's language. Elsewhere in core $language is often used for a language object while $langcode (or, occasionally, $lang) is used for a language code. The use of $langcode helps prevent unintended conflicts with the global $language variable (an object). Should we consider using $langcode here?

Again, on my reading only minor tweaks are needed for this patch to be ready. Thanks Francesco for your dedicated and innovative work on. I think we're in the home stretch.

StatusFileSize
new164.93 KB
Failed: 12259 passes, 9 fails, 2 exceptions
[ View ]

Apart from the following all the suggestions from nedjo's #192 are implemented.

Is sending an uninitialized $null variable as an argument what we do elsewhere when we need to pass a null variable by reference?

I've asked on #drupal-dev and it seems it's an accepted way, I used that form because I had seen it somewhere in Drupal, however I added an explicit initialization.

Do we need to add $language to the function signature if it is not used in the function?

Yes, it's a hook implementation.

I guess any change will need to wait until body is translatable.

Yes, this is what we decided in #117 and following.

We have an ok from the i18n point of view, we need a final ok from the Field API side.
yched?

(and a green light from the test bot, sure :)

StatusFileSize
new166.36 KB
Failed: 12261 passes, 9 fails, 2 exceptions
[ View ]

I forgot to document the new $langcode parameter in field.api.php. Code is the same as #193.

Status:Needs review» Reviewed & tested by the community

The patch is ready for me on the Field API side.

The change is fairly intrusive, though (field values move from $object->field_name to $object->field_name[$langcode] for *every* field).
So the decision to actually commit it IMO depends on whether we are confident that #539110: TF #4: Translatable fields UI has enough momentum to ensure a stable and sound 'content translation' feature for D7. It seems to be getting positive feedback from the right folks so far, and plach shows great dedication and turnaround times. Disclaimer: I myself lack time and i18n wits to take a decisive part in that other thread or assess its viability.

Once again, kudos to the *amazing* work and energy plach is delivering on these threads.

subscribe

Although I'd hate not seeing this committed I perfectly agree with yched. As one might expect #539110: TF #4: Translatable fields UI is showing all the strength but also the weak points of the current approach. I don't know if we should wait until that's rtbc to commit this one but surely we should have a good answer for all the open issues.

@yched: thanks again for finding the time for this, despite your amazing work on the "pure" field api (and ui :)!

@platch, "surely we should have a good answer for all the open issues" in #539110: TF #4: Translatable fields UI

Well, only if those issues suggest problems here. #539110 is fleshing out the UI and data storage for field-based translation, including what could be viewed as limitations, e.g., it doesn't handle non-field data. But so far as I can see none of the issues there suggest limitations or errors in this patch.

My view is that translatable fields can go in even if for we don't get #539110 for D7. Yes, a full solution in core would be ideal. But if necessary a field translation module could live in contrib for D7. What can't be done in contrib is the architectural work of this patch. And for that, #539110 has IMO only confirmed the approach taken here.

One question I'm not clear on is the upgrade path. Does this patch need an update for existing fields?

The recent changes look great and I'm happy with the explanations in #193.

There are a couple of places where a tab has crept in instead of a space, e.g. in testPagePreview():

<?php
+      $language = FIELD_LANGUAGE_NONE;
?>

subscribe

Status:Reviewed & tested by the community» Needs work

The last submitted patch failed testing.

subscribe

Sorry, I got sidetracked into some other stuff. My plan is to help roll this in today.

@plach: Tests seem to fail for the last patch in the meantime. If you could re-roll during the day, that would be awesome! :) I'll most probably start work this evening, ~19:00 UTC.

Status:Needs work» Needs review
StatusFileSize
new166.81 KB
Failed: Failed to apply patch.
[ View ]

Rerolled patch, I fixed the tabs and the tests. Feel free to set RTBC again once the testbot gives its ok.

@nedjo:

One question I'm not clear on is the upgrade path. Does this patch need an update for existing fields?

IMO, this should be part of #366364: [meta] Data migration from D6 contrib CCK fields to D7 Field API. Probably all the old cck fields should be converted to untranslatable fields.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new167.7 KB
Failed: 12101 passes, 3 fails, 0 exceptions
[ View ]

#516138: Move CCK HEAD in core went in. I did an almost straight reroll, I had to manually merge field_default_insert and add a $langcode parameter to field_get_default_value. No time to test locally. Let's see if the testbot is happy. In that case we can go RTBC again.

Status:Needs review» Needs work

The last submitted patch failed testing.

3 fails in "Enable/disable modules"

Status:Needs work» Needs review

I've sometimes seen "Enable/disable modules" tests act a little edgy. Requiring retest.

Indeed, something's effed with testing bot atm. I've shut it off.

StatusFileSize
new167.62 KB
Failed: Failed to install HEAD.
[ View ]

Very nice.

Cleaned up some PHPDoc and variable names.

Status:Needs review» Reviewed & tested by the community
StatusFileSize
new73.89 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.essentials-minus-tests-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
new122.28 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.essentials-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
new167.37 KB
Failed: Failed to install HEAD.
[ View ]

Some further mini tweaks.

I'm also attaching two convenience patches, since more than 50% of those 168 KB are changes in other modules and tests to account for the new field array structure:

- drupal-tf.essentials-d6.patch: All essentials, including relevant test files.

- drupal-tf.essentials-minus-tests-d6.patch: All essentials, only including the really relevant changes in field tests.

StatusFileSize
new167.4 KB
Failed: Failed to apply patch.
[ View ]

Those convenience patches were really helpful. Found some more coding-style issues and reworded some PHPDoc.

Status:Reviewed & tested by the community» Needs work

The last submitted patch failed testing.

Status:Needs work» Reviewed & tested by the community

Oh, testing bot...

StatusFileSize
new78.7 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal-tf.212.essentials-DREDITOR-d6.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
new167.29 KB
Failed: Failed to apply patch.
[ View ]

Tests pass!

Re-rolled for removed $check argument to check_markup().

Adding another convenience patch for final review.

Status:Reviewed & tested by the community» Needs work

Sorry, but I wasn't able to test this because the field configuration page is broken when the patch is applied. (It works without the patch.)

Steps to reproduce: Update CVS. Apply patch. Run install.php. Go to Structure -> Content types -> Page -> Manage fields. Click edit on body field (or a newly created field)

Error message: PHP Fatal error:  Cannot pass parameter 8 by reference in /path/drupal-7/modules/field_ui/field_ui.admin.inc on line 1216

The call to field_default_form() on line 1216 seems to miss the new $langcode parameter:

<?php
$widget_form
= field_default_form(NULL, NULL, $field, $instance, $items, $form, $form_state, 0);
function
field_default_form($obj_type, $object, $field, $instance, $langcode, $items, &$form, &$form_state, $get_delta = NULL) {
?>

Status:Needs work» Reviewed & tested by the community
StatusFileSize
new168.21 KB
Failed: Failed to apply patch.
[ View ]

Thanks for testing and catching that! Of course, the new Field UI wasn't completely covered yet.

Back to RTBC.

Cross-posting a summary of this patch I sent to Dries: (please correct me where I may be wrong)

- Currently, translations of content are stored in different entities, which means that the same content exists multiple times for each language. To achieve content translations, Drupal tries to put all translations into a translation set, which in effect, are just different views on the same content (translatable strings change).

- The current translation approach of multiple entities leads to major issues for additional information that is attached to nodes by contributed modules: For example, file attachments or votes/rating may need to be properly (and manually) synchronized between the nodes in the translation set. Or when considering node references, it becomes obvious that one needs heavy workarounds to try to find a suitable translation for a node that was referenced in the original node (in the source language). In case no translation of the referenced node existed when the translation of the referencing node was created, but the referenced node was translated later on, then the referencing node may still point to the untranslated node that existed when the translation of the referencing node was created.

Long story short: Our current content translation approach is not canonical. Translations of content are displaying the same entity, just in a different language.

- We now have field API in core, which allows us to solve this issue by only storing translated values for translatable fields.

- This means that a user is able to specify which fields should be translatable and which should not. A Link, Node Reference, or User Reference field may not need to be translated, while most probably all text fields will.

- The "Translatable Fields" (TF) patch lays the ground-work for some further patches in the queue (most are at 90%). It is a low-level patch with no UI changes or whatsoever.

- With TF, we allow all field values to be translatable. To achieve that, they are stored and treated like multiple value fields, i.e. they are normalized into own field tables and get an additional 'language' column for each value (all of this is handled by the field storage engine).

- To account for language negotiation and yet untranslated but translatable fields, we store, load, and cache all field values for all languages. This also allows for advanced use-cases, where one might want to display information in another language, or even multiple languages at the same time. A good example for last advanced use-case would probably be a translation UI that allows to translate fields into multiple languages, which is kinda nice :)

- Since we load/render/store all languages, the conclusion is to make Drupal's field system fully language-aware. That means, field values are no longer stored in $object->field_foo[0], but in $object->field_foo['en'][0], and we pass on the "default" language to use for rendering - which is probably the reason why I am writing this mail (because that leads to a major API change we badly need to do).

- Since the i18n sprint, Francesco Placella (plach) focused on the TF patch and did an tremendous, awesome job in puzzling all parts together. The field API maintainers/developers AND the major i18n developers held countless of follow-up sprints in IRC to discuss the concept, related issues, and the resulting solution that plach implemented in TF. The patch has strong support from both field API and i18n developers and will be a major step towards a sane handling of translations in Drupal.

- Again, the patch only deals with low-level changes. It provides no UI to use the feature. However, I am confident that at least some of the other patches will make it in (I'll personally focus on those, because it is the top priority for almost all of our clients) - and even if those other patches won't make it: The functionality added by them can live in contrib. But TF cannot - because it is the building ground.

- Aside from mind-blowing discussions about translations, I just ensured that the patch is in a committable state. It's insane to review this monster; therefore a "convenience" patch has been created that contains the bare essentials (API changes) only.

Long story even shorter: In the name of many, international Drupal users: this is what we want. I kindly ask you to push the button. :)

Ok, spent another couple hours with this patch tonight, with tha_sun in IRC. I read through the whole thing, but the vast majority of it is brain-dead stuff: adding the new $langcode parameter to various field API functions and reformatting arrays to use it, adjusting the tests so that they pass, etc. The underlying patch itself seems like it is actually not much; just nesting things a level deeper, underneath langcode so that properties can be accessed independently. Yet, this simple approach has very nice benefits as outlined by sun in #218.

I want to extend a sincere thank-you to all the people who worked on this patch, particularly plach for being an amazing and constant force throughout the effort. This patch has support from a major chunk of our i18n team, both field API experts, as well as our Drupal 6 maintainer who knows this stuff better than just about anyone. This issue is a marvelous example of team work!! :)

Here are the fairly minor things I managed to find in going through, but other than those I really can find no objections. I am a n00b when it comes to the translation system, though, and because of that (as well as the fact that it's 168 K :P) I would really like to get Dries's sign-off on this before committing.

+++ modules/field/field.attach.inc 19 Aug 2009 21:44:31 -0000
@@ -417,45 +436,57 @@ function _field_invoke_multiple_default(
+ *     // The sub-array is nested into a $langcode key where $langcode has the
+ *     // same value of the $langcode parameter above. This allow us to match
+ *     // the field data structure ($field_name[$langcode][$delta][$column]).
+ *     // The '#language' key holds the same value of $langcode and it is used
+ *     // to access the field sub-array when $langcode is unknown.

Ah! Great! This resolves my one major issue with the patch the last time I reviewed it, which is that there's no way for themers who know they want to print a field in a template but don't know what the langcode is, to print it.

+++ modules/field/field.form.inc 19 Aug 2009 21:49:02 -0000
@@ -353,7 +369,7 @@ function field_add_more_js($bundle_name,
-    // Ivnalid
+    // Invalid
     $invalid = TRUE;

(minor) How about just remove the comment rather than fixing it, since it's totally useless? :P And in fact, how about doing this as a separate "novice" patch since it has nothing to do with this one, and this one is quite big enough, tyvm. :)

+++ modules/field/field.module 20 Aug 2009 15:06:26 -0000
@@ -67,6 +68,13 @@ require(DRUPAL_ROOT . '/modules/field/fi
/**
+ * The language code assigned to untranslatable fields.
+ *
+ * Defined by ISO639-2 for "No linguistic content / Not applicable".
+ */
+define('FIELD_LANGUAGE_NONE', 'zxx');

I'm happy for us to conform to existing standards. However, what this means in practice is that var_dump($node) is going to show stuff like $node->body['zxx'][0]['value'] = 'bananas'; We'll need to make absolutely sure that we document this well in the theme/module upgrade guide as well as the general module/themer information because that will be a complete WTF.

+++ modules/field/modules/field_sql_storage/field_sql_storage.module 19 Aug 2009 20:37:07 -0000
@@ -144,8 +144,16 @@ function _field_sql_storage_schema($fiel
+      // @todo Consider to store language as integer.

(minor) "Consider storing language as integer."

+++ modules/field/modules/field_sql_storage/field_sql_storage.test 19 Aug 2009 20:44:04 -0000
@@ -98,13 +99,23 @@ class FieldSqlStorageTestCase extends Dr
+    // Add a translation in an unavailable language and verify it is not loaded.
@@ -238,5 +251,47 @@ class FieldSqlStorageTestCase extends Dr
+    // Add a translation in an unavailable language.

Why are you doing this in two places?

I'm on crack. Are you, too?

StatusFileSize
new167.91 KB
Failed: Failed to apply patch.
[ View ]

The current patch is a straight reroll of #217, it just implements the // Ivnalid and (minor) "Consider storing language as integer." webchick's suggestions. The code is the same as in #217.

Ah! Great! This resolves my one major issue with the patch the last time I reviewed it, which is that there's no way for themers who know they want to print a field in a template but don't know what the langcode is, to print it.

Well, actually your concerns where addressed in field_attach_preprocess by populating the $variables array with the field translations to be displayed, using a similar technique, though :).

Why are you doing this in two places?

The first place is testing that no unavailable field translation is loaded. The second one is not a real test by itself, rather it's preliminary insertion to test the missing data behaviors.

I actually looked at this patch the other day, but didn't write a review (yet). Fundamentally, I agree with the approach taken and support this patch being committed. My big question was around performance -- I'd like to make sure it doesn't significantly degrade performance for both non-translated and translated sites.

Status:Reviewed & tested by the community» Needs work

This patch needs a quick re-roll after the regional path change.

And is someone available to run basic benchmarks before/after the patch with/without a language installed?

StatusFileSize
new167.91 KB
Failed: Failed to apply patch.
[ View ]

Re-rolled.

On performance, I'm a bit pushed for time at the moment, but I think due to the field upgrade path, we'd need the same Drupal 6 database with some devel generated content updated to Drupal 7 (one with and one without the patch) - then benchmark 30 nodes (= 30 fields) on the front page. Ideally the cost for an untranslated field should be close to zero. This is going to be a bit tricky, since it's currently impossible to upgrade from Drupal 6 to Drupal 7 in HEAD, so not sure how best we can deal with that until the upgrade path is back to working.

With translated fields the way this is designed, the idea is that you absolutely avoid either 1. additional queries when loading data 2. additional joins/conditions to determine which language/tnid to display 3. adding language keys to caches to avoid lots of misses. So the main penalty would be from loading more data from the database and a bit of processing - although if we consider all translations part of the same entity, this is much the same as loading a very long article compared to a very short one, or any content type with a additional fields attached to it which might not always display on every page. I'm not sure how easy it'd be to benchmark a translatable field with additional languages installed since there's not an easy way to populate the data (without filling it out manually), I'd be more comfortable if we could compared HEAD-HEAD with standard mono-lingual data before a commit though, will find people in irc I guess.

Status:Needs work» Needs review

Botty.

catch informed me that the upgrade path between D6 and D7 might be hosed atm: #517742: Upgrade failing if blocked_ips doesn't exist If that happens:

a) Try applying the patch in the other issue and report back your results.
b) If that fails to fix it, do a HEAD / HEAD comparison.

Ask around in #drupal if you need help!

HEAD / HEAD comparison you can do with real data by just manually adding ten articles to the front page in each install, a bit clunky but good enough for a sanity check.

Ran the benchmark: HEAD (unpatched) vs. Patched HEAD.
Note: I have not read thru the patch and just got directions from @Catch on IRC, I hope this is correct and somewhat helpful.

First test:

* Fresh install, "Full" configuration, MySQL
* Created 30 article nodes, using lorem ipsum generator.
* Configured to show 30 nodes in front page, anonymous user cache is disabled.
* Ran ab -n 100 -c 10. Results::

E:\UV-L00329342\Laboral\xampp\apache\bin>ab -n 100 -c 10 http://localhost/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient).....done
Server Software:        Apache/2.2.11
Server Hostname:        localhost
Server Port:            80
Document Path:          /
Document Length:        49849 bytes
Concurrency Level:      10
Time taken for tests:   23.924 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      5030800 bytes
HTML transferred:       4984900 bytes
Requests per second:    4.18 [#/sec] (mean)
Time per request:       2392.439 [ms] (mean)
Time per request:       239.244 [ms] (mean, across all concurrent requests)
Transfer rate:          205.35 [Kbytes/sec] received
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   9.1      0      91
Processing:   857 2349 818.5   2303    4230
Waiting:      836 2298 810.8   2278    4216
Total:        857 2350 820.1   2303    4230
Percentage of the requests served within a certain time (ms)
  50%   2303
  66%   2548
  75%   2883
  80%   3167
  90%   3572
  95%   3764
  98%   4126
  99%   4230
100%   4230 (longest request)

Second Test

* Applied patch from: http://drupal.org/node/367595#comment-1953376
* Dropped the whole database, re-installed Drupal.
* Again, created 30 article nodes, configured to no cache, 30 nodes in front page.
(Nodes were a bit different in length, filled with random "Lorem Ipsum" text)
* Ran ab. Results:

E:\UV-L00329342\Laboral\xampp\apache\bin>ab -n 100 -c 10 http://localhost/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient).....done
Server Software:        Apache/2.2.11
Server Hostname:        localhost
Server Port:            80
Document Path:          /
Document Length:        50692 bytes
Concurrency Level:      10
Time taken for tests:   24.844 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      5115100 bytes
HTML transferred:       5069200 bytes
Requests per second:    4.03 [#/sec] (mean)
Time per request:       2484.449 [ms] (mean)
Time per request:       248.445 [ms] (mean, across all concurrent requests)
Transfer rate:          201.06 [Kbytes/sec] received
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.9      0       8
Processing:   931 2445 1034.9   2194    5632
Waiting:      919 2403 1024.8   2182    5560
Total:        932 2445 1034.8   2195    5632
Percentage of the requests served within a certain time (ms)
  50%   2195
  66%   2836
  75%   3304
  80%   3454
  90%   3854
  95%   4242
  98%   5047
  99%   5632
100%   5632 (longest request)

As per @catch's request, ran again with different settings for ab: -n1000 -c1

My feeling is that there's discernible difference under my test conditions...

Results:
Unpatched HEAD

E:\UV-L00329342\Laboral\xampp\apache\bin>ab -n1000 -c1 http://localhost/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:        Apache/2.2.11
Server Hostname:        localhost
Server Port:            80
Document Path:          /
Document Length:        51711 bytes
Concurrency Level:      1
Time taken for tests:   378.725 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      52170000 bytes
HTML transferred:       51711000 bytes
Requests per second:    2.64 [#/sec] (mean)
Time per request:       378.725 [ms] (mean)
Time per request:       378.725 [ms] (mean, across all concurrent requests)
Transfer rate:          134.52 [Kbytes/sec] received
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.4      0       1
Processing:   352  378  41.4    373    1242
Waiting:      341  367  41.2    361    1231
Total:        353  379  41.4    373    1242
Percentage of the requests served within a certain time (ms)
  50%    373
  66%    382
  75%    388
  80%    391
  90%    401
  95%    411
  98%    437
  99%    469
100%   1242 (longest request)

Patched:

E:\UV-L00329342\Laboral\xampp\apache\bin>ab -n1000 -c1 http://localhost/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:        Apache/2.2.11
Server Hostname:        localhost
Server Port:            80
Document Path:          /
Document Length:        50692 bytes
Concurrency Level:      1
Time taken for tests:   395.848 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      51151000 bytes
HTML transferred:       50692000 bytes
Requests per second:    2.53 [#/sec] (mean)
Time per request:       395.848 [ms] (mean)
Time per request:       395.848 [ms] (mean, across all concurrent requests)
Transfer rate:          126.19 [Kbytes/sec] received
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.5      0       9
Processing:   357  395  35.6    386     648
Waiting:      346  383  34.1    375     603
Total:        357  396  35.7    387     649
Percentage of the requests served within a certain time (ms)
  50%    387
  66%    397
  75%    407
  80%    416
  90%    440
  95%    467
  98%    500
  99%    527
100%    649 (longest request)

Status:Needs review» Reviewed & tested by the community

Test bot is still happy, and those benchmarks look fine to me. If I don't get beaten to it with a commit before Sunday, i'll try to run some as well just to verify, but putting back to RTBC since there's no serious regression here in terms of performance for untranslated fields. If anyone has time to create a translated field in a couple of languages and benchmark that, it might also help - I guess we'd have to benchmark a translated field with two translations, vs. and untranslated field, which isn't really a fair comparison given locale will be enabled and etc., but not many other ways I can see to do it.

Status:Reviewed & tested by the community» Fixed

Awesome! Thank you so much for jumping in here, janusman!

Committed to HEAD! WOOHOO!! :)

OMG.

W O W !!!

Status:Fixed» Needs work
Issue tags:+Needs Documentation

This needs documenting in the upgrade guide, and maybe a CHANGELOG.txt entry.

Awesome :), Thanks you guys

Status:Needs work» Needs review
StatusFileSize
new1.17 KB
Failed: Failed to apply patch.
[ View ]

I've made a CHANGELOG.txt patch.

Status:Needs review» Reviewed & tested by the community

Looks good.

Status:Reviewed & tested by the community» Needs work

Awesome, thank you! Committed to HEAD.
Back to needs work for docs.

@webchick: should I write the documentation myself or am I supposed to get in touch with the documentation team?

Status:Needs work» Needs review
StatusFileSize
new829 bytes
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch translatable_fields-367595-239.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Er, I just noticed that the patch that went in had a missing hunk :(

It was a one-line change in template_preprocess_node, the attached patch should fix it. I did not open a new thread as the missing line was part of the patch until #183 and had yched's approval, it was lost during the proof of concept work. I'm really sorry.

Status:Needs review» Reviewed & tested by the community

Looks good.

Status:Reviewed & tested by the community» Needs work

Thanks, committed! Back to needs work for documentation.

plach: Up to you. The customary thing is for one of the people who worked on the patch to amend the upgrade docs accordingly. But since you already put in approximately one bazillion hours on this patch, if you want to leave it for a member of the docs team to clean up, they're normally quite attentive to the "Needs documentation" queue.

Assigned:Unassigned» plach

@webchick: I would be glad to provide the documentation myself, but I am really full with the TF side-issues. Moreover a skilled doc team member is going to achieve a far better result than me. However I am fully available for any needed feedback.

Assigned:plach» Unassigned

Assigned:plach» Unassigned

Hm, we should probably add the field_attach_preprocess() call to all other fieldable entities (user, comment, terms), even if they're not translatable.

This patch broke taxonomy term fields with autocomplete widgets. It may also have broken options (radio/check) widgets, because I get WSOD when I try to edit a node with a list field and that widget.

On taxonomy term autocomplete widgets, the function taxonomy_autocomplete_validate is looking in $form_state['values'][$element['#field_name']]['value'] but it needs to look in $form_state['values'][$element['#field_name']][$langcode]['value']. However, it doesn't know a $langcode. Do I just force it to be untranslateable and use the FIELD_LANGUAGE_NONE constant?

@yched: perhaps we should open a new issue?

@bangpound: in $form[$element['#field_name']]['#language'] there should be the needed $langcode. I'll try to give a look to the options widgets ASAP. If you could file two new issues and tag them translatable fields it would be great.

There's a conceptual problem with translatable fields for references, and I think sun alluded to it in #218. I'm considering it for taxonomy terms, but it could apply to any reference field to a translatable entity.

If a vocabulary and its terms are translatable, the fields whose allowed values include that vocabulary shouldn't be translatable. For example, a toponym (place name) vocabulary lists names of countries. The countries are the same whether the language of the user is French or English or Japanese. A taxonomy term reference in an English node to "Bangladesh" should not change its value for the French version of the node.

However, tags are a different problem. The user's judgment determines how something is classified, and individual judgment can be affected by culture such that (with multilingual social sites for example) the tags for a node in English cannot necessarily match those in other languages. Tags could be slang, impressionistic...

We can talk more about this at Drupalcon. This may ultimately be a documentation issue.

I'd say some taxo fields (taxo-term-reference, actually) should be translatable, some others not, that's hardly predictable and really depends on the kind of categorization. It should ultimately be left to site admins.

A taxonomy term reference in an English node to "Bangladesh" should not change its value for the French version of the node.

This might be true in some circumstances while not at others.

Take USA for example:

English: “United States of America”
French: “États-Unis d’Amériques”

Some sites would require localization of the countries whereas some other sites will use the original country name.

@yched and @xmacinfo are totally right, the site admins should decide which fields are translated and how. This absolutely depends on the use case.

The current approach should allow us to translate the single terms of an untranslatable taxo field or make the taxo field translatable and provide different values for each node translation.

I agree that it's a problem that's specific to each site. So I've enabled multiple languages, set up one of my content types to be multilingual and translatable. How do I change the translatability of a field? Is this a missing feature?

@xmacinfo your example is what I'm talking about. The relationships between the node and the term is not translated but the name of the term is translated. The "thing" isn't language specific but the name of the thing is.

The translability of an item is not currently handled by the UI, obviously it should. The UI have also to take into account the presence of translation handlers which enable translatable fields to be actually translated (see field_multilingual_available_languages and field_multilingual_check_translation_handler for details).

@bangpound: I think I do understand what you mean. :-)

What are all the UI elements missing? Are there things that can be done later, after the API is frozen?

This patch introduced a new blocker for #391330: File Field for Core. See #559506: Field Widgets Need $langcode variable. If some of the authors of this patch could give that a review I'd appreciate it.

The UI for translatable fields is tackled in #539110: TF #4: Translatable fields UI. Please follow-up over there.

If you want to participate in other i18n issues:

http://drupal.org/project/issues/search/drupal?issue_tags=i18n%20sprint
http://drupal.org/project/issues/search/drupal?issue_tags=translatable%2...

Please only comment on this issue if you want to point to regressions or other issues that need help (like quicksketch did). Thanks.

Please, leave this issue for the documentation work.

For any bug report file an issue tagged translatable fields: I keep checking the queue for issues tagged that way.

- re #244 / 246 : #560608: add missing calls to field_attach_preprocess()

- @plach: I think at some point we discussed the fact that only nodes would hold translatable fields for now, to limitate UI work. Sorry if I'm on crack and that was dropped later on ?

Issue tags:+Exception code freeze

add tag

Issue tags:+d7docs

added tag

Status:Needs work» Needs review

Status:Needs review» Needs work

@geto:

The documentation has not been updated yet, so we should keep the status to needs work.

Hi, I have an alternative and less Traumatic solution for this issue, I developed it for Drupal 6 and is working very good in a real project.

The propossed module futfill most of the requirements and is currently working in a real Drupal 6 instalation, the idea can serve as inspiration to solve this issue on Drupal 7.
The module has some advantages, like these:
1) Its a module and doesn't need to modify the core.
2) The translation logic is hidden to the entire system, so it works whit most of the existing modules without additional handling.

The module is in "proposed" state so tell me your opinions here: http://drupal.org/node/752122

@ricardodpsx:

This issue has been commited to Drupal 7 core in August 2009, it is in status 'needs work' because it lacks documentation as some related issues are still pending.

Keeping track of this thread.

Status:Needs work» Fixed

We have the documentation finally: Translatable Fields.

Subscribing

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

The TF UI is provided by http://drupal.org/project/translation.

Wow, long shot, but this actually semi broke 'default values' : #1253820: It's impossible to submit no value for a field that has a default value

(fixed wrong issue link in previous comment)

Status:Closed (fixed)» Needs review

The last submitted patch, 211: drupal-tf.essentials-minus-tests-d6.patch, failed testing.

The last submitted patch, 211: drupal-tf.essentials-d6.patch, failed testing.

The last submitted patch, 215: drupal-tf.212.essentials-DREDITOR-d6.patch, failed testing.

Status:Needs review» Needs work

The last submitted patch, 239: translatable_fields-367595-239.patch, failed testing.

Issue summary:View changes
Status:Needs work» Closed (fixed)