Currently there is no general solution in core for making fields holding user-defined strings translatable. Some are outlined here: http://groups.drupal.org/node/18735.
Probably the most promising is to add built-in support for translation to the Fields API.
Key advantages would include:
* No need for custom handling of each translatable field (as some other approaches would require).
* Greater incentive for use of the new Fields API.
* Translations would be loaded on initial object load, rather than waiting until later and loading each translation as a separate override.
We're preparing for an internationalization focused code sprint in the next two weeks. In preparation, it would be great to get some direction from Field API experts on what adding translation support to the API would mean. What are our options for implementation?
Adding a language field to each field table (and adding language to the primary key array)? Separate storage for multiple languages?
Use the existing locale API for translation? Add translation to the UI for object editing?
Comment | File | Size | Author |
---|---|---|---|
#239 | translatable_fields-367595-239.patch | 829 bytes | plach |
#235 | changelog.txt_translatable_field.patch | 1.17 KB | xmacinfo |
#223 | translatable_fields-367595-220.patch | 167.91 KB | catch |
#220 | translatable_fields-367595-220.patch | 167.91 KB | plach |
#217 | drupal-tf.217.patch | 168.21 KB | sun |
Comments
Comment #1
nedjoComment #2
nedjoComment #3
catchI've been mulling over some of the various options this week, and this seems like the ideal one to me for a number of reasons. Thanks for beating me to it posting the issue.
# If fields are content, then the translations of a field are equally content - and should be loaded and stored as part of the object rather than tacked on later. If I want to migrate/syndicate content around Drupal installs, being able to load objects and save them as one single entity is much nicer.
# By doing loading everything in the same palce - the locale system only has to handle rendering the correct language, not pulling in language information from custom storage as an afterthought.
# Having proper multilingual fields means that we can cache objects and/or fields safely without having to wonder about language-dependent cache keys or anything.
# We don't need any additional queries or joins. If there's lots of translations we'll end up with some very large objects being loaded sometimes - but that particular issue is only going to affect sites with lots of translations.
So the main question seems to be how to get this to fit in with the proposed field storage model, and how the UI for it might look.
My initial inclination would be to have a 'multilingual/translatable' property which would be part of any field definition- which would map roughly to the difference between single or multiple. So rather than have multiple fields with numeric indexes/weights, you'd have multiple fields keyed by language. And then when displaying this to the user, we'd have something along the lines of $node->field_name[$langcode]->view instead of $node->field_name[0]->view
Comment #4
nedjoAn implementation might support:
* Admin adds some fields to an object.
* Some of these fields are translatable.
* To make a translation, click 'Translate into French' and get an edit
form for the object with just the translatable fields.
* When loading the object for display in a multilingual environment,
fields are loaded from the start for a given language (the current
language).
(Possibly, if we took this route, nodes too could optionally use it. A
node type could use the translation module *or* just feature
translatable fields.)
Some of the questions:
* How should a field be designated translatable? Is it by field type
(e.g., text = translatable) or is it a property applied to field instances?
* What are the options for storage, and of them which are best?
Comment #5
catchWe're working on this as part of the i18n sprint. Here's a summary of the current discussion, at least from my point of view ;)
'translatable' is a new property, which can be applied to any field.
If a field is translable, a few things happen:
1. The field_sql_storage module adds a 'language' column to the table (in addition to object id, delta and field-provided columns)
2. The field is treated as multiple with FIELD_CARDINALITY_UNLIMITED. In reality there will be one row per language in the table, but we can't predict how many languages are installed on a site.
3. All data for the field is loaded with the object it's attached to - same as any other field, this avoids query rewriting and language dependent cache keys.
4. On rendering, only the value in the current language will actually be displayed.
Here's a very initial patch which adds the 'translatable' property to the field definition documentation, and adds the column in field_sql_storage.module
Unanswered questions:
1. Should we always add the language column for any field, but just have it default to an empty string and treat that as language neutral, or should the column be added only if a field is marked as translatable?
2. sun mentioned a possible alternative data structure like having columns for data_en, data_fr, data_de - so all languages are held in one row - but so far in discussion we don't see a way to do this without dynamic/admin triggered schema manipulation.
Very basic patch to show where the changes in field.crud.inc and field_sql_storage.module would start. Initial opinions on this idea from the FiC team would be great of course.
Comment #6
sunIf 2) turns out to be not possible, I'd say yes to 1).
Comment #7
catchHere's work so far.
language column is added to all per-field tables by default, if a field isn't translatable, then it just gets populated with an empty string (same as nodes do now). This means that say body becomes a field, and then a site wants to make 'body' translatable, then while it wouldn't be a straight switch, it also wouldn't require actual schema changes either. It might be considered crufty though, so up for discussion.
All translatable fields get FIELD_CARDINALITY_UNLIMITED - because we never know how many languages might be on a site. This also means that adding the language column won't make any difference if fields API moves to a default per-bundle model, because FIELD_CARDINALITY_UNLIMITED always means a single dedicated table for that field.
On loading, the multiple field is loaded like any other, except, it's keyed by language.
so:
Instead of
Which saves a lot of messing about later hopefully (i.e. no array_search() no extra looping) .
Not in the patch yet by any means, but a question which needs answering is this:
Where do we select the language-specific content to display to the user?
There's no query rewriting going on, and no t() or anything here - all fields are available all the time as an inherent property of the object, which shifts the responsibility to somewhere in rendering.
My inclination would be to have somewhere, maybe even in drupal_render() itself, where if an element has a '#translate' property set, the rendering function assumes that the element children are keyed by language, and only renders $elements[$language] for display.
That'd make the whole rendering issue separate from this patch though (i.e. not a formatter or anything like that). And would give us a generic method which could be used for stuff which currently isn't fields for whatever reason.
If that sounds like a possible approach, then the initial groundwork laid here might be approaching enough for a first patch - with a bunch of related/followup issues to open alongside it. So marking to needs review for some more eyes.
Comment #8
yched CreditAttribution: yched commentedmeans multiple fields cannot be translatable / a translatable field can only have one value (per language) ?
Comment #9
catchyched: yes, at the moment that's exactly what it means. I'm having trouble thinking of use cases for translatable multiple fields, and also just generally wrapping my head around how it'd happen.
We could do something like
$additions[$row->entity_id][$field_name][$item->language][] = $item;
of course, but that's probably going to add some overhead elsewhere. Another possibility would be only doing that for "translatable multiple" fields - so we have 'translatable' and 'translatable multiple' as two possible options. Ideas welcome of course.
Comment #10
sunWe definitely need fields that are multiple and translatable. But indeed, I too have problems to wrap my head around how the schema and the resulting object would have to be structured for this.
Maybe having 1-extra-column-per-language-value would be easier to understand (and track) for this? (as mentioned in #5)
Comment #11
plachsubscribe
Comment #12
catchHere's a stab at supporting multiple translatable fields.
Only change is we do
Instead of
I also added the #translatable property per field where it's marked as translatable.
This should now match up fine with the rendering approach at #368823: Add #translatable support to content rendering. I don't think the schema would need to change - we'd still load all fields for an object, and if there's more than one entry for a language, these would be displayed. Where it'd get more complicated is supporting this in the UI.
Comment #13
catchOne other thing to note, if you needed to list 10s of items - Views being the obvious example, then it ought to be very possible to do so by adding a language condition to the field table if a full object load absolutely needs to be avoided. Also, while I think we should avoid dynamic schema at all costs (at least until there's something cooked up for fields in general), either the per-bundle module or materialized views could probably cook up a similar table to the column_language_1 column_language_2 idea suggested by sun - these would need to rebuilt when adding or removing a language, as I understand it mv is designed to handles those changes on some level. So for specific cases, I don't think we're precluding loading only the data you need - even if I think that's not a good fit for 'first class object' loads themselves.
Comment #14
catchComment #15
yched CreditAttribution: yched commentedYes, I see no reason why the code in #12 wouldn't technically work.
Two issues though :
- Unpredictability of what you find in $object->field_name. Currently it's always an array (possibly empty) of multiple values (possibly only one). This is really important for DX. With the approach taken here, it depends on whether the field is translatable. We need to find a way that doesn't force you to test translatability and have two code branches in the may places where we refer to a field's values.
- As http://drupal.org/node/361683#comment-1236964 shows, there's a limit on how many columns you can add to a db. So this approach of translatable fields make the issues there make denormalized tables more hazardous.
Also, yes, it is critical that the primary storage tables (PF in the current state of HEAD, PB if it becomes the default at some point) do not have schema changes. This means that enabling a new language cannot add a new column, so they cannot have field_name_value_[language] columns.
Comment #16
catchyched: yes that'd be a big issue with having translation of multiple value fields done this way (part of why I'd initially hoped I could leave it out since only allowing one value per language per field would mean the existing structure could be re-used with only named keys added).
One way around it would be to populate the field values as usual, but add $object->field['translations'] additionally - which would contain the data at a low level. I think it's critical for performance and ease of access that we don't have to keep iterating over fields once they're initially loaded to see if they've got languages associated or not, so it needs to be done early, once - but doing it that way means two copies of the data instead of one.
Then in rendering, we'd do:
The 'translations' array would then replace the #translatable property (or something similar). Here's a patch just for comparison.
Comment #17
catchMerged #368823: Add #translatable support to content rendering in with this one, since we're now using #pre_render, and while I'm leaving the #pre_render callback in common.inc for now since it's potentially a generic approach we might want to take later, it doesn't really make sense out of context. Also tests added for this.
By using #pre_render we keep drupal_render() agnostic about translation, and it also gives contrib the chance to change the callback before we get there, say to handle proper language fallbacks.
Comment #18
janusman CreditAttribution: janusman commentedsubscribing
Comment #19
catchFixed up some test failures (other than the ones already in HEAD). When all core field tests pass, I'll add some more here.
Comment #20
catchplach did some thorough review of this in irc, $language wasn't ever going to get properly populated, changed to $item->language
Comment #21
bjaspan CreditAttribution: bjaspan commentedI've only quickly scanned the discussion here. I do not love the idea of intermingling translation and cardinality. If we cannot keep these unrelated concepts orthogonal, the design is wrong.
What about a separate field data table for each field for each language? I haven't thought through it, but it would allow per-language storage without taking over the cardinality capability.
Each field indicates if it is translation-enabled. Each object specifies the languages it should be loaded/saved in via an object property identified by hook_fieldable_info(). The field storage engine is responsible for maintaining separate tables for each field per-language, and for reading/writing them as appropriate. Changing table schemas in flight is hard, but creating or dropping tables is easy.
Just a thought.
I have no idea about how to represent multi-lingual field data on the object itself at the moment.
Comment #22
catch@bjaspan, if I'm loading an object, I should get the actual object including all content, not a slice of it depending on the language of the current user. Adding a table per language per field, means potentially multiplying the number field tables by 5, 10, 20 depending on the number of languages on a site. If we need to show all values in all languages in all fields for an object (very possible to do with Views), then we'll quickly hit MySQL's 61 table join limit. So I'm very much against adding an extra table per language per field + query rewriting, it's going to mean a massive slowdown. For example a lot of sites need language fallbacks, sometimes multiple levels of fallback - having to hit a number of different tables in turn to get the fallback value for a field is going to be very, very nasty.
As to cardinality, I don't see a partciular reason why we couldn't separate the two with the current model - the only issue is that we have to allow for adding a language to a site without schema changes, so the maximum number of records per object is going to be variable over time.
Comment #23
plachTested the patch with a
demofield
from FIC bzr repository.There were some things not working, I tried to fix them without screwing the whole picture we discussed yesterday.
There are some aspects yet to be fixed but I were able to:
* create an article with a demofield (text) field
* after inserting manually in the DB a couple of field translation, make them show up according to selected UI language
* show nothing if no translation is avalaible for the selected UI language
* show the field value with any UI language if it is language neutral
I'll be glad to discuss the details tomorrow.
Comment #24
bjaspan CreditAttribution: bjaspan commented@catch: I am not specifically supporting my field-table-per-language idea; I have not thought about the problem in enough detail to be confident the idea is any good. However, I want to disagree with a couple assertions you made.
1. If you are storing 5, 10, or 20 times the unique data, you are going to need 5, 10, or 20 times the amount of space to put it in and should expect to do a lot more work. I do not see why loading multiple languages ought to come "for free" from the database. If there is a way to make it happen that does not break the architecture, great.
2. Loaded fields can be cached per object, so storing field data in different tables does not slow down load performance. As for write performance, this is related to but not quite the same situation as the great hybrid-storage debate. I do not think there are very many real-world scenarios in which there is an extremely high volume of objects being created in 20 languages simultaneously.
3. The 61-join limit for Views is a red herring. Unless you are actually filtering/sorting by >61 fields, there is no reason Views needs to actually join to select-only fields every time; that is not the most efficient way to retrieve the data, especially given the field cache. Views could just retrieve the object ids and use load-multiple (which uses the cache) to get the full objects, or if you really need a live query it could retrieve the object ids and the first 60 fields in one query, then the next 60 fields in the next query, etc.
Comment #25
sunI strongly agree with catch. One of our sites translates content into 14 languages. It has 25 content-types, with a total of 115 fields. So, that alone would result in 1610 tables. Adding the remaining, non-CCK-related tables (~180) of this site would result in a total of 1790 tables.
Comment #26
catch@bjaspan, @1 I think we can make it happen without breaking the architecture, IMO sun's example strongly points to a table per field per language being not a good option at all.
@2 great ;)
@3 this is true, but it'd require the administrator choosing an row style which loads the full object (and also that all languages get loaded into the object all the time - not sliced by current language within the load functions) - my point is that it'd be even easier than usual for an admin to do this configuring a field row style per D6 Views, and we shouldn't make situations like that any easier than they already are to walk into.
Here's a minor update with the FIELD_CARDINALITY_UNLIMITED change removed, however there's a couple of issues with cardinality we need to solve, and since this patch is my first really serious look through the field API I'm not sure how best to do it:
The main one, is that If a field has cardinality of 4 for example, we need a way to make the delta/cardinality ratio per-language. This needs to both be handled in the UI, and enforced in validation. Lines like this for example aren't going to work if the cardinality is 4 and there's 10 languages (i.e. $delta could hit 40):
We need to find a way to make deltas work on a per language basis - at this point it seems likely to require ignoring the actual delta, and doing row counts for entries-per-language-per-object in validation somewhere.
Comment #27
yched CreditAttribution: yched commentedI just gave the patch a cursory look, no time for an in-depth review right now.
- In field_sql_storage_field_storage_load()
I'm not sure about the 2nd line - turning each $object->field_name[delta] in a #-array ? maybe, but probably needs some pondering... Basically means you need element_children() each tilme you want to iterate over field values (as you do in text_field_sanitize(), but there are *many* more places). Not sure of the other consequences right now.
- A definite -1 on the first line, though. Filling a render property at load time really sounds like a bad idea :-)
- Right, if we go this way, the code to limit the number of values written or loaded will need to be updated to limit per language.
- More generally, what's the impact of this 'translatable fields' approach on D6's 'translated nodes are different nodes' approach to node translations ?
I'm pretty sure you explained that somewhere in one of the Translation Sprint threads, can you link to a post I could read ?
Comment #28
nedjoIt's a good question. Here's my understanding.
The focus on fields came out of a process where we looked at various possible approaches to making non-nodes translatable, see http://groups.drupal.org/node/18735.
Option #2 on that page was to follow the translated nodes approach with other objects, like users. But multiple records per user, one per language, each at a different URL?
For any non-node object that fields can attach to, translatable fields would make it possible for the object to exist in multiple languages, without the various problems associated with having separate objects per language.
For nodes, multilingual fields would provide an *alternate* way of making nodes translatable. A particular site could keep the translation module disabled and use purely fields for multilingual nodes. It could have only unilingual fields and use translation module. Or it could use a combination of the two--some node types handled by translation module, others by multi-language fields.
Probably there will be some quirks to work out to get this working though. For example, if this is the scenario, is it a requirement that whether a field is multilingual or not is by field instance, so that a particular field could be used with translation module handled node types and those not handled by translation module?
There are also UI considerations. Do we have the same UI for translation module (create a new node based on this node but in a different language) and field-based node translation (bring up a copy of this node with translatable fields editable in a different language)?
In the long term, it's *possible* that multilingual fields will render our current nid/tnid approach unneeded, but they could equally coexist.
Comment #29
catchThanks yched, I'm also not sure about requiring element_children() for this. plach, could you explain why that change was necessary?
The #pre_render in load, yes it doesn't really belong there, put that down to me reading through Field API as I go along...
Nedjo pretty much covered translatable fields vs. translatable nodes. The main reason for trying this approach is that trying to share anything between nodes in a translation set (nodequeues, votes, CCK fields) currently requires doing everything twice - we've been working on adding that support to contrib modules and it's non-trivial to implement and requires a lot of complexity. I also have an early draft of a patch for attaching fields to translation sets (i.e. proper shared fields as opposed to copying values over) here: #132075: Attach fields to translation sets. although that takes a bit of fakery to get going. So this would be a much simpler approach to take for things like taxonomy terms and users, and an optional approach for nodes (obviously you wouldn't want both node translation and translatable fields on the same node type).
Comment #30
sunAll very good points. To summarize:
- If fields are translatable and stored by language, then we are not going to store different nodes (objects) for translations. I.e. node/1 contains all translations on load.
- Separate nodes would mean that the additional language column for fields would not be required at all. However, as catch already mentioned, those separated translation nodes were/are a major flaw in the current translation system in many cases.
- OTOH, separate nodes have valid use-cases, f.e. to have only French comments on the French translation. I.e. the scheme of having a node/1 for EN and separate node/2 for FR has some pro points, but also a major con point: We can't have user/1 for EN and (the same) user/2 for FR.
- Upgrading existing nodes (with separate translation nodes) to translatable fields is rather impossible without rebuilding your Drupal 7 site from scratch. Users must, however, be able to use both node translations and field translations in parallel.
In short: We need translatable fields (e.g. for users), but we also need to keep the existing translation method for nodes.
Comment #31
plach@catch: I was forced to use element_children() because the $object->fieldname fields are everywhere treated as integer-keyed arrays of field values, as yched was pointing out: the introduction of the
#pre_render
andtranslation
keys broke this assumption. If we decide to keep this approach we'll have to use element_children() almost everywhere in the Fields code (I wish we could use getters and setters to access object properties...)As for my understanding there are three main aspects treated in this discussion:
I am not sure that any of my proposal is viable, but I think we should find a shared solution for all the aspects above to get somewhere.
Comment #32
plachWe spent all the 3rd day in discussing, this is the plan for the 4th day:
#languages
, i.e.field[0]['#languages']['en']['value']
-- we provide NO default value (for translatable fields, obviously)hook_field_attach_load()
default implementation which performs language negotiation, i.e. shifts the proper values intofield[0]['value']
drupal_render
always gets a plain old node with all the field values in place (#languages
is not dropped)read coding reviewing testing
Comment #33
plach@bjaspan: the storage pattern issue is still to be considered, we decided that, due to the field storage pluggability, we can ignore it for the moment: we'll be glad to discuss that on #drupal-i18n with you and yched.
Comment #34
catchSo the idea here, is that for multilingual fields, we want the initial loading of fields/objects (and caching) to be agnostic about what actually gets rendered - so you always get a consistent field[$delta]['#languages'] array from database loading or the field cache regardless of what the current language is.
We also don't want drupal_render() to have to worry about translation. So, for multilingual fields only, we store the values in #languages, then populate the actual field value to be used in field_field_attach_load() - which means other modules can hook in later to do proper language fallbacks as needed.
Then any rendering/theming is going to just do field[0]['view'] same as now - doesn't need to worry about translation at all, and nor does the caching layer - all the hard work is done in hook implementations.
This seems like the best compromise between keeping the existing field array consistent and having enough flexibility. i.e. we won't need to use element_children() as in the existing patch or anything like that.
Comment #35
catchHere's a new version in line with yesterday's discussion. Main changes:
1. For multi-lingual fields, we load and cache all multilingual values for a field in a new $field[$delta]['#languages'] property in field_sql_storage_field_storage_load().
This has the following advantages:
* Regardless of the language for the current request, this information is always the same, so can be loaded from cache or the database safely.
* As a #property, it'll be 'invisible' to all field modules (see later) so shouldn't mess up existing code
* We'll be able to track delta values per language (language has been added to the primary key) - so field cardinality should still work (although saving delta per language is still a TODO).
* Language selection and/or fallbacks can be implemented in PHP instead of requiring complex JOINs (see 2)
2. In field_field_attach_load() we populate $field[$delta] with the best fit value fr the language language (per-delta, to allow for partially translated multiple value fields).
Once this is run, other modules implementing the hook could implement their own language fallbacks (use en-US, if not use en-CA, if not use en-GB) relying on the #languages array to see what's available. If we require locale and/or translation modules for translatable fields, then this hook implementation could also be moved there, but it gives an idea of how this would work.
After this point, all field modules and field API will see a single value per delta, as with non-translated fields, in the desired language for that request.
The main things still to resolve:
# Setting delta values on a per-language basis on save, and using the delta to enforce exactly the same cardinality rules on validation and load as happen for mono-lingual fields.
# Some idea of how this would map to a UI. My initial idea would be node/n/edit/fr node/n/edit/en with only fields in the available languages shown on those tabs, maybe with a side-side view so the existing values can be shown alongside the form.
# Tests.
However this is at the point where it needs more architectural review, so setting to CNR.
Comment #36
sunAdded another fallback to an empty ('') language value, which basically equals the "language neutral" setting we have for nodes now.
Comment #37
bjaspan CreditAttribution: bjaspan commentedI haven't looked at this yet.
I'm curious how this affects per-bundle storage (see http://drupal.org/node/368674, the patch keeps getting rejected but in fact is fine, and also pbs.module in contrib). Is there a defined list of all possible languages that can exist at any given time? If so, should per-bundle storage create a separate column in its wide tables for every possible language for every field that it otherwise puts in those tables? If so, there also needs to be hooks that pbs.module can implement for "create language" and "delete language" (and "rename language" if appropriate") so that it can know when it has to update the per-bundle table columns.
If there is a not a defined list of possible languages at any given time with hooks for when it changes, then I think this approach kills per-bundle storage, and that is going to be a deal-breaker for a very vocal contingent.
Comment #38
catch"Is there a defined list of all possible languages that can exist at any given time?"
Yes, it's available from language_list(), I don't know if locale module provides hooks for when languages are added or removed, but if it doesn't then it ought to ;)
More generally, were it somehow impossible to add fields to per-bundle storage, I don't see it 'killing per-bundle storage' as such, but it would kill per-bundle storage for translatable fields - meaning they'd have similar restrictions to fields with unlimited cardinality on that front, at least as I understand it. Even then, they'd only be subject to the limitations of per-field-only storage as it is now, as opposed to query rewriting, additional joins with conditions, or hits to the database on every load which most other options are going to impose.
Comment #39
bjaspan CreditAttribution: bjaspan commentedOkay.
Comment #40
bjaspan CreditAttribution: bjaspan commentedQuick read-only review:
This is wrong. field_info_instances() takes a bundle name, not an object type. This code will work for users but not for nodes. See other calls to this function in (e.g.) field.attach.inc.
I don't love having field_field_attach_load() in field.module do the language thing. If this is going to be a core behavior of Field API, it should probably be in field_attach_load() directly. If not, it should probably be in a different module (locale? something else?).
The primary key for field_config and field_config_instance is going to change to serial ids. The current primary key will become a unique key. Anyway, it isn't clear to me where 'language' belongs in the key, if it does, or if we should have a separate index for it.
I haven't really thought about the changes to field_sql_storage.module.
Comment #41
yched CreditAttribution: yched commented[edit : how can I crosspost with a comment posted half an hour ago ? - bah]
Architecturally this looks fine.
field_field_attach_load() :
is not right. field_info_instances() takes the bundle of the object, which can be extracted using field_attach_extract_ids().
Actually, the common way to 'run an operation on a object's field values for every instance in the bundle' is through _field_invoke_default() / field_default_*(), so field_field_attach_load() might be worth implementing as a field_default_* operation, for consistency.
About impact on PB storage, the main issue is limitations on the number of columns. I don't know if pbs.module intends to account for this (don't know if that's even possible), but adding one set of columns per language, per delta (pbs stores all limited-cardinality fields) makes you reach the limit much faster.
Comment #42
bjaspan CreditAttribution: bjaspan commentedpbs.module could choose to store only single-value fields if they are also translatable. It could even make that decision dynamically based on the database's per-table column limit and the number of fields it wants to combine. Or it could provide a UI to let the admin decide.
catch: FYI, hook_field_attach_load() is going to change as part of the hybrid storage patch to be completely different than it is now. The current hook behavior will probably be provided by hook_field_attach_load_alter() or somesuch. This will be an easy update to make, obviously. :-)
Comment #43
catchfield_field_attach_load() - I'd be fine with moving it to locale.module - since locale is required to have languages, then it can realistically be required for translatable fields to work.
field_info_instances() - thanks for that, now fixed. (both these changes in this patch)
primary keys - do you mean translatable rather than language? Language is added to per-field tables and in those tables, to allow for one delta to have a record for each language, we added language to the primary key there. But yeah there's nothing for translatable yet, probably needs more discussion.
Thanks for the feedback, looking forward to more.
Comment #44
catchSounds good with hook_field_attach_load() changing, it looks misplaced at the moment since it doesn't get added to the field cache but runs in the 'if not cached bit' (and also not multiple, tsk tsk). Is it always going to be reliably not added to the field cache? Also, if the cache node load patch gets in at some point, that'll be caching the full output of field_attach_load() - so we might need to move hook_field_attach_load_alter() out of that, somehow. A bit tricky but a fairly minor issue compared to the other ones that need solving. I'm glad this version is going down better than what we came up with last week ;)
Hoping to add some very basic low level tests for this, if possible later tonight.
Comment #45
plachMade some small fix to make it work, we are still stuck with the field cache issue, but apart from that the patch seems to work.
I tested it again with demofield: I manually inserted three demofield entries (en, it, 'language neutral'), then I configured three languages for the site (en, it, fr) and made fr the default one: all seemed to work as expected.
You can find some screenshot attached.
Comment #47
plachWindows newlines strike again...
Comment #48
plachComment #49
plachCrossposting UI proposal: http://groups.drupal.org/node/19057
Comment #50
yched CreditAttribution: yched commentedA few more remarks :
-
means that if a field is 'translatable' but locale.module is disabled, we get no useful data for the field at all (stays hidden in '#languages', only locale knows how to fetch it from there).
I think the overall approach is fine but there are some details to iron out about code separation and who does what - like : 'translatable' is an "official" $field property, stored in the field_config_* tables (not sure we need a dedicated column, btw), and storage modules need to support it, but the actual feature is hooked in by another, non-required module.
- The current code doesn't let you specify "I want to load the fields for a given object in a given language" - the language is automatically decided for you based on page context.
- So far the discussion mainly focused about loading the data, not too much about saving incoming data. We'll need to be able to read the language of the incoming values somewhere, not sure how this will look like. The patch currently sets $obj->field_name[$delta]['language'] on load (field_sql_storage_field_storage_write() needs to be fixed, uses $item->language instead of $item['language']).
Means you can write an object with a french field_a and an english field_b ? or even field_a value 1 in french and value 2 in english ?
Means you have to manually specify a language for all field values when doing a programmatic node_save() ?
+ detail : $item['language'] might clash with a field-type 'column' named 'language'.
Comment #51
plachnedjo, catch and I managed to address the field cache issue, now all seems to work fine: the screenshots are the same as before but no need to truncate
cache_field
before switching language; I think now we are ready for a serious review.Comment #52
catchYes, we need to make this decision fairly soon. We could move locale_field_attach_load() back into field.module somewhere, so at least we're always guaranteed a full field, and then other modules can still override the handling. Depends how much we want core field module to know about it I guess.
One possible option would be to have field module show all values for multilingual fields by default - so ignoring language apart from populating the #languages array early, and otherwise it just builds the array as usual (so field_sql_storage_field_storage_load() would just populate the array as usual in addition to adding the #languages for the deltas. So in load, every field would get it's own array key, no empty values or filtering at all, then locale module comes along, says
That means, if you have a multilingual site, then one day just disable locale module, suddenly your multilingual content is showing all values for every language - might make your nodes look ugly, but we don't have panicky admins with 'missing' content that way.
The field value for rendering is, but you still have all the translations in [$delta]['#languages'] if you need to get to them. Do you have an idea how we could offer both here?
Yes this is where it gets tricky, and where it might be partially informed by our UI. For a programmatic node_save() - if your field is translatable, then IMO yes, it'd be like any other field value when saving - just always there regardless of the field providing module (or if not set, it's saved as language neutral).
Comment #53
suna) Translated field values need to use the same deltas like the original/source field values. This means there's not much we can stack into field values when a site or object suddenly is no longer translatable. I am not sure whether Translation module in core supported this before. As a user, I would expect that only language neutral values will remain when I disable translatability. That's the default case for both the 'language' column as well as programmatically created objects that do not specify a language.
b) Locale module is for user interface translations/localization. Translatable fields belong to content translation, so if we happen to move the functionality, Translation module has to get them. This particular issue boils down to the more generic question whether Drupal core will provide support for multilingual content by default in the future. Doing this would vastly increase the sensibility and awareness for languages throughout core and contrib. I think this should be discussed in IRC or in a separate issue. Please NOT in this issue.
In this patch:
- Moved implementation of hook_field_attach_load() into Translation module.
- We should be able to safely use $row->language, as the column is defined as varchar not null default ''. Solving the potential $item['language'] name clash.
- Minor clean-ups.
Comment #54
plachTested sun's patch and works as expected, attaching fresh new screenshots (with the user!).
@sun:
with catch we were worried about the case that NO language neutral value survived the translation process, that's why we came up with that proposal; personally, I am more oriented on displaying just the language neutral content and applying some fallback logic on translation module disabling/uninstallation; also maybe: the first value inserted for a field is duplicated both in a language neutral row and in its language specifc row, this way we always have the language neutral value populated (the famous source).
Comment #55
sunComment #57
sunLast one introduced {field_object_language}, a generic table that stores the source/default language for each object.
This one focuses on fetching the information in _field_attach_load(), before objects are passed to the storage method. Needs work though. But leaving PNR, since all of this needs to be reviewed.
Comment #58
sunOptimized method for checking for (overall) object (field) translatability.
Comment #59
yched CreditAttribution: yched commentedin _field_attach_load() :
There's a misleading
// Make sure empty fields are present as empty arrays.
comment.- shouldn't that be
if ($field['translatable'])
?@sun : suggestion - could you name your patches after the issue and comment # ? E.g drupal7.translatable-fields-367595-58-sun.patch. Would make checking progress between patches easier :-)
Comment #61
plachJust tested the patch and fixed some minor problems.
I added the
language_default
option as the second best fit choice intranslation_field_attach_load
.I was thinking about one thing: if we used the newborn
{field_object_language}
table as a registry for all the translations of an object we could store the default/source language, but also some translation metadata (e.g. creation date, author, last modified date, ...).This way we could actually emulate the current node translation functionality through translatable fields without having to convert each single node field to
Field
.The
{field_object_language}
table might become something like a{field_object_translations}
:Comment #63
plachfixed windows newlines...
Comment #64
plachComment #65
catchUpdated to move the #translations key into $object->field_translations as discussed last week.
Quick summary of where this issue is:
All translatable fields are stored in a per-field table with a language column. On load (i.e. in field_sql_storage module), we load all the values in each language into an $entity->field_translations property. So for example:
$entity->field_translations['my_text_field']['en'][0]
$entity->field_translations['my_text_field']['fr'][0]
Then in field_attach_load() itself, we pull the values for the object's default language. This is stored in a new table, which has the object type, ID, and the language it was originally entered in (plach's ideas in #61 look like good possible extensions to this). We decided it doesn't make sense to associate a language with a user (and we definitely can't have tnids for users), but this metadata allows us to populate the field value for each object with a sensible default value before any field hooks need to operate.
This is in the update patch, and it's one simple query to get this information per set of objects being loaded, and only if they're loading translated fields. At this point we can safely cache both the $additions object in the field cache, and the $object itself for something like node_load() caching - because this data is always going to be persistent across requests. No need for extra cache keys or anything like that.
The translations are still available in $entity->field_translations at all times. This means that hook_field_attach_load() and later hooks can pull out translations in the user's language, implement complicated fallbacks like Czech -> Slovak -> Croatian etc. - and they only have to deal with an array of field values keyed by language code - no going back to load different values from the db, no query rewriting. Rendering and everything else always deals with the values in $entity->field_name - so this process is transparent to field modules, themers etc. (although once again the data is available for advanced use cases).
What's still missing is saving of the data (the language for each field value) and a UI of course (and tests, but that really relies on us having the data saving working to an extent so we can load it again). At this point I think it'd be good to know how much we need to do in this initial patch in order to have something committable. I don't think we should try to tackle the UI issue here (and there's a groups discussion open if anyone wants to have a crack at that), probably we need to get saving working.
Comment #67
yched CreditAttribution: yched commentedGreat ! No time to look at the patch right now, just a minor point :
Could we abstract the current code in translation_field_attach_load() into a function that does 'switch $obj->field_name front values with $langcode (with fallbacks)' - it seems we'll need that function anyway - and then simply call that function in translation_field_attach_load() ?
Comment #68
plach@yched: nice idea, I'll work on it and on reviewing the last patch tonight
Comment #69
catchFixed typo in translation.module. I agree with abstracting out the code in translation_field_attach_load(), also translation.module isn't the place for it to live - locale() is required to have languages enabled, so while that's technically for interface translation, it seems like a less heavy dependency.
Comment #70
yched CreditAttribution: yched commented1) I'm still bugged by the fact that if translation.module (or locale, if this ends up in there) gets disabled, you get no $entity->field_name value for translatable fields, which are then 'lost'. field_sql_storage_field_storage_load() should at least populate $entity->field_name with the values for the object's language_default. You'd still get a value, even though you disabled the 'translation' feature.
That would mean moving the code that retrieves the default_value *before* the call to field_storage_load() in field_attach_load(), instead of just after.
2) There's a now a DX difference between :
a) entity types that handle their own caching outside of field_attach_load :
they get to cache an object where the current page's $best_fit language has already been put upfront by translation_field_attach_load() at the end of field_attach_load(), so they need to remember to do the language $best_fit reshuffle themselves after reading from their cache.
b) those that don't handle their own caching :
field cache then stores 'current language'-agnostic values + all translations, and $best_fit negociation happens on the values that get out of the field cache, before we return from field_attach_load().
Makes me think that maybe what currently lives in translation_field_attach_load() does in fact not belong to hook_field_attach_load(), but rather should be the job of a separate field_attach_*() func, field_attach_get_translation() or whatever.
- Fieldable entities call field_attach_load() (or fetch from their cache); this gives an object populated with fields values for the object's default language, plus $entity->translations.
- They then call field_attach_get_translation('fr'), repopulates $entity->field_name entries with translated values. This function does *no* db hit, just invokes hook_field_attach_get_translation(). The current translation_field_attach_load() then just becomes translation_field_attach_get_translation(). And field_attach_get_translation() is then the function I mentioned in #67.
Opinions ?
3) Bikeshed point :
Since we're talking about translations, seems natural to find an array keyed by languages; then, in $entity->field_translations['en'], you find the same data format that you're used to all over the field API.
Comment #71
plachI reviewed and tested #69, didn't notice yched's #70 until now, but incidentally some of the issues raised there are covered by this one:
1) Now
field_sql_storage_field_storage_load()
loads the language default field transations directly in place, if they are available.2) Too late for me to think about this, just note that this patch introduces a
field_attach_switch_translation()
function which accepts a series of language preferences and which is used intranslation_field_attach_load()
for the "heavy work".3) +1 for
$entity->field_translations['en']->my_text_field[0]
; moreover I have the bad feeling that$entity->field_translations
might clash with a field namedtranslations
, what about another name?Comment #72
bjaspan CreditAttribution: bjaspan commentedJust got back from vacation and have not had time to review the new patch or design, so I'll just comment on the bikeshed issues :-).
I agree with $entity->field_translations['en']->fieldname instead of $entity->field_translation['field_name']['en'].
I'm not sure I like the property name field_translations. Currently, the "field_" prefix for fields is imposed only by the CCK UI; if you create a field named (e.g.) "body" then you get $object->body, not $object->field_body. Using field_ as a prefix with CCK and as a special case prefix for field_translations seems poor. However, I do not presently have a better suggestion. $object->translations would work, but that would be the first case of the Field API "polluting" an object's namespace with a constant property name. Perhaps fieldable info should indicate the property name to use, and then the field API would refuse to create a field whose name matches that property.
Actually, that brings up the important point that field api should ALREADY be refusing to create field names that override other property names identified by fieldable info (e.g. for node: nid, vid, type) but is not. Time for a new issue.
Comment #73
bjaspan CreditAttribution: bjaspan commentedFYI, I created #382464: Disallow reserved field names.
Comment #74
plachNedjo, Sun and I had an IRC meeting in which we discussed most of the open issues, these are our conclusions:
field_attach_post_load()
for the future (if the cache node_load() issue does not beat us to do it).field_attach_switch_translation()
tofield_attach_translate()
hook_field_attach_load()
implementation in Translation module, because it's content, because Translation module is swappable (easier than Locale), and because the field translation UI must be provided by some module. But we hope to get rid of Translation module, or better, the old-fashioned entity-translation.$entity->field_translations['en']->my_text_field[0]
$entity->field_translations
sounds like the way to go: it would be good but not essential to be able to define a default property name, since there's no strong reason to expect that this needs to differ between object types.field_attach_translate()
we simply iterate over all fields separately and check whether there are values for the given language (default is the current language) in our field translations. If there are values, we use them - (even) if they are empty. If there are no values, the source/default language values remain.Comment #75
plachThis should cover the points discussed yesterday, the main issue still open is saving.
Comment #76
Anonymous (not verified) CreditAttribution: Anonymous commentedsubscribe
Comment #77
plachIn this one I addressed the saving issue: at the moment all fields being saved are assumed to share the same language or have none, I'm not sure this is the right approach but needing feedback on this.
This is the first patch fully-testable with no need to insert data in the DB with a SQL client: just install the translatable version of Demofield and enable the multilanguage support for the article content type, then start creating some articles and enjoy your translatable (demo) fields :)
(Obviously you'll have to enable the language switch block to see the translations)
Comment #79
Gábor HojtsyBarry got this issue to my attention. With all the Drupalcon DC craze, I'll only have time to come back to this later.
Comment #80
plachI was unhappy with the way the previous patch handled the field language saving, primarily because it involved the field item property
_language
: IMHO that one would have made supporting thenode_save(node_load($nid))
requirement pretty hard to achieve.Moreover I realized that the
$object->language_default
property was another case of entity namespace pollution, following barry's definition :)The current patch:
$form_state['values']['language']
$object->{info['translations key']}[$form_state['values']['language']]
field_sql_storage_field_storage_write
in order to be able to write all the field translations available in$object->{info['translations key']}
, if any (for translatable fields the actual value is ignored and only translations are taken into account)This way on field save all the translations found are just written to the db, allowing us to perform a
node_save(node_load($nid))
(with all the available translations in place) or anode_submit()
(with just the current translation in place) having a consistent behavior.I think the next step is writing down some simpletests but obviously I'd like to have your feedback on this one before starting.
Comment #82
plachUnable to reproduce any error, re-testing...
Comment #83
yched CreditAttribution: yched commentedI don't think that's the right approach for saving.
In this approach, it's not the content of the canonical $object->$field_name that gets written, but the values in the nested $object->translations[$lang]->$field_name.
The patch accounts for that in the case of a regular node form submission by populating form values into $object->translations[form_lang]->$field_name instead of $object->$field_name in field_default_submit().
- it means the values are not where we expect them (in $object->$field_name) for the rest of the submit workflow : hook_field_validate(), hook_field_attach_validate(), hook_field_presave(), hook_field_attach_presave() are screwed. Our current approach for 'load' is also broken on that aspect : if they only look in $object->$field_name, then hook_field_load() and hook_field_attach_load() only see one language.
- it doesn't account for programmatic saves : developers need to remember 'I should put the values in $object->field_foo for non-translatable fields, but in $object->translations['fr']->field_bar for translatable fields'. That's terrible DX IMO.
This is leading to a really convoluted API.
I don't think I see a way to have the three of :
1) $objects that load and save all their translated values in one operation and carry them for all their lifecycle.
2) a reasonably simple DX where field values live in a predictable spot in the $object, translatable or not, for all apis and hooks...
3) an $object structure similar to CCK D6 to ease DX transition to D7
I'm starting to wonder if the solution is something like how multiple values are handled : when it comes to the format of the data in $object, all fields are 'multiple', and single fields are just a degenerate case that only come with delta 0. Likewise, all fields are 'translatable', and have a structure like : $object->$field_name[$lang] = array of values keyed by numeric delta. Non translatable fields only have an $object->$field_name[''] entry.
We sacrifice point 3) above and all the APIs (field-type modules, hooks, templates, etc) have to account for that new structure, and do their stuff accordingly. Maybe supporting natively translatable fields comes at that cost...
Er, is this where we started from, and we just moved in circles ?
Comment #85
KarenS CreditAttribution: KarenS commentedI agree with yched that we need a global method similar to what we do with multiple values so we have a predictable structure to the array whether or not there are translations. Then we just have to adapt all the functions that touch the array to expect the new structure.
I haven't tested it yet, but I'm worried about option_widgets which has to transpose the array structure during FAPI processing -- I'm hoping that won't break completely. I'm guessing no one has tested that yet because I'm pretty sure that code will need some adaptation.
Comment #86
KarenS CreditAttribution: KarenS commentedOops, it's not option_widgets any more, its 'options'. And where I said 'array' I mean 'object', but you get my drift.
Comment #87
plachHere is a rerolled version of the last one just in case anyone wishes to test it.
@yched: all your points are good, I think the main reason we decided to go this way was to preserve the field/cck code from massive refactoring (#3); I think I can see a way to concile the three points, but after what you and karen stated I am not so sure that #3 is a priority anymore. Anyway:
Right, there was a lack of simmetry between load and save: currently after field load the canonical fields are populated, but on field submit we did not match this behavior. In the current patch both
$object->translations[form_lang]->$field_name
and$object->$field_name
are populated, this way we should be ok.Yes, it seems that currently also the load hooks might be screwed up, but if we perform the translation switch-in trick by calling a
hook_translate
before invokinghook_field_attach_load
we should be fine: all the modules implementing the load hook will see the right translation according to the current UI language and fallback rules, if they need a different language they should know where to find it.Absoultely right. I think this is one of the most evident examples of the need of using getters/setters to access object properties now that we have PHP5 objects, but also this solution would involve a massive code refactoring.
So I am thinking about an api function that we might use to populate the object fields, something like
field_store_field_values($fields, $object, $language = NULL)
.The
$object->$field_name[$lang][$delta]
is also fine for me, but I am a little bit scared about the amount of work it would take.@KarenS: as you expected I did not test the options widget, to be honest I never really got into that transpose stuff, if you could provide me some clue or even better test it that would be great :)
Comment #88
KarenS CreditAttribution: KarenS commentedI still see the following code in the patch:
This means the node object will be structured differently for translatable and non-translatable fields, which is what we are concerned about. We need a consistent structure in both situations so we don't continually have to test whether there is a language or not when we manipulate the node. There should be a empty value using the same structure for no language rather than creating a different structure in that case.
However I'm also wondering about losing the top-level position of fields in the bundle. In the current code, you can always be sure that the value of 'field_example' is located in $node->field_example and with this new structure you will need a way to figure out what languages are available to be able to find your field values in the bundle. I am now wondering if it might be better to reverse that so we have (with an empty placeholder for the language when it is not translatable) so that we can still have an easy way to find the values for 'field_example' in any situation:
And a field with no language would be:
Comment #89
plach@KarenS: I understand your and yched's concerns but I would like to be sure that the current approach, which is the result of many endless discussions in IRC with catch, sun and nedjo (at least for the loading part), is totally clear before marking it as not viable, so excuse me if I repeat myself and/or repeat things you're fed up with :)
The main guidelines we followed while designing this were:
A) Avoid per language entity caching
B) Avoid to break anything is currently working
C) Support
node_save(node_load($nid))
D) Support any kind of language fallback rules that one might need
To support A we decided to load and cache all the available field translations at field load.
In order to support B we decided to leave untouched the current
$object
structure and store the field translations in a dedicated property: the current$object
data structure is designed to hold a single language and we need nothing more, unless we have to show more than one translation at a single time. So we decided we can load the best suited values at each content view in the$object
canonical fields and store all the field translations elsewhere; currently this is what happens on field load:$object->$translations[$lang]->field_value
; if a particular field translation matches the source language (the one set while creating the entity), it is also stored in the canonical place:$object->field_value
. These values get cached. When the process ends we are ready to invokehook_field_attach_load
with an$object
in which all the canonical fields should be populated. This should guarantee that nothing breaks.Translation
module implementshook_field_attach_load
and switch into the canonical fields the tranlations for the current language if available in$object->$translations[$current_lang]
. All the following modules see a translated$object
, again with all the canonical fields populated.$object
, we have just to show it.To support C I tried to match as much as possible the load behavior: on entity save anything found in
$object->$translations
is written to the DB plus all the untranslatable field values found in the canonical object properties.On entity submit all the field values are stored in their canonical fields, if translatable they are also stored into
$object->$translations[$form_lang]
.D might be achieved through contrib modules implementing
hook_field_attach_load
asTranslation
does, otherwise we might introduce a new hook, e.g.hook_field_attach_translate
, to be called before attach load.If we have done the things right, the only places you should need to look into are the canonical
$object
fields.Comment #90
yched CreditAttribution: yched commentedplach ; no worries. My suggestion in #83 indeed has a high API change cost, so there's no harm in trying to see if the current approach floats. Actually, I have myself pushed in this direction, so I would not lightheartedly discard it. Plus #89 provides a really welcome summary of where we currently stand.
The problem is that populating one language at a time in the canonical $object->$field_name is not enough for hook_field_[attach_]?_load, hook_field_[attach_]?_validate, hook_field_[attach_]?_presave, etc. If something has to be done about field values in there, it has to be done for all languages, not just the current one.
Take hook_field_load(). This is where file field reads from the {file} table and merges the info into the field value - which natively only stores an fid. This has to be done for all languages, not just the object's default language, or any definition of 'current language'.
Same goes for validation : if all languages are written, then all need to be validated, not just the language that sits in front at the time validation is performed - and we do support validation for programmatic saves too (even if no 'core' fieldable entity uses it right now), so this not just a question of form workflow.
Which means that, in the curent approach, the hook implementations need to take care of both the front location ($object->$field_name) and the translations bucket ($object->$translations[$lang]->$field_name). I really think this is no good, and, again, take my fair share of the blame for pushing in what I now think is the wrong direction :-(
Comment #91
plachSubscribing to Field API Progress and mentoring initiative.
Comment #92
plachI am not so sure we have to apply all the hooks to all the field translations at each entity view/save and I would like to explore this idea.
I'd like to insist on the point that during a single entity view we need just a slice/facet of the
$object
data: if we have the proper translations (according to all the defined fallback rules) in place before calling thefield_[attach_]load
hooks I think we can safely ignore the other translation data.Let's resume the file field example: say the field is translatable, we have an english PDF and a french PDF, each one with its fid; when we are viewing the english version the load hooks will get into the
$object
the english PDF data, viceversa when viewing the french translation we'll get the french PDF data. Why bother about the other translations in this case? Entity-level cache, maybe?On the save side if we let just one translation be created/modified at a single time, we'll have that all the validation/presave/uknow hooks are correctly invoked and again can safely ignore any other existing translation data: if it exists it must have passed through all the above hooks previously to get there, so it must be valid and complete. This way, when saving an
$object
holding many translations we should be pretty sure that all of them passed through the hooks each time they got modified.To ensure this probably we should slightly modify the current save process to be completely symmetric with the load one: we should be guaranteed that the we can perform a switch-out trick (pulling out the data from the canonical place and store it in
$object->$translations[$save_lang]
) after all the other hooks are over.Let's say a basic switch-out is performed by the
Translation
module and let's say the data found in the canonical object fields is considered to be in the source language (and written onto the DB accordingly), if we disableTranslation
we gracefully degrade to the simple no-translation case in which the data is pushed into the DB and pulled out of it without having to worry about its language.Having all these assumptions satisfied would allow us to have a completely language agnostic
field_save
: all the translation logic would be delegated to hook implementations, e.g. theTranslation
one.Talking about good DX? :)
Comment #93
catchI think plach is right about hook_field_load() - the only place we'd need all hooks run on all translations of a field would be for entity level caching - in that case we can just not cache hook_field_load() in the object - which currently isn't cached anyway. This is already accounted for in the cache node_load() patch which has a hook_nodeapi_post_load() for those situations - an equivalent field_attach_load_uncached() could be introduced as well - we'll hit the same problem for user specific information in fields, not just languages - i.e. poll module's $allowvotes. If there are other use-cases apart from that for having everything run on all translations that'd be good to know though.
$object->fieldname[$lang] - means every piece of code which has to grab a field value manually has to account for languages - i.e. $object->fieldname[$lang][0]['view'] in node templates. Do I need to know if a field is translatable to use [''] or [$lang]? I also disagree that this is a predictable array structure - if a field is translatable, but untranslated - then I'll have empty values for $lang all over the place - leading to fallback logic needing to be implemented at a much higher level. We tried really hard to keep the heavy lifting for translations as internal to the API as possible to avoid situations like this. It seems like plach has a plan to avoid too much complexity on programmatic saves - but even if it's a bit more complex, better that than putting the fallback logic on everyone who needs to access bits of the node object directly.
Comment #94
yched CreditAttribution: yched commentedSummary of IRC meetup with plach and sun yesterday - new approach :
1) field_attach_X() has to operate equally on all languages. All hooks and function calls (field_default_X(), hook_field_X(), hook_field_attach_X()) have to be run on each language.
2) No more $object->translations. We go with :
3) We keep this new 'keyed by language then by delta' structure transparent to field_default_X() and hook_field_X(), by changing _field_invoke() so that it calls them for each language separately. What they receive in $items are the values for *one* language (they probably need to know which one, with an additional $language param), and they act as they do know.
3rd party modules that act directly on the object through hook_field_attach_X() have to deal with the new double-nested structure.
4) Still to be ironed out : field_attach_view() / field_attach_form().
Usual use case is : 'display $object in spanish' / 'display the form for $object in french'.
field_attach_view() : we need to extract *one* set of values of each field (witch language fallback logic), and display it.
field_attach_form() : we need to extract values of non transtatable fields and values in translatable fields in a given language
But it would also be useful to be able to have several languages in one form (side by side translation). Even if we don't implement it, we probably need to make sure its possible, possibly with some form massaging.
Comment #95
KarenS CreditAttribution: KarenS commentedI like the sound of this approach, it sounds in concept like something that will provide the flexibility needed but still have a predictable structure that other parts of the code can deal with sanely. Thanks for all the work hammering this out!
Comment #97
plachThis is the first draft patch implementing the new approach described in #94. It needs lots of work (I did not even try to make it pass the tests) but the basic form render/field save/field load/field display code seems to work correctly.
However there are already a few basic points that I'd like to discuss before going on: the current approach has the great advantage of not needing the concept of current language. Basically fields are natively multilingual, so the concept of current translation is something that might be built on top of the basic field layer by other modules (I am looking at Translations).
As fields become multilingual I think Field API should not enforce any kind of restriction on the values that get entered/displayed, just as much as it does not say "only odd values are entered/displayed for multiple fields". IMHO the language selection on field enter/display should be performed by the Translation module or any other one that might need to hook in to implement complex visualization/fallback/form manipulation logics.
The current patch goes this way: on form rendering all the field values for all the available languages are displayed, on entity save all the field translations are written to the DB, on entity load every field translation is loaded and cached, on entity view all the available field translations are displayed.
Anyone wishing to test the patch while reviewing it (I'd appreciate it a lot :) can use the translatable version of Demofield.
Comment #98
yched CreditAttribution: yched commentedYay ! Great work, plach ! I do feel we're getting there.
My code-related remarks so far:
1) Minor : can you set up your editor to remove trailing whitespaces (including on empty lines) ? Removing them in followup patches is a pain.
2) field_create_field() : please use
'translatable' => FALSE
instead of 03) _field_invoke() :
The incoming object might have values for a language and not for others (think: submission of a form with only one language)
So the
// Initialize field translations according to the available languages.
part should act language per language :4) _field_available_languages():
what's $instance['languages'] ? I don't see it anywhere else in the patch. Probably related : hook_field_languages ? Could you elaborate on the need for this ?
invalid by-reference param at call-time :
$function($field, $instance, &$languages);
5) db storage : now would be a good time to actually write the code that limits the number of values *per language* on read / write
(current patch doesn't count values as soon as a field is translatable)
6) field_sql_storage_field_storage_write() / _field_sql_storage_field_storage_write_items()
I don't think we need a separate _field_sql_storage_field_storage_write_items() function.
It currently does one insert query per language, which is a performance loss compared to:
7) Also: it's probably time we start to write some tests ;-) ?
Comment #99
yched CreditAttribution: yched commentedLess technical : The points about form / view will need "more i18n / less Field API' oriented people like nedjo or Jose to come back and provide feedback / insights for UI strategies. Maybe a couple screenshots would help to convey a better idea of where the patch currently stands ?
Also : I don't think we should tie this patch to *the* perfect UI for field translations, nor that both discussions can happen in the same thread. Ideally, support for translatable fields should be committed first with a 'reasonable / least intrusive' UI, and let discussion about the ideal UI ake place in a separate thread. So we need to agree on this minimal acceptable UI for 1st commit.
In short: i18n gurus needed, now :-)
What we can be pretty sure about IMO is :
- we'll need to view 'de + neutral' values only, when current page request (according to lang negociation rules decided somewhere else) is 'de'.
- we'll need to handle the case of a form submission for one language only, and make sure the values for this language are validated and saved properly, while the other languages are untouched. (plach : right now, it can be roughly tested by *not* filling the $form definition with elements for a given language)
So we can start thinking about the internal mechanism for Field API to support this.
Comment #100
plachI am working on yched's suggestions from #98, meanwhile here are some screenshots: the test has been made adding the French language and attaching two fields to the article content type: a translatable demofield and an untranslatable one, as you can see all values from all languages are displayed.
I adopted the approach of displaying all the available values during view/submit to have a consistent behavior: this way all the field_attach_X functions act on all the field translations. I think that with some minor adjustments (e.g. reorganize fieldsets/display the related language on field labels) this could be the basic official Field API UI. As I was saying in #97 other modules might easily hook in field_attach_form and field_attach_view and decide which values/form elements get actually displayed and how.
This way we could support with little effort the common use case of having one single language on object viewing/submitting and the less common, but probably desirable, case of having many languages during a single object view/submission. IMHO supporting both of them from the opposite scenario (Field API displays one single translation at a time by default) would be far harder to achieve.
Anyway I fully agree on discussing the translation UI elsewhere.
On #98/4: sorry for not commenting that part: I thought about a couple of ways for other modules to determine which languages are available for a given field instance: one might be somehow populating the
$instance['languages']
setting, another one might be having a hook invocation right there. Not sure about this point.About tests: before starting, should we fix the existing ones?
Comment #101
yched CreditAttribution: yched commentedHaving several languages on one form raises a technical issue : right now it generates incorrect HTML (duplicate ids) and thus breaks JS behaviors ('add more', possibly d-n-d reorder). This would require adding the language code into id strings, but it means field_default_form() needs to receive the language being treated. That's
// TODO: Should we introduce a $language parameter?
. Note that this language param will be needed by text_field_sanitize() as well, for the check_markup() call.Supporting multiple language in a form might not be that easy (fieldgroups, ordering between translatable and non translatable fields...). Needs some UI thinking.
I think meanwhile we can start thinking about how we can limit form and view to a single language. The $options param for field_invoke() introduced in #392494-18: Fields API needs to allow searching of scalars might be a solution : $options['language'] = 'de' only iterates on 'de' values.
About $instance['languages'] : I'm not sure we need to support a list of languages per field. Going with the set of active languages should be fine in a first pass. If the need arises, we'll see...
About tests : Well, I guess for a start we could add a set of separate FieldTranslation tests, and leave the existing ones broken.
Comment #102
plachImplemented yched's suggestions from #98, added the
$language
parameter, removed$instance['languages']
, and fixed field attach tests (found a couple of bugs in the previous patch thanks to this).TODO:
* iron out the field view/submission multilingual vs monolingual issue
* write field translations tests
Comment #103
plachIn the current patch I worked on tests: I wrote a couple of simpletest for the multilingual field logic and had all the non form-related tests fixed.
I left alone the form-related tests as we still have to iron out the field view/submission issue.
Still to be considered yched's suggestion below, as I need to discuss it in some detail before coding:
Reviews, especially from the i18n gurus, at this point are welcome.
Comment #105
plachA brief summary for people needing to get up-to-date: the issue is currently assigned to me (plach) as I subscribed to the to Field API Progress and mentoring initiative (http://drupal.org/community-initiatives/drupal-core/fields), yched agreed to be my mentor and is providing the main Field API-related design suggestions. The current patch implements the approach that came out from the latest IRC meeting (http://drupal.org/node/367595#comment-1412738), which has been attended by yched, sun, catch and me. The new approach radically differs from the previous one but did not subvert the design guidelines that inspired the latter (see points A, B, C, D from http://drupal.org/node/367595#comment-1364062).
We currently stand at this point: we have pretty functional load/save/delete/validate/sanitize behaviors and simpletests are in place (http://drupal.org/node/367595#comment-1574020), but we are completely stuck on view/submission. The new approach has the great advantage of not needing the concept of current language, as a matter of fact all the mentioned operations are currently performed on all the available field translations, view and submission are no exception (see the screenshots on http://drupal.org/node/367595#comment-1485130). What we have to decide right now is: do we want to keep this multilingual approach for field view/submission, at least when there is no translation module enabled having the responsibility to introduce the concept of current (UI? content?) language, or should we go for a monolingual approach right from the Field API core? Details are available on http://drupal.org/node/367595#comment-1471596.
Comment #106
plachAn IRC meeting has been held today about the open issues reported in #105, the attendees were bjaspan, catch, nedjo, sun and me.
The main conclusions were:
Locale
module not enabled, orLocale
module enabled but only one language enabled) the field view/submission behavior should be exactly the same as the current D7 HEAD: language is implied. For multilingual sites we should try to emulate the currentTranslation
module behavior.field_attach_form
/field_attach_view
. We are confortable with the parameter solution as$language
is not a property of the object: it indicates the part of the object to be rendered as a form or viewed.field_attach_form
fills in a$form
array it receives as parameter andfield_attach_view
generates a structured content array and returns, so both of them should be re-callable provided that no hook alters the$object
.$language
parameter should come from, in the standard workflow, but our initial inclination is a language code as part of thenode/%node/edit
path.The next step after implementing 1 and 2 is defining minimum RTBC. Probably we'll have a patch that is only the framework in
Field
module, with actual translation workflow to follow in later patches, likely toTranslation
module.Comment #107
nedjoNice summary plach. This was a valuable meeting, particularly as it brought together Field API and multilingual expertise.
The idea is pretty similar to what we have now with node translation. Node module has the infrastructure needed for multilingual nodes (e.g., language, tnid, and translate fields) but doesn't on its own implement multilingual functionality. Similarly, field module would provide the minimal infrastructure needed for multilingual fields but not itself provide any UI or implementation.
So this initial patch, probably, should have no UI reflection. It should provide the minimal backend infrastructure needed for other modules to implement multilingual fields.
Then followup patches can put that multilingual support in place, likely in the translation module, and likely closely paralleling what we already have in the current "one node per translation" model.
I've started a wiki page, http://groups.drupal.org/node/22478, to begin to sketch out what that future work might include.
Comment #108
yched CreditAttribution: yched commentedSorry for missing the meetup, and thanks plach and nedjo for these summaries.
While I agree that this patch should not be held by UI issues, the following questions come to my mind :
- how can we define a notion of a 'committable state' for this backend patch ?
- how do we make sure that there is enough community momentum on the followup supporting patches so that we this initial commit doesn't end as a bunch of dead code ? I, for one, have my hands more than full with 'strictly Field API' stuff :-)
Comment #109
plachFor what is worth I give my availability to work on the following patches, obviously I would be more than glad if other big ones joined us. Maybe we could try to get webchick involved, perhaps she is more clear about who might be interested/available to help us with the remaining work.
Edit: I think that when
field_attach_view
andfield_attach_form
and all the simpletests will be fixed we'll be ready for some serious review after which go RBTC.Comment #110
plachThis one implements the design guidelines reported in #106 and yched's suggestion cited in #103.
I added the
$language
parameter tofield_attach_view
,field_attach_form
andfield_attach_preprocess
.The
$language
parameter defaults to the current UI language, for now I left untouched the Node module as it was unclear how to set it and its possibile fallback values, I think this should be addressed in the following patches. ATM the view/submission/preprocess language is implied (actually it's the current UI language).All the field tests are fixed. I think we are ready for review.
Comment #111
yched CreditAttribution: yched commentedThanks plach !
No time for a proper review right now, couple remarks from a really quick look at the patch :
- could you expand on the need for $option['merge_results'] in _field_invoke() ?
-
$return[$id][$language] = isset($return[$id][$language]) ? array_merge($return[$id][$language], $result) : $result;
looks like unneeded precaution, since$return[$id][$language]
is initialized a few lines earlier ?Comment #112
plachNo real need ATM, in the first patch it seemed to me sensible having per-language separate results from
_field_invoke
, in this one the only functions using the result relied on a "merged" result, so I decided to keep the possibility to have separate results for the future but currently we don't need that option.You're right, I fixed it in
_field_invoke
e_field_invoke_multiple
.Comment #113
yched CreditAttribution: yched commentedIf we have no actual use for $option['merge_results'] for now, I'd suggest we remove it and keep only the behavior we need.
Comment #114
plachFine, removed the option and the per-language results of _field_invoke e _field_invoke_multiple.
Comment #115
plachComment #117
plachRerolled #114 but #372743: Body and teaser as fields broke everything :(
Implementing node body and teaser as translatable fields is a task too complex for me to accomplish alone. If at least Dries, Webchick, Gabor and Nedjo don't give their feedback on this one I am afraid I'll have to give up: I can't go on working without knowing if this is going to be supported by the community.
Comment #118
catchplach - it doesn't seem like body and teaser need to be translatable as part of this patch - can't that be left as a non-translatable field?
Comment #119
plachcatch, AFAICT for this one to be committed all the node/blogapi/book tests need to be fixed and tests will work only when all the $node->body[0]['values'] become $node->body[$language][0]['values'] and who knows what else; and you know better than me what a pain that $language may be.
If I understand the current status well, at this point translatable fields involve relevant changes in the Drupal core thus I'd like to know if this issue is considered interesting and deserving further work before going on.
Comment #120
yched CreditAttribution: yched commentedplach: It's true that the patch won't get the attention it needs if tests are broken and it can't stay in CNR state.
But the new node body field doesn't have to be translatable for now, so fixing tests mainly involves changing $node->body[0]['value'] to $node->body[FIELD_LANGUAGE_NEUTRAL][0]['value']. Tedious but possibly not as complex as it sounds. The latest patch in #372743: Body and teaser as fields can be used to identify places in .test files that were altered and thus need to be fixed again - not that much AAMOF.
Comment #121
catchAhh, yes. That's a big reason I didn't want to change the object structure in the first place... let's try to get some 'early' committer feedback then.
Comment #122
plachyched: I hope you're right because I already tried to search/replace $node->body[0]['value'] to $node->body[FIELD_LANGUAGE_NEUTRAL][0]['value'] while rerolling the patch in #117 and nothing seemed to change in the test results...
To be as explicit as possible: fixing all those tests may involve lots of work and I won't be the one who will do that if I am not reasonably sure that the needed time is well spent.
Edit: please forget the sentence above, I had a couple of troubled days
Comment #123
yched CreditAttribution: yched commentedplach : if you have done some of the replacements already, why don't you upload what you have so that we know where we stand and people can help ?
Comment #124
plachI forgot not to panic (and my towel too, probably...)
Here's a rereolled patch which uses FIELD_LANGUAGE_NEUTRAL where needed and should fix all the broken tests.
Let's see what the testbot has to say.
Comment #126
plachOops, wrong patch. Let's try again...
Comment #128
plachthis one should work
Comment #129
yched CreditAttribution: yched commentedplach: ;-)
Comment #130
BerdirDBTNG review ;)
I know it's already wrong, but you should place each method call on it's own line when there are 2+. See http://groups.drupal.org/node/17906#comment-77003
EDIT: Oh, and IN is the default now when passing arrays to condition(), so that isn't necessary anymore.
db_result() should be replaced by fetchField(), but you don't change it, and I did it already in #491556: DBTNG the rest so I'd say you can leave it for now, that patch is already huge ;)
Comment #131
plachHere is a new patch implementing Berdir's suggestions.
@Berdir: any review is welcome ;)
@yched: :)
Comment #133
plachrerolled, testbot?
Comment #135
plachI think I forgot "please"
Comment #136
yched CreditAttribution: yched commentedEr, I just got a reversed epiphany I wish I had several weeks ago :-(.
What are actual use cases for 'translatable fields' if objects "titles" (node title, term name) are not fields and are thus not translatable ?
Translatable user profiles, probably, since user names are language neutral. But how many sites have users willing or able to fill their profile in more than one language ?
Of all 'X as field' potential tasks, 'node title as field', 'term name as field' are probably the most problematic performance wise...
Comment #137
plach@yched: I had a discussion with catch on IRC: he believes there are possible workarounds to the non-field entity "titles" but obviously this needs a separate discussion.
He suggests we should talk with webchick and decide if this issue is core-worthy, if the patch gets in we'll address your concerns in a separate issue.
[My point] A review before the meeting with webchick would be precious :)
Comment #139
plachrerolled
Comment #141
plachHere is a brief description of the current (rerolled) patch: the aim of the patch is providing multilingual capabilities to the Field API. During an IRC meeting attended by bjaspan, catch, nedjo, sun and me it was decided that in this patch we would lay the basic groundwork, simply making Field API support multiple translations of the same field value, letting other modules, most likely Translation, implement the actual translation logic.
We decided to refactor the current entity field structure from:
to:
This approach allow us to consistently handle both translatable fields and untranslatable ones, which get a dummy language code defined into
FIELD_LANGUAGE_NEUTRAL
, just as much as single fields share the$delta
key with multiple ones.This refactoring involved lots of changes in many places, thus increasing the patch size, but most of them are trivial: simply introducing the new language level into the various data structures.
The main relevant changes were to
_field_invoke
and_field_invoke_multiple
(field.attach.inc): as per yched's suggestion, to ensure that each Field API hook/function is called on all the available field translations they now act on each one found in the passed entity data structure which has an allowed language code; the concept of allowed/available language is provided by_field_available_languages
(field.attach.inc), which currently returns a list of the enabled UI languages.This way we don't have to rely on any function call sequence for the whole thing to work, thus preserving Field's API nature. Moreover no language-specfic cache is needed.
The only exceptions to this otherwise full multilanguage handling are
field_attach_view
andfield_attach_form
, which use an additional$language
parameter to narrow the result of the field hook invocations to the single language specified. With this choice the only ones needing to care about the change in the data structure are the field hook implementations, while the theme layer and the Form API data structures are language agnostic.On the performance side this changes should be pretty unrelevant for unilingual sites, for which they should be fairly transparent, for multilingual sites we have an additional level of iteration whose entity is dependent on the number of languages enabled, not that much, probably.
Comment #142
moshe weitzman CreditAttribution: moshe weitzman commentedvery nice patch summary. sounds reasonable to me.
Comment #143
yched CreditAttribution: yched commentedThanks for the summary, plach.
I did another pass of review a couple days ago, and didn't find the time to polish and post so far.
Code-related remarks below, I'll keep higher-level remarks in a separate comment.
- we tried to avoid this kind of stuff in the BAF patch:
Should we use the actual value of FIELD_LANGUAGE_NEUTRAL ? (not sure about '_0', BTW. Can NULL, or emty string be used as an array key ?)
- detail: field_sql_storage_field_storage_load() - I'm not sure that part is actually needed:
I think
$delta_count[$row->entity_id][$field_name][$row->language]++
works even if$delta_count[$row->entity_id][$field_name][$row->language]
is not defined (didn't check, though). Same thing in field_sql_storage_field_storage_query()- field_sql_storage_field_storage_write() - I'm unsure about that part:
IMO the behavior we want is:
no $object->field_name : leave data untouched
$object->field_name = array() or = NULL: empty the field for all (enabled) languages
no $object->field_name[$lang]: leave data untouched for $lang
$object->field_name[$lang] = array() or = NULL: empty the field for $lang
Maybe the code does this, but I can't say for sure.
- _field_invoke() / _field_invoke_multiple(): I don't think we need $options['language*s*'] as an array right now. A single language should be enough.
Also, I'd suggest taking $suggested_languages out of _field_available_languages(), and let field_invoke_* handle $options['language'] itself.
- field_add_more_submit():
seems a bit hackish. I guess we'd need the 'add_js' button to have a #language entry so that field_add_more_submit() can retrieve
$form_state['clicked_button']['#field_name']
?Comment #144
yched CreditAttribution: yched commentedHigher level remarks:
- Can we summerize (er, again ?) how much the feature is tied to locale and/or translation modules ?
what happens with translatable fields before locale/translations is enabled ?
what happens when locale gets disabled / a node is edited / locale gets re-enabled ? any chance of data loss ?
what happens when a language gets disabled / a node is edited / the language gets re-enabled ? any chance of data loss ?
are there other scenarios we need to watch out for data loss ?
- Currently sql_storage_load returns values for all languages it finds in the db. If a language is actually disabled, the data still end up in $object, but escape field_invoke() treatments, which is unsafe. sql_storage_load should only return values for active languages. Similarly, sql_storage_write shouldn't touch values for disabled languages.
I think this should address a couple data loss cases above.
- We have no notion of "translation of the object exists for $lang", only "object exists" and "field_foo has (or has no) values for $lang"
For instance, we don't know which 'see in $lang' links we can provide.
- Probably related: a node is usually created in a single language at first. How do translatable fields appear in other languages: empty ? with the values of the default language ?
- Also possibly related: Gotchas with default values.
Default values are added on object *creation* (not update) - currently, for all enabled languages. Happens in field_default_insert()
Imagine: 'en'+'de' are enabled; field_foo has default value 'default'; node 1 is created in 'en' with field_foo = 'bar'; *then* langage 'fr' gets enabled:
a) field_foo is 'bar' in en, 'default' in de, empty in fr
b) the default value won't show up on the form to 'create' the fr translation.
- Also: do we need default values to be per language ? (ouch)
Some other field settings ('list of allowed values' for "list" fields ?) could be problematic as well, I guess...
Comment #145
peximo CreditAttribution: peximo commentedSubscribe
Comment #146
plachThis weekend an IRC meeting will be likely held with webchick, I'll work on yched's suggestions from #143 after that one.
About #144:
_field_available_languages
currently useslanguage_list
(bootstrap.inc) to determine the allowed languages: with Locale disabled translatable fields get$language = 'en'
and untranslatable ones get$language = FIELD_LANGUAGE_NEUTRAL
. Enabling Locale would allow_field_available_languages
to return all the enabled languages. Translation should implement the basic content translation interface.$language
into account if necessary._field_available_languages
during field load.Comment #147
Gábor HojtsyAgree with Moshe, great summary. I think this approach is sound and could support all kinds of translation approaches with core and contrib modules. It does not in itself provide any translation features whatsoever, just lays down the groundwork, but without the groundwork, we step back from Drupal 6's achievement of translatable nodes.
One thing not clear to me: how does this work with a multilingual site when you add taxonomy terms or users for example, for which I guess there will be no core UI to translate? Will all data be stored in _0 fields, or would we have some problems with data stored under different languages?
Comment #148
plach@Gábor Hojtsy: thanks for reviewing :)
I think we could refactor User and Taxonomy modules to default to FIELD_LANGUAGE_NEUTRAL or the default site language and provide the minimum hook infrastructure needed to allow contrib modules to properly support taxonomy term/user profile field translations.
Another possibility would be: fields are content, Translation provide the basic content translation UI not just for nodes but for any fieldable entity (still emulating the plain old translatable nodes system).
Comment #149
yched CreditAttribution: yched commentedI think maybe one possibility to limitate UI issues and 'how do we handle the huge amount of features/issues we just added by making fields translatable' could be to state that only some *object types* are translatable - starting with nodes.
Thus, a field on an object can carry multilinguial values if
(field is translatable && obj_type is translatable)
.Note that all fields on all object types, translatable or not, would still come with the same structure :
$object->field_name[$lang || LANGUAGE_NEUTRAL] = (array of values).
That's important for DX and predictability of what you find in $object->field_name.
Comment #150
catchWe could do that in hook_fieldable_info() (or hook_fieldable_info_alter()) quite happily I think, sounds like a good plan for now unless someone comes up with a magic UI which works for any entity.
Comment #151
plachI am not sure I understand the need of explicitly marking the entity as (un)translatable: if taxonomy, profile or whatever declare their fields as not translatable don't we get the same result?
If we added some hooks to let other modules change this setting and provide a translation logic we should be fine: with no contrib module taxonomy and profile behave just as field weren't translatable but, say, i18n might mark user fields as translatable and implement the missing UI.
Comment #152
yched CreditAttribution: yched commentedre plach #151:
"if taxonomy, profile or whatever declare their fields as not translatable don't we get the same result ?"
Taxonomy just declares 'taxo terms' as 'fieldable', it doesn't control what fields get actually added to them. Once an entity is fieldable, anyone (Field UI, any contrib module...) can add any field type with any setting to any entity.
So currently taxonomy has no way to say 'Hey, you cannot add translatable fields to terms'.
"If we added some hooks to let other modules change this setting"
Already there: hook_fieldable_info() comes with hook_fieldable_info_alter().
Comment #153
plachHow dumb of me! For whoknows which reason I thought that Taxonomy would declare its fields, Profile the same. End of the story.
Works for me. This is THE solution. I'll implement it as soon as we talk to webchick.
Comment #154
bjaspan CreditAttribution: bjaspan commentedre #136 and the concern that lack of translatable titles means this whole patch is doomed to uselessness: We do not need to turn "titles into fields" to make titles into fields. :-) See http://drupal.org/node/82751#comment-876871. Brief summary: We have a contrib module that adds a new field named "title_field" or something that is translatable (and formattable) and hides the actual title widget on the edit form.
Comment #155
catchAgreed with #154. Additionally - my user might be called 'Barry Jaspan' in nine languages, but might need to have their biography translated. Some workarounds being necessary for translating title for nodes are nothing compared to the workaround that would be required trying to implement tnids for users.
Comment #157
plach[deleted content: wrong issue]
Comment #158
plachI've worked on yched's suggestions from #143/#144:
Using the actual value makes defining the constant pretty useless IMHO, I'd prefer to keep using the constant if there is no strong need to use its value.
About the actual value using
NULL
or''
causeselement_children
to throw warnings, moreover I think that using a 2 characters string is more consistent. Obviously_0
might be replaced by any other 2 characer string not already taken by language codes. Suggestions?field_sql_storage_field_storage_load
throw notices on line 248 e 259.field_sql_storage_field_storage_write()
and its tests to implement the described behavior. I also modifiedfield_sql_storage_field_storage_load()
to load only available languages. This should guarantee that no data loss is possible if a language or even Locale gets disabled.$options['languages']
has become$options['language']
, but taking out suggested languages from_field_available_languages
is problematic because we have to test if the field is translatable and decide if it should use the suggested languages orFIELD_LANGUAGE_NEUTRAL
. IMO this logic should be kept confined into_field_available_languages
.#language
fixed.The only open issues should be:
Marking CNR to see what the testbot has to say.
Comment #159
plachIn this one I implemented the
translatable
fieldable info property which defaults to FALSE. Now every entity wishing to allow its translatable fields to be treated as multilingual has to explicitly set itstranslatable
property to TRUE or rely to ahook_fieldable_info_alter
implementation, as nodes are likely to do.I also added some missing PHP docs. If we fix the default values issues I think we are done :)
Comment #161
sunStraight re-roll.
Comment #163
webchickHere were some stuff that stood out to me on an initial pass w/ yched and sun on IRC. Some notes:
- I haven't read the issue yet: I'm only about pass 2-3 on http://webchick.net/6-pass-patch-reviews. If I'm asking/pointing out stuff that's already been discussed, please point me to the
- I spent about an hour and only have a couple of things to show for it, but I think I have the general gist of what's going on. I'll need another pass to read the issue summaries more thoroughly and then will probably have another list of things.
- Thanks a lot for your perseverance on this patch!!
A few questions...
1. The PHPDoc says that this is "untranslatable" fields, but yet the variable is called "language neutral." Which is it?
2. Is "language neutral" a well-understood term in the i18n world? Would "FIELD_LANGUAGE_NONE" be better? (sun's suggestion) Then it would make sense to module developers who don't know anything about i18n.
3. '_0'? What the heck? :) Can we not just use something like NULL? Or '' as the $languages dropdown does in node.admin.inc?
Either sun or yched (I forget which) pointed out that drupal_render() raises notices with with [''] as an array index. BUt I think fixing that in another issue would be preferable to doing this sort of hack here.
I'm concerned about the impact for themers with this change. Luckily we got render($content['foo']) in, because without it, it would be impossible to print out a node field like you could in D6.
In D6, you would
print_r($node)
and get something like:$foo_value = $node->field_foo[0]['value'];
I don't see any possible way to get that in the current system, because the location of the language code is *before* the delta and index. For translatable fields, yched points out that the $langcode variable could be used:
But this will not work for non-translatable fields. :\
yched is going to ponder some about how we could structure the renderable array so that themers would not be out of luck.
Also, knowing that language-neutral means non-translatable.. why is the body field not translatable?
We try not no longer do COUNT(*) as this is tremendously slow on InnoDB, which is the default storage engine. See #196862: COUNT(*) is an expensive query in InnoDB.. Will counting an explicit column work by chance? Oh. Actually, I see that this appears elsewhere in the tests. Let's file a follow-up to fix this then.
Should be like: "Implement hook_field_languages()." (Implement ... end with () and .) Weird, but true.
Comment #164
catchDoing count(*) inside a test isn't an issue IMO, it's only slow in the sense of "don't want to do it on every page load" slow.
On the node structure issue - early on I'd suggested something like $node->field_translations[$lang][$field_name][$delta] to hold all the field values in all languages, and leaving the $node->$field_name structure the same - just containing the current language or the language neutral version.
But there's two issues with that - yched rightly pointed out that that sort of structure doesn't allow field modules to run properly on translations - for example we want hook_field_load() to run on each translation so everything that's needed is there before we actually display it. We also want translation module or similar to do the work of displaying the most appropriate value of the current language (fallbacks etc.), not have any of that work done in Field API itself.
Between field theme functions, field templates and render() it seems like it shouldn't be necessary for themers to use values directly like that any more, but I don't know how realistic that is.
Comment #165
plach@webchick:
Despite the amazing work going on, without this one D7 simply won't fit my needs :)
FIELD_LANGUAGE_NEUTRAL
: I adopted language neutral as it is the current definition of "no language" in the Translation module; an untranslatable field is content with no language soFIELD_LANGUAGE_NONE
works for me.About its value, as I was saying in #158, the main reason I used a non empty one is consistency: built-in language codes are 2 character strings. Moreover a NULL/empty value might be harder to use with equality/
empty()
/isset()
operators than an actual value.Apart from this I totally agree that
'_0'
is ugly and we can find something better. I used the underscore as I wanted to be sure that no clash with built-in language codes was possible, for the 0, well, you get it :) Suggestions are welcome.$foo_value = $node->field_foo[0]['value']
: a possible approach to this problem is providing the current field language code as a template variable; this way no matter the field being translatable or not we get:$foo_value = $object->field_foo[$field_language][0]['safe']
$language
key).COUNT(*)
s: thanks for linking the other issue but, I agree with catch, in this case we are talking about tests on two-rows tables, if we decide counts are evil then we'll need to polish all the tests...The attached patch introduces
FIELD_LANGUAGE_NONE
, adds the$field_language
template variable, fixes PHP docs and adds a simpletest for a yet untested behavior: multilingual fields storage/retrieval.It also ships a little change in
_field_valid_language
: now it's possible to choose if empty/invalid language codes are replaced with the default language or the current one, this ensures a more predicable behavior forfield_attach_form
while allowingfield_attach_view
calls not to explictly specify the$language
parameter.Comment #166
yched CreditAttribution: yched commentedThanks for the fast update, plach:
- FIELD_LANGUAGE_NONE / '_0': on IRC we agreed that '' (empty string) would make the most sense, if possible. If it *works* (but just produces notices in drupal_render), we might want to go for it and fix drupal_render in a separate patch. I'm not sure I get your concerns about equality/empty()/isset() operators.
- About templates: AFAICT, you added a variable in the field-templates, right ? We probably need it as well, but the main issue is in node templates. There are some cases (less often in D7 than in D6 though) where themers need to access a field's raw values. Right now they need to know the language for each field, to access $node->field_foo[$lang][0]['value'].
My first guess is they need to pick between FIELD_LANGUAGE_NONE and 'the language asked for the display'. But (stop me if I'm on crack) this might become a little more complicated if complex language fallback rules are applied:
when node 1 is displayed in 'fr', field_a values displayed will be FIELD_LANGUAGE_NONE, field_b values displayed will be 'fr', field_c values displayed will be 'en' because there are no 'fr' values...
Even if it's just 'either FIELD_LANGUAGE_NONE or the requested language', it seems that we need have some variables for 'the raw values for field_a in the language being used in the template'
Curently in node templates, the $field_a variable is strictly equal to $node->field_a (raw values for all languages). Maybe we could re-purpose $field_a to be the 'values for currently displayed language only' ?
- "It also ships a little change in _field_valid_language: now it's possible to choose if empty/invalid language codes are replaced with the default language or the current one, this ensures a more predicable behavior for field_attach_form while allowing field_attach_view calls not to explictly specify the $language parameter."
Sounds like it makes sense, but it's a little complex :-) Could you expand the PHPdoc for _field_valid_language() a bit, and add inline comments explaining why field_attach_form() uses one form and field_attach_view uses the other ?
Comment #167
Damien Tournoud CreditAttribution: Damien Tournoud commentedISO639-2 has reserved the
zxx
code forNo linguistic content / Not applicable
.Comment #168
plach@yched:
FIELD_LANGUAGE_NONE: Damien's suggestion suits perfectly my design, using it would allow us to avoid the tricky situations I mentioned above, e.g.:
I reintroduced
field_attach_preprocess
which makes available in $variables the field values as currently displayed.I also fixed the PHPdocs.
Comment #169
yched CreditAttribution: yched commentedI wasn't aware of the ISO639-2 convention, thanks Damien. I fear that 'xzz' will raise massive "WTF is this ?" when people start seeing this in their nodes (all the more than most fields won't be translatable...), but I'll wait to hear what others think.
field_attach_preprocess() looks fine, minor points:
- The PHPdoc should just make it clear that the function adds to the $variables argument by reference. See how we documented that in the PHPdoc for field_attach_load(). It should also explicitely say what variables we add and what they contain.
- We should document those variables in node.tpl.php and user.tpl.php (also recommending the use of $field_name vs $node->field_name)
Comment #170
plachAbout
'zxx'
, IMHO it's the best solution, simply people will have to learn one simple thing among the others. But I won't insist on this point, if you think we should go with''
I'll do it.Now PHPdocs should be ok, except for possible syntax errors :)
I also added a simpletest for
field_attach_preprocess
and the variable$field_translatable
to field.tpl.php.Comment #172
plachrerolled
Comment #173
plachWe agreed with yched to leave 'localized settings' (allowed values, default values) for a later patch. So the last pending issue (apart from those raised by nitpick reviews) should be the "default values on translation creation" one.
Currently if
field_attach_insert
is invoked with$object->$field_name
unset and the field instance defines a default value function callback a default value it's provided. This cannot happen with translations unless we add a similar functionality infield_attach_update
. My (modest) proposal is to leave the current behavior untouched: I think it's acceptable that a language added later has empty values, unless the field values are required, but this introduces the need of some field-level validation which AFAICT it's not properly implemented yet.Default values are strictly connected to language fallback rules for what concerns translation, so I'd suggest that it's Translation responsibility to provide the appropriate values in case of missing ones: it might call
$instance['default_value_function']
asfield_default_insert
does and even take$language
into account. Perhaps I'd just modifyfield_default_insert
to handle missing translations and not only the entire property.Comments?
Comment #174
plachWednesday an IRC meeting as been held, the partecipants were sun, yched and me.
Initially we tried to find a way to provide default values when inserting a translation rather than on entity creation, but we didn't reach an agreement on the approach to follow; however we agreed on the following requirements:
At first we agreed on my statement it's acceptable that a language added later has empty values, where they are not explictly required.
We also agreed that
field_default_insert
should consider if a particular field translation is defined before assigning it a default value, thus when creating a new entity only actually "empty-populated" field translations get a default value. This should ensure a consistent behavior between standard workflow and free API calls.Then yched pointed out that having default values on translation creation rather than on entity creation involves having a way to track entity translations and their status, i.e. more or less what I described in #61. AFAICT sun agreed with him.
This table would allow Field API to know if any value for a given entity has already been entered for a particular language, thus letting it call the appropriate default providers in
field_attach_presave
. If Field API handles that table, it will handle it for every entity that wants to be translatable, core or contrib.My approach is slightly different: I agree that tracking the information yched refers to involves the use of a table like the one cited above, what I don't agree on is the assumption that entity translation is the only kind of translation we might need.
I don't think that Field API should natively implement the concept of entity translation, i.e. a collection of field values sharing a common language, because there are other use cases in which we might need fields with different languages showed during the same page view, and I am not thinking of fallback rules. Rather think of a couple of "Citation" fields which hold two citations in two different fixed languages, say "Latin" and "Greek", among other content fields being translatable in the common sense; if we were able to explicitly associate a language to the citation fields we could also mark them up semantically with an
xml:lang
attribute, otherwise we would have to demote them to untranslatable status, i.e. fields with no language associated, and we would have no way to provide the semantic information language represents.Another possible alternative translation mechanism might be one implemented on a multicurrency multilingual site: think of a site showing different prices for different locales where prices are stored as different values, not converted from a source currency; the content language might be totally unrelated to the currency locales.
I propose a solution which emulates the current Locale/Translation modules relationship:
If we go this way we would make Field API lighter and we would allow different, pluggable, translation systems to hook in, each one having the ability/responsibilty to choose wich entity and/or field handle; this could be easily done by invoking a hook which would collect entities interested in being handled by a certain translation system.
And, if we decide to go this way, to close this patch we just need to slightly modify
field_default_insert
to take into account field translations, as we already agreed to do (the attached patch does exactly this), everything else would be delegated to the following patches.Any feedback is greatly appreciated, however I am working on a proof-of-concept patch.
Comment #176
plachrerolled
Comment #177
plachComment #178
plachAs part of the proof-of-concept patch work I created this side-issue: #533924: Allow different language types to be enabled separately
Comment #180
plachrerolled again
Comment #182
yched CreditAttribution: yched commentedre #174: I don't really know what to think about the 'alternate uses of value languages'.
The 'display price in different currencies' example is moot IMO, because there's no direct and simple relation between languages and currencies.
The other example ('Citation field; this citation is latin, this citation is greek') is a little mind bending, but I guess there are use cases.
Well, this is entering territories that make my head spin. Not to say that the idea that object 'translations' are just one way to handle value languages is bad, just that I'm probably not the right guy to have an opinion about this.
Just to make sure I understand your proposal:
- no Locale.module, no Translation.module: all field values are language-neutral
- Locale.module without Translation.module: translatable entities (nodes...) get *one* language ($node->language or something), pickable by some UI on the node create form (not sure whether this language can be changed afterwards). This is the language assigned to all values for translatable fields in the object. Nodes that existed before Locale was enabled remain 'language neutral', and so do their field values.
- Locale.module + Translation.module: you can edit a node in several languages. Default values are inserted for a given language when the node is first submitted with values in this language. Translation.module maintains a table of 'existing languages' for each entity. It has to work for *any* entity type that wants to be 'translatable' (including contrib entity types), not just nodes.
About the patch itself: please add a comment about the new logic in field_default_insert(). I'm probably tired, but right now I really can't wrap my head around it, even though I very possibly asked for that change ;-) [edit: I mean a comment even more detailed that the one you already added :-p. Something like 'what's the underlying logic, why is it the right way']
Comment #183
plach@yched:
Yes, we definitely need the i18n folks' feedback on this.
More or less that's it, you can find attached my proof of concept work implementing #174. The code is organized in three patches depending on one another (a single big patch is also attached for reviewers' convenience):
field.multilingual.inc
and renaming the function names accordingly.locale.field.inc
file which extends the current multilingual node approach to fields, basically when creating/editing a node all the translatable fields get the node language; any node language change is correctly handled and reflected on the single fields.I didn't touch Translation, instead I created a new Field Translation module which is meant to substitute it; however they could also live together, as nedjo was pointing out in his g.d.o writeup. Field Translation implements almost the same functionality of Translation but without creating a node for each translation, translatable fields are used instead; it relies on a separate module called Entity Translation which exposes an API which allows to track translations for any entity.
From the UI point of view they are slightly different as Field Translations does not use a node form for translations, but it simply exploits
field_attach_form
to let users enter just the field values; however there are a couple of translation-specific options "inherited" from the node form.To test everything (please, do!) just enable Field Translation, multilingual support for the
article
content type, install the Demofield module and start translating content.The main issues left out, as they involve deep i18n expertise, are:
Again your feedback is greatly appreciated, especially if you are i18n experts :)
Comment #184
catchOK I read through the latest patch, and generally it looks great to me. I spoke with plach and had a couple of issues, mainly that #533924: Allow different language types to be enabled separately shouldn't be a dependency - since it's a change with a lot of implications by itself, and it's more about language negotiation than anything else, which already causes problems with our current translation model and doesn't really stop this from being committed.
While the patch is 143k, I'd estimate only around 30-40k is field API changes, the rest is fixing up tests and adding the language key for field values
Comment #185
plachThanks catch! I rerolled the patch to remove the dependencies from #533924: Allow different language types to be enabled separately. I also attached a "light" version of the patch with all the test fixes plus trivial language key addings stripped out to ease reviewing.
I prepared a demo site with the proof-of-concept patch installed, just contact me on IRC or by email and I'll give you the access credentials. However in #539110-7: TF #4: Translatable fields UI I posted some screenshots.
Comment #187
plachI missed the "taxonomy term field" commit. The no-test version of the patch is the same of #185.
Comment #189
plachReroll + some phpdoc improvement.
@yched: It seems we are getting good reviews in #539110: TF #4: Translatable fields UI :)
Comment #190
Gábor HojtsyI've read up on #174 above on (repeated) requests and here is my feedback:
I'd say that implementing field language support and translation on separate levels is a good idea. Just like locale does, if you enable language support, you might not want to also have translations (think a simple multilanguage blog, where you write different posts in different languages, but not translate them; or a more complex site where you have subsections for different languages, but they are independently edited, not related).
On designated fields with a language while the whole node or some of the fields do not have a language (the Greek quote example), I think the language support in itself would ideally cover that. You should be able to set fields language neutral while others might have a language in your content type.This applies to simple product types, where a price might be language neutral (not your product example). In your product example, I think using "price translation" to display different prices for different languages might easily not work, since prices are usually different per region, country or something else, and not language. I don't think we should support that.
I'm not sure I'm totally up to speed on the discussion, so let's see if I was useful at all :)
Comment #191
plach@Gábor Hojtsy
Thanks for finding some time, Gábor! I am really really sorry for having to be so annoying, I tried to avoid bugging you until it was avoidable, but now I feel we are almost done and for me it would be really frustrating seeing months of work trashed just because I left anything untried.
That said, I'd like to thank again you and all the people giving their feedback here, especially yched, catch, sun and nedjo for having subverted their priorities to allow me to work on this one :)
It seems we are agreeing on the approach but not on the use cases, better that than the opposite :)
yched will tell us ;)
Comment #192
nedjoI've read the patch through for the first time in several months.
Most of this patch comes down to a very simple change: adding a language code key to the field array structure. I didn't completely follow the various iterations and back and forths on array structure, but the result here is straightforward and field module authors should find the changes easy to grok and work with.
Together with the accompanying patches, I'm convinced this is the right direction and a major step forward for multilingual support in Drupal. This would be the single biggest multilingual improvement in D7.
I read the full patch and it seems to be in very good shape. Here are the issues I found. Most are trivial, and the rest should be easy to fix.
In this documentation of an example field form, we should document what the $language variable means. Maybe:
// The language code of the object being edited. We key the form array by language code.
Should be two sentences:
Is sending an uninitialized $null variable as an argument what we do elsewhere when we need to pass a null variable by reference?
Should have a period:
+ * Whether the field is translatable.
'[' . $language . '][0][value]' in many places in the tests is a bit hard to read and might be better with double quotes:
"[$language][0][value]"
field_multilingual_available_languages should be field_multilingual_available_languages().
"Test multilanguage fields logic." might as well have single quotes.
This should be full sentences:
Do we need to add $language to the function signature if it is not used in the function?
This new hook needs to be documented in field.api.php.
Since what we appear to be doing is allowing modules to alter the element, I wonder if this could/should this instead be a drupal_alter() style hook.
In various places we have hard-coded the language code for the body field. Could we instead load this from the field definition? Doing so would make it easier in future to change this to a translatable field.
On second thought, though, I guess this wouldn't help, since, if the field is translatable, what we want is not the default value but the language code for the specific node. I guess any change will need to wait until body is translatable.
Good explanation. One of the sentences is a run on. Here's a fixed version:
Many existing core messages lack punctuation, but I think this (and various others introduced in the patch) should have a period: 'Field translation in an unavailable language ignored.'
It isn't self-explanatory what we're doing here. Suggested comment:
// Add the field form element as a child keyed by language code.
I don't understand this comment. Where might an empty string come from?
A final comment on variable naming. This patch generally uses $language as the variable name for the language code of a field's language. Elsewhere in core $language is often used for a language object while $langcode (or, occasionally, $lang) is used for a language code. The use of $langcode helps prevent unintended conflicts with the global $language variable (an object). Should we consider using $langcode here?
Again, on my reading only minor tweaks are needed for this patch to be ready. Thanks Francesco for your dedicated and innovative work on. I think we're in the home stretch.
Comment #193
plachApart from the following all the suggestions from nedjo's #192 are implemented.
I've asked on #drupal-dev and it seems it's an accepted way, I used that form because I had seen it somewhere in Drupal, however I added an explicit initialization.
Yes, it's a hook implementation.
Yes, this is what we decided in #117 and following.
We have an ok from the i18n point of view, we need a final ok from the Field API side.
yched?
(and a green light from the test bot, sure :)
Comment #194
plachI forgot to document the new $langcode parameter in field.api.php. Code is the same as #193.
Comment #195
yched CreditAttribution: yched commentedThe patch is ready for me on the Field API side.
The change is fairly intrusive, though (field values move from $object->field_name to $object->field_name[$langcode] for *every* field).
So the decision to actually commit it IMO depends on whether we are confident that #539110: TF #4: Translatable fields UI has enough momentum to ensure a stable and sound 'content translation' feature for D7. It seems to be getting positive feedback from the right folks so far, and plach shows great dedication and turnaround times. Disclaimer: I myself lack time and i18n wits to take a decisive part in that other thread or assess its viability.
Once again, kudos to the *amazing* work and energy plach is delivering on these threads.
Comment #196
alexanderpas CreditAttribution: alexanderpas commentedsubscribe
Comment #197
plachAlthough I'd hate not seeing this committed I perfectly agree with yched. As one might expect #539110: TF #4: Translatable fields UI is showing all the strength but also the weak points of the current approach. I don't know if we should wait until that's rtbc to commit this one but surely we should have a good answer for all the open issues.
@yched: thanks again for finding the time for this, despite your amazing work on the "pure" field api (and ui :)!
Comment #198
nedjo@platch, "surely we should have a good answer for all the open issues" in #539110: TF #4: Translatable fields UI
Well, only if those issues suggest problems here. #539110 is fleshing out the UI and data storage for field-based translation, including what could be viewed as limitations, e.g., it doesn't handle non-field data. But so far as I can see none of the issues there suggest limitations or errors in this patch.
My view is that translatable fields can go in even if for we don't get #539110 for D7. Yes, a full solution in core would be ideal. But if necessary a field translation module could live in contrib for D7. What can't be done in contrib is the architectural work of this patch. And for that, #539110 has IMO only confirmed the approach taken here.
One question I'm not clear on is the upgrade path. Does this patch need an update for existing fields?
The recent changes look great and I'm happy with the explanations in #193.
There are a couple of places where a tab has crept in instead of a space, e.g. in testPagePreview():
Comment #199
sinasalek CreditAttribution: sinasalek commentedsubscribe
Comment #201
mattyoung CreditAttribution: mattyoung commentedsubscribe
Comment #202
sunSorry, I got sidetracked into some other stuff. My plan is to help roll this in today.
@plach: Tests seem to fail for the last patch in the meantime. If you could re-roll during the day, that would be awesome! :) I'll most probably start work this evening, ~19:00 UTC.
Comment #203
plachRerolled patch, I fixed the tabs and the tests. Feel free to set RTBC again once the testbot gives its ok.
@nedjo:
IMO, this should be part of #366364: [meta] Data migration from D6 contrib CCK fields to D7 Field API. Probably all the old cck fields should be converted to untranslatable fields.
Comment #205
plach#516138: Move CCK HEAD in core went in. I did an almost straight reroll, I had to manually merge
field_default_insert
and add a$langcode
parameter tofield_get_default_value
. No time to test locally. Let's see if the testbot is happy. In that case we can go RTBC again.Comment #207
sun3 fails in "Enable/disable modules"
Comment #208
yched CreditAttribution: yched commentedI've sometimes seen "Enable/disable modules" tests act a little edgy. Requiring retest.
Comment #209
webchickIndeed, something's effed with testing bot atm. I've shut it off.
Comment #210
sunVery nice.
Cleaned up some PHPDoc and variable names.
Comment #211
sunSome further mini tweaks.
I'm also attaching two convenience patches, since more than 50% of those 168 KB are changes in other modules and tests to account for the new field array structure:
- drupal-tf.essentials-d6.patch: All essentials, including relevant test files.
- drupal-tf.essentials-minus-tests-d6.patch: All essentials, only including the really relevant changes in field tests.
Comment #212
sunThose convenience patches were really helpful. Found some more coding-style issues and reworded some PHPDoc.
Comment #214
webchickOh, testing bot...
Comment #215
sunTests pass!
Re-rolled for removed $check argument to check_markup().
Adding another convenience patch for final review.
Comment #216
zoo33 CreditAttribution: zoo33 commentedSorry, but I wasn't able to test this because the field configuration page is broken when the patch is applied. (It works without the patch.)
Steps to reproduce: Update CVS. Apply patch. Run install.php. Go to Structure -> Content types -> Page -> Manage fields. Click edit on body field (or a newly created field)
Error message:
PHP Fatal error: Cannot pass parameter 8 by reference in /path/drupal-7/modules/field_ui/field_ui.admin.inc on line 1216
The call to
field_default_form()
on line 1216 seems to miss the new $langcode parameter:Comment #217
sunThanks for testing and catching that! Of course, the new Field UI wasn't completely covered yet.
Back to RTBC.
Comment #218
sunCross-posting a summary of this patch I sent to Dries: (please correct me where I may be wrong)
- Currently, translations of content are stored in different entities, which means that the same content exists multiple times for each language. To achieve content translations, Drupal tries to put all translations into a translation set, which in effect, are just different views on the same content (translatable strings change).
- The current translation approach of multiple entities leads to major issues for additional information that is attached to nodes by contributed modules: For example, file attachments or votes/rating may need to be properly (and manually) synchronized between the nodes in the translation set. Or when considering node references, it becomes obvious that one needs heavy workarounds to try to find a suitable translation for a node that was referenced in the original node (in the source language). In case no translation of the referenced node existed when the translation of the referencing node was created, but the referenced node was translated later on, then the referencing node may still point to the untranslated node that existed when the translation of the referencing node was created.
Long story short: Our current content translation approach is not canonical. Translations of content are displaying the same entity, just in a different language.
- We now have field API in core, which allows us to solve this issue by only storing translated values for translatable fields.
- This means that a user is able to specify which fields should be translatable and which should not. A Link, Node Reference, or User Reference field may not need to be translated, while most probably all text fields will.
- The "Translatable Fields" (TF) patch lays the ground-work for some further patches in the queue (most are at 90%). It is a low-level patch with no UI changes or whatsoever.
- With TF, we allow all field values to be translatable. To achieve that, they are stored and treated like multiple value fields, i.e. they are normalized into own field tables and get an additional 'language' column for each value (all of this is handled by the field storage engine).
- To account for language negotiation and yet untranslated but translatable fields, we store, load, and cache all field values for all languages. This also allows for advanced use-cases, where one might want to display information in another language, or even multiple languages at the same time. A good example for last advanced use-case would probably be a translation UI that allows to translate fields into multiple languages, which is kinda nice :)
- Since we load/render/store all languages, the conclusion is to make Drupal's field system fully language-aware. That means, field values are no longer stored in $object->field_foo[0], but in $object->field_foo['en'][0], and we pass on the "default" language to use for rendering - which is probably the reason why I am writing this mail (because that leads to a major API change we badly need to do).
- Since the i18n sprint, Francesco Placella (plach) focused on the TF patch and did an tremendous, awesome job in puzzling all parts together. The field API maintainers/developers AND the major i18n developers held countless of follow-up sprints in IRC to discuss the concept, related issues, and the resulting solution that plach implemented in TF. The patch has strong support from both field API and i18n developers and will be a major step towards a sane handling of translations in Drupal.
- Again, the patch only deals with low-level changes. It provides no UI to use the feature. However, I am confident that at least some of the other patches will make it in (I'll personally focus on those, because it is the top priority for almost all of our clients) - and even if those other patches won't make it: The functionality added by them can live in contrib. But TF cannot - because it is the building ground.
- Aside from mind-blowing discussions about translations, I just ensured that the patch is in a committable state. It's insane to review this monster; therefore a "convenience" patch has been created that contains the bare essentials (API changes) only.
Long story even shorter: In the name of many, international Drupal users: this is what we want. I kindly ask you to push the button. :)
Comment #219
webchickOk, spent another couple hours with this patch tonight, with tha_sun in IRC. I read through the whole thing, but the vast majority of it is brain-dead stuff: adding the new $langcode parameter to various field API functions and reformatting arrays to use it, adjusting the tests so that they pass, etc. The underlying patch itself seems like it is actually not much; just nesting things a level deeper, underneath langcode so that properties can be accessed independently. Yet, this simple approach has very nice benefits as outlined by sun in #218.
I want to extend a sincere thank-you to all the people who worked on this patch, particularly plach for being an amazing and constant force throughout the effort. This patch has support from a major chunk of our i18n team, both field API experts, as well as our Drupal 6 maintainer who knows this stuff better than just about anyone. This issue is a marvelous example of team work!! :)
Here are the fairly minor things I managed to find in going through, but other than those I really can find no objections. I am a n00b when it comes to the translation system, though, and because of that (as well as the fact that it's 168 K :P) I would really like to get Dries's sign-off on this before committing.
Ah! Great! This resolves my one major issue with the patch the last time I reviewed it, which is that there's no way for themers who know they want to print a field in a template but don't know what the langcode is, to print it.
(minor) How about just remove the comment rather than fixing it, since it's totally useless? :P And in fact, how about doing this as a separate "novice" patch since it has nothing to do with this one, and this one is quite big enough, tyvm. :)
I'm happy for us to conform to existing standards. However, what this means in practice is that var_dump($node) is going to show stuff like
$node->body['zxx'][0]['value'] = 'bananas';
We'll need to make absolutely sure that we document this well in the theme/module upgrade guide as well as the general module/themer information because that will be a complete WTF.(minor) "Consider storing language as integer."
Why are you doing this in two places?
I'm on crack. Are you, too?
Comment #220
plachThe current patch is a straight reroll of #217, it just implements the
// Ivnalid
and(minor) "Consider storing language as integer."
webchick's suggestions. The code is the same as in #217.Well, actually your concerns where addressed in
field_attach_preprocess
by populating the$variables
array with the field translations to be displayed, using a similar technique, though :).The first place is testing that no unavailable field translation is loaded. The second one is not a real test by itself, rather it's preliminary insertion to test the missing data behaviors.
Comment #221
Dries CreditAttribution: Dries commentedI actually looked at this patch the other day, but didn't write a review (yet). Fundamentally, I agree with the approach taken and support this patch being committed. My big question was around performance -- I'd like to make sure it doesn't significantly degrade performance for both non-translated and translated sites.
Comment #222
webchickThis patch needs a quick re-roll after the regional path change.
And is someone available to run basic benchmarks before/after the patch with/without a language installed?
Comment #223
catchRe-rolled.
On performance, I'm a bit pushed for time at the moment, but I think due to the field upgrade path, we'd need the same Drupal 6 database with some devel generated content updated to Drupal 7 (one with and one without the patch) - then benchmark 30 nodes (= 30 fields) on the front page. Ideally the cost for an untranslated field should be close to zero. This is going to be a bit tricky, since it's currently impossible to upgrade from Drupal 6 to Drupal 7 in HEAD, so not sure how best we can deal with that until the upgrade path is back to working.
With translated fields the way this is designed, the idea is that you absolutely avoid either 1. additional queries when loading data 2. additional joins/conditions to determine which language/tnid to display 3. adding language keys to caches to avoid lots of misses. So the main penalty would be from loading more data from the database and a bit of processing - although if we consider all translations part of the same entity, this is much the same as loading a very long article compared to a very short one, or any content type with a additional fields attached to it which might not always display on every page. I'm not sure how easy it'd be to benchmark a translatable field with additional languages installed since there's not an easy way to populate the data (without filling it out manually), I'd be more comfortable if we could compared HEAD-HEAD with standard mono-lingual data before a commit though, will find people in irc I guess.
Comment #224
catchBotty.
Comment #225
webchickcatch informed me that the upgrade path between D6 and D7 might be hosed atm: #517742: Upgrade failing if blocked_ips doesn't exist If that happens:
a) Try applying the patch in the other issue and report back your results.
b) If that fails to fix it, do a HEAD / HEAD comparison.
Ask around in #drupal if you need help!
Comment #226
catchHEAD / HEAD comparison you can do with real data by just manually adding ten articles to the front page in each install, a bit clunky but good enough for a sanity check.
Comment #227
janusman CreditAttribution: janusman commentedRan the benchmark: HEAD (unpatched) vs. Patched HEAD.
Note: I have not read thru the patch and just got directions from @Catch on IRC, I hope this is correct and somewhat helpful.
First test:
* Fresh install, "Full" configuration, MySQL
* Created 30 article nodes, using lorem ipsum generator.
* Configured to show 30 nodes in front page, anonymous user cache is disabled.
* Ran ab -n 100 -c 10. Results::
Second Test
* Applied patch from: http://drupal.org/node/367595#comment-1953376
* Dropped the whole database, re-installed Drupal.
* Again, created 30 article nodes, configured to no cache, 30 nodes in front page.
(Nodes were a bit different in length, filled with random "Lorem Ipsum" text)
* Ran ab. Results:
Comment #228
janusman CreditAttribution: janusman commentedAs per @catch's request, ran again with different settings for ab: -n1000 -c1
My feeling is that there's discernible difference under my test conditions...
Results:
Unpatched HEAD
Patched:
Comment #229
catchTest bot is still happy, and those benchmarks look fine to me. If I don't get beaten to it with a commit before Sunday, i'll try to run some as well just to verify, but putting back to RTBC since there's no serious regression here in terms of performance for untranslated fields. If anyone has time to create a translated field in a couple of languages and benchmark that, it might also help - I guess we'd have to benchmark a translated field with two translations, vs. and untranslated field, which isn't really a fair comparison given locale will be enabled and etc., but not many other ways I can see to do it.
Comment #230
webchickAwesome! Thank you so much for jumping in here, janusman!
Committed to HEAD! WOOHOO!! :)
Comment #231
yched CreditAttribution: yched commentedOMG.
Comment #232
plachW O W !!!
Comment #233
catchThis needs documenting in the upgrade guide, and maybe a CHANGELOG.txt entry.
Comment #234
sinasalek CreditAttribution: sinasalek commentedAwesome :), Thanks you guys
Comment #235
xmacinfoI've made a CHANGELOG.txt patch.
Comment #236
catchLooks good.
Comment #237
webchickAwesome, thank you! Committed to HEAD.
Back to needs work for docs.
Comment #238
plach@webchick: should I write the documentation myself or am I supposed to get in touch with the documentation team?
Comment #239
plachEr, I just noticed that the patch that went in had a missing hunk :(
It was a one-line change in
template_preprocess_node
, the attached patch should fix it. I did not open a new thread as the missing line was part of the patch until #183 and had yched's approval, it was lost during the proof of concept work. I'm really sorry.Comment #240
sunLooks good.
Comment #241
webchickThanks, committed! Back to needs work for documentation.
plach: Up to you. The customary thing is for one of the people who worked on the patch to amend the upgrade docs accordingly. But since you already put in approximately one bazillion hours on this patch, if you want to leave it for a member of the docs team to clean up, they're normally quite attentive to the "Needs documentation" queue.
Comment #242
plach@webchick: I would be glad to provide the documentation myself, but I am really full with the TF side-issues. Moreover a skilled doc team member is going to achieve a far better result than me. However I am fully available for any needed feedback.
Comment #243
plachComment #244
yched CreditAttribution: yched commentedHm, we should probably add the field_attach_preprocess() call to all other fieldable entities (user, comment, terms), even if they're not translatable.
Comment #245
Anonymous (not verified) CreditAttribution: Anonymous commentedThis patch broke taxonomy term fields with autocomplete widgets. It may also have broken options (radio/check) widgets, because I get WSOD when I try to edit a node with a list field and that widget.
On taxonomy term autocomplete widgets, the function taxonomy_autocomplete_validate is looking in $form_state['values'][$element['#field_name']]['value'] but it needs to look in $form_state['values'][$element['#field_name']][$langcode]['value']. However, it doesn't know a $langcode. Do I just force it to be untranslateable and use the FIELD_LANGUAGE_NONE constant?
Comment #246
plach@yched: perhaps we should open a new issue?
@bangpound: in
$form[$element['#field_name']]['#language']
there should be the needed $langcode. I'll try to give a look to the options widgets ASAP. If you could file two new issues and tag them translatable fields it would be great.Comment #247
plachCrossposting bangpound's issues:
#557924: Options widget broken
#557932: Taxonomy term field autocomplete widgets never validate, always lose values.
Comment #248
Anonymous (not verified) CreditAttribution: Anonymous commentedThere's a conceptual problem with translatable fields for references, and I think sun alluded to it in #218. I'm considering it for taxonomy terms, but it could apply to any reference field to a translatable entity.
If a vocabulary and its terms are translatable, the fields whose allowed values include that vocabulary shouldn't be translatable. For example, a toponym (place name) vocabulary lists names of countries. The countries are the same whether the language of the user is French or English or Japanese. A taxonomy term reference in an English node to "Bangladesh" should not change its value for the French version of the node.
However, tags are a different problem. The user's judgment determines how something is classified, and individual judgment can be affected by culture such that (with multilingual social sites for example) the tags for a node in English cannot necessarily match those in other languages. Tags could be slang, impressionistic...
We can talk more about this at Drupalcon. This may ultimately be a documentation issue.
Comment #249
yched CreditAttribution: yched commentedI'd say some taxo fields (taxo-term-reference, actually) should be translatable, some others not, that's hardly predictable and really depends on the kind of categorization. It should ultimately be left to site admins.
Comment #250
xmacinfoThis might be true in some circumstances while not at others.
Take USA for example:
English: “United States of America”
French: “États-Unis d’Amériques”
Some sites would require localization of the countries whereas some other sites will use the original country name.
Comment #251
Gábor Hojtsy@yched and @xmacinfo are totally right, the site admins should decide which fields are translated and how. This absolutely depends on the use case.
Comment #252
plachThe current approach should allow us to translate the single terms of an untranslatable taxo field or make the taxo field translatable and provide different values for each node translation.
Comment #253
Anonymous (not verified) CreditAttribution: Anonymous commentedI agree that it's a problem that's specific to each site. So I've enabled multiple languages, set up one of my content types to be multilingual and translatable. How do I change the translatability of a field? Is this a missing feature?
@xmacinfo your example is what I'm talking about. The relationships between the node and the term is not translated but the name of the term is translated. The "thing" isn't language specific but the name of the thing is.
Comment #254
plachThe translability of an item is not currently handled by the UI, obviously it should. The UI have also to take into account the presence of translation handlers which enable translatable fields to be actually translated (see
field_multilingual_available_languages
andfield_multilingual_check_translation_handler
for details).Comment #255
xmacinfo@bangpound: I think I do understand what you mean. :-)
What are all the UI elements missing? Are there things that can be done later, after the API is frozen?
Comment #256
quicksketchThis patch introduced a new blocker for #391330: File Field for Core. See #559506: Field Widgets Need $langcode variable. If some of the authors of this patch could give that a review I'd appreciate it.
Comment #257
sunThe UI for translatable fields is tackled in #539110: TF #4: Translatable fields UI. Please follow-up over there.
If you want to participate in other i18n issues:
http://drupal.org/project/issues/search/drupal?issue_tags=i18n%20sprint
http://drupal.org/project/issues/search/drupal?issue_tags=translatable%2...
Please only comment on this issue if you want to point to regressions or other issues that need help (like quicksketch did). Thanks.
Comment #258
plachPlease, leave this issue for the documentation work.
For any bug report file an issue tagged translatable fields: I keep checking the queue for issues tagged that way.
Comment #259
yched CreditAttribution: yched commented- re #244 / 246 : #560608: add missing calls to field_attach_preprocess()
- @plach: I think at some point we discussed the fact that only nodes would hold translatable fields for now, to limitate UI work. Sorry if I'm on crack and that was dropped later on ?
Comment #260
int CreditAttribution: int commentedadd tag
Comment #261
LeeHunter CreditAttribution: LeeHunter commentedadded tag
Comment #262
Joe Dreno CreditAttribution: Joe Dreno commentedComment #263
plach@geto:
The documentation has not been updated yet, so we should keep the status to needs work.
Comment #264
ricardodpsx CreditAttribution: ricardodpsx commentedHi, I have an alternative and less Traumatic solution for this issue, I developed it for Drupal 6 and is working very good in a real project.
The propossed module futfill most of the requirements and is currently working in a real Drupal 6 instalation, the idea can serve as inspiration to solve this issue on Drupal 7.
The module has some advantages, like these:
1) Its a module and doesn't need to modify the core.
2) The translation logic is hidden to the entire system, so it works whit most of the existing modules without additional handling.
The module is in "proposed" state so tell me your opinions here: http://drupal.org/node/752122
Comment #265
plach@ricardodpsx:
This issue has been commited to Drupal 7 core in August 2009, it is in status 'needs work' because it lacks documentation as some related issues are still pending.
Comment #266
BenK CreditAttribution: BenK commentedKeeping track of this thread.
Comment #267
plachWe have the documentation finally: Translatable Fields.
Comment #268
LaurentAjdnik CreditAttribution: LaurentAjdnik commentedSubscribing
Comment #270
plachThe TF UI is provided by http://drupal.org/project/translation.
Comment #271
yched CreditAttribution: yched commentedWow, long shot, but this actually semi broke 'default values' : #1253820: It's impossible to submit no value for a field that has a default value
Comment #272
yched CreditAttribution: yched commented(fixed wrong issue link in previous comment)
Comment #273
umbilu CreditAttribution: umbilu commented239: translatable_fields-367595-239.patch queued for re-testing.
Comment #278
sun