The idea is to conditionally load translated values when loading values.

Menu links, variables, and any other arbitrary module data - as long as the values are stored in own columns.

This actually works. Trust me. :)

function ot_query_menu_links_alter(QueryAlterableInterface $query) {
  $tables = &$query->getTables();
  $fields = &$query->getFields();
  $expressions = &$query->getExpressions();

  // Replace ml.* with actual columns, excluding 'link_title'.
  $columns = db_query("SHOW COLUMNS FROM {menu_links} WHERE Field <> :exclude", array(':exclude' => 'link_title'))->fetchCol();
  unset($tables['ml']['all_fields']);
  $query->fields('ml', $columns);

  // Join on menu link translation table.
  $query->leftJoin('ot_menu_links', 'tml', 'tml.mlid = ml.mlid AND tml.language = :language', array(':language' => $GLOBALS['language']->language));
  // Conditionally use translation or source value.
  $query->addExpression('CASE WHEN tml.link_title IS NOT NULL THEN tml.link_title ELSE ml.link_title END', 'link_title');
}

The patch just adds a query tag "menu_links" to all menu link queries and adds the current language code to menu tree/links caches.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Status: Needs review » Needs work

The last submitted patch failed testing.

catch’s picture

subscribing.

mattyoung’s picture

.

Crell’s picture

Amazingly I think it would actually work. :-) The CASE statement in particular worries me for performance, though. I'd want to see core benchmarks on this to see what if any impact there is when this hook runs.

plach’s picture

subscribe

sun’s picture

Assigned: Unassigned » sun

Further discussion with Crell in IRC:

  • We want to use Schema API to "tag" all table columns (and perhaps tables) that contain translatable columns.
  • We use Schema API to replace table fields wildcards in the known tables in the query alter, replacing the translatable fields with CASE statements.
  • Someone should verify that CASE actually is a proper ANSI statement (AFAIK, it is).
  • This hook_query_alter(), along with a UI for it, will live in contrib.

What we will do as preparation for this in D7 core though is:

  1. Add query tags to all database queries that retrieve translatable values.

    Note that the original patch already used an optimized approach: The tag "menu_links" was added to three different queries in menu.inc, which all belong to different functionality, but all retrieve menu links.

  2. Replace db_query() invocations with db_select(), when there are translatable values in the fields, because only db_select() queries are alterable.

    For example: Filter formats have a user-customizable title, so when formats are queried, db_select() should be used.

  3. Add the global $language->language code to cache ids, where query results are processed and cached in the database.

If we can agree on that, I'd roll a patch.

Thoughts?

sun’s picture

oh, and we may need a schema change in {menu_links}, because menu links not only have titles, but also descriptions, but the description is serialized into the 'options' column. We don't want to translate the options, just the title + description.

sun’s picture

Title: Can we abstract this? » Prepare Drupal core for dynamic data translation

nedjo raised the question whether we need to add a condition for queries that (also) run on editing pages. A very good point. As of now, I assume that we have sufficiently separated queries running for different cases; for example, menu.inc's menu_tree* functions only run when entire menu trees are retrieved, and (some of) menu.module's queries only run when you intend to edit a menu link. The latter we don't want to alter, because it's the job the contrib module to load translated values and attach and provide those (somehow) in the UI.

Damien Tournoud’s picture

* you want COALESCE, not CASE
* I believe it would make a lot of sense to add additional columns instead of JOINing to a different table (especially because it allows you to properly support conditions on translated columns...)
* Note that there are technical limits to the size of a row in MySQL that might make the previous idea impossible

nedjo’s picture

This is a great idea.

Some questions I discussed with sun (no solid answers yet):

1. Language type. If/when #282191: TF #1: Allow different interface language for the same path is applied, we'll need to distinguish between which language type we're wanting to translate to. E.g. if we're using this approach for field titles, we'll want the UI language when editing but the content language when outputting a field label for node view.

2. Serialization. We have the challenge of what to do with serialized fields, which might mix some translatable strings with other non-translatable data. E.g., serialized views and field data.

Other thoughts:

I think ideally we'd have a single query tag to use. E.g., 'translate'. If we've marked translatable fields in the schema, that should be enough. (Or possibly we need one for each language type per #282191.)

We'll need to do something to facilitate storage of the original strings for translation. We need to know when a source string has been entered or updated, so that a contrib module can respond accordingly by e.g. adding the source string to the list of strings to be translated or marking existing translations as outdated. We could consider adding an invocation to drupal_write_record(). module_invoke_all('write_record', $object, $primary_keys) or possibly a translation-specific invocation.

sun’s picture

Issue tags: +D7 API clean-up

Tagging absolutely critical clean-ups for D7. Do not touch this tag. Please either help with this or one of the other patches having this tag.

nedjo’s picture

Alongside translatable fields for content, this issue looks to be the breakthrough in thinking we've been needing for localization in Drupal.

Localization of UI strings has two major problems:

1. Because translation is done after the fact, it's possible only through overriding previously-loaded data. If - as is the case for many strings - no explicit opening is provided for such overrides, we get a mish-mash of overrides using numerous different hooks, when in fact localization is possible at all. The D6 Internationalization module reflects this problem, requiring a new module for each type of data to be localized.
2. Performance: translations usually require a new query for every string being translated.

sun's approach solves both problems: localization is done as part of the original load operation, meaning no subsequent overriding and no performance impact.

Yes, we should aim to put the minimum needed in D7 to enable a solution in contrib.

The three minimum steps in #6 look right, with the addition of: add a schema property to fields containing translatable strings.

Suggested schema field property and query tag: 'translate'.

sun’s picture

FileSize
15.98 KB

Things I left out:

- Aggregator: Feed titles are usually unique/untranslatable.

- Block: #430886: Make all blocks fieldable entities

- Blog, Book, Color, Comment, DB Log, Forum, Help, Image, Locale, OpenID, Path, PHP, SimpleTest, Statistics, Syslog, Tracker, Translation, Update.

- Field, Field UI: Dunno where field labels/descriptions are stored

- Profile: uhm.

- Search: Not sure, whether needed.

- Taxonomy: Needs work -- I have no idea how the new relation to fields works now (maybe catch can help here)

- Trigger: uh oh - what about configurable actions?

- Upload: meh, why does this still exist?

- User: Basically, user role names would be translatable. Furthermore, user signatures could be translatable (though a bit tricky).

Interestingly, the menu_tree* functions and variables seem to be the only places where retrieved values are cached.

That said. Major problem. This patch also includes a change to the variables query. But we cannot translate variables on the fly, because variables are loaded before the language system is initialized. It's a race condition, because the language initialization requires variables... So that will not work. The only solution I can think of would be to make language.inc pluggable (like cache.inc, password.inc, and others) and overload $conf in there - which means that we wouldn't have the desired effect/pattern for variables, but would still be able to translate them.

andypost’s picture

FileSize
3.42 KB

Here is patch for contact, is it ok to mark as duplicate? #251482: Contact categories can't be translated with i18n

sun’s picture

Status: Needs work » Needs review

@andypost: Thanks for trying to help here, but please read the issue before posting arbitrary patches. That Contact module issue basically won't fix, because of the reasons I provided over there.

Not sure why I didn't change the status... so let's have it a crack at it.

sun’s picture

FileSize
15.16 KB

Removed the variables query change. We should consider to open an issue about that plugging of language.inc *now*. ;)

catch’s picture

On taxonomy, I'm hoping field translation helps there, however that requires 'taxonomy term name and description as field', which is a Drupal 8 task. There's probably enough of an API in taxonomy.module to workaround it in Drupal 7, but it's not solved. Term loading does/should have a query tag though.

sun’s picture

OTOH, why don't do the pluggable language.inc right in here. Benefit: Direct context for the already expected "WTF?" question of core maintainers. :)

nedjo’s picture

This is looking great. So simple. So powerful.

Two questions:

1. What would tests for this look like? I guess a sample query alter that translates one of the given relevant queries based on schema information?

2. Could we conceivably skip adding the query tag and instead do this at the db api level? We have what we need in the schema. But - and no doubt this is the reason why this won't work - we'd need to consult the schema for every query.

And I didn't quite follow why we need language.inc to be pluggable. Could you explain again? How does this solve the variables problem?

sun’s picture

@all: Please also help over in #187398: Re-split locale module.

Re #19:

1) Tests, we don't need at this point. The QueryAlterableInterface of DBTNG is already covered by tests. We will just make use of it.

2) Yeah, that was the result of the discussion with Crell in #6 -- we won't touch Drupal core for this. I'm not yet 100% sure whether a single 'translatable' query tag may be sufficient, and whether we shouldn't pass the "type" of translatable query as query meta information instead. DBTNG supports tags and meta information, and while tags just work similarly to taxonomy, the meta information allows you to even pass complex data structures (even entire node objects) along with the query to hook_query_alter() implementations. Of course, our contrib module will parse the schema only once and store the translatable tables/columns in a more efficient variable (whenever the schema is rebuilt).

3) We need to make language.inc pluggable, because the system's variables are initialized very early in Drupal's bootstrap, when the language system is not yet initialized. That means, we cannot apply the concept of DDT to system variables. But what we can do is to replace language.inc with a customized version that, when the language sub-system initializes, retrieves translated variables and overloads the existing. At least, that's all I can think of currently. So the idea behind making language.inc pluggable is just to be able to do this at the appropriate point in time during Drupal's bootstrap.

Crell’s picture

Wait, how are you planning to make language.inc pluggable? Swappable include files are a poor approach. Lazy-load classes (a la handlers or the cache system or the stream wrappers system) are a much more stable solution. Then the physical location on disk is irrelevant.

sun’s picture

@Crell: I fear that we won't have sufficient time for a handler-alike approach... unless you can provide us a really quick way to apply this to the language and localization systems. :-/ Since I doubt that a bit, and we (the i18n team) need a solution here to be able to work on this in contrib, a variable-based approach should be sufficient. If absolutely not doable, then we definitely need help here.

sun’s picture

Discussed with Crell...

We no longer want to make language.inc pluggable/swappable, but instead, we just squeeze a single module_invoke_all('language_init') onto the end of http://api.drupal.org/api/function/drupal_language_initialize/7

Help with rolling this patch into RTBC as quickly as possible would be highly appreciated.

sun’s picture

FileSize
15.72 KB

And aforementioned solution translates into this.

What's left? Taxonomy, for sure. Any thoughts/comments on the list provided in #13 ?

Status: Needs review » Needs work

The last submitted patch failed testing.

tomsm’s picture

subscribing

sun’s picture

Status: Needs work » Needs review
FileSize
16.06 KB

Of course we need to use bootstrap_invoke_all(), because the module system is not initialized yet.

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
16.78 KB

chx was so kind to clarify that this is an install.php problem only.

sun’s picture

FileSize
16.71 KB

Removed the additional query tags that described the data being queried. The "translatable" tag is sufficient, because we will consult the schema to figure out all translatable table columns anyway.

Taxonomy module is still to be done.

sun’s picture

Now also went through Taxonomy module.

Taxonomy module is kind of a special case scenario: It uses the entity controller for almost all load operations. That means, our tiny contrib module will have to figure out on which pages to apply the dynamic translation and on which not... ;) In general, I guess that we'll skip all queries that happen on admin/* anyway (because this is where we always have to load original values and the contrib module's translation UI needs to kick in in some way), so this is just FYI.

However, this entity controller issue leads to an interesting recursion issue:

  1. Taxonomy implements hook_entity_info()
  2. taxonomy_entity_info() creates bundles for each vocabulary, including the (translatable) vocabulary name
  3. entity_get_info(), which invokes hook_entity_info(), caches that information in the database
  4. taxonomy_entity_info() registers the administrative path for each bundle where fields can be configured
  5. Field UI module grabs this information from the cached entity info - ...to display a "label" (here: vocabulary name) for each bundle (somewhere; not sure where that is)

To account for this edge-case, we may also have to add the global $language_interface to the entity_info cache key. But I'm not sure about that - and shouldn't hold up this patch.

On a related note, taxonomy terms are a very nice showcase for the new translation functionality in D7, as a quick discussion with plach revealed:

  • By just attaching taxonomy terms to an entity, DDT will kick in on display, and will display the terms in the interface language (which makes sense, if you think about it).
  • If you want to "translate" the attached terms, then you can enable field translations for the particular taxonomy term field, so you can attach entirely different terms per language.

So this one should be near to being ready to fly.

Status: Needs review » Needs work

The last submitted patch failed testing.

plach’s picture

Status: Needs work » Needs review

This patch is looking very good and we badly need what it would make possible: a unique, coherent, approach to user-entered string translation.
No remark to make besides the fact we have red light from the bot and we are not supporting some core elements.

Among the others:

  • Aggregator: probably we don't need to support this
  • Block: I ain't sure #430886: Make all blocks fieldable entities will make it, so perhaps we should support translation for block titles
  • Field, Field UI: labels are stored in {field_config_instance}.data
  • Search: I can't remember of user entered strings here
  • Trigger: this probably needs translation
  • Upload: I can't remember of user entered strings here either
  • Profile: who knows :)
  • User: ok for translating role names, signature sounds really fieldish to me

However if we miss something we'll have time to fix it later, as this does not seem to involve heavy changes.

plach’s picture

Status: Needs review » Needs work

didn't mean to change the status

sun’s picture

Status: Needs work » Needs review
FileSize
21.72 KB

This should pass again.

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
21.73 KB

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
21.77 KB

Now for real.

I'm falling half asleep now, so I'm not sure whether I'll be able to re-roll with the further changes to Block module.

Field module seems to store its data in a serialized column, so I'm not sure whether we'll be able to do anything about that.

sun’s picture

FileSize
20.37 KB

Re-rolled against HEAD. (Taxonomy synonyms are gone)

andypost’s picture

Configurable Actions/Triggers saves it's data in serializable {actions}.parameters
I think it's a same as field labels

sun’s picture

- Actions: I don't think that's something we can handle cleanly. We'll have to use workarounds for that.

The only things we may want to tackle are:

- Blocks

- Field labels (looks equally undoable due to serialization)

- Upload

andypost’s picture

FileSize
1.25 KB

Suppose only the block title should be translatable

Upload description... what for?

sun’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
22.45 KB

Merged that into the patch.

I'm marking this RTBC, because we horribly, badly need this to work on a contributed module during D7 (to possibly aim for core inclusion in D8).

It's clear that we won't get any further API/schema changes in for D7, so if we missed something, we have to workaround that...

Dries’s picture

I'm happy to commit this patch to Drupal 7 still, but I feel we need to document addTag('translatable') somewhere. It doesn't have to be a long explanation, but without any context in core, it could be a bit of a WTF. Let's make sure to add the minimal documentation to core.

sun’s picture

Thanks - I'm running a bit out of ideas where it would be sane to document this.

I think that Schema API properties are only covered in d.o handbooks, and likewise, detailed documentation about DBTNG select queries and query altering I've equally seen in the handbooks only so far. The concept really only boils down to:

1) DDT translatable data (usually strings) needs to be stored "normalized", i.e. in a separate column.

2) All translatable table fields use the additional property 'translatable' => TRUE in the Schema API definition, i.e. hook_schema().

3) All queries that load translatable data for display in the frontend should use db_select() along with the query tag "translatable".

I guess that only 2 + 3 would have to be documented somewhere in core, but we can certainly add some background info about 1 (the idea/pattern) as well.

Are those "node_access" and "term_access" query tags documented somewhere?

So any pointers to the proper location for documenting this would be helpful.

Crell’s picture

http://drupal.org/node/310077 - The DB API documentation includes reference to some common tags. I vote we just add "translatable" there. If there's some better place to put all of them later (in code or otherwise) we can do that, but that's where we're documenting common tags for the time being.

andypost’s picture

Issue tags: +QueryTags

Good idea to summarize query tags, so new tag for issue QueryTags (another candidate #605630: Comment language administration UI with comment_filter tag)

documentation same as #46 is needed - is there i18n page docs?

nedjo’s picture

A good place for documentation would be http://drupal.org/node/304002, Making your custom data translatable, in the module developer's guide.

As I suggested in #10, I still think we need a hook invocation in drupal_write_record(). The issue is that we don't just need to be able to switch in translations; we also need to know when we need to create them in the first place, and know when we need to update them because the original has changed. A hook invocation would allow a contrib solution to respond to data inserts and updates, storing source strings for future translation and marking existing translations as outdated. Of course, that could be done as a follow up patch.

andypost’s picture

@nedjo why not use a common {entity}_save and not drupal_write_record() ? seems that data-level is much better then row-level to sync translations

Dries’s picture

Priority: Critical » Normal
Status: Reviewed & tested by the community » Needs work

I don't see any major downsides of this patch at this point, so I decided to go ahead and commit it. I'm marking this code needs work, so (1) we can follow-up with documentation and (2) we create a CHANGELOG.txt entry.

webchick’s picture

sun’s picture

Issue tags: +D7 API clean-up

Re-adding the critical tag, since this is one of the most important issues for the i18n team to be able to build the envisioned module in contrib during D7 (and make it core-worthy for inclusion in D8).

@nedjo: That's again a good point you're raising - sorry, I didn't really recognize it earlier. Not sure whether dwr() is the proper place, because not all insert/update queries are run through it. I guess that we'll still need an API that allows modules to integrate with this approach upon source data changes. Regarding data in core itself, I think that everything is sufficiently hook'ified, so the contrib module itself could implement the hooks on behalf of core modules. But yeah, let's brainstorm this some more.

andypost’s picture

I think that hook_query_TAG_alter() is enough for presentation layer but for monitoring a data change events hook_{entity}_[insert|update|delete] is common practice in core.

insertion of drupal_alter() into dwr() or db_insert and others is bad idea because this approach needs tagging all queries in core.

yched’s picture

Awesome to see this committed.

It's a bit sad to have Field label and description out of this insanely neat mechanism just because they're stored in a serialized array (until this patch, there was just no use case for dedicated columns).
Splitting those in their own columns should be a simple patch (instance read and write are encapsulated in API functions), I can write this if there's a chance to have this accepted post freeze.

plach’s picture

@yched: can't we file this as API cleanup?

yched’s picture

API cleanup I'm not sure, but IMO this stands as a simple followup keeping Field API to-date with a newly added core feature.
I'll post a patch over at #549698: Prepare field label and description for DDT (translatable queries).

plach’s picture

any chance to adress also the allowed values translation?

sun’s picture

@yched: That would be plain awesome. I didn't approach it in the patch, because I had no idea where these changes would have to be applied... ;)

sun’s picture

FYI: Damien just pointed out that there are insane restrictions on the row size AND table definition: http://dev.mysql.com/doc/refman/5.0/en/column-count-limit.html

Hence, like he already suggested, our little module will better go with a JOIN-approach. Although that has limitations, too: http://dev.mysql.com/doc/refman/5.0/en/joins-limits.html - but I think it's less likely that we reach those ;)

sun’s picture

Issue tags: -D7 API clean-up

.

yched’s picture

FYI, I posted an initial patch in #549698-16: Prepare field label and description for DDT (translatable queries), but it actually raises more questions than it solves :-/

andypost’s picture

So is there any docs about using queryTags for dynamic translation?
Should we put a someTag for every translatable query to identify them?

sun’s picture

@andypost: Nope, not required for DDT. Only the 'translatable' tag is. Basically.

Dave Reid’s picture

BTW, thanks for leaving an undocumented hook and not adding hook_language_init to bootstrap_hooks since its invoked before a full bootstrap... #642782: hook_language_init is a bootstrap hook and needs bootstrap_invoke_all()

andypost’s picture

claudiu.cristea’s picture

In #6, @sun:

We want to use Schema API to "tag" all table columns (and perhaps tables) that contain translatable columns.

I see in the patch that you added: 'translatable' => TRUE in schema for translatable fields. Is only a question.... What is the meaning of this tagging? I cannot see any processing or storage operations with this key. No updates to the Schema API class. I don't understand for what this stands for.

drifter’s picture

subscribing

klonos’s picture

subscribing...

drifter’s picture

Wrote up some documentation at http://drupal.org/node/304002 - this is the first time I've edited a handbook page, so it could use some review...

sun’s picture

Thanks! I've slightly corrected that.

Should this be documented anywhere else, too?

sun’s picture

Status: Needs work » Fixed

Let's mark this fixed for now.

6aKa’s picture

Status: Fixed » Needs review

#13: drupal.translatable.patch queued for re-testing.

sun’s picture

Status: Needs review » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Berdir’s picture

FYI, I made a proof of concept module that can create translation tables and does the necessary query mangling to display translations. There is no UI yet to enter translations.

I've tested it with roles at admin/permissions/roles, so it kinda works. I'm not sure that it's going to work in general, though. I'm pretty sure that not all necessary queries have been correctly converted, see for example http://api.drupal.org/api/drupal/modules--menu--menu.module/function/men....

Edit: Forgot the link to the project: http://drupal.org/sandbox/berdir/1122562

plach’s picture

@berdir:

I cited your project in the Drupal 7 multilingual support matrix.