Make all UI strings translatable
tituomin - October 19, 2009 - 09:03
| Project: | Millennium Integration |
| Version: | 6.x-2.x-dev |
| Component: | Code |
| Category: | task |
| Priority: | normal |
| Assigned: | tituomin |
| Status: | fixed |
Description
The material type and language strings were not translatable. Patch below.

#1
#2
Also, some strings were in ISO-8859-1 (at least here, after a CVS checkout), converted to UTF-8.
#3
Well, actually, your patch translates strings that go INTO the database, so the result is you are importing translated strings in the language at the time of import, not at the time of visit to the site.
The correct way is to add it to the viewing portion, inside hook_nodeapi($op = view) and theme_millennium_biblio_data().
I already did a portion in the huge patch over at http://drupal.org/node/576784#comment-2184824, inside theme_millennium_biblio_data():
'teaser' => 'series,translated_title,edition,imprint,lang',
);
+ $translateable_fields = array('type', 'lang');
$content = "";
foreach (explode(",", $map[$mode]) as $fieldname) {
- $fieldvalue = $biblio_data[$fieldname];
- if (trim($fieldvalue) == "") {
+ $fieldvalue = trim($biblio_data[$fieldname]);
+ if ($fieldvalue == "") {
continue;
}
+ if (in_array($fieldname, $translateable_fields)) {
+ $fieldvalue = t($fieldvalue);
+ }
$rows[] = array(
'data' => array(
array('data' => $_millennium_field_labels[$fieldname], 'class' => 'fieldname'),
- array('data' => trim($fieldvalue), 'class' => 'fieldvalue')
+ array('data' => $fieldvalue, 'class' => 'fieldvalue')
)
);
}
When I/we properly review that patch (I think it'll block other things for a bit)... then we could tackle this one =)
#4
Ah, you're right!
And we should also remember to make sure that the translated strings are available when the nodes are being indexed for searching.
#5
It seems that you have to provide a literal string inside the t-function, not a variable. See http://api.drupal.org/api/function/t/6 -- this is what I got:
I suggest that we do the following: let's save the language code, not the language name in English, to the database. Same goes for the item type. Then, the UI should map the language code to the language at node view time. And the mapped strings will be surrounded by t().
*edit* It seems that you actually can in this case get the translations to work by using dynamic strings inside t. (But I'm not sure if you we can get rid of the warnings.) Still, this requires that we have the literal string versions somewhere in order to be extracted. So we would actually need two arrays in the code: (1) the mapping of the language codes to English versions (without t, so the English version would be saved in the database), (2) English strings wrapped in t in an array somewhere. But since this is the case, it's simpler to just wrap the English versions inside t in the first place, and not save them in the database. Saving the untranslated language codes would seem better.
#6
Side note: for the language code translations, it would be easiest for the translator, if we had something like this somewhere in the code:
'eng' => t('eng'),'mul' => t('mul'),
Although I'm not sure this is the Drupal Way (the rules say you have to have English strings inside t). This would require a translation .po file even for English. But it would make automated translation easier, because you could see the standard codes in the .pot file and easily generate the correct translations if you have the list of mappings somewhere in a different format. At least that's the case here.
#7
Right... I guess the approach is like you mentioned:
1) store language CODES in the database. The "code" could easily be the language name in english, or we could go with the ISO codes used in MARC.
2) have an array that converts that code into the t() equivalent.
e.g.: inside theme_millennium_biblio_data()
<?php// this is pseudocode =)
$lang_code = [from the database];
$lang_human_readable = array(
'eng' => t('english'),
'spa' => t('spanish'),
... etc etc.
echo $lang_human_readable[$lang_code];
?>
This way the pot extractor would work.
#8
Here's my second try.
In this solution, a code string is saved into the database for both the material type and language. By "code string" I mean a string which is not meant to be displayed at the user at any point, but used internally as a unique id string.
The code string for a language is the ISO code string. For the material type, its just a single lowercase word (like 'cassette').
There is one new function, _millennium_human_string which takes as parameters a biblio_data array and a field key. It maps the field contents (code string) into a human readable string wrapped in t() and returns the human readable string. It uses a global array containing the mappings.
This new function is called from to places: (1) the UI displaying the record and (2) the code which maps the fields into taxonomies. So the taxonomy terms will still get translated in this model.
Personally, I think this is a clean solution. There is a downside, though: the storing of the data different, so this isn't compatible with databases from previous versions. Of course I could make some kind of a migration script.
#9
This portion inside millennium_add_taxonomy_to_node() will have problems, since _millennium_human_string() would return a string in the current language at the time of node import; and what you want is to always create a term in English.
+++ millennium.module 2 Nov 2009 17:13:01 -0000
@@ -1244,15 +1765,18 @@ function theme_millennium_biblio_data($b
@@ -1307,7 +1831,7 @@ function millennium_add_taxonomy_to_node
@@ -1307,7 +1831,7 @@ function millennium_add_taxonomy_to_node
);
// Material type
- $term_mat_type = millennium_marcleader_to_mattypename($record);
+ $term_mat_type = _millennium_human_string($nodeobject->millennium_biblio_data, 'type');
if ($term_mat_type) {
millennium_add_node_taxonomy_terms($nodeobject, variable_get('millennium_marc_vid_leader_item_type', -1) , array($term_mat_type));
}
@@ -1322,8 +1846,7 @@ function millennium_add_taxonomy_to_node
@@ -1322,8 +1846,7 @@ function millennium_add_taxonomy_to_node
if (is_array($tmpfields)) {
// Language
- $lang_code = drupal_substr($tmpfields[0]["rawdata"], 35, 3);
- $lang_term = _millennium_langcode_to_human($lang_code);
+ $lang_term = _millennium_human_string($nodeobject->millennium_biblio_data, 'lang');
if ($lang_term!= "") {
millennium_add_node_taxonomy_terms($nodeobject, variable_get('millennium_marc_vid_language', -1), array($lang_term));
}
@@ -1401,7 +1924,7 @@ function millennium_add_taxonomy_to_node
@@ -1401,7 +1924,7 @@ function millennium_add_taxonomy_to_node
}
// Other subjects: 600s, 61xs, 63xs
- $tmpfields = millennium_getFields($record, "6[013]..."); #get'em
+ $tmpfields = millennium_getFields($record, "6[013]..."); //get'em
if (is_array($tmpfields)) {
foreach ($tmpfields as $field) {
AN idea: as we need to store english terms in taxonomy, use the english word as the code itself. (E.g. "Projected medium" could be the code, instead of "projected" for that item type)
The only special case is language: it would be a 2-step process.. 1) change from ISO 3-letter language code into the english-language name (like the original function _millennium_langcode_to_human($code) did), and (2) translate that with millennium_human_string() into the localized string for display.
#10
I have to disagree with this. To me, the translation of taxonomy terms is a required feature. The primary language for my site is not English but Finnish. This means the content of the site is in Finnish, and taxonomy terms are just one kind of content.
(Optionally, in the future, it would be nice to have the site fully translatable, so the user could choose between Finnish, Swedish, and English. But I have read that Apache Solr integration doesn't really support this yet.)
So, the terms will have to be translated at some point. Of course I might be wrong about where the translation should take place. In this patch the translation takes place inside millennium_add_taxonomy_to_node(). Maybe there are other options? We could for example
But I have to say, I haven't yet had the time to see how the taxonomy term translation works in Drupal in general. Would you say we should use the general taxonomy term translation mechanisms in Drupal? There are some open questions I don't know the answers to, if we choose this path:
If you can give the best approach to taxonomy term translations straight away, please share! =) Otherwise, I can dig deeper into the options..
BTW, one easy option: let's add a tick box in the settings, where the user can choose if he wants the taxonomy terms translated at record import time.
#11
Excuse the long post =)
You're right, instead of "english" i should have said "base language".
That said, another clarification: I'm not saying we will NOT have translation. All the contrary, we must first add the term in
englishthe site's base language into taxonomy, so then you can use i18n module to localize or translate those terms.Let me also say that I am *mostly* doing multilingual sites (so I know your pain!) One of these sites uses the millennium module too =) So I have, so to say, "ample" (meaning painful) experience getting taxonomy terms to work in i18n or localized environments =)
Now, another thing: after all this time, I think I have only scratched the surface =) So I'm willing to learn =)
Let me tell you how I have done it:
I have always done it using english as the base, and once I tried to do it another way (define the base language as, say, "spanish", then handle english as a second language... and it was a mess (to me)... maybe it doesn't mean it's impossible, just that I don't know how to do it) =)
If we use t() when adding terms to nodes, they will get localized to the interface language at the time of import (I think??), which we don't want, because it depends on the current interface language, which might get switched at any time. Example: Let's say you go into the finnish-version interface, import a node from millennium, and that node gets the term "suomea" for language for an english-language book. You now switch to the english-language version of your site, import another item from millennium, and another english-language book gets the term "english". Now you have two terms for the same thing, depending on what the interface language was when you imported. (Which is bad) Also bad, is that Drupal's taxonomy module didn't know that "suomea" is in finnish, and "english" is in...english. It's just characters.
Translated terms, when you are dealing with MULTIPLE languages, is i18n's job. For instance Apache Solr has some pending patches to review that let it use i18n functions to show the facets correctly (I'm running those patches here (spanish version)).
If you only have one site language, I suppose you *might* want to let the admin import using t() if they wish it... or we could make the "fixed" taxonomy terms (language, item type) configurable via an form in millennium module...
But using t() to *always* create voc. terms is out of the question, since I (and others) want multiple languages in the site, and taxonomy terms get localized via i18n.
Now, I just noticed that t() can be called with an optional $langcode argument? Would that fix everything, if we force it to invoke t() to create terms in whatever the site's base language is? Again, this hasn't worked for me so I'd need testing. I'm SURE we can make it work using english taxonomy terms and then apply i18n to it (it will work with Apache Solr when those patches go through also).
... and, after all this typing, I still am a bit lost =) Hope you have some insight to make this work so it makes everyone happy =)
#12
Hey, thanks! This is very useful information regarding my needs. I had a feeling that you probably had a good reason to be careful with this issue..
Personally, I have two different goals for my site: in the short term, I would be happy to get just a one-language site (Finnish). But in the long term, on-the-fly language switching between Finnish, Swedish and English would be absolutely great.
I understand that because English has been chosen as the standard language in Drupal, it's probably simpler to stick with English as the basis for i18n.
I'll get back to you on this one, after reading the solr thread and making some tests!
#13
Here's another patch. Now, the user can choose if the taxonomy terms should be translated to the site default language during import, or if they should be kept in English.
The idea is that there are three groups of sites with different needs:
I added an option in the settings panel to reflect these different needs. The default is English, of course.
_millennium_human_string() now supports different modes for translation:
Now, this is just a basis for discussion. I think this code has a couple of good things to speak for it. First of all, the English literal strings for languages and material types are kept in one place only. (Took some time to figure out how this is possible -- that's why each string has the 'en' parameter to go with it). Moreover, the _millennium_human_string-function is general: it implements the same translation functionality for all desired fields (more translateable fields could be added later). This also makes the translation decisions explicit in the code.
I have kept the decision not to save human-readable ui strings in the Millennium Integration metadata arrays in the database. You could also always save the the English versions to the array (just use t('...', array(), 'en'))... This would maybe simplify things a bit. If you want, I can quickly make the modifications for this one. It's just that the UI strings might change over time if somebody wants to change "Projected medium" to "Movie" or "Greek, modern" to "Greek" for example.
When making this patch, a couple of general ideas about the Millennium Integration surfaced.. I'll make a couple of tickets for them.
#14
Looks good on first glance; will test soon.
#15
Maybe a small performance gain could result from only allocating an empty array once inside millennium_init( ). So instead of
'aar' => t('Afar', array(), 'en'),we would have
$empty = array();[...]
'aar' => t('Afar', $empty, 'en'),
#16
Finally I've had some time to look at this. It looks ok by me, but you mention this code is "just a basis for discussion".
Are we missing something? Or is the code ready to go?
#17
I think we probably should incorporate the changes in #15, it just seems silly to allocate hundreds of empty arrays. =) Unfortunately I can't do this at the moment, I can get back to you.
This patch should work, at least according to my tests a while back. Re: #16, maybe I was just expecting some more discussion about my decision to save the "code" strings into the database instead of the English UI strings.
(The reason being: what we save in the database is part of the "data model", not part of the UI -- and the UI strings could change more easily, but there's no reason not to keep these string tokens the same. Plus, I have some of my own code which makes decisions based on the types of records, and it just feels more robust if I don't have to compare English language UI strings and think about letter case, spaces between words, etc.)
But anyway, I think this is ready to go after incorporating #15.
#18
Oh, it was in fact very silly.. t() accepts null just fine as the second parameter. =)
Also, had to re-roll the patch against the current version, here it is! Tested, works.
#19
Thanks, testing.
#20
Great job!
Will commit this soon.
#21
Committed to 6.x-2.x-dev