This patch includes a bare minimum i18n module and most of the core patches needed for the system to work. Some general idea of what we are trying can be found here: http://drupal.org/node/77266 [DEP: Multilanguage support in Drupal core]
This is based on the existing Internationalization (i18n) module and is a first try for full integration of some of its features into Drupal core. Anyway, most of the code here has been rewritten from scratch, no copy & paste, but rewrite to try to comply with Drupal core coding standards. I said 'try'... :-)
Features:
------------
- Language management in path and url functions
- Basic content selection depending on language
- Language switcher block
- Language dependent variables, so textual ones can have a different value for each language
- Pluggable support for language icons -through modules & theme system
- Ready for multiple url aliases depending on language
- Content types can be enabled to have language in content type administration page
Main code changes:
------------------
- Some reorganization of path management functions. Includes a multilanguage-enabled 'path.inc' replacement and also the mechanism to switch from one file to the other, when module enabled and 'activated' (see Home » administer » site configuration > multi-language)
- Bootstrap hooks to early initialize i18n system, so variables can be replaced just after they're loaded depending on which language is used.
- Got some language functions out of locale module, to language.inc, thus language can be used before module loading
Some data model changes:
----------------------------------------------
+ Language field in 'variable' table
+ Language field in 'url_alias' table -ready for language dependent url aliases
+ Language field in 'node' table
I know this is a big and complex patch and is not close to be ready for commitment.
But please, try and/or provide feedback.
The plan is to produce a set of smaller ones that will address single issues and features and will be easier to review. But I will try to have this one up to date for anyone who wants to give a try and also to have an overview of how all this could work together.
| Comment | File | Size | Author |
|---|---|---|---|
| #2 | i18n_module_02.patch | 43.55 KB | jose reyero |
| i18n_module_01.patch | 41.15 KB | jose reyero |
Comments
Comment #1
gábor hojtsySome stuff I found while looking through the code is below. I have not tried the code in action yet, just reviewed the patch. Hope my findings/tips/questions help. Note that indentation problems I tried to point out will now show due to the nature of the filtering here...
- Why is language empty here? If it is because this allows for querying language independent variables, then this should be documented with a comment on the previous line. A space should be between = and ''.
- Seems to be a bogus change, the WHERE should not have been removed:
- Why is i18n replaceable?
- It would be nice to know why drupal_get_title() and drupal_set_title() was in path.inc, maybe they should stay there.
- System.install should have the CREATE TABLE statements also changed, not only the update path. BTW it is very nice from you to think update the update path up front :)
- Space should be around the = sign (in all SQL statements) AFAIK:
- $conf['i18n_variables'] should probably include a lot more, like user registration/activation/approval email text and such.
- I would write "i18n system bootstrap" and would simplify wording here. There is also a typo in there.
- This behaviour should be noted in the settings.php comment. People might not expect to loose variables, if they comment out some of them for testing or some other reason in settings.php.
- Performance wise this is surely a good idea, we will see how core comitters see it.
- This only unsets variables in the current language. This is not consistent with the above deletion of variables in all languages. We should think about this.
- This trims out the language from the path, but does not store it in $language, since i18n_get_lang_prefix() returns the language, and the return value is not used.
- There are some serious indentation problems here. Two space indentation should be used (some lines are overly indented, some are under-indented). There are more similar problems later in the same function.
- This small code is probably easier to integrate into the url() function directly, rather then as a separate function called with a conditional from url().
- Typo in phpdoc comment:
- Overindentation in the comment, plus attach the comment directly to the function, without a separating newline.
- browser_get_language() should be named differently. Either put it to the drupal_...() camp, or the i18n_...() functions. The browser prefix is confusing here and should not be used.
- The accept-language checking in browser_get_language() is quite simplistic and will not work in all cases. We tackled the problem at php.net with implementing what the RFC tells us about the format of that header. Look for the filling of the $browser_langs array in this code: http://www.php.net/source.php?url=/include/langchooser.inc Language codes are not necceserily two letter (even Drupal has pt and zh variants which are not two letter). So this assumption does not hold either.
- This is a rather broad reaching statement in i18n.module (and thus is not really true).
- Comment indentation and attaching to function again.
- i18n_get_lang() is not named according to Drupal conventions. Abbreviations are not used in Drupal. It should be i18n_get_language(). Apply this convention elsewhere (like i18n_get_lang_path()) if I missed other abbreviations.
- This should say *on* the settings page shouldn't it?
- This seems to be just a start :) BTW as you can see in other core modules a "You can" section is added to help texts, not a features list.
- The description should point out that multiple language *content* and *settings* are handled by the module, not some general all covering multiple language support (like menus, taxonomy and such). There is also an indentation problem at the end.
- Maybe add a link to the other language interfaces to edit the variable in other languages, or just add: "Switch language to edit different translations.".
- i18n_form_alter() and _i18n_language_select() seem to have identation problems with form API arrays.
- Is _i18n_language_select() going to be used elsewhere? It is only used at one place now. Is there a need to have a separate function for it?
- i18n_supported_languages() seems to be quite simple to not exist IMHO. Removing it would also clear any possible confusion that Drupal supported languages and i18n supported languages are not the same.
- I would call $target as $path instead, which is in line with Drupal naming.
- Why is a setting for enabling the i18n functionality? I would expect i18n to work after I enable the module, not that I need to enable it, and then disable it before disabling the module. Drupal can tell you if a module is enabled, there is no need to have a variable for it which a user needs to check.
Comment #2
jose reyero commentedI've followed most of Goba suggestions. Thanks also to chx for some ones. Also updated the patch for latest Drupal-HEAD.
I'll post something else, including replies to Goba's comments tomorrow. Btw, most of the indentation looks ok when actually applying the patch.. (?)
Changes:
------------
- Fixed a bug in bootstrap.inc:variable_del()
- Fixed some comments
- Renamed some functions
- Reworked i18n variable API
- Expanded i18n_variable functions with language parameter, so they can be used to handle multiple languages in a page request
- Don't delete disabled multilingual variables anymore
- Cleaner integration with normal variable functions
- Reworked common.inc:locale_initialize
Comment #3
killes@www.drop.org commentedI am sorry to say so, but this patch isn't better than what is going aroung now as i18n, ie a "solution" in the sense of "it works" more than in the sense of "it is good and core worthy".
Guess everybody expected that. :p
Comment #4
dries commentedCan we split this up in smaller patches, and go one step at the time? It's kinda big right now.
Comment #5
killes@www.drop.org commentedBy "expected that" I meant my reaction to the patch, not the patch itself, I had hoped for more.
Comment #6
gábor hojtsyKilles, I see this solution is not perfect. The i18n project contributors refined several solutions and came to this one. Nobody came around for years to suggest a better method. We can delay this indefinitely waiting for something perfect, or we can live with what we can do.
IMHO this is the same question as configurable node types. That feature is not perfect nor complete in core. Contrib can build on it and provide additional stuff. Now Drupal is not up to i18n for menus, variables, taxonomy, nodes, user profiles and stuff. We can try to tackle these all at once, but it is going to be a very huge (impossible) undertaking. Forms API wawing might mean something still :) A full i18n compatible Drupal would take a complete release or two. It is not going to happen in one good and core worthy patch. Also the remaining days until code freeze does not allow us to do more for the next release.
Comment #7
killes@www.drop.org commentedLast year in Amsterdam I've suggested two methods that I deem to be better to Jorge in person.
Comment #8
killes@www.drop.org commentedJose, of course.
Comment #9
gábor hojtsy@Killes: are those written down somewhere? Could we look into the concept at least (no code required :)?
Comment #10
killes@www.drop.org commentedNo, the ideas were all lost in the subsequent drinking binge... :p
Basically, one idea was to introduce another table similar tothe node_revisions table. The body would be moved the that table and there would be one to many relationship between the node and the node_revisions table on the one side and that third table on the the other.
We also considered to not add a third table but to add a language field to the node_revisions table.
Another possibility would be to add the one to many relationship as a key on (language, nid) directly in the node table. Basically a nid would not longer be unique in this case.
Comment #11
gábor hojtsyYou mean that a revision of a node will have different language variants stored? What if I would like to fix text in a translated version and keep the revision history? If we advance the revision counter, then the original language variant and other translations will not be in sync (will relate to previous revisions).
How would you relate different translations to different taxonomy items (translated taxonomies supposed to have different tids)? Even different authors? Or just attaching the translator uid to the translation? Would you need to upload files (ie. illustrations in different languages) to the same original language node? How do you handle permissions (they are node level, not revision level). These last two also apply to the previous idea.
Once again, how would you handle permissions for translators to their language variants (permissions are bound to nids)? Revisions are tied to nids, uploads are tied to nids, so the above attachment problem still applies.
I also gave this some second thought yesterday, and basically come to the conclusion that what we are aiming to here is another kind of multisite setup. We need first class nids, tids, variables, menu items and such, so we can reuse revisions support, node level permissions and all the node/taxonomy modules already implemented. Any translation could cover the original language version fully or can diverge from it significantly (different menu, different taxonomy, different nodes). Our solution differs from multisite setups in that:
At first thought, it might be possible to implement i18n with multisite, if multisite is made more versatile (eg. runtime selection of the site used). We would need to share some tables, like modules used, users, sessions. We would need relation support for tables such as nodes, terms, menus etc.
But if you remember, i18n in a previous iteration used table prefixes (which is the basic of the multisite setup I am talking about) and it did not turn out to be working fine. Due to having different node tables for example, you would need to set up node level permissions in the different sites separately. You also need to upgrade the different datases separately (this is extremely tricky with shared tables since one language subsite would upgrade the shared tables and others will also try to but will fail). This is a general problem of how multisite is handled though. BTW another advantage of having one single site instead of multiple ones, is that you have a good language independent search index.
Jose prepared this patch, so we see where i18n is going. You see that the variables already have this multisite-like look in this patch (language column added, language dependent variables identified in an array and language based selection is used). Similar functionality is implemented for other objects in contrib.
Comment #12
killes@www.drop.org commentedIt is completely natural that the individual revisions won't be in sync. This just happens as you edit a node. What is important is that the current revision the one shown to the end-user. Complex workflow arrangements aren't supported by Drupal, yet. This is another issue to be dealt with elsewhere.
That is what is wrong with the current approach in the first place: Taxonomy terms (or anything else) should have the same ID for the same thing.
The uid of people doing the revisions is already stored.
Yes.
Permissions would stay as they are. It doesn't make sense to allow access to a node in one language and not in another one. If you are concerned about editing rights: This could be easily checked through a variety of ways.
Well, I don't see it as a problem. You can enforce the loading of the correct language per translator, for example.
Well, you are welcome to try what you describe. I wouldn't buy this as a multisite concept, though.
See above, I don't think multisite is a viable solution. Sharing of certain tables might be, though. I however would greatly prefer to have everything consolidated into one site.
The results are rather disappointing.
I've had a look at the implementation of variable_set to get a feel for what Jose is trying to do. What do I see?
if (function_exists('i18n_variable') && i18n_variable($name)) {
+ i18n_variable_set($name, $value, $language);
+ } else {
The whole approach has a hackish air about it. It somehow remembers me of mambo, don't ask me why.
I feel very sorry that I asked Dries to accept the patch which introduced the current i18n hook. Why? It allowed the current hackish solution to exist and didn't force the people to think about a real solution.
Comment #13
gábor hojtsyKilles, nobody forces us not to think. We are moving with i18n closer to core, and this luckily opens up some discussions. You see you did not write your ideas down before (or you did not care to point me to them instead of reiterating them). But now we see them. This is great.
IMHO what you miss is that there are different multilanguage scenarios.
Use case 1:
Site of a town in Germany. Local residents get news, information about local offices, opening hours and such. English/French visitors get an overview of the city, the major tourist attractions, some offices and opening hours. You see that the menu and taxonomy structure will overlap a bit, and to help translators, we should relate different language versions together. Complete subsites will only exist in one language. Compare http://www.berlin.de/ and http://www.berlin.de/english/index.html for a real world example.
Use case 2:
International newspaper having different editions around the globe, or different language local variants. Think of National Geographic or the Belgian Metro newspaper which is published in French and Dutch. You can also think of product manuals on a website. These ideally have the same content published, so original language and translated language variants always find their matching pairs. Structure should be the same, menus, taxonomy should be equal, just the actual text differs.
I think Drupal should support both. As far as I see from your intentions, you are solely focused on multilingual nodes. But multilingual sites are much farther from there. It can happen that a different language site variant is a complete translation, but it rather happens that whole categories should be removed/added in the translated variant, that the menu structure is at least slightly different and such.
You say people working on different languages for the town, newspaper or product manuals should have the rights to the original text. Newspaper teams or European governmental sites having stricter workflows (which is possible if we use first class nodes with the workflow module, if we explicitly not deny it with a restictred design) will not agree with you. I see translation projects having multistep workflows defined, so that a text's translated version needs to go through some hoops before being published.
Why I say is that we should think of the translated version as a multisite scenario (conceptually!) is because different language variants of a site will not overlap with each other in their structure, forum topics, permissions, themes (logo and other images at least). They should overlap in users, and should fall back to a default language is several circumstances though. I perfectly agree that it would be best to have everything consolidated into one site. I just tried to point out that conceptually we are introducing multisite-like features (regardless of whether Jose's or some other route is taken).
OK, it is possible to integrate the language dependant variable handling directly into variable_get/set. You will not be happy about performance, but then so be it.
You are free to come up with ideas for this issue beyond to the node system. What we are trying to do here is to intorduce a language column into relevant tables, so we keep all objects first class (which is a required outcome), and then apply language dependant data handling throughout Drupal. I don't see how your patterns on the node table extend to the menu, taxonomy, variables, profiles, cck fields and other objects. The language aware first class objects way offered here is one possible solution. I see you are not happy with the concept and not happy with the implementation.
We should probably discuss the broader concept first, before picking on implementation and then refine an implementation for the aggreed on concept. I don't think code is golden then discussion here. At least until we have such fundamentally different thinking about what should an i18n system be capable of. I would like to get the most out of the system.
Comment #14
jose reyero commentedWell, it seems that we have all kind of opinions here :-) I fully agree with most of Goba's answers above so only some comments:
Sure. That's the idea from the beginning, as stated in the first post.
This is just an overview of 'how everything fits together' and my intention is to maintain both, this one and a set of smaller patches, they're coming...
Strongly disagree. You can not like any of them , but this patch is a much cleaner and better code than the module. Just comparing 'relative' quality.
No, they weren't. Actually I've been considering both approaches -and more- and eventually I went for this one. Not only because it was already implemented, but because IMHO it's more workable, fits much better with Drupal data model and is much more flexible.
Gerhard had quite a different -I have to say also interesting- view of the whole thing. I think all this is summarized here: http://www.developmentseed.org/blog/internationalization/further -Drupal i18n report I wrote some months ago.
So please, let's be constructive and let's try to get something 'committable' at least. Not this big patch but the smaller ones -in the works-.
While I like these kind of philosophical discussion I think it would be better somewhere else, like in a threaded discussion on the DEP page or the forum. So I will try to focus on this approach here and on the code, but I really would like to see some other approaches -and patches- somewhere else.
Like most of the people, I just want the features. Not specially this code or this approach. Only it happens to be the more workable one I can think of.
About the 'hackish' code, I somehow agree about some parts. And for that I will try to rethink some features. The main problem here was trying to get something that had nearly zero performance impact when disabled, but it seems to be difficult -maybe for my PHP skills- to get it working without some kind-of hacks like that ones for variable functions.
So I might post a next version dropping the whole concept of 'i18n layer' and going for a more integrated one. Thinking about it... -Please help, comment-
Note: I will be off-line for a few days, but coming back with more patches around mid next week.
Comment #15
sunSubscribing... and: Regarding to use case 1 in #13 such kind of "internationalization" should be done by path redirects. I don't see any magic in creating a menu item "English" (/english) displaying a custom page or taxonomy along with the submenu entries of /english. You even do not need i18n to accomplish this. However, menu items or taxonomy should support forcing a specific language so that blocks like in the given example won't be displayed in the default language.
Comment #16
Frando commentedComment #17
sunAfter discovering Localizer module I'd rather prefer to get that complete module into core someday.
Comment #18
jose reyero commentedWe've already got many of this features with different patches so this one doesn't make sense anymore