Hello Drupal community,
please, could you consider/criticize?
Yesterday, I have finished with synchronization bugs in i18n module, and today morning there came to me an idea on how to build true multi-language support in Drupal. It is a fresh idea, may be stupid.
shortly spoken - we can translate queries, not tables
You see, the approach of i18n module is to translate tables prepending language code. What worries me, is the number of things it affects. For example, menus module - i18n makes an array of $db_prefix and this produces PHP notice and this produces error in menus module; the i18n module synchronization had some bugs itself see - http://drupal.org/node/view/10708 and you may find probably more bugs here, at Drupal's.
And again, what is probably worse, is the approach of i18n module - we are duplicating objects. Instead of having one document with language versions, we are having two or more versions of tables that should contain the same objects. But this is true only at the beginning, later, the objects will start to live their own life making synchronization difficult.
my point is this - we do not need to create different tables for different languages, we need only some content to be translated and therefore we can manage to alter queries before they are executed.
instead of
cz_node
=======
nid | type | title | story | score | created ...
------------------------------------------------
1 | story| můj-X | tady..| 5245 | 101245
en_node
=======
nid | type | title | story | score | created ...
------------------------------------------------
1 | story| myX | text | 1255 | 101245
we could have
node
=======
nid | type | title | cz_title | story | cz_story | score | created ...
------------------------------------------------
1 | story| myX | můj-X | text | tady... |1255 | 101245
Doing this we will keep our objects un-duplicated, they will share the same administrative data and they will be updated properly. User can choose what he/she wants to translate -(exactly, what columns of the table should be translated) and for other modules? The other modules will be totally unaware that something was changed - nothing changes for them.
Consider this query - it creates new vocabulary:
INSERT INTO vocabulary (name, nodes, description, multiple,
required, hierarchy, relations, weight, vid) VALUES ('New vocabulary',
'page', '', '0', '1', '2', '0', '0', '88')
before it is issued (that means - somewhere before _db_query() ) it may be altered to this
INSERT INTO vocabulary (name, nodes, en_description, multiple,
required, hierarchy, relations, weight, vid) VALUES ('New vocabulary',
'page', '', '0', '1', '2', '0', '0', '88')
Do you see "en_description"? Only one column needs to be translated - not the whole table, we can change original query, we do not need to make separate tables. Of course, the colums must exist, but this would be task for administrative part of module - in administration stage, when user enables multi-language versions of something, the columns will be created created automatically and their names stored.
the problem is:
1. to make administrative API for users to see their actual tables and be able to choose the columns for translation
2. to identify table and the rewrite columns inside of query before it is executed (and to make it efficiently)
3. to quide users when selecting columns - it is not good idea to synchronize "id"
I think, I can see how to do it - therefore it is not difficult (it can not be difficult whe I am able to see it :) and I will probably try to finish it (unlikely before the end of month).
What is your opinion? Will it work? Is there some similar approach?
thank you for any comments
romca
Comments
Dear romca, I am a frequen
Dear romca,
I am a frequent user of the i18n functionality but I have indeed found it to be a difficault module to use.
I think indeed your approach is very good and well designed.
I am wondering what your "approach"for solvinf the db issue is.
Would that be using another database layer?
Something like database.mysql.i18n.inc ?
And the other thing I am wondering is how the interface should be used and built. I think your idea of the backend is very good, but the main problem now does not really lie with the backend, but more with the front-end.
I am willing to work on something for that. I will post some screen mockups after the weekend on about how i think the UI could me changed to make translation easy and structured.
With kind regardss,
[Ber | Drupal Services webschuur.com]
Hi, Would that be using an
Hi,
Would that be using another database layer?
I think we will not need any extra database layer, we can use the existing ones - the idea is to change sql query after it was resolved, but before it was executed - that means inside of existing db_query(), or _db_query(), not SQL, but rather regex
for example, we could keep an array of user-selected translation rules
And during replacement, at first identify what table we are going to work with
I am not SQL-wise, but I would guess it will be after keyword WHERE and maybe some another ones
If it is our table(s), we would target the area(s) after the keyword WHERE and replace all occurences of "node" with "cz_node" - the key is normal, the value depends on the selected language
this, of course, adds some execution time, and it would be the best to avoid regex and perhaps only split the strings and concatenate again, because the function is called before every query
please, forgive any error in the code above, I do not have any book to hand :)
romca
ack, I have forgotted to reply to your second, more important
I think this is realy an issue and currently have no idea about how to do it - there is dba module that display the tables, perhaps we could use it to acomplish this work
the help message would instruct users to choose the correct columns - eg. "please, think before you click - is there any sense translating numbers? do you want your vocabularies to have language dependant synonyms? ..."
and, after the colum was activated, it will be created
SQL command? I do not know
romca
Same thoughts
the i18n module synchronization had some bugs
Sure. It may have some incompatibilities with some node types. And it doesn't work with current cvs version of Drupal -because of changes in the Drupal api. Anyway, that's what I call 'experimental feature'
And again, what is probably worse, is the approach of i18n module - we are duplicating objects
I really prefer the all-or-nothing translation for each object -node-. Yes, it means a little more space in the db but probably a cleaner implementation.
That idea of translating individual fields, I really don't see how you could cleanly implement it without reworking *all of* the node api and the database layer.
IMHO it is much better to have a different object for each language and I'd suggest something like having 'nid,lang,etc...' in the node table and then some other table to keep the relationships between different languages of a module.
About the multi-table idea, which I borrowed from some other module - Walkah's translate_node -, I think it is the simplest cleanest way of implementing i18n without having to heavily patch the core.
I mean, I'm not trying to deffend at all the module I wrote. I'd actually love to see some different-better implementation. But I really think what I'm reading here lacks of concretion at the implementation level and neither I see a clear way to do it -Which doesn't mean 'drop it', but 'keep on working, but better elaborate the idea before implementing it'-
Anyway, I must admit I sometimes go the other way, which is 'get something working, then elaborate...' ;-)
Regards
Jose
https://reyero.net
I don't like your proposal to
I don't like your proposal too much. It would require changes to the core where the current approach works without. Also, you need to consider that most sites only offer one language, so any additional overhead for those sites is frowned upon.
Also, with the new locale module we just turned away from a similar design for the locale db tables. It just is not very userfriendly.
--
If you have troubles with a particular contrib project, please consider to file a support request. Thanks.
--
Drupal services
My Drupal services
reply
hi,
thanks for all your thoughts (that is why I wrote this call)
changes to the core:
I do see your point, but I think there would not be so huge changes as you think - great and little at the same time, only these lines inside db_query function are needed
if (function_exist(i18n_rewrite_query() && $i18n_translation_enabled){
$result = i18n_rewrite_query($query);
}
so, nothing happens for single-language sites, no overhead
concerning the Jose's comment of translating-everything:
I really do not prefer to translate everything, to duplicate tables, not only because of space, but mostly because of administration - administrative data really should be the same for both language versions, IMHO, administration would be much more easier.
Think about this situation : you create one story in two languages and assign it to several categories (keywords): "facet analysis" , "subject analysis", "information architecture", "20.th century" - and now, you would like to add one more category. How would you do it with i18n?
You must open *all* the created stories and reassign one by one - if you have 4 languages, it will be pain. If you make mistake in one object, it will not be the same for other three.
and even more - when there comes new module, you must implement i18n support if there is no appropriate hook inside (for example, in flexinode, there is no appropriate hook and when you need synchronization, you must do it yourself, or hardcode it).
On the other hand, if we could do it rewriting queries, no hooks are needed (I know, u probably don't like this idea) and user must only create another column and select synchronization for his/her preferred table-column pair. It is simple and should work for every object/module.
I don't like your proposal too much. It would require changes to the core where the current approach works without.
I know, the current approach works and I like Jose's module, without it, I could not use Drupal at all. But look - I have spent many hours before I understood i18n module (i am not coder) and only after then I could rewrite the synchronization functions to work with Drupal 4.4.1.
I will have to the same for Drupal 4.5 (you have said, that it doesn't work with it) and I will have to do it for every new module that I want to be multi-lingual. And I will have to administer copies of the contentually-same objects. This is what your "current working approach" looks like.
anyway, your critique is highly appreciated
romca
An idea when edit a node with several idioms
When we have the 'edit' option activated, below a bar would have the available idioms for the content. This bar should have the an sub-menu aspect.
An example http://fsilva.online.pt/images/nodes/mockup_edit_node_idiom.jpg
Owwch!
That approach would not work to well for all languages. String translation in a query would produce myrids of unicode problems. Not all dbs allow unicode on a per table or per column basis. This would not solve the basic problem which is that the core is not open to translation and so needs to have its extensions (modules) translated.
Though it is more work the actual core of drupal needs to be reworked.
In the mean time using fields/additions to the tables would be a better solution than adding tables.
My own feeling that there is too much hardcoded information in the core which makes it a mess to deal with. Externalizing that content into easily translatable variables would be a big step and not that hard to do. This would be the only way I would use extra tables. Using them to hold variables and their values.
Has anyone tried Drupal on a datbase with full unicode support running? My SQL 4.1 is close but still has some quirks. MySQL 5?
Carl McDade
Information Technology Consult
Team Macromedia
www.heroforhire.net
What about non-synchronised usage?
There is a problem as I see it in this view of the problem.
But what about sites where the content is NOT synchronised? In other words, where we have DIFFERENT content for different languages (or some content the same and some different).
Personally, I would much prefer the approach of attaching the language to the node (new column in node table)