There is already a connector ready for drupal.org projects and projects maintained in local tarballs. These are nicely separated from the locale tables (and should be separated), so having our own tables is the good direction.

l10n_server database structure

Our tables implement the following relations:

projects - releases - files - lines - strings - translations

and also

releases - extraction errors

So a string is connected to a line of a specific file which is part of a release of a specific project, and we also store the extraction errors when a release is scanned for translatables. This works nicely for the existing two connectors. Our model also stores strings with plural versions as one row, ie:

1 new\0@count new

And translation as

1 nowy\0@count nowe\0@count nowych

in case of Polish translations. This shows nicely that although English plural originals always have two versions, translations can have as much as the language requires. We don't need to look up different plural versions on their own, so this "serialized" storage does not lead to any performance problems, but at the same time allows easy relations between source and translations/suggestions and easy counting of strings, untranslated stuff, and so on. When trying to make l10n_server use the internal Drupal database, this causes the most problems (apart from suggestions not being possible, which we could disable in the code easily if we'd want to).

Built-in locale table structure

The built-in locale database uses the following database structure:

lid  | source
-----+----------------
163  | 1 new
276  | @count new
1953 | @count[2] new

lid  | translation      | language | plid | plural
-----+------------------+----------+------+---------
163  | 1 nowy           | pl       | 0    | 0
276  | @count nowe      | pl       | 163  | 1
1953 | @count[2] nowych | pl       | 276  | 2

So a few observations on what is not good for l10n_server mapping:
- sources and translations are stored in different rows (this is good for translation lookups for Drupal, but not good for l10n_server)
- we have no idea of plural relations just by looking at the source table (eg. when there are no translations of these strings yet)
- if more than two plural forms are required, even the source table needs to be modified, so the list of sources depend on the language we need to translate to (ugh)
- "extra" plural versions use @count modified with an index to distinguish between different versions
- even the translation table only stores backreferences for plural strings, so we need to look back to find all plurals
- we also need to look at the plural indexes (last column) because it is not guaranteed that translations get into the table in order, so we need to order by plural index too

Our problem is to map these to each other

The question is how should we map the table design optimized for string translation lookup to the table design which is optimized for metadata storage and translation interface presentation.

Comments

Gábor Hojtsy’s picture

So we could go to basic ways about this:

- map the l10n_server functionality to the built-in locale database
- have the data in both places and copy stuff over at all times

Mapping

Mapping would mean we store strings and translations in the Drupal built-in locales_source and locales_target tables. The problem with this is that plurals are stored differently (as explained above). So we would need mapping tables for the source and target data, which would store the plurals (and refer to locales tables for single string translations). The locales tables also have no possibility to support suggestions and storage of editing history, which might or might not be an issue on local sites. I would say local sites need simpler functionality, so we might as well disable suggestions there even if our data model supports it (or at least disable by default and allow it to be enabled).

Copying over data

The obvious problem here is maintaining consistency of course. When the module is installed, we need to copy over the data from the locale source and target tables. We need to disable the locale module translation (search) and import interfaces (via form_alter or menu overrides) and tell people to use the l10n_server interface instead. Then we only have l10n_server as an import or translation editing interface input point, BUT locale module still collects translatable stuff in itself when t() is called with something which is not in the DB yet, so we should still be ready at all times to sync again with the locales DB. This does not allow us to gain full control on the locale tables.

I don't necessarily like either of these two ways, although admittedly data copying is easier to implement with the current system. I would welcome any feedback, and especially better ideas though.

Gábor Hojtsy’s picture

Status: Postponed (maintainer needs more info) » Closed (works as designed)

This is not going to happen. Instead, l10n_server is now extended to be able to receive remote translation submissions from l10n_client and l10n_client can submit translations remotely to one server.