Problem/Motivation

  • deepL introduced the possibility to use a custom glossary while translating text and documents

Proposed resolution

  • Allow creation of custom glossaries via deepL API in drupal
  • Add config / option to specify glossary id in tmgmt_deepl UI
  • Pass glossary_id in API calls while requesting translatinos

User interface changes

  • Add select field for selecting glossary on the translator config form
  • Administration page for glossaries
Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

SteffenR created an issue. See original summary.

SteffenR’s picture

SteffenR’s picture

Version: 2.0.0 » 2.1.x-dev
SteffenR’s picture

Title: Add new option to specify the glossary to use for the translation » Add new option to specify a glossary_id used for the translation
Assigned: SteffenR » Unassigned
Status: Active » Postponed

Due to limitations of the deepL API we cannot use the glossaries created on the website or in the deepL apps.
In this case we have to implement a custom solution based on drupal entities for creating glossaries via the API documented in https://www.deepl.com/docs-api/managing-glossaries/.

More information about the topic can be found here: https://support.deepl.com/hc/en-us/articles/4405021321746-Managing-gloss...

SteffenR’s picture

Title: Add new option to specify a glossary_id used for the translation » Allow use of glossaries for translations
Issue summary: View changes
SteffenR’s picture

Issue summary: View changes
daveiano’s picture

Any updates on this?

Would love to see this feature added :)

SteffenR’s picture

@daveiano
Feel free to contribute.
RIght now, i've not planned to integrate the feature, cause i'm involved in a time consuming project with not so much time left for extra contributions.

While implementing the feature, we should take care of the following things

  • administer glossaries as custom drupal entities (allow multiple glossaries depending on API key of configured deepL translator)
  • glossary entity with following fields
    • Internal name (only used for drupal)
    • glossary id (used as unique identifier for drupal and API)
    • Source language
    • Target language
    • Glossary entry with Source Target values (Multiple value field)
  • possibility to manage 1-x glossaries
  • restrict selectable source/ target languages
  • custom validation for glossary entries
  • API integration
    • TBD: possibility to sync glossary to API or "add new + delete old" on saving glossary entity
    • check if entries have changed
    • show latest sync date
    • sync glossaries (multi user scenario) / kind-of content locking if multiple users try to edit same glossary
  • passing glossary while translating content
    • auto-select latest glossary while tranlating based on source/ target language
    • let user select glossary (restricted to available glossaries based on source/ target language)

As you can see, there are several things, we need to take care, while integrating the glossaries.
For storage purposes a custom entity type would be best fit - syncinc between drupal < > DeepL could be achieved by running a cron/ manually sync process.

API documentation: https://www.deepl.com/de/docs-api/managing-glossaries/

SteffenR’s picture

Status: Postponed » Active

Since i'm currently working on the issue, i'll set the status to Active.

Current progress is available at https://git.drupalcode.org/issue/tmgmt_deepl-3232134/-/tree/3232134-glos...

daveiano’s picture

I had a quick look over the commits. It seems the Glossary management part on the Drupal side is done, also the sync to Deepl?

What I don't get is why the deepl_glossary entity has translatable = FALSE? How to define the translations for the glossary items?

SteffenR’s picture

@daveiano:

The deepl_glossary entity won't get translated, cause you build up a glossary entity with its entries for a specific language pair. While running checkout, those glossary entities will be available for selection in the translation checkout process (since we have access to source/ target language in the checkout, we can check for possible glossary entities to let the user choose from).

Just have a look at the API documentation https://www.deepl.com/docs-api/managing-glossaries/

This is done roughly on my dev machine, but needs some polishing to get pushed to the branch.
Furthermore a cron will be added, to fetch glossaries from the deepl API automatically.

daveiano’s picture

The deepl_glossary entity won't get translated, cause you build up a glossary entity with its entries for a specific language pair. While running checkout, those glossary entities will be available for selection in the translation checkout process (since we have access to source/ target language in the checkout, we can check for possible glossary entities to let the user choose from).

Just have a look at the API documentation https://www.deepl.com/docs-api/managing-glossaries/

I am afraid I don't get it. How will the glossary entries then be translated?

I cloned the current state of the branch and tested a bit around. I noticed that the implementation of the "selectable glossaries" is implemented in the checkoutSettingsForm. To be honest, I had never seen this form. In my use case, people usually translate a node directly via the node translate tab and the "Request translation" button. The checkoutSettingsForm is only visible when adding a "Continuous Job", so no way to use glossaries with a "manual translation"?

Also: The adding of a glossary entry without adding a definition fails silently (I don't think the definition should be required).

Sorry for asking so many question, just want to understand it :D

SteffenR’s picture

@daveiano:

Let me explain.
The glossary entity stores entries for a specific language combination e.g. EN (English) → DE (German).
A translation of those entities would make no sense for this case.

DeepL uses the entries of the glossary and simply replaces our defined subject with the given definition.
This can be used, if you want a really specific translation for a given term, that differs from the translation deepl would provide while translating.

For that purpose, we have to provide a matching glossary with its entries while requesting a translation.
You can think of multiple glossaries per language pair or just single combination. But in all use cases, we need a selection of the glossary, we want to use for a given job.

The checkout form is the best place to add those informations to the current translation job (you won't need to build a continous job for that purpose). It's just the default translation workflow provided by the underlying tmgmt module. You may take tmgmt_file as an example, how to apply custom checkout settings or other tmgmt related translator providers.

Hope it helps to understand, how the whole glossary "thing" is working. The API documentation also provides lots of infos, how this is working internally. Just check the link in my previous comments.
A recent blog post of deepL also describes the use of glossaries https://www.deepl.com/en/blog/translate-your-way-with-the-deepl-glossary

daveiano’s picture

@SteffenR

Ah, I got you, thank you for the explanation.

With that in mind I had a look over MR10 and tested it again:

One reason for confusion on my site was the TMGMT setting "Allow quick checkout" (/admin/tmgmt/settings). If this is enabled, you will never see the checkout form (Only for continuous jobs, hence my confusion). I disabled this and now I can see and select a glossary for a translation job. Will we be able to select multiple glossaries per translation job?

And I think I found a Bug, I created a glossary with one entry (DE > EN):
subject: Eignungsprüfung (allgemein)
definition: aptitude test

this is my translation source: Eignungsprüfung (allgemein)
and this is the translated string from deepl: Aptitude test (general)

So the translation is correctly taken from the glossary, but the part after the space gets also translated.

So far my testing results, thank you for the explanation and the good work on this!

SteffenR’s picture

@daveiano
This is not a bug of the module. The module only stores the subject/ definitions in drupal and pushes those entries into a deepl glossary.
While translating we only pass the glossary id, that should be used by the translation.
The replacement part is happening on the deepL site.
You may use the desktop app of deepL to check, how glossary entries get replaced. While using your example, you'll get the same result. The word (allgemein) is also getting translated unless it should be part of the glossary entry.
This issue could only be addressed by deepL.

mrshowerman’s picture

Thanks @SteffenR for working on this feature!

I had the same confusion as @daveiano in #15; does this mean that if Quick Checkout is enabled, no glossary will be used?
If that's the case, couldn't we auto-select one, especially if there's only one glossary available for the current language combination?

I was also wondering why there's no option to remove glossary entries. Is this due to API restrictions? It is possible to delete entries on DeepL's website, though.

SteffenR’s picture

@mrshowerman
Right now the quick checkout (auto accept finished translations) won't support glossaries. But the suggestion you think ok could be integrated.
But we should think about having multiple glossary entities with the same target and source language for a translation job.
Which glossary should be used in this case?
Do we need any setting to enable/ disable the use of automatic glossary selection while quick checkout (auto-accept).

Editing / deleting of glossary entries is possible, since the module just uses a standard multiple field widget for editing the entries.
Deleting a single item in a multiple items field is another issue, but this is already being addressed in https://www.drupal.org/project/drupal/issues/1038316.

The API of deepL is "kind of limited" in its editing capabilities. You cannot edit existing glossaries directly, that's why, we delete/ recreate the glossary on "deepL side".

mrshowerman’s picture

Thanks for pointing out how to delete entries. I think this is a another good reason to fix that UI issue in #1038316: Allow for deletion of a single value of a multiple value field sooner than later, even experienced userstend to forget you simply have to empty the input field…

Concerning auto-selection of glossaries: I had been thinking about what to do in cases where more than one glossary matches, and the option to mark a glossary as default also came to my mind. But there can be cases where this still isn't unique, that's why I suggested to auto-select only when there's exactly one match.

vistree’s picture

Hi SteffenR - nice work!!! Is there any release plan? Is there a way to support with testing, ...?

SteffenR’s picture

@vistree
We don't have a release plan for the feature, since most of the development is happening besides our main projects.
But the current MR can be tested.
Feel free to do so and add comments in this issue.

Short instructions for testing:

  • install tmgmt_deepl and configure the free and/ or pro endpoint with your API key
  • install tmgmt_deepl_glossary
    • a new section DeepL Glossaries will be available at admin/tmgmt
    • start adding new glossaries at /admin/tmgmt/deepl_glossaries
    • create new translations via default tmgmt workflow - in case glossary is existing for source -> target language it will be used in your translation
    • if you have multiple glossaries for source -> target language combination, you can select between those glossaries (option needs to be set in tmgmt translator settings

There is still need for some unit tests and polishing of error messages, but the main functionality is working.
I'll rebase against latest 2.1.x/ 2.2.x release, so don't be suprised about the commits in the next comment..

  • SteffenR committed 6ac06a3a on 2.2.x
    Issue #3232134: Allow use of glossaries for translations
    
SteffenR’s picture

Version: 2.1.x-dev » 2.2.x-dev
vistree’s picture

Hi SteffenR,
really nice work!! Thank you!!!
One question: we want to use deepl with glossaries in multiple project: is it somehow possible to uninstall the module without deleting the glossary on deepl backend? So, remove only in Drupal - but let it stay on deepl?

SteffenR’s picture

@vistree
Currently this is not possible, because deleting a glossary would result in an API Call, which also deletes the glossary on deepL to keep things consistent.
If you have multiple sites, which should use the same glossary, you can run into some problems due to the API.

  • glossary entries cannot be edited directly via the API - instead, we have to do a delete/ add operation -> leads to new glossary_ids with every change
  • need of a cron to synchronize glossaries, instead of manual action on the glossary overview page
  • not deleting would lead to obsolete glossaries on API site (we cannot delete those via any UI besides the module)

Since the glossary is directly tied to the deepL API token (free or paid), you can create one glossary per project and use it standalone.

Another approach for sharing multiple glossaries would be a "main site", were the glossaries can be edited/ administered for all projects. But for this approach the module needs some additional features like:

  • more granular permissions for deleting/ editing glossaries
  • cron for automatic synchronisation of glossaries
SteffenR’s picture

Status: Active » Needs review

I set this issue to "Needs review" to get some feedback from the community.
Please play around with the feature and leave comments in the issue for getting a 2.2.0 release.
Alpha release is available for testing.

  • SteffenR committed 92f900d7 on 2.2.x
    Issue #3232134: Allow use of glossaries for translations
    
DuaelFr’s picture

I'll have to implement that in the next months. I'll provide feedback as soon as it's coming to my prioritary todo list.
Thank you for implementing that!

SteffenR’s picture

@DuaelFr: I justed wanted to ask, if everything went fine, while implementing the feature?

mrshowerman’s picture

@SteffenR, thanks very much once more for the great work on the glossary integration.
We started to use the new submodule, and it's working nicely. I was also suprised that we're now able to keep the "quick checkout" setting active, much appreciated!

I'd like to give feedback on one thing that came to my mind:

In order to enable editors to add, edit and delete glossary entries, it seems like not only the appropriate permissions are needed, but also the Administer DeepL glossary entities, otherwise they don't see the menu item. We might want to change this, so editors are able to edit glossary entries while not being permitted to edit the glossary entity itself.

SteffenR’s picture

SteffenR’s picture

@mrshowerman: Further work on this topic should be covered within a new issue. May you check the Proposed resolution there and comment in case i'm missing sth..
Thx.

mrshowerman’s picture

Left a comment in the related issue.

I also noticed a few wording issues concerning the new glossary functionality. Since this issue is still in status Needs Review, should I leave comments here, or do you prefer a separate issue, @SteffenR?

SteffenR’s picture

Yep. Just leave a comment in here. I'll take care of those issues. Thx.

mrshowerman’s picture

Ok, here we go:

  • In modules/tmgmt_deepl_glossary/src/Entity/DeeplGlossary.php, line 147:
    // The entries of of the glossary.
    

    => Duplicate "of". Same in line 151 of the same file, and also in modules/tmgmt_deepl_glossary/src/DeeplGlossaryApiBatchInterface.php (line 32) and modules/tmgmt_deepl_glossary/src/DeeplGlossaryApiInterface.php (line 57).

  • There are the terms "DeepL glossary" and "DeepL Glossary" (also in plural form), we might want to use only one of the two forms.
  • Similarly, we're mixing the terms "Glossary Id" and "DeepL glossary Id".
  • In modules/tmgmt_deepl_glossary/tmgmt_deepl_glossary.module, line 74:
        '#description' => t('By default it is possible to create one glossary for a source/ target language combination and the matching glossary will be selected automatically in the translation workflow. This will also enable selection of glossaries in the checkout form of translation job.'),
    

    This text is hard to read. What about this:

        '#description' => t('By default, it is possible to create only one glossary per source/target language combination, and the matching glossary will be selected automatically in the translation workflow. This setting will also enable selection of glossaries in the checkout form of a translation job.'),
    
  • In modules/tmgmt_deepl_glossary/tmgmt_deepl_glossary.permissions.yml, line 2:
    delete deepl_glossary entities:
      title: Delete DeepL glossary entities.
    

    => Either remove the trailing dot, or add it to the other permission titles as well.

  • In modules/tmgmt_deepl_glossary/src/Form/DeeplGlossarySyncForm.php, line 57:
        return $this->t('Sync deepL glossaries');
    

    Should be

        return $this->t('Sync DeepL glossaries');
    

    Same applies to lines 64 and 71.

  • Similar issue in modules/tmgmt_deepl_glossary/config/install/views.view.tmgmt_deepl_glossary.yml, lines 9 and 440.
SteffenR’s picture

Assigned: Unassigned » SteffenR
Status: Needs review » Active

  • SteffenR committed 3a90dcd8 on 2.2.x
    #3232134: fix phpstan/ phpcs issues in tmgmt_deepl_glossary submodule
    

  • SteffenR committed 5aa1ff70 on 2.2.x
    #3232134: fix typos - use 'DeepL' instead of 'deepL'
    

  • SteffenR committed a80c4341 on 2.2.x
    #3232134: update title of permissions - set title in single quotes and...

  • SteffenR committed ab62170c on 2.2.x
    #3232134: fix typo in DeeplGlossaryListBuilder - title for glossary_id...

  • SteffenR committed c117f162 on 2.2.x
    #3232134: fix typos - use "DeepL glossary" instead of "deepL glossary"
    

SteffenR’s picture

@mrshowerman Thx for the findings. I've changed the code according to your suggestions.
All occurences of deepL where changed to DeepL (like in their API documentation) and the description text was updated.
I also run a phpstan/ phpcs test against the tmgmt_deepl_glossary submodule and added some fixes.

The time was sponsored by my lovely daughter, who is currently at the swimming competition. (Dad taxi 🚕🚕🚕)

SteffenR’s picture

Status: Active » Needs review
mrshowerman’s picture

Status: Needs review » Reviewed & tested by the community

Thanks a lot! Also to your lovely sponsor 🏊‍♀️🏊

SteffenR’s picture

Status: Reviewed & tested by the community » Fixed

  • SteffenR committed 8ab1dda4 on 2.2.x
    #3232134: improve method getValidSourceTargetLanguageCombinations
    
     - we...

  • SteffenR committed 17f96d15 on 2.2.x
    #3232134: add glossary languages: Portuguese, Russian, Chinese
    
SteffenR’s picture

Assigned: SteffenR » Unassigned

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.