The strings to remove are human language words. A user either needs to enter words in all their languages or we need to add translation support so that words in other languages may be included as well. That would help with shipping translations for the module too. As in people could translate the words on localize.drupal.org for various languages.

Issue fork pathauto-2630382

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Gábor Hojtsy created an issue. See original summary.

gábor hojtsy’s picture

Project: Token » Pathauto
gábor hojtsy’s picture

Title: Strings to remove missing translation support » "Strings to remove" option missing translation support

Make title clearer.

berdir’s picture

The challenge with this is that we need to be able to pick the right language depending on the entity/language that we're currently working with. E.g. bulk generate with multiple translations.

Simple config translation doesn't help much here, I think we'd actually need to store it explicitly per language.

gábor hojtsy’s picture

You can load the right language needed with config translation, it is just that by default it load in the language for the request. You can set a different language. This snippet from user_mail annotated for this issue:

  // Remember config language.
  $language_manager = \Drupal::languageManager();
  $original_language = $language_manager->getConfigOverrideLanguage();
  
  // Set whatever language we need the config in.
  $language_manager->setConfigOverrideLanguage($language);

  $mail_config = \Drupal::config('user.mail');

  // Do stuff with the mail config in $language :)
  // [.....]

  // Restore config language.
  $language_manager->setConfigOverrideLanguage($original_language);
rodrigoaguilera’s picture

Status: Active » Needs review
StatusFileSize
new729 bytes

I noticed that the "ignore_words" configuration inside pathauto.settings is defined in the schema as "string" but intuitively it should be declared as "label".

I think is the only setting that should be translatable inside pathauto.settings.
It will also need some configuration to show a menu item to translate this.

I understand that the configuration should apply only to aliases that match the language in the configuration but at least with this patch users creating content with an active language that matches the configuration language will have the appropriate strings removed.

The patch is also a small start.

rodrigoaguilera’s picture

Issue tags: +Needs tests
StatusFileSize
new9.2 KB
new9.61 KB

There is some tricky static cache there. I added the langcode as a context for that static cache but a real refactoring is needed for that piece of code.

damienmckenna’s picture

Component: Code » I18n stuff
manuel.adan’s picture

Re-roll against the latest -dev. Interdiff fails due changes in base code. Only one real change, on pathauto.config_translation.yml, from:

pathauto.settings:
  title: 'Patterns settings'
  base_route_name: path.admin_overview
  names:
    - pathauto.settings

to:

pathauto.settings:
  title: 'Pathauto settings'
  base_route_name: pathauto.settings.form
  names:
    - pathauto.settings
s3b0un3t’s picture

It seems that the patch was applied to version 1.7 of the module.
However, it does not appear in the realase note.

extralooping’s picture

It seems that the patch was applied to version 1.7 of the module.

Unfortunately this is not the case: pathauto.config_translation.yml is not present in 1.7 & 1.8.

rodrigoaguilera’s picture

No, this patch was not applied.

Here is a reroll of #9

berdir’s picture

What is the use case here exactly? You would need to have a scenario where string xyz should be removed from some language but not another or you could just store a combined list of all languages?

This likely won't be committed without test coverage

+++ b/src/AliasCleaner.php
@@ -159,10 +159,24 @@ class AliasCleaner implements AliasCleanerInterface {
+
+    if (empty($this->cleanStringCache[$langcode])) {
+      // Remember config language.
+      $original_language = $this->languageManager->getConfigOverrideLanguage();
+      // Set the language we need to fetch config from.
+      $this->languageManager->setConfigOverrideLanguage($this->languageManager->getLanguage($langcode));
       // Generate and cache variables used in this method.
       $config = $this->configFactory->get('pathauto.settings');

this could be optimized to check if the language really is different from the already configured language, switching this has some overhead that we can avoid in many csaes. Also, what happens if a node has neutral/not applicable as language code?

rodrigoaguilera’s picture

Bad reroll, I attach the fix.

I don't remember the exact use case that I had in 2018 but It was related with the translation of the prepositions to Spanish. We have some like 'con' or 'sin' which are actual words in English so I use this patch to remove them from Spanish but not from English.
Do you think this is a valid use case?

The behavior for neutral/not applicable should be to fallback into the default language.

Status: Needs review » Needs work

The last submitted patch, 14: 630382-strings-remove-translatable-14.patch, failed testing. View results

s3b0un3t’s picture

@rodriguoaguilera, the use case you had at the time and which corresponds to mine today.
We must be able to configure different stopwords per language. As you said, the stopwords are not the same in English, Spanish, French, etc.

My mistake, I was wrong, the module does not yet incorporate a configuration management by language because it continues to cache a single configuration in the "AliasCleaner" class.

manuel.adan’s picture

Status: Needs work » Needs review
StatusFileSize
new8.08 KB
new1.39 KB
  1. +++ b/src/AliasCleaner.php
    @@ -159,10 +159,24 @@ class AliasCleaner implements AliasCleanerInterface {
    +    if (!empty($options['language'])) {
    +      $langcode = $options['language']->getId();
    +    }
    +    elseif (!empty($options['langcode'])) {
    +      $langcode = $options['langcode'];
    +    }
    +
    +    if (empty($this->cleanStringCache[$langcode])) {
    

    Variable $langcode is required even if not received in options. Fallback to the site default language should be safe.

    Also, the "language" options value is not documented, not sure why to check it here. Even when the source of options is a token callback, the language is given in the "langcode" option.

  2. +++ b/src/AliasCleaner.php
    @@ -233,49 +239,51 @@ class AliasCleaner implements AliasCleanerInterface {
         // Replace or drop punctuation based on user settings.
         $output = strtr($output, $this->cleanStringCache['punctuation']);
    ...
    +    // Replace or drop punctuation based on user settings.
    +    $output = strtr($output, $this->cleanStringCache[$langcode]['punctuation']);
    

    Missed the langcode prefix getting the punctuation settings, that produces lot of undefined index notifications.

    Also, this line seems to be duplicated. Might be due to the re-roll.

google01’s picture

Unfortunately this patch has stopped working for version 1.11

The ability to translate the "strings to remove" into other languages by translating the "Pathauto settings" is essential for multilanguage projects.

pflora’s picture

Assigned: Unassigned » pflora
Status: Needs review » Needs work

I'll try to make the patch work for version 8.x-1.11.

pflora’s picture

Assigned: pflora » Unassigned
Status: Needs work » Needs review
StatusFileSize
new8.08 KB

I had to do a reroll. Now it's able to be applied to 8.x-1.x.

The biggest change that i had to make was on line 216 of the AliasCleaner.php.

    $langcode = 'en';
    if (!empty($options['language'])) {
      $langcode = $options['language']->getId();
    }
    elseif (!empty($options['langcode'])) {
      $langcode = $options['langcode'];
    }

I removed this chunk of code because $langcode was already beeing defind at the top of the getCleanSeparators() method by the following:

    $langcode = isset($options['langcode'])
      ? $options['langcode']
      : $this->languageManager->getDefaultLanguage()->getId();

If there is anything that i've missed, feel free to give me any kind of feedback!

google01’s picture

Thanks to @pflora for providing us with the patch for version 8.x-1.11 so quickly.

The translations of "Pathauto settings" have been recovered and its operation in several languages has been validated.

jsricardo’s picture

Assigned: Unassigned » jsricardo

I Will review it

jsricardo’s picture

Assigned: jsricardo » Unassigned

Hello
I tested it and it seems to be ok.
I will move to RTBC

jsricardo’s picture

Status: Needs review » Reviewed & tested by the community
nelo_drup’s picture

Although the patch works I have the following doubt
now I have this option to translate the string as seen in the image

but if I enter the Spanish language, for example, it takes me to the normal route as seen in the image, so how would I separate the texts to be eliminated? general interface here

hgeshev90’s picture

Hello,

I encountered an issue with the max length of the translation field Strings to Remove. It allows only 128 characters, which is not enough for most of the suggestions per language (https://www.drupal.org/node/847370).

I am proposing a solution which is to change the type of the field from label to text. Tested it and it seems to be ok. Here is a patch for the change.

berdir’s picture

Status: Reviewed & tested by the community » Needs work

Thanks, should should be done as a merge request now to run current tests. A test for this would be really useful, as this is changing a lot of code, even if it's fairly simple, it's pretty easy to miss one of those cases.

mably made their first commit to this issue’s fork.

mably’s picture

Status: Needs work » Needs review
Issue tags: -D8MI, -Needs tests

Pathauto: "Strings to remove" translation support — code review fixes

The original commit (834a143) added translation support for the
ignore_words ("Strings to remove") setting, allowing site administrators
to configure different stop words per language. Our review identified several issues
and we applied the following fixes.

Bug fixes in AliasCleaner.php

1. Replaced global state swapping with targeted config override lookup

The original approach temporarily changed the site-wide config language using
setConfigOverrideLanguage(), fetched the config object, then restored
the original language. This had two problems:

  • The restore happened before the $config->get() calls,
    so the translated values were never actually used (Drupal resolves overrides
    at get() time, not at construction).
  • Swapping global state is fragile and could affect other code running in the
    same request.

We replaced this with a targeted call to
getLanguageConfigOverride($langcode, 'pathauto.settings'), which
directly fetches the translated ignore_words value for the given
language without touching global state. A ConfigurableLanguageManagerInterface
check ensures compatibility with monolingual sites where the language module is
not installed.

2. Only ignore_words is translated, not the entire config

The original code fetched the entire pathauto config under the language
override, meaning a translator could accidentally override site-wide settings like
separator, transliterate, or case for a
specific language. Now only ignore_words is read from the language
override; all other settings come from the base config.

3. Simplified and corrected langcode resolution

The original code dropped support for the $options['language']
parameter (a Language object). After investigation, we confirmed that pathauto
always passes a string langcode in its options — never a Language object — so this
parameter was dead code. The langcode resolution was simplified to:

$langcode = $options['langcode'] ?? $this->languageManager->getDefaultLanguage()->getId();

This also fixes the original hardcoded 'en' default, which would
have been wrong on non-English sites.

Test fixes in PathautoKernelTest.php

4. Fixed incorrect use of t() in test assertion

The previous commit changed a test assertion message from
new FormattableMarkup(...) to t(...).
t() is for translatable UI strings, not test messages — reverted to
FormattableMarkup.

5. Added testCleanStringTranslatedIgnoreWords()

New kernel test verifying the core feature: that ignore_words is
translated per language. It sets up English stop words ("the, a, an") and French
stop words ("le, la, les, un, une") via config override, then asserts:

  • "the big house" with langcode=en
    "big-house" ("the" stripped)
  • "la grande maison" with langcode=fr
    "grande-maison" ("la" stripped)
  • "the big house" with langcode=fr
    "the-big-house" ("the" is not a French stop word)
mably’s picture

Assigned: Unassigned » berdir