Pathauto 1.6 no longer checks whether a 'new' alias is identical to an existing alias when the Update action is set to 'Create a new alias. Leave the existing alias functioning.' This results in the creation of a new, identical alias entity every time a node is saved, even if no changes were made to the node content or resulting alias.

This isn't immediately obvious as the aliases continue to work as expected. If pathauto's 'Verbose' setting is enabled, you'll get a "Created new alias..." message every time a node is edited.

Steps to reproduce

  1. Ensure Update action is set to Create a new alias. Leave the existing alias functioning. (/admin/config/search/path/settings)
  2. Create a new content item for an entity with pathauto enabled
  3. Visit /admin/config/search/path and find the alias you just created
  4. Edit and save the content item, making no changes
  5. Refresh /admin/config/search/path and confirm duplicate alias entities

Issue fork pathauto-3107144

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

justcaldwell created an issue. See original summary.

justcaldwell’s picture

justcaldwell’s picture

Status: Active » Needs review

Sorry -- updating status.

Renrhaf’s picture

Priority: Normal » Major
Status: Needs review » Reviewed & tested by the community

Same configuration here, I can confirm the bug. Drupal 8.8.2 and pathauto 8.x-1.6.
After applying the patch and testing again, no more duplicated path alias were created.
Thanks @justcaldwell for the patch !
Setting priority to major as it creates a lot of duplicated data. Site owners would need to clean things without this patch.

Berdir’s picture

Issue tags: +Needs tests

A test would be very useful there.

To be honest, I'm not sure what the use case for that setting is. Why would you want multiple aliases for the same thing? Using redirect.module to create redirects for changing aliases is IMHO much better for SEO for example.

hkirsman’s picture

Wow, just picked up random nodes and they do have lots of duplicate.

IMHO we should also clean up database from duplicates and what about having unique index on langcode, path and alias?
CREATE UNIQUE INDEX index_name ON path_alias(langcode,path,alias);

Is that core proposal? Is index only for db performance or is it also ok for blocking duplicates and making sure it doesn't happen again on db level?

Berdir’s picture

yes, that's definitely core, we just generate them.

That said, I'm still not sure what the use case of this feature is, see #5. Also, as said there, a test would make it easier to commit this.

hkirsman’s picture

You are right, I don't see the point of that feature either if you have Redirect module with "Automatically create redirects when URL aliases are changed." option enabled. It's usability is also much better as you can see from the node itself what redirects you have. With this current Pathauto feature they are all hidden. Probably you could find them on the /admin/config/search/path page but that isn't that good.

Here I initially created /foobar link, then started adding up numbers to it. All redirects are visible in admin and they redirect to current alias. All what I would need.
Foobar links

Still, today I've been figuring out how to remove the duplicates from database. I have more than 600 of them.

Here's SQL query to find them:

SELECT id, 
       revision_id, 
       langcode, 
       path, 
       alias, 
       status, 
       Count(alias) 
FROM   path_alias 
GROUP  BY langcode, 
          path, 
          Cast(alias AS BINARY) 
HAVING Count(alias) > 1 
ORDER  BY id ASC 

Note that it needs BINARY to be case sensitive.

Here's query that deletes all of those duplicates, keeping the one with highest id:

DELETE t1 
FROM       path_alias t1 
INNER JOIN path_alias t2 
WHERE      t1.id < t2.id 
AND        t1.langcode = t2.langcode 
AND        t1.path=t2.path 
AND        binary(t1.alias) = binary(t2.alias)

This is based on https://www.mysqltutorial.org/mysql-delete-duplicate-rows/

Didn't create any patch yet as wanted to get some feedback.

It still keeps behind aliases that are unique. If we would go down the path of Redirect then those would be needed to move to redirects, right? At least it would make sense to me to clean up path_alias table and only have 1 alias per entity. I don't see this change being put only in update hook as site could not be having Redirect module. What could also be done would be to add special page for converting and message on status page? Or only message on status page with link to 'fix the issue' or something like that.

I guess all in all it's 50% "political" issue.

Edgar Saumell’s picture

Hi,

Same here using Drupal 8.9.7 and Pathauto 8.x-1.8

Since I'm also using Redirect 8.x-1.6 https://www.drupal.org/project/redirect I solved it by setting 'Create a new alias. Delete the old alias.' on /admin/config/search/path/settings and 'Automatically create redirects when URL aliases are changed.' on /admin/config/search/redirect/settings
Then I deleted all aliases and bulk generated them again.

joseph.olstad’s picture

+1 Thanks for the patch, great work.

joseph.olstad’s picture

Version: 8.x-1.6 » 8.x-1.8
kkaya’s picture

Thank you all for the work on this issue. I found that the patch in #2 did not work in the following case:

1) Have pathauto setting "Create a new alias. Leave the existing alias functioning" and basic page aliases are automatic based on the page title.

2) Add new basic page with title "New Page" and path_alias table entry is created for /new-page

3) Edit page and change title to "New Page Test" and path_alias table entry is created for /new-page-test and correctly keeps /new-page

4) Edit page and change title back to "New Page" and now there's a duplicate entry in the path_alias table for /new-page (/new-page-test is still kept)

deneus18’s picture

The patch #2 does work for me. Thanks for that.
Drupal 9.2.1 and pathauto 8.x-1.8.

jweowu’s picture

The patch in #2 also worked here.

Can this be committed? The bug bloats the path_alias tables which can in turn cause significant performance problems in Drupal 8 and 9 (see #2988018: [PP-1] Performance issues with path alias generated queries on PostgreSQL ), so it's a shame that it hasn't been released at any point in the 1.5 years that it's been RTBC.

The issue in #12 is more complex and I think it should be deferred to a separate issue. Drupal just picks the most recent alias for the path, so we can't not create a new alias in that scenario. You would instead have to delete the older duplicate, which seems like a more significant decision, and I don't think aliases ping-ponging back and forth in that way is a problem which will affect very many sites.

jweowu’s picture

For postgres I used the following query to purge the duplicates:

WITH
  valid_path_alias AS (
    SELECT MAX(id) AS id
      FROM path_alias
     GROUP BY alias, langcode, path, status
  ),
  unwanted_path_alias AS (
    SELECT a.id
      FROM path_alias a
           LEFT JOIN valid_path_alias valid
                     ON valid.id = a.id
     WHERE valid.id IS NULL
  ),
  delete_from_path_alias AS (
    DELETE FROM path_alias a
     USING unwanted_path_alias u
     WHERE u.id = a.id
  )
  DELETE FROM path_alias_revision a
   USING unwanted_path_alias u
   WHERE u.id = a.id
;

That should be relatively quick even with extremely large tables (tested with 1.3 million rows).

I should note that while path aliases are nowadays revisionable entities, nothing in my site has utilised that ability, and so in my database the revision data is effectively a needless duplication / waste of space. Rather than duplicate aliases (or aliases for a given path in general) being related revisions, each is actually a completely independent entity with a distinct id and only a single 'revision'. For me, the path_alias table rows were 100% identical to the path_alias_revision rows, save for the single column each table has that the other does not: path_alias.uuid and path_alias_revision.revision_default; and the value of the latter is consistent across all rows. Only uuid actually varies between duplicates, and the uuid is not used to determine the alias for a path; alias lookup is simply taking the newest matching row. I have not seen anything which references these uuids at all, in fact. (EntityInterface requires that a uuid is included, and fulfilling this requirement might be the sole purpose of this column.)

You may wish to verify that you see the same in your database, as if something is using alias revisions in your site, you may need to take a different approach to purging unwanted rows.

joseph.olstad’s picture

a lot safer way to clean out aliases is to use the api

/**
 * Implements hook_update().
 * Delete the duplicate url aliases for the node.
 */
function mymodule_update_9000() {

  $query = \Drupal::entityQuery('node');
  $group = $query->orConditionGroup()
    ->condition('type', 'page', '=')
    ->condition('type', 'article', '=')
  $query->condition($group);
  $nids = $query->execute();

  $path_alias_storage = \Drupal::entityTypeManager()->getStorage('path_alias');
  foreach ($nids as $nid) {
    // langcode = en & langcode = fr
    $actual_alias_e = \Drupal::service('path_alias.manager')
      ->getAliasByPath('/node/' . $nid, 'en');
    $actual_alias_f = \Drupal::service('path_alias.manager')
      ->getAliasByPath('/node/' . $nid, 'fr');

    // Load all path alias for this node.
    $alias_objects_e = $path_alias_storage->loadByProperties([
      'path' => '/node/' . $nid, 'langcode' => 'en',
    ]);
    if (count($alias_objects_e) > 1) {
      //keep the latest revision
      array_pop($alias_objects_e);
      foreach ($alias_objects_e as $alias_object_e) {
        // delete duplicate record only
        if ($alias_object_e->get('alias')->value == $actual_alias_e) {
          $alias_object_e->delete();
        }
      }
    }

    // langcode = French
    // Load all path alias for this node.
    $alias_objects_f = $path_alias_storage->loadByProperties([
      'path' => '/node/' . $nid, 'langcode' => 'fr',
    ]);
    if (count($alias_objects_f) > 1) {
      //keep the latest revision
      array_pop($alias_objects_f);
      foreach ($alias_objects_f as $alias_object_f) {
        // delete duplicate record only
        if ($alias_object_f->get('alias')->value == $actual_alias_f) {
          $alias_object_f->delete();
        }
      }
    }
  }
}

this example could obviously be improved with getting the enabled languages and putting them in an array looping through.
but get the jist of it.

jweowu’s picture

Agreed -- if you're dealing with sufficiently small amounts of data that the APIs are fast, do that.

In my case using the APIs to delete aliases this way would have taken around seven hours (extrapolating from a test run of your code with some added timing), versus 30 seconds for my query. (Using the APIs to do everything else I needed to do would have taken about a week, so I was already well past the point of API calls.)

joseph.olstad’s picture

@jweowu

Yes we're both correct, I would have done your approach had there been a huge amount, only had maybe 16000 or so to process so not bad.

eleonel’s picture

Hi, using the patch from #2 doesn't solves the problem for me (even using Create a new alias. Delete the old alias.) so I created the following patch to control if we already have an alias for the current source using loadBySourcePrefix($source) function:

if ($existing_alias) {
      /** @var \Drupal\path_alias\PathAliasInterface $existing_alias */
      $existing_alias = $this->entityTypeManager->getStorage('path_alias')->load($existing_alias['pid']);
    }
    elseif ($pid = $this->loadBySourcePrefix($source)) {
      /** @var \Drupal\path_alias\PathAliasInterface $existing_alias */
      $existing_alias = $this->entityTypeManager->getStorage('path_alias')->load(reset($pid));
    }
jweowu’s picture

#19 looks strange to me. $existing_alias is obtained via loadBySource() for a specific path. By contrast, loadBySourcePrefix() returns an array of (potentially) multiple matching paths, so (a) why wouldn't you have been dealing with a specific path in the first place? and (b) why are you confident that the first item of that array is the one you want?

What is the scenario where you needed to do that?

eleonel’s picture

What is the scenario where you needed to do that?

During a node creation I can see duplicated path alias (two) are being created for the same node, and $existing_alias is NULL for both aliasStorageHelper->save() calls.

Why are you confident that the first item of that array is the one you want?

Because that elseif will be executed only when $existing_alias is NULL, meaning when no other path alias for the given node exists.

jweowu’s picture

Because that elseif will be executed only when $existing_alias is NULL, meaning when no other path alias for the given node exists.

But $existing_alias was obtained using loadBySource() not loadBySourcePrefix(). Is it guaranteed that there's only one alias with the given source prefix?

Moreover, is it guaranteed that the alias you get from that has the intended source at all?

(I haven't tried to analyse the code at all; it just sounds like a dodgy assumption from the method names.)

Neslee Canil Pinto’s picture

Had same problem, #2 solved the issue for me 👍🏼

Berdir’s picture

Version: 8.x-1.8 » 8.x-1.x-dev
Status: Reviewed & tested by the community » Needs work

What I said in #5 is still true, I don't fully understand the use case and changes around this, also looks like different people have different problems, so I will require a test to demonstrate the problem and show that the patch fixes it.

awm’s picture

@berdir, right now pathauto creates identical aliases every time a node is saved. There are settings form that set "update_action" where one may prevent this but I still don't know if path auto should allow duplicates. Is there a use case for it? I believe the patch is to prevent that. In my case, I have millions of duplicates in path_alias and path_alias_revisions. Also by duplicates I mean identical path and alias. Is this by design or a bug?

Berdir’s picture

That's not a behavior that I'm seeing, which is why I'm asking for automated tests to reproduce the problem. It might happen only in combination with other core patches or contrib modules or something.

awm’s picture

@berdir, make sure to set update_action to to 1 in pathauto settings (pathauto.settings.yml).

justcaldwell’s picture

I just created a fresh Drupal instance (9.3.6) with no patches or contrib modules other than Pathauto 8.x-1.9 and its dependencies. The issue persists.

Here's a more detailed set of steps to reproduce the behavior. I hope they clarify the problem, and perhaps help someone write a supporting test 🙏:

  1. Enable pathauto.
  2. Visit /admin/config/search/path/settings and change the Update action to Create a new alias. Leave the existing alias functioning. Save the configuration. Screenshot
  3. Visit /admin/config/search/path/patterns and create a test pattern for the Article content type. I used article/[node:title]. Screenshot
  4. Create a new Article node.
  5. Visit /admin/config/search/path and you should see a single alias for the new node. Screenshot
  6. Visit the edit form for the Article created above, do not make any changes, simply click Save. Screenshot
  7. Visit /admin/config/search/path and you should see two exact duplicate aliases for the node. Screenshot

Given the settings and pattern used, a new alias should be created only if the node title changes, but with this setting, Pathauto creates a duplicate alias every time the node is updated, even if there are no changes to the content/fields. Go ahead, re-save that node a few times with no changes — you'll get a duplicate alias each time. Put Pathauto in Verbose mode, and you'll get a "Created new alias" message every time you save. I can't think of any use case for creating an unlimited number of exact duplicate alias entities.

This bug is potentially responsible for some pretty significant database bloat (millions of dupes!?!? 😬). The patch in #2 still fixes it. So, I respectfully suggest that we focus on getting the fix committed, and that we…

Move other concerns to separate issues

RE #5: I agree that it makes more sense to use Redirect, but I think that's a larger discussion — whether to deprecate an existing feature, provide an upgrade path (or not), add a new dependency, etc. — more suited to a separate issue.

RE #6: Cleaning up the database seems like the way to go, but if coding/testing the update will unduly delay getting a fix committed, it may be better to handle in another issue.

RE #12: This is a related (but probably less common) bug and should be addressed — you guessed it — in its own issue.

Patch #19 didn't fix the issue in my testing. The comment references a different setting (Create a new alias. Delete the old alias), so maybe it's trying to solve a different problem?

If you made it this far, thanks for reading! 👋

awm’s picture

@berdir can you point us to the code where pathauto prevent (supposed to prevent) duplicates. Is it here: https://git.drupalcode.org/project/pathauto/-/blob/8.x-1.9/src/PathautoS...

update that's not it.

kevin.pfeifer’s picture

Alright my fellow Drupal DEVs, I have some more info.

Since we are also plagued by this issue which creates thousands of duplicate path aliases I had to go deep and find out why.

The first thing which stood out to me was the fact, that no matter where I set the breakpoint inside the pathauto module it doesn't get triggered when updating a node with no changed path field.

So I had to add a breakpoint into the Core Database Connection Class where the insert query is being created and I found some evidence.

See https://screenshot.sunlime.at/4befe7ff83aaf3270dcfceb611a30826.png

As you see in the screenshot this happens inside Drupal Core docroot/core/modules/path/src/Plugin/Field/FieldType/PathItem.php inside the postSave($update) function.

$update is indeed true as you see at the bottom right but $this->pid is not present, therefore the $path_alias_storage->create() method is called.

I can't tell you why this check is in there but that is in my opinion the root cause of this whole mess.

My used versions:
Drupal Core 9.3.6
Pathauto 1.9.0

Berdir’s picture

pathauto should prevent PathItem from saving at all. Are you using the pathauto widget and field item class?

See https://git.drupalcode.org/project/pathauto/-/blob/8.x-1.9/src/PathautoI..., postSave() must not be called if the pathauto state is not skip.

kevin.pfeifer’s picture

Well I do fall into https://git.drupalcode.org/project/pathauto/-/blob/8.x-1.9/src/PathautoI... because in my code I set

'path' => array('alias' => '/produkte/'.$name, 'pathauto' => \Drupal\pathauto\PathautoState::SKIP)

which in my opinion is the correct value to set for pathauto since I don't want drupal to generate a path alias for me since I am already setting it myself

kevin.pfeifer’s picture

Also these values are the same in the backend if one looks into the generated checkbox and input name attribute values.

Therefore we thought its the same if we programmatically fill nodes with these field data like so

$entity = $this->node_storage->load($id);
$entity->set('path', array('alias' => '/produkte/somename', 'pathauto' => \Drupal\pathauto\PathautoState::SKIP));
if($entity->save()) {
  // some more logic
} else {
  
}

Or is that not correct?

Berdir’s picture

Well, if you set the path like that yourself then why do you think that it's the fault of pathauto? Yes, you can skip it (not sure why you even would have a pattern then in that case), but if you set it like that on updating content, then the core path field is likely the one creating duplicates and your problem has nothing to do with this module.

The path field lazy-loads the existing path, but if you set it yourself you're probably skipping that. You'll need to check the current path, which should load it including pid and then that should result in not creating duplicates.

Berdir’s picture

Going back to the reported issue here, I was able to confirm it with the update setting set to generate new alias. Yes, there is a bug, it should not create duplicate identical aliases.

However, I still struggle to see use case for that specific setting.

In almost every case, you should set pathauto to update the alias and let redirect.module create a redirect from the old path. Then you have a single canonical alias and any old path redirects.

The UI of that setting is very confusing, I think we could improve that and make the distinction and recommended setting clearer.

What hasn't changed is that to commit this, tests are needed. It shouldn't be that complicated to extend the existing coverage in \Drupal\Tests\pathauto\Kernel\PathautoKernelTest::testUpdateActions, resave the entity once more and verify that there is no extra alias for that. \Drupal\Tests\Traits\Core\PathAliasTestTrait::loadPathAliasByConditions can't be used directly as it only returns one, but you can just copy the entity query and assert that exactly one match is found.

junaidpv’s picture

#2 helps!

rex.barkdoll’s picture

Hey all, Thank you for all your hard work on this. Has anyone made progress on coming up with tests for this so that we can have a working patch to test? It's out of my skillset, but these duplicates are killing my sites as well.

plopesc’s picture

We have experienced this issue and patch in #2 worked for us.

Providing new patch including tests to push this one.

Thank you!

The last submitted patch, 38: 3107144-38-test-only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

podarok’s picture

Status: Needs review » Reviewed & tested by the community

#38 looks good
Thank you

jigarius’s picture

In the end, will a @hook_update_N()@ be included in the fix to remove existing duplicates?

jweowu’s picture

In extreme cases, such an update hook might be problematic. E.g.:

https://www.drupal.org/project/pathauto/issues/3107144#comment-14278822

DamienMcKenna’s picture

Could people who said the original patch did not work please test out patch #38? Thank you.

kevin.pfeifer’s picture

In the end, will a @hook_update_N()@ be included in the fix to remove existing duplicates?

We had multiple websites with millions of duplicate alias entries.
In the end we cleaned it up ourselfs and adjusted our sync scripts to delete duplicate aliases manually on each sync run.

joseph.olstad’s picture

Patch #38 works for us. We've been all in on this solution for quite some time. Has tests, it's RTBC for us.

jweowu’s picture

Status: Reviewed & tested by the community » Needs work

#38 says it's the patch from #2 plus tests, but comparing:

- https://www.drupal.org/files/issues/2020-01-17/pathauto-3107144-2.patch
- https://www.drupal.org/files/issues/2023-06-07/3107144-38.patch

#38 retains some now-redundant code which was removed in the original patch. This looks like a mistake (albeit a functionally-harmless one).

I still see that code in the current 8.x-1.x HEAD commit, so it seems like a going concern:
https://git.drupalcode.org/project/pathauto/-/blob/ed2d4d30479ddbc051281...

xurizaemon’s picture

I tried to make an interdiff of those two patches, but interdiff failed. First time I've seen that.

$ interdiff pathauto-3107144-2.patch 3107144-38.patch > interdiff-3107144-2-to-38.txt
1 out of 1 hunk FAILED -- saving rejects to file /tmp/interdiff-1.Y0ZoFC.rej
interdiff: Error applying patch1 to reconstructed file
joseph.olstad’s picture

joseph.olstad’s picture

Status: Needs work » Needs review
joseph.olstad’s picture

I feel bad for all the tens of thousands of instances just accumulating aliases and compounding a mistake into a larger and larger mess until eventually they just complain but have no clue what is the root cause of the issue.

Status: Needs review » Needs work

The last submitted patch, 48: 3107144-48.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

jweowu’s picture

Status: Needs work » Reviewed & tested by the community

Back to RTBC.

The only suggestion from the testbot in #51 which is in any way connected to this patch is:

+  /**
+   *
+   */
   public function assertAliasIsUnique($conditions) {

As that function is in line with the others in the file, I don't think that should keep this patch from being RTBC. (Adding docs to all of those test functions could always be a separate issue.)

xurizaemon’s picture

Status: Reviewed & tested by the community » Needs work

Sorry, not RTBC due to test failure.

@joseph.olstad including an interdiff (docs, more info) between the patch you're improving on and the one you're proposing helps fixes to land faster by making review easier.

jweowu’s picture

Ok, but test failures which have nothing to do with the patch should not be shoe-horned into the patch, so I believe only that single function header doc is needed. Anything else is a different issue.

xurizaemon’s picture

OK, if it's really unrelated then it shouldn't block merging - apologies if that's the case.

I saw that #38 above did pass, and #48 didn't, and took from that that #48 introduced test failures. If the main branch tests were failing (most recent run passed) then the tests would not run in this issue, therefore tests failing in this issue (IMO) are introduced in this issue. Caveats there - the most recent run on 8.x-1.x was when the last merge happened, back in June.

If we want the maintainer to merge this, we need to make it easy for them to determine as a pass.

Will leave to maintainer discretion (and I won't set this issue back from RTBC for that reason ... although the d.o testbot may).

Have requeued tests for both Drupal 9.5/PHP7 and Drupal10.1/PHP8.1

xurizaemon’s picture

This failing test behaves differently when tested with 10.1 vs earlier versions, via #3328670: PHP 8.2 compatibility (97e46eed5a).

The failing test is on /admin/config/search/path/patterns at the time of failure, and I think that failure is unrelated to the code and tests changed in this issue.

That does however explain why we see it pass when tested with 10.1, and fail on MR runs against 9.5 or 10.0 😀

jweowu’s picture

Status: Needs work » Needs review

Thanks for digging into that, xurizaemon.

I've re-rolled the patch and I went ahead and fixed all the phpcs issues I was getting. This includes ones which the testbot hadn't complained about, so maybe this project has more relaxed testing settings?

I've split it into three commits (and hence I've used a fork/MR): one for the testbot complaints, one for the actual issue patch, and one for the additional phpcs fixes. If the maintainer doesn't want the last commit, it'll be easy to omit that.

I've queued up the same tests as before, and I presume we'll get the same failure on 9.5.

xurizaemon’s picture

FYI to those keen to land this - thanks for your persistence especially, and I hope my pushback here didn't feel like I was trying to slow progress.

Last night I opened #3390330: Failing test in module's existing coverage prevents unrelated patch submissions passing tests to take a look at it, and based on feedback to that issue opened #3390491: Switch to Gitlab CI which it is hoped will clear the PathautoUiTest test failure blocker on this and any other Pathauto issues.

xurizaemon’s picture

Status: Needs review » Reviewed & tested by the community

Now that #3390330: Failing test in module's existing coverage prevents unrelated patch submissions passing tests is fixed, I think this can go to go back to RTBC; I've tested that failing test locally on Drupal 9.5.x (on MySQL), and it passed. I believe that test failure on versions below 10.1 is more "DrupalCI being weird" and not an actual issue with earlier versions of Drupal.

jweowu’s picture

Excellent, thank you xurizaemon; duly ignoring results for versions < 10.1 and I've queued up another couple of 10.1 test runs.

I see there was a failure for "PHP 8.2 & MySQL 8, D10.1" previously, so I'm also re-testing that in case that was temporary. It's probably not temporary, but I think not connected to this patch either, so I've queued up a test with the same parameters on your no-op patch in #3390330: Failing test in module's existing coverage prevents unrelated patch submissions passing tests to verify whether that's the case.

...and in fact that's finished already with the same failure; so I believe we can also ignore that one for the purposes of this issue.

jweowu’s picture

My new test here with PHP 8.1 + MySQL 8 failed as well, along with the repeat test for PHP 8.2 + MySQL 8 and the equivalent tests on the no-op patch, so it looks like there's an issue with pathauto and MySQL 8? (There's no test option for PHP 8.2 + MySQL 5.7 though, so I can't completely isolate it to that.)

jweowu’s picture

Just to summarise the last set of tests wrt recent discussion:

Passed for Drupal 10.1 (the only currently-testable version for this project):

PHP 8.1 & pgsql-14.1, D10.1 48 pass
PHP 8.1 & pgsql-13.5, D10.1 48 pass
PHP 8.1 & MySQL 5.7, D10.1 48 pass
PHP 8.2 & pgsql-14.1, D10.1 48 pass
PHP 8.2 & pgsql-13.5, D10.1 48 pass

Failing on account of unrelated issues testing Drupal versions < 10.1:

PHP 7.3 & MySQL 5.7, D9.5 46 pass, 2 fail
PHP 8.1 & pgsql-13.5, D9.5 46 pass, 2 fail
PHP 8.1 & pgsql-13.5, D10.0.7 46 pass, 2 fail

Failing on account of unrelated issues testing with MySQL 8:

PHP 8.1 & MySQL 8, D10.1 43 pass, 1 fail
PHP 8.2 & MySQL 8, D10.1 43 pass, 1 fail

  • Berdir committed ea4c655b on 8.x-1.x authored by jweowu
    Issue #3107144 by justcaldwell, joseph.olstad, eleonel, hkirsman:...
Berdir’s picture

Status: Reviewed & tested by the community » Fixed

I'd appreciate if someone could open a separate issue about the MySQL 8 test fails.

Also, I don't think I've ever gotten any feedback on #34, the fact that there are very, very few use cases for using this setting that I can think of. Why would you want multiple historical aliases (there are some use cases for multiple aliases for the same thing, but less so for old aliases due to content changes).

The strongly recommended setting is to update the alias and use redirect.module to set up redirects for the old path. As asked in #34, a follow-up issue with the goal of better explaining that setting would be appreciated. With this configuration, you won't have duplicate identical aliases anymore but you'll still end up with possibly many *different* aliases for the same source path.

And last, all the coding standard changes make this a *lot* more complicated to review, as the patch/MR is 3-4x as large as it needs to be, so a reviewer needs to go through and find the actually relevant changes. The general rule is to only fix coding standard on lines you need to change.

All that out of the way, merged.

joseph.olstad’s picture

joseph.olstad’s picture

@berdir, thanks for the commit, hope to see a tag/release also soon.

jweowu’s picture

Thank you Berdir!

And last, all the coding standard changes make this a *lot* more complicated to review, as the patch/MR is 3-4x as large as it needs to be, so a reviewer needs to go through and find the actually relevant changes. The general rule is to only fix coding standard on lines you need to change.

Ah, sadly you didn't notice that there were three separate commits, specifically for that reason :/

https://git.drupalcode.org/project/pathauto/-/merge_requests/55/commits was intended to be extremely easy to review.

I'd noted in #58 that "I've split it into three commits (and hence I've used a fork/MR): one for the testbot complaints, one for the actual issue patch, and one for the additional phpcs fixes."

In general I wouldn't ever try to review a MR as a single diff unless there was only one commit, so I suggest always checking the commits list of any MR.

kevin.pfeifer’s picture

Sorry for not reporting back to your #34 question.

Yes, you can skip it (not sure why you even would have a pattern then in that case)

In our case we have a product sync with an external database (which should be rather self explenatory so the title of the product results in the path alias)
But there is also the possibility to manually add products to the drupal website (which is of course stupid but still our customer requires it) therefore the path auth pattern was required so that the alias is also generated automatically for those manually added products.

In the end it was our fault to expect the path field from core and/or the path auto module to automatically handle path changes when doing

$entity = $this->node_storage->load($id);
$entity->set('path', array('alias' => '/produkte/somename', 'pathauto' => \Drupal\pathauto\PathautoState::SKIP));
if($entity->save()) {
  // some more logic
} else {
  
}

instead (for everyone else checking this) you have to do this

$path_alias_manager = \Drupal::entityTypeManager()->getStorage('path_alias');
$aliasObjects = $path_alias_manager->loadByProperties([
  'path'     => '/node/' . $entity->id(),
  'langcode' => 'de' // <== set your language here if you have translatable nodes/aliases
]);

if(!empty($aliasObjects)) {
  $lastElement = end($aliasObjects);
  foreach($aliasObjects as $alias_object) {

    if($alias_object === $lastElement) {
      $alias_object->alias = '/produkte/something';
      $alias_object->save();
    } else {
      // Delete duplicate alias
      $alias_object->delete();
    }

  }
} else {
  // If no path alias are present create it "normally"
  $entity->set('path', ['alias' => '/produkte/something', 'pathauto' => \Drupal\pathauto\PathautoState::SKIP]);
}
jweowu’s picture

In general I wouldn't ever try to review a MR as a single diff unless there was only one commit, so I suggest always checking the commits list of any MR.

I should add that the same applies to squashing commits at merge time. I'm mostly sure I configured that MR to not have its commits squashed, but I see that squashing has occurred, which means that anyone subsequently reviewing the code history doesn't get the benefit of the individual commits. Whether or not to squash commits for a MR ought to be considered on a case-by-case basis -- sometimes it makes total sense, but in other cases squashing only destroys useful information for no benefit.

jweowu’s picture

Also: Thank you for getting a new release out so promptly.

Berdir’s picture

@jweowu: That's not how drupal.org works. Merge requests are merged through the drupal.org UI and they are always squashed and the commit message defined in the issue is used.

I did in fact not see the split commits, that does help, but in the end, I still need to review all the changes to merge a merge request, and it is preferred to keep things strictly separate. In contrib, it depends on the maintainer and the change, but merge requests against Drupal core will definitely get rejected over unrelated changes.

jweowu’s picture

Merge requests are merged through the drupal.org UI and they are always squashed and the commit message defined in the issue is used.

Oh, wow... that's awful. I know that's how things have always worked for patch files as almost invariably there is just one patch file, but I just assumed that with the advent of gitlab and merge requests there would also have had been option of merge commits!

With the testbot regularly rejecting contributions over unrelated changes, such unrelated changes are inevitable in the process of getting patches a green light; and even without unrelated test failures, large change sets are typically better as a series of atomic commits.

Thank you for clarifying this. I can't fathom why anyone made that decision, but it's good to know.

joseph.olstad’s picture

@jwoewu, @berdir

Are there any other followup tickets needed for this other than the HEAD tests failing for MySQL 8?
#3392302: HEAD tests failing against MySQL 8

jweowu’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.