Problem/Motivation

f you try and add a redirect to an existing URL alias, the redirect module correctly prevents you, with the message: "You are attempting to redirect the page to itself. This will result in an infinite loop."

However, nothing prevents you from editing an entity and setting its URL alias to be the same as an existing redirect's source for that entity. Since 7.0-1.0-rc1 of redirect, this causes the following Drupal message when you then view the page:

"Oops, looks like this request tried to create an infinite loop. We do not allow such things here. We are a professional website!"

This situation is quite common in the wild - a user merely needs to change the URL of a page (causing a redirect to be added automatically), then either change their mind and set it back, or turn pathauto's Generate automatic URL alias back on and Pathauto will do that for them.

Steps to Reproduce

The following steps will reproduce the problem from scratch:

  1. Create and save a test node and set its URL alias to "test" under URL Path Settings.
  2. Edit your node and in the "URL Redirects" vertical tab press Add URL redirect to this node and
    fill in "test2" as the From field, then hit Save. You will be returned to the node edit form.
  3. Now visit the URL path settings vertical tab and change the URL alias to also be "test2" and hit Save.
  4. Your node will be loaded at path /test2 but you will see the error message: Oops, looks like this request tried to create an infinite loop. We do not allow such things here. We are a professional website!

Proposed resolution

I propose we fix this problem by having Redirect implement hook_entity_update(). From there we can automatically delete duplicate redirect rows and prevent the circular redirect from occurring. This automated behaviour is similar to the automatic alias added to old node and term paths, and would involve minimal new code, not impact existing code, and involve no UI changes.

We should also properly remove old duplicate redirect rows in a Database update, using redirect_delete_multiple().

Remaining tasks

  • Determine best approach for both preventing entity URL aliases from duplicating Redirect sources.
  • Determine best approach for fixing existing bad data (or whether to leave that as a manual process).
  • Create patch combining the above.
  • Testing, including with multi-language data and multiple url_alias entries for a path.
  • Add proper test coverage to the Simpletest at RedirectFunctionalTest::testPathChangeRedirects(). Currently that test doesn't appear to be doing any asserting, but it looks to me like it's performing the exact steps that would cause this error so it would be easy to add test coverage for this bug by asserting that the Oops!... error message was not shown after the submits.
    Edit: Test coverage here: redirect-detect_prevent_circular_redirects_test_only-1796596-18.patch from #18 of [#1796596] .

User interface changes

None.

API changes

None.

Original reports

This issue has been forked from #1796596: Fix and prevent circular redirects, which aims to detect circular redirects in progress and stop them, and originally that issue was forked from #1263832: Remove session-based flood / loop detection since it performs poorly, and has unfixable race conditions which discusses whether to roll back the new flood protection features in rc1. This is not a duplicate of either of those issues, but involved in preventing the problem from happening in future, and fixing the bad data from when it has happened in the past.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

damien_vancouver’s picture

Continuing the discussion from #14 and #15 at #1796596: Fix and prevent circular redirects:

@Eric_A said in #15:

the clean-up query needs work. Would you be willing to open a separate issue? I'd be happy to help on getting that one done. There are at least two things to deal with when running such a database query:
1) Language.
2) The other is the fact that there may be multiple aliases in the database for one system path. The primary one is the one to check against, but it might be the case that (at times) Redirect selects a different one as the primary one than Drupal Core does.

I think it would be wise to keep this issue focused on just one problem space: the last line of defense for one looping case: redirect to self.

Here are improved cleanup query that also joins on language:

Show records to be deleted:

SELECT r.rid, r.language, r.source, r.redirect  FROM redirect r INNER JOIN url_alias u ON r.source = u.alias AND r.redirect = u.source AND r.language = u.language;

Delete redirects shown in above query:

DELETE r FROM redirect r INNER JOIN url_alias u ON r.source = u.alias AND r.redirect = u.source AND r.language = u.language;

As for the multiple aliases, I was assuming there would be no way to have duplicates. But I see they are allowed by the table schema (though I don't understand how one is supposed to edit them in that case - the node edit form just uses the first one in the DB).

The query above will join on those duplicates in url_aliases, and that will prevent the Oops message from appearing everywhere. Would that not be the correct behaviour?

For a database update, I think we should be doing it procedurally anyway, using redirect_delete_multiple(). That is how Redirect module is deleting redirects right now if you delete an entity, in redirect_entity_delete().

matglas86’s picture

I had this problem with a client of ours I fixed it now in a custom module with the hook_entity_update. Here is my code.

function hook_entity_update($entity, $type) {
  
  // If we have path information continue.
  if(isset($entity->path) && isset($entity->path['original'])) {
    $current = NULL;
    $current->source = $entity->path['source'];
    $current->language = $entity->path['language'];
    $current->alias = drupal_get_path_alias($current->source, $current->language);
    
    $original = NULL;
    $original->source = $entity->path['original']['source'];
    $original->language = $entity->path['original']['language'];
    $original->alias = $entity->path['original']['alias'];
    
    //If current is different then what it was check on.
    if ($current != $original && isset($current->alias)) {
      
      // If we find a redirect that is the same as the new alias
      // delete the redirect.
      $redirect = redirect_load_by_source($current->alias, $current->language);
      if (isset($redirect) && is_numeric($redirect->rid)) {
        redirect_delete($redirect->rid);
      }
    }
  }
}
matglas86’s picture

Looking at http://drupalcode.org/project/redirect.git/blob/refs/heads/7.x-1.x:/redi... and think that redirect_path_update() or redirect_path_delete(). Could react also instead of hook_entity_update().

What do you guys think would be best?

Eric_A’s picture

With new entities come new aliases. Existing entities have their aliases changed often enough when editors can't make up their mind. Content language changes at times and thus the alias.
I don't think it makes sense to boldly delete a redirect entity every time there's a (temporary) conflict. We just need to make sure to not proceed with the redirect at a moment in time when it clearly makes no sense.

(One of the challenges with the current system is that node redirects do not simply redirect to a hard coded node path. It redirects to whatever the currently active alias may be.)

Eric_A’s picture

And another note on the proposed query. The options part should be taken into account as well. Those columns are serialized arrays, though. So I'm not a 100% convinced you will always catch them all. Only processing in PHP will.

azinck’s picture

There are a number of ways collisions between aliases and redirects like this can occur. I'm not sure there's a blanket set of logic that we can apply for resolving such collisions and since the path-related hooks (like hook_path_update) are often the only reliable place to detect the collisions that's too late to give a user a chance to make a decision on how to resolve the problem.

What if, instead, we were to build a report to show alias and redirect collisions? We might also give provide some tools there for cleaning up the collisions. To build such a report we need to think through exactly what entails a collision. I started thinking that through but my head began to hurt so I'll leave it for another day :).

azinck’s picture

matglas86’s picture

@azinck thats a good idea. Even if its just to get our head around the different ways collisions can happen. We can work out the different situations and test we could build to try and fix it.

rooby’s picture

Either this issue or #1796596: Fix and prevent circular redirects needs to be closed as a duplicate.

I would assume closing this one as the other one has more comments and patches.

You can edit the first post of that one to be like this one if you need to.

azinck’s picture

#1796596: Fix and prevent circular redirects is pretty much exclusively to do with stopping circular redirects from redirecting indefinitely, which doesn't get at the source of the problem at all.

This issue is more about trying to fix the source of the problem. To make it clearer, I propose retitling this issue: "How to resolve conflicts and undesired interactions between url aliases and redirects".

damien_vancouver’s picture

Status: Active » Closed (duplicate)

I'm marking this issue as a duplicate of the original #1263832: Remove session-based flood / loop detection since it performs poorly, and has unfixable race conditions. I'll post a summary of the recent conversation here on that issue. For rationale behind this change, see comments #32 through #41 at #1796596: Fix and prevent circular redirects.

If you are searching for a patch that prevents the "Oops!" circular redirect error message from appearing, please help us test the latest patch on #1796596: Fix and prevent circular redirects, which should fix your problem.

damien_vancouver’s picture

Title: Updating an entity with an URL alias that exactly matches an existing redirect causes Circular Redirect Oops message » Updating an entity with an URL alias that matches an existing redirect causes bad data and circular redirect
Status: Closed (duplicate) » Needs work

as per #53 at #1796596: Fix and prevent circular redirects, I'm re-opening and re-titling this issue to discuss fixing and/or preventing the bad data that causes Circular Redirects.

Here is some new discussion from that thread, reposted here for convenience.

#49
Posted by DamienMcKenna on January 7, 2013 at 2:42am
Here's an update script I wrote to purge redirects that are causing an infinite loop:

/**
 * Fix infinite redirect loops.
 */
function mymodule_update_700() {
  $results = db_query("SELECT
    r.rid, u.alias
    FROM {url_alias} u
      INNER JOIN {redirect} r
        WHERE u.alias = r.source");

  foreach ($results as $record) {
    db_delete('redirect')
      ->condition('rid', $record->rid)
      ->execute();
    drupal_set_message(t("Removed infinite redirect loop for the path @path", array('@path' => $record->alias)));
  }
}

I decided to do it as two separate steps rather than one large query so there can at least be some logging of the pages that were affected.

#50
Posted by heddn on January 7, 2013 at 2:10pm
Status: needs review » reviewed & tested by the community
Patch in #48 seems to resolve the issue. I tested using system path example.com/node/123 and redirect example.com/foo. Redirects are stopped. If I rename the path alias to foo2, then the redirect logic doesn't get fired, which is also appropriate. Let's get this thing committed.

#51
Posted by heddn on January 7, 2013 at 2:13pm
BTW, I like having #49 available but not included in redirect. Not so sure I want it in a hook_update_N that is provided by redirect because I want to be able to pick and choose which redirects should be deleted.

#52
Posted by a.ross on January 7, 2013 at 3:16pm
Hm, it also seems that removing redirects like that (#49) won't remove all possible loops. It just removes a redirect to itself.
On the other hand IMO some dedicated diagnostics interface just to find redirect loops is overkill - at least for now. It could always be added later if it turns out to be a popular request or something.

#53
Posted by damien_vancouver on January 7, 2013 at 7:49pm new
- The query should join on the "language" column as well or the join may be ambiguous on multilanguage sites.

- Fixing the bad data doesn't make the underlying problem go away. As soon as users are in there working and changing URL paths, then circular redirects start happening again in the database. It happens almost immediately in the Real World.

heddn’s picture

Here's an approach to resolve this issue. If the alias or redirect would create a duplicate, we could gracefully/silently delete the alias or redirect, however I'm taking the stance that I'd rather throw an error.

alias
If a path alias would duplicate a redirect for the current path, don't create the alias and give an error instead

  • implement hook_path_insert/hook_path_update
  • roll back the creation of the alias
  • throw an exception to give feedback to the caller of path_save().

If a path alias would duplicate a redirect for another path, don't create the alias and give an error instead

  • implement hook_path_insert/hook_path_update
  • roll back the creation of the alias
  • throw an exception to give feedback to the caller of path_save().

redirect
If a redirect would duplicate a path alias for the current path, don't create the redirect and give an error

  • throw an exception in redirect_save so the redirect_save rolls back

If a redirect would duplicate a path alias for another entity, don't create the redirect and give an error

  • throw an exception in redirect_save so the redirect_save rolls back
granticusiv’s picture

What I haven't noticed in any of these discussions is a way to remove the alias without any coding. I found that with the latest version of the Metatag module installed, I could edit a node, and delete the extra 'URL redirects'.

rooby’s picture

@granticusiv:

Go to the redirects admin page, which is at admin/config/search/redirect
Then you can find and delete whatever redirect you need to.

granticusiv’s picture

@rooby
Interesting... I thought I'd deleted all the redirects using that method, for the page I was having trouble with. It didn't seem to work when I tried it, only the other method I mentioned. Perhaps there was something else going on I wasn't aware of.

heddn’s picture

I did some more poking around and think that the solution proposed in#13 won't work for path aliases. It should work for redirects. We really need a way to "fail" the path alias creation or update but there aren't any hooks for that in core. But look, there's #683510: Add hook_path_alter() that might help! I think we need to wait for that to go into core before we can see much progress here. In the meantime, some help is offered by #1288768-10: Conflict between an old redirect (with Redirect module) and a new automatic alias (with Pathauto). if you use pathauto.

joshf’s picture

joshf’s picture

This one adds redirect_free_source() to the api (overkill?) which simply deletes any redirect records which match the passed source and issues a warning.

nlisgo’s picture

#19 works great for me. Just altered the patch slightly to fix a php warning:

Warning: htmlspecialchars() expects parameter 1 to be string, array given in check_plain() (line 1545 of /Users/nlisgo/Sites/aeopus/clients/lexisnexis/hsw/hswd7/www/hsw/includes/bootstrap.inc).

David_Rothstein’s picture

+    $warning = t('A @type record using the path @source (pointing to @redirect) has been removed.', array('@type' => $redirect_record->type, '@source' => $redirect_record->source, '@redirect' => $redirect_record->redirect));
+    drupal_set_message($warning,'warning');

What is the point of displaying this warning message on the screen? That is only going to confuse the person creating the content.

Maybe a watchdog message instead?

torotil’s picture

I've re-rolled the patch without the user warning and also deleted a superfluous check for the previously deleted redirect in hook_path_update(). I also made the comments less verbose.

grndlvl’s picture

Minor documentation formating.

brant’s picture

#23 appears to have fixed the issue I was having (renaming page w/ automatic alias from a to b and back to a resulting in redirect loop).

torotil’s picture

Fixed an issue with duplicate redirects being added in the previous patch.

GoZ’s picture

There is an Integrity constraint error if following condition is deleted from redirect_path_update() :

if (!redirect_load_by_hash($hash)) {
  redirect_save($redirect);
}
 PDOException : SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'hQRvApeRCxYTOodkPcd-72whrW7UDHGeIHN9w5_fdSQ' for key 'hash': INSERT INTO {redirect} (hash, type, uid, source, source_options, redirect, redirect_options, language, status_code, count, access) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8, :db_insert_placeholder_9, :db_insert_placeholder_10); Array ( [:db_insert_placeholder_0] => hQRvApeRCxYTOodkPcd-72whrW7UDHGeIHN9w5_fdSQ [:db_insert_placeholder_1] => redirect [:db_insert_placeholder_2] => 1 [:db_insert_placeholder_3] => bouger/velo/loire-velo-de-fou-0 [:db_insert_placeholder_4] => a:0:{} [:db_insert_placeholder_5] => node/3322 [:db_insert_placeholder_6] => a:0:{} [:db_insert_placeholder_7] => fr [:db_insert_placeholder_8] => 0 [:db_insert_placeholder_9] => 0 [:db_insert_placeholder_10] => 0 ) in drupal_write_record() (line 7136

I also add language parameter in redirect_free_source(). Without that, localized redirects will never be deleted because redirect_load_by_source will take LANGUAGE_NONE by default.

This patch is part of the #1day1patch initiative.

GoZ’s picture

duplicate comment. Why drupal issues doesn't take care of this ??

Elijah Lynn’s picture

Status: Needs review » Reviewed & tested by the community

Thanks GoZ,

#26 works in my tests.

p.s. I am going to cross reference a similar issue for Pathauto that was closed #2065091: Infinite loop when changing alias to existing redirect.

acbramley’s picture

+1 confirm that this fixes this issue

acbramley’s picture

Status: Reviewed & tested by the community » Needs work

I just ran into a problem with this patch (I think).

Steps:

1) Created a node called "test" with alias using node title. Auto generated alias was "test-0" as there was already a test path.
2) Saved node. Got taken to /test-2 which is another node with a redirect from test-0 -> test-2
3) Scratch your head wondering what is going on.

acbramley’s picture

Status: Needs work » Reviewed & tested by the community

Apologies, wrong issue

Dave Reid’s picture

Status: Reviewed & tested by the community » Needs review
+++ b/redirect.module
@@ -408,6 +410,41 @@ function redirect_path_update(array $path) {
+ * Similar to redirect_delete_by_path(), but doesn't recurse sub-paths. This is

Can we instead add an extra parameter to redirect_delete_by_path() that disables the sub-path check rather than adding a new API function?

How should this work if I created a redirect at ferzle?foo=bar and then created an URL alias for 'ferzle'? It looks like this wouldn't delete that redirect in that case, but probably should?

--

I'm wondering if I should add a follow-up issue to add a 'disable' functionality where redirects shouldn't be deleted, but instead disabled?

torotil’s picture

How should this work if I created a redirect at ferzle?foo=bar and then created an URL alias for 'ferzle'? It looks like this wouldn't delete that redirect in that case, but probably should?

I don't think it should delete sub-paths or even path with GET-parameters. The patch was created to avoid redirect loops and it should do exactly that.

I'm wondering if I should add a follow-up issue to add a 'disable' functionality where redirects shouldn't be deleted, but instead disabled?

In most cases the redirects that are deleted by this patch have been created automatically because the node-title (and therefore the automatic alias) for a node was changed. For us having it only disable those redirects would lead to the redirect-table being cluttered with automatically produced disabled redirects. So I'm voting for deletion by default. Including an option can't hurt if someone actually needs it.

GoZ’s picture

+1 for deleting and not disabling redirect. Adding disable information only make table and datas more heavy without really needs.

vinmassaro’s picture

Another vote to delete instead of disable. We get questions about this bug from internal university customers a few times a week and have to have them manually delete their redirects. Thanks to all who are working on this issue!

torotil’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
1.84 KB

Following the suggestion from Dave Reid: Here is a patch that generalizes the redirect_delete_by_path() instead of introducing redirect_free_source().

torotil’s picture

Status: Reviewed & tested by the community » Needs review
GoZ’s picture

Status: Needs review » Needs work

This patch doesn't take care of language anymore. See #26

torotil’s picture

@GoZ: are you sure? If I'm correct your version of the patch was about also deleting redirects with $language!=LANG_NONE. redirect_delete_by_path() doesn't filter at all by language.

If that's a problem I'd vote for the patch in #27 with redirect_free_resource() instead of making redirect_delete_by_path() still more complex by adding a language parameter.

GoZ’s picture

It's a problem :
You can see in patch #26 that to load redirect id, we have to give language parameter to redirect_load_by_source($source, $language) function because redirects can be same for differents languages but not redirect to same nodes.
Example of redirects which work by default with language :
Redirect page1 (FR) => node/2
Redirect page1 (EN) => node/3

So in patch #26, we load redirect id for a path AND a language. And then we can delete it thanks his redirect id and not only path : redirect_delete($redirect_record->rid);.

In patch #36, you don't care of language or redirect id but only path redirect_delete_by_path($path['alias'], FALSE), so it will delete all paths without filtering on language.

I'll make tests to see if it's possible to adapt redirect_delete_by_path() to take care of language.

torotil’s picture

Ah I've overlooked that. Thanks for your explanation!

torotil’s picture

Status: Needs work » Needs review
FileSize
1.98 KB

Here is the patch from #36 with language support added.

jofitz’s picture

Patch applied and solves my problem with redirects conflicting with aliases and causing infinite loops.

gnucifer’s picture

I had a strange issue after applying this patch.

If creating a node (1) with alias "test".

Creating another node (2), and creating a redirect to this from "test".

/test will then redirect to node (2), but if editing the redirect, source and destination will both be set to "node/nid" and "node/nid", not "test", "node/nid" as expected. Don't have time to look into this i further detail right now, but though I could at least flag for this issue.

torotil’s picture

@gnucifer This doesn't sound like something that's affected by this patch. Could you try to find a test-case that works without this patch and doesn't work with it?

j0rd’s picture

Status: Needs review » Needs work

I applied patch 42 and I've been checking the logs for the error, and have not found any problems thus far.

One thing this patch will need to do, is erase the old bad circular redirects which previously were created.

I believe a query like this will help you find any redirects, which match URL aliases and then can be deleted.

SELECT r.rid,r.source FROM redirect r INNER JOIN url_alias a ON r.source=a.alias;

Then delete these bad redirects in an hook_update().

a.ross’s picture

I think it's better not to stall this issue any longer for the benefit of an upgrade script that won't really work anyway. What if you have a redirect like this:
A -> B -> C -> A

Ideally, redirects should be clearly constrained. As in, the URL redirect SHOULD NOT match the source URL of another redirect. When that is all done and adopted, upgrade scripts or whatever could take care of bad redirects based on that.

To illustrate, when you have this situation:

Redirect loop:       A -> B -> C -> D -> A
System URL or alias:           C

this should happen (after removing the existing redirects):

Redirects:
A -> C
B -> C
D -> C

This will of course cause a bit more strain on the database, as existing redirects need to be kept updated, but has the benefits of data integrity and that users won't receive 301 or 302 redirects more than once for any given real URL.

rp7’s picture

I can confirm patch in #42 works.

To clean up the faulty redirects (which we're over 4000 on my installation), following snippet did the job:

$rids = db_query('SELECT r.rid FROM {redirect} r INNER JOIN {url_alias} a ON r.source=a.alias')->fetchCol();
redirect_delete_multiple($rids);
fenstrat’s picture

Tempted to mark this as a duplicate of #1796596: Fix and prevent circular redirects which has a working patch and seems to address the issues raised here.

fenstrat’s picture

Issue summary: View changes
deviantintegral’s picture

Issue summary: View changes
Status: Needs work » Closed (duplicate)
Related issues: +#1796596: Fix and prevent circular redirects