Problem/Motivation
f you try and add a redirect to an existing URL alias, the redirect module correctly prevents you, with the message: "You are attempting to redirect the page to itself. This will result in an infinite loop."
However, nothing prevents you from editing an entity and setting its URL alias to be the same as an existing redirect's source for that entity. Since 7.0-1.0-rc1 of redirect, this causes the following Drupal message when you then view the page:
"Oops, looks like this request tried to create an infinite loop. We do not allow such things here. We are a professional website!"
This situation is quite common in the wild - a user merely needs to change the URL of a page (causing a redirect to be added automatically), then either change their mind and set it back, or turn pathauto's Generate automatic URL alias back on and Pathauto will do that for them.
Steps to Reproduce
The following steps will reproduce the problem from scratch:
- Create and save a test node and set its URL alias to "test" under URL Path Settings.
- Edit your node and in the "URL Redirects" vertical tab press Add URL redirect to this node and
fill in "test2" as the From field, then hit Save. You will be returned to the node edit form. - Now visit the URL path settings vertical tab and change the URL alias to also be "test2" and hit Save.
- Your node will be loaded at path /test2 but you will see the error message: Oops, looks like this request tried to create an infinite loop. We do not allow such things here. We are a professional website!
Proposed resolution
I propose we fix this problem by having Redirect implement hook_entity_update(). From there we can automatically delete duplicate redirect rows and prevent the circular redirect from occurring. This automated behaviour is similar to the automatic alias added to old node and term paths, and would involve minimal new code, not impact existing code, and involve no UI changes.
We should also properly remove old duplicate redirect rows in a Database update, using redirect_delete_multiple().
Remaining tasks
- Determine best approach for both preventing entity URL aliases from duplicating Redirect sources.
- Determine best approach for fixing existing bad data (or whether to leave that as a manual process).
- Create patch combining the above.
- Testing, including with multi-language data and multiple url_alias entries for a path.
Add proper test coverage to the Simpletest at RedirectFunctionalTest::testPathChangeRedirects(). Currently that test doesn't appear to be doing any asserting, but it looks to me like it's performing the exact steps that would cause this error so it would be easy to add test coverage for this bug by asserting that the Oops!... error message was not shown after the submits.
Edit: Test coverage here: redirect-detect_prevent_circular_redirects_test_only-1796596-18.patch from #18 of [#1796596] .
User interface changes
None.
API changes
None.
Original reports
This issue has been forked from #1796596: Fix and prevent circular redirects, which aims to detect circular redirects in progress and stop them, and originally that issue was forked from #1263832: Remove session-based flood / loop detection since it performs poorly, and has unfixable race conditions which discusses whether to roll back the new flood protection features in rc1. This is not a duplicate of either of those issues, but involved in preventing the problem from happening in future, and fixing the bad data from when it has happened in the past.
Comments
Comment #1
damien_vancouver CreditAttribution: damien_vancouver commentedContinuing the discussion from #14 and #15 at #1796596: Fix and prevent circular redirects:
@Eric_A said in #15:
Here are improved cleanup query that also joins on language:
Show records to be deleted:
Delete redirects shown in above query:
As for the multiple aliases, I was assuming there would be no way to have duplicates. But I see they are allowed by the table schema (though I don't understand how one is supposed to edit them in that case - the node edit form just uses the first one in the DB).
The query above will join on those duplicates in url_aliases, and that will prevent the Oops message from appearing everywhere. Would that not be the correct behaviour?
For a database update, I think we should be doing it procedurally anyway, using redirect_delete_multiple(). That is how Redirect module is deleting redirects right now if you delete an entity, in redirect_entity_delete().
Comment #2
matglas86 CreditAttribution: matglas86 commentedI had this problem with a client of ours I fixed it now in a custom module with the hook_entity_update. Here is my code.
Comment #3
matglas86 CreditAttribution: matglas86 commentedLooking at http://drupalcode.org/project/redirect.git/blob/refs/heads/7.x-1.x:/redi... and think that redirect_path_update() or redirect_path_delete(). Could react also instead of hook_entity_update().
What do you guys think would be best?
Comment #4
Eric_A CreditAttribution: Eric_A commentedWith new entities come new aliases. Existing entities have their aliases changed often enough when editors can't make up their mind. Content language changes at times and thus the alias.
I don't think it makes sense to boldly delete a redirect entity every time there's a (temporary) conflict. We just need to make sure to not proceed with the redirect at a moment in time when it clearly makes no sense.
(One of the challenges with the current system is that node redirects do not simply redirect to a hard coded node path. It redirects to whatever the currently active alias may be.)
Comment #5
Eric_A CreditAttribution: Eric_A commentedAnd another note on the proposed query. The options part should be taken into account as well. Those columns are serialized arrays, though. So I'm not a 100% convinced you will always catch them all. Only processing in PHP will.
Comment #6
azinck CreditAttribution: azinck commentedThere are a number of ways collisions between aliases and redirects like this can occur. I'm not sure there's a blanket set of logic that we can apply for resolving such collisions and since the path-related hooks (like hook_path_update) are often the only reliable place to detect the collisions that's too late to give a user a chance to make a decision on how to resolve the problem.
What if, instead, we were to build a report to show alias and redirect collisions? We might also give provide some tools there for cleaning up the collisions. To build such a report we need to think through exactly what entails a collision. I started thinking that through but my head began to hurt so I'll leave it for another day :).
Comment #7
azinck CreditAttribution: azinck commentedA closely related discussion: #1288768: Conflict between an old redirect (with Redirect module) and a new automatic alias (with Pathauto).
Comment #8
matglas86 CreditAttribution: matglas86 commented@azinck thats a good idea. Even if its just to get our head around the different ways collisions can happen. We can work out the different situations and test we could build to try and fix it.
Comment #9
rooby CreditAttribution: rooby commentedEither this issue or #1796596: Fix and prevent circular redirects needs to be closed as a duplicate.
I would assume closing this one as the other one has more comments and patches.
You can edit the first post of that one to be like this one if you need to.
Comment #10
azinck CreditAttribution: azinck commented#1796596: Fix and prevent circular redirects is pretty much exclusively to do with stopping circular redirects from redirecting indefinitely, which doesn't get at the source of the problem at all.
This issue is more about trying to fix the source of the problem. To make it clearer, I propose retitling this issue: "How to resolve conflicts and undesired interactions between url aliases and redirects".
Comment #11
damien_vancouver CreditAttribution: damien_vancouver commentedI'm marking this issue as a duplicate of the original #1263832: Remove session-based flood / loop detection since it performs poorly, and has unfixable race conditions. I'll post a summary of the recent conversation here on that issue. For rationale behind this change, see comments #32 through #41 at #1796596: Fix and prevent circular redirects.
If you are searching for a patch that prevents the "Oops!" circular redirect error message from appearing, please help us test the latest patch on #1796596: Fix and prevent circular redirects, which should fix your problem.
Comment #12
damien_vancouver CreditAttribution: damien_vancouver commentedas per #53 at #1796596: Fix and prevent circular redirects, I'm re-opening and re-titling this issue to discuss fixing and/or preventing the bad data that causes Circular Redirects.
Here is some new discussion from that thread, reposted here for convenience.
#49
Posted by DamienMcKenna on January 7, 2013 at 2:42am
Here's an update script I wrote to purge redirects that are causing an infinite loop:
I decided to do it as two separate steps rather than one large query so there can at least be some logging of the pages that were affected.
#50
Posted by heddn on January 7, 2013 at 2:10pm
Status: needs review » reviewed & tested by the community
Patch in #48 seems to resolve the issue. I tested using system path example.com/node/123 and redirect example.com/foo. Redirects are stopped. If I rename the path alias to foo2, then the redirect logic doesn't get fired, which is also appropriate. Let's get this thing committed.
#51
Posted by heddn on January 7, 2013 at 2:13pm
BTW, I like having #49 available but not included in redirect. Not so sure I want it in a hook_update_N that is provided by redirect because I want to be able to pick and choose which redirects should be deleted.
#52
Posted by a.ross on January 7, 2013 at 3:16pm
Hm, it also seems that removing redirects like that (#49) won't remove all possible loops. It just removes a redirect to itself.
On the other hand IMO some dedicated diagnostics interface just to find redirect loops is overkill - at least for now. It could always be added later if it turns out to be a popular request or something.
#53
Posted by damien_vancouver on January 7, 2013 at 7:49pm new
- The query should join on the "language" column as well or the join may be ambiguous on multilanguage sites.
- Fixing the bad data doesn't make the underlying problem go away. As soon as users are in there working and changing URL paths, then circular redirects start happening again in the database. It happens almost immediately in the Real World.
Comment #13
heddnHere's an approach to resolve this issue. If the alias or redirect would create a duplicate, we could gracefully/silently delete the alias or redirect, however I'm taking the stance that I'd rather throw an error.
alias
If a path alias would duplicate a redirect for the current path, don't create the alias and give an error instead
If a path alias would duplicate a redirect for another path, don't create the alias and give an error instead
redirect
If a redirect would duplicate a path alias for the current path, don't create the redirect and give an error
If a redirect would duplicate a path alias for another entity, don't create the redirect and give an error
Comment #14
granticusiv CreditAttribution: granticusiv commentedWhat I haven't noticed in any of these discussions is a way to remove the alias without any coding. I found that with the latest version of the Metatag module installed, I could edit a node, and delete the extra 'URL redirects'.
Comment #15
rooby CreditAttribution: rooby commented@granticusiv:
Go to the redirects admin page, which is at admin/config/search/redirect
Then you can find and delete whatever redirect you need to.
Comment #16
granticusiv CreditAttribution: granticusiv commented@rooby
Interesting... I thought I'd deleted all the redirects using that method, for the page I was having trouble with. It didn't seem to work when I tried it, only the other method I mentioned. Perhaps there was something else going on I wasn't aware of.
Comment #17
heddnI did some more poking around and think that the solution proposed in#13 won't work for path aliases. It should work for redirects. We really need a way to "fail" the path alias creation or update but there aren't any hooks for that in core. But look, there's #683510: Add hook_path_alter() that might help! I think we need to wait for that to go into core before we can see much progress here. In the meantime, some help is offered by #1288768-10: Conflict between an old redirect (with Redirect module) and a new automatic alias (with Pathauto). if you use pathauto.
Comment #18
joshf CreditAttribution: joshf commentedEDIT: Disregard; I didn't see #1796596: Fix and prevent circular redirects
Comment #19
joshf CreditAttribution: joshf commentedThis one adds redirect_free_source() to the api (overkill?) which simply deletes any redirect records which match the passed source and issues a warning.
Comment #20
nlisgo CreditAttribution: nlisgo commented#19 works great for me. Just altered the patch slightly to fix a php warning:
Warning: htmlspecialchars() expects parameter 1 to be string, array given in check_plain() (line 1545 of /Users/nlisgo/Sites/aeopus/clients/lexisnexis/hsw/hswd7/www/hsw/includes/bootstrap.inc).
Comment #21
David_Rothstein CreditAttribution: David_Rothstein commentedWhat is the point of displaying this warning message on the screen? That is only going to confuse the person creating the content.
Maybe a watchdog message instead?
Comment #22
torotil CreditAttribution: torotil commentedI've re-rolled the patch without the user warning and also deleted a superfluous check for the previously deleted redirect in hook_path_update(). I also made the comments less verbose.
Comment #23
grndlvl CreditAttribution: grndlvl commentedMinor documentation formating.
Comment #24
brant CreditAttribution: brant commented#23 appears to have fixed the issue I was having (renaming page w/ automatic alias from a to b and back to a resulting in redirect loop).
Comment #25
torotil CreditAttribution: torotil commentedFixed an issue with duplicate redirects being added in the previous patch.
Comment #26
GoZ CreditAttribution: GoZ commentedThere is an Integrity constraint error if following condition is deleted from redirect_path_update() :
I also add language parameter in redirect_free_source(). Without that, localized redirects will never be deleted because redirect_load_by_source will take LANGUAGE_NONE by default.
This patch is part of the #1day1patch initiative.
Comment #27
GoZ CreditAttribution: GoZ commentedduplicate comment. Why drupal issues doesn't take care of this ??
Comment #28
Elijah LynnThanks GoZ,
#26 works in my tests.
p.s. I am going to cross reference a similar issue for Pathauto that was closed #2065091: Infinite loop when changing alias to existing redirect.
Comment #29
acbramley CreditAttribution: acbramley commented+1 confirm that this fixes this issue
Comment #30
acbramley CreditAttribution: acbramley commentedI just ran into a problem with this patch (I think).
Steps:
1) Created a node called "test" with alias using node title. Auto generated alias was "test-0" as there was already a test path.
2) Saved node. Got taken to /test-2 which is another node with a redirect from test-0 -> test-2
3) Scratch your head wondering what is going on.
Comment #31
acbramley CreditAttribution: acbramley commentedApologies, wrong issue
Comment #32
Dave ReidCan we instead add an extra parameter to redirect_delete_by_path() that disables the sub-path check rather than adding a new API function?
How should this work if I created a redirect at ferzle?foo=bar and then created an URL alias for 'ferzle'? It looks like this wouldn't delete that redirect in that case, but probably should?
--
I'm wondering if I should add a follow-up issue to add a 'disable' functionality where redirects shouldn't be deleted, but instead disabled?
Comment #33
torotil CreditAttribution: torotil commentedI don't think it should delete sub-paths or even path with GET-parameters. The patch was created to avoid redirect loops and it should do exactly that.
In most cases the redirects that are deleted by this patch have been created automatically because the node-title (and therefore the automatic alias) for a node was changed. For us having it only disable those redirects would lead to the redirect-table being cluttered with automatically produced disabled redirects. So I'm voting for deletion by default. Including an option can't hurt if someone actually needs it.
Comment #34
GoZ CreditAttribution: GoZ commented+1 for deleting and not disabling redirect. Adding disable information only make table and datas more heavy without really needs.
Comment #35
vinmassaro CreditAttribution: vinmassaro commentedAnother vote to delete instead of disable. We get questions about this bug from internal university customers a few times a week and have to have them manually delete their redirects. Thanks to all who are working on this issue!
Comment #36
torotil CreditAttribution: torotil commentedFollowing the suggestion from Dave Reid: Here is a patch that generalizes the redirect_delete_by_path() instead of introducing redirect_free_source().
Comment #37
torotil CreditAttribution: torotil commentedComment #38
GoZ CreditAttribution: GoZ commentedThis patch doesn't take care of language anymore. See #26
Comment #39
torotil CreditAttribution: torotil commented@GoZ: are you sure? If I'm correct your version of the patch was about also deleting redirects with $language!=LANG_NONE. redirect_delete_by_path() doesn't filter at all by language.
If that's a problem I'd vote for the patch in #27 with redirect_free_resource() instead of making redirect_delete_by_path() still more complex by adding a language parameter.
Comment #40
GoZ CreditAttribution: GoZ commentedIt's a problem :
You can see in patch #26 that to load redirect id, we have to give language parameter to
redirect_load_by_source($source, $language)
function because redirects can be same for differents languages but not redirect to same nodes.Example of redirects which work by default with language :
Redirect page1 (FR) => node/2
Redirect page1 (EN) => node/3
So in patch #26, we load redirect id for a path AND a language. And then we can delete it thanks his redirect id and not only path :
redirect_delete($redirect_record->rid);
.In patch #36, you don't care of language or redirect id but only path
redirect_delete_by_path($path['alias'], FALSE)
, so it will delete all paths without filtering on language.I'll make tests to see if it's possible to adapt redirect_delete_by_path() to take care of language.
Comment #41
torotil CreditAttribution: torotil commentedAh I've overlooked that. Thanks for your explanation!
Comment #42
torotil CreditAttribution: torotil commentedHere is the patch from #36 with language support added.
Comment #43
jofitz CreditAttribution: jofitz commentedPatch applied and solves my problem with redirects conflicting with aliases and causing infinite loops.
Comment #44
gnucifer CreditAttribution: gnucifer commentedI had a strange issue after applying this patch.
If creating a node (1) with alias "test".
Creating another node (2), and creating a redirect to this from "test".
/test will then redirect to node (2), but if editing the redirect, source and destination will both be set to "node/nid" and "node/nid", not "test", "node/nid" as expected. Don't have time to look into this i further detail right now, but though I could at least flag for this issue.
Comment #45
torotil CreditAttribution: torotil commented@gnucifer This doesn't sound like something that's affected by this patch. Could you try to find a test-case that works without this patch and doesn't work with it?
Comment #46
j0rd CreditAttribution: j0rd commentedI applied patch 42 and I've been checking the logs for the error, and have not found any problems thus far.
One thing this patch will need to do, is erase the old bad circular redirects which previously were created.
I believe a query like this will help you find any redirects, which match URL aliases and then can be deleted.
Then delete these bad redirects in an hook_update().
Comment #47
a.ross CreditAttribution: a.ross commentedI think it's better not to stall this issue any longer for the benefit of an upgrade script that won't really work anyway. What if you have a redirect like this:
A -> B -> C -> A
Ideally, redirects should be clearly constrained. As in, the URL redirect SHOULD NOT match the source URL of another redirect. When that is all done and adopted, upgrade scripts or whatever could take care of bad redirects based on that.
To illustrate, when you have this situation:
this should happen (after removing the existing redirects):
This will of course cause a bit more strain on the database, as existing redirects need to be kept updated, but has the benefits of data integrity and that users won't receive 301 or 302 redirects more than once for any given real URL.
Comment #48
rp7 CreditAttribution: rp7 commentedI can confirm patch in #42 works.
To clean up the faulty redirects (which we're over 4000 on my installation), following snippet did the job:
Comment #49
fenstratTempted to mark this as a duplicate of #1796596: Fix and prevent circular redirects which has a working patch and seems to address the issues raised here.
Comment #49.0
fenstratreference test posted to #1796596: Fix and prevent circular redirects
Comment #50
deviantintegral CreditAttribution: deviantintegral commentedI've merged this patch with #1796596: Fix and prevent circular redirects over at https://drupal.org/comment/8220263#comment-8220263