Add Wildcards to Redirect Paths
| Project: | Path Redirect |
| Version: | 6.x-1.x-dev |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs work |
This patch adds support to enter a wildcard into the from path.
For instance, if you have pathauto set up to create image nodes at photos/[author-username]/[node-title], then you might want to create a redirect from:
photos/<*>
which would redirect to a View displaying a tab at
user/<*>/photos
I've chosen to use <*> as the wildcard because * by itself is a valid URL character, however < and > are not. Therefore, we're guaranteed that this will never conflict with a valid path.
Some points for discussion:
Right now we're limited to one wildcard and it has to be a distinct path segment (delimited by /). For instance, you can't redirect from user<*> or our_old_legacy/url/<*>.html. The reason for this is below.
For performance reasons, I've opted NOT to use the SQL "LIKE" selector to match the path. My feeling is that this would be too heavy of a query to impose on every page load (including those with page caching enabled). However this point is open for debate. Additionally, we'd kind of need to to the LIKE query in reverse, because the wildcards are in the database and we're matching to the known path value. However if we could get it to work (and we felt like it wasn't a performance problem), we could gain much more complex selectors.
Right now, the query is built by building an array of the possible path matches (substituting the wildcard in each segment position) and then using the IN selector to do an exact match on the (indexed) path column. So for instance, visiting node/3/edit builds this query:
SELECT path, redirect, query, fragment, type FROM path_redirect WHERE path IN ('node/3/edit','<*>/3/edit','node/<*>/edit','node/3/<*>') LIMIT 1
We're okay with "IN" on Postgre, right? And LIMIT?
Another nice bit is that you can use <*> not only in the normal to path, but you can also use it in the query or the fragment. So you could redirect user/<*> to userlist#<*> or userlist?user=<*>. Makes for some interesting possibilities.
| Attachment | Size |
|---|---|
| path_redirect-wildcard.patch | 6.01 KB |

#1
Subscribing; very cool. I double-checked, and this worked fine in Postgres 8.1.9:
SELECT nid, title FROM {node} WHERE nid IN (3,1,4) LIMIT 2#2
This improvement is MUCH-NEEDED.
But...
I applied the patch. But I can't save a redirection with <*> in the destination field. This error is displayed: "The redirect to path does not appear valid."
Checking the code, it seems to be lacking a way to allow this non-standard URL content (much as it allows ?).
#3
Oop. Yes. I have a fix for this. Will upload soon.
-j
#4
With a site I'm working on, I encountered a situation where we essentially need mod_rewrite with user permissions. My solution for this problem was to use preg_replace and to allow the user to put in a regular expression. This works really well because it allowed us to do more complex patterns such as:
Source URL -> Destination URL
categories/services/(.+) -> $1 // Allows us to redirect taxonomy pages to the service page, so we don't have to change any taxonomy URL's
(.+/courses/.+)/20[0-1][0-9]-[0-9][0-9]-[0-9][0-9] -> $1 // Redirects users from a specific course date to the course info page.
This presumably gets around the issues you mentioned above (every page load is just a SELECT src, dst...) and the heavy work is done in PHP's preg_replace.
The biggest problem with this approach is that ATM it's possible for an admin to break things by putting in a regex which won't compile. I suppose that could be fixed if PHP4 supports try{} catch{}, but I need to find that out.
Anyways, I've attached my module for reference in case you find anything that's useful. "As is" it works without conflicting with path_redirect, but I'd much rather integrate this functionality here :)
Thanks,
--Andrew
#5
Bleh. I can't upload a tgz or upload multiple files apparently. Here's the rest of the module.
Or not. I apparently can't upload a .module either. Here's a link to the zip file: http://www.cs-club.org/~andrew/files/url_redirect-10182007.tar.gz
#6
I think the attached patch fixes this, but it's my first attempt, so go easy on me :)
Patch is against the 11.05.07 5.x-1.x-dev version.
#7
I'll reroll this against the latest 5.x dev version, hopefully incorporating hawkdrupal's related request.
#8
I suggest to store wildcards in the database as spaces since their charactercode is the
lowest possible (32). Exclamation-marks (charactercode 33) is, according to
[RFC1738](http://www.faqs.org/rfcs/rfc1738.html) allowed in URLs. Spaces aren't
allowed and will be encoded as %20 or +. This means we can use spaces for
wildcards without problems. When using "ORDER path DESC" in queries wildcards
will come hindmost.
And when adding a weight to redirects you also can override this.
#9
I've installed the latest 5.x-1.x-dev version (December 16th), and there is no visible differences in admin interface. Wildcards are also not accepted.
I've checked the code - it seems path_redirect-5.x-1.x_i18n_15.patch (which, as I understand, shall resolve also the Add Wildcards to Redirect Paths issue I am mainly concerned about) was not applied. I've applied it to this latest version, but there are still no noticeable changes (paths are displayed as before, altough some differencies are announced - to see URLs as saved in the database; if I enter an asterisk - as only * or as <*> - it is not accepted, only the error complain is displayed...).
I am also quite confused now about which patch shall resolve which issue (there are 3 or 4 "active" now). Maybe it would be good to more clearly define which issues will be resolved by which approach - or each issue separately with a patch provided there, or all together (if like mod_rewrite will be used to resolve also the Add Wildcards to Redirect Paths and/or Obscuring true paths of redirects is confusing and buggy issue(s))? Just for us to know which issues to follow looking for a solution needed (I find now mainly references to patches attached to some other issues, but from there more or less at the end only references to some other issues etc.).
So, about my problem - is there any code (patch or a new development version) ready, which shall enable the use of wildcards? If yes, what exactly shall be used (or, done to enable it)? If it shall already work, how exactly to enter the wildcard character (I would suggest to put this info into the "Enter a Drupal path or path alias to redirect" text displayed to add (or, edit) a redirect.
#10
Thanks for keeping this issue active, LUTi! Here is a patch against the very latest dev version, which as of this posting probably isn't quite available for download yet. (They're generated every 12 hours, I believe, so you should be able to get it soon!)
I haven't tested the patch extensively, but it appears to basically work. The only significant change I made from the previous patch was to have the "test" link in the admin interface still work by having it replace the wildcard with the string
test.#11
For what it's worth, my thinking is that 5.x-1.1 should come out very soon, but without the wildcard feature. I'm eager to get 1.1 out because Postgres is fixed in dev but still broken in beta1.
I have some changes I need to port to 6.x dev, too; I'm sure we'll have a stable release ready by the time D6 is out!
#12
HorsePunchKid, thank you for a patch provided.
I've downloaded the latest version of module (1.3.2.16 2007/12/21) from CVS first, and applied the patch to it. When I try to use the wildcard <*> in "From:" only, but not in "To:" as well (to redirect all partially saved paths to a single page, informing the visitor that such URL doesn't exist...), such a redirect can not be saved - I get the message "The redirect to path does not appear valid." only.
So, I would suggest:
1. To allow many URLs to be redirected to a single URL only (= use of a wildcard in "From:" field only)
2. To make wildcards work as in filesystem paths (to respect other characters in a path segment and to have a single character wildcard as well)
I need a second feature for partially cut URLs (paths as http://.../image123.jpg, http://.../image234.jpg, http://.../image345.jpg etc. wrongly saved by some robot(s) as http://.../image123, http://.../image234, http://.../image345 etc.), which I would like to redirect to a single page with a message (mainly to prevent filling my logs with errors, but possibly also to inform the administrator over there about his errors). Of course I have to keep links to images - so in my example, I would need to redirect http://.../image???, but not http://.../*.
Or, to introduce another field with exceptions - redirect http://.../*, but not http://.../*.jpg and http://.../*.png and http://.../*.gif (or, at least http://.../*.*). Maybe to simply use regular expressions would be the best, but probably there would be quite a lot of users not familiar enough with them (I am the first one like that), so there should be a good help with typical examples practically a must.
I am aware I ask quite a lot (with the 2nd suggestion), but by my opinion wildcards would expand the possibilities (usability) of this module really significantly.
In any case, could you implement at least the 1st suggestion soon, please.
#13
i tried to apply patch #10 against today's dev version and it gives:
patching file path_redirect.module
Hunk #3 FAILED at 212.
Hunk #4 succeeded at 214 (offset -11 lines).
Hunk #5 FAILED at 298.
2 out of 5 hunks FAILED -- saving rejects to file path_redirect.module.rej
[root@host path_redirect]#
#14
Thank you for the reminder; I'll try to get this patch updated to apply to the 1.1 release within the next couple of days. If anybody else would like to help with this feature, have at it!
#15
Here is some grease for the squeaky wheel.
:)This should apply cleanly to the latest dev version and stands a good chance of applying to the 1.1 release, too. Get it while it's hot!
#16
patch worked this time. but, unfortunately, did not solve my problem described also on http://drupal.org/node/195853#comment-677007
current patch makes it possible to redirect paths with only one single wildcard in the url. for example this works:
Drupal URL: 'articles/<*>'
Redirect path: 'http://oldversion.mysite.com/articles/<*>'
only in case if wildcard <*> replaces a single word.
But it does not work, if you try to replace path with several sections in between symbols '/'. For example, this kind of addresses from my pre-drupal site are not possible to redirect: http://oldversion.mysite.com/articles/2006/06/16/greenline
I also tried:
Drupal URL: 'articles/<*>/<*>/<*>/<*>/<*>'
Redirect patch' 'http://oldversion.mysite.com/<*>/<*>/<*>/<*>/<*>'
Unfortunately, it did not work.
I am looking forward to find a way to redirect thousands of old urls on my site like: http://www.mysite.com/articles/[year]/[month]/[day]/[shor_title] to http://old.mysite.com/articles/[year]/[month]/[day]/[shor_title]
Can you, please, advice what is the best way?
#17
It sounds like you need mod_rewrite. I don't think this module will ever duplicate all of the functionality of mod_rewrite; I'm not sure what the point would be.
#18
+1 subscribing, and waiting until this thread pans out into something simpler for my tiny mind to read :)
#19
+1
#20
Any news about this patch? When will be applied? It was applied?
Thanks
#21
+1
#22
+1. I'd very very very much like to see this feature make it into the D6 version as well.
#23
Not sure if I should file this as a separate bug report, but this feature is not working for me in the 5.x-1.2 version (it has worked for me with various patches in the past). I can confirm that the wildcard code is present in path_redirect.module, but when I enter:
foo/<*>in the from field and
bar/<*>in the to field, I get the error message:
The redirect to path does not appear valid.
#24
I suggest using '%' for wildcards rather than '<*>' for more consistency with hook_menu().
#25
I'd like to resurrect this issue. I could really use this feature on a site I'm working on, but I'd rather not downgrade to the old version to use this patch. Can someone give a status update? Is this feature going to be included in a future version?
#26
Marked #291504: Redirect wildcard and #312708: "Templated" redirection as a dup of this issue. Existing code from #306475: Regular Expression Path Matching may also be useful.
#27
please.....
Could someone post the latest working module for drupal5 with wildcards support? it is good also if it works partially.
I attemped to patch the 5.1.2 but code is changed, obviously.
Thanks in advance.
#28
subscribing
#29
Is there a patch for D6?
#30
Not yet. If this gets in, it will be accepted for 6.x-1.x only. I don't use Drupal 5 myself anymore so I don't have motivation to fix anything in 5.x besides bug fixes. If someone provides a patch that backports the change from 6.x-1.x, I'll review it.
#31
Hey -
I needed a quick fix for the D6 version and - doh - I didn't look at the issue queue before patching it myself against 6.x-1.0-beta1.
So, here's what I have done. I hope it doesn't confuse the readers of this thread.
I promise to have a look at merging/working in the code/ideas expressed above shortly...
Cheers
fp
#32
There is a patch for the beta 3??
#33
beta 1
#34
This thought would be great for me as well. Though it's not a big deal for me yet, so not worth patching my copy, but I'm excited to see that this may be in future releases! Yay!
#35
Hey, this is great.
Is it possible to implement token support?
Somehing like redirect from user/profile/[realname] to user/[realname] would be great.
Case-study: I use core.profile for birthday.module functionality and content profile because of cck fields, taxonomy etc. I merged both in one content profile and would like to redirect automatically from core.profile to content profile.
That would be great!
Regards,
Stefan
#36
Subscribing.
#37
Just subscribing and adding my two cents.
I think this would be great to have as a feature. I would prefer regular expressions over wildcards personally. That mirrors a bit better the typical way of doing it with the .htaccess. Additionally allowing users to specify weights in some situations may be useful. That way someone could override one particular rewrite being done through a regular expression with one that is done through a written out path.
#38
I'm VERY interested in this functionality as well. I currently run Drupal 6.12 but I'm coming over from an old PHPNuke site. according to my site traffic stats, I'm losing a ton of traffic when people are clicking on links to URLs that were previously valid on the PHPNuke site before I moved to Drupal when my old ISP suddenly made changes that broke my old site.
Anyway, I've put up the corpse of my old site as well, but instead of:
http://www.whatever
it's now located at:
http://old.whatever
To give a more specific example, I want to redirect incoming requests for:
http://www.comicsonline.com/modules.php?[wildcard]
to:
http://old.comicsonline.com/modules.php?[wildcard]
so that:
http://www.comicsonline.com/modules.php?name=News&file=article&sid=1205
and
http://www.comicsonline.com/modules.php?name=Forums&file=viewtopic&t=60
redirect to:
http://old.comicsonline.com/modules.php?name=News&file=article&sid=1205
and
http://old.comicsonline.com/modules.php?name=Forums&file=viewtopic&t=60
...respectively. Will fp's module at http://drupal.org/files/issues/path_redirect-182512-6.x-1.x.patch do this or will I need to do more to it or wait for that functionality to be implemented?
#39
I'm half-guessing, but from first glance it looks like you should be able to do this with mod_rewrite rules in htaccess - indeed it may be more appropriate to do it that way. Find a rewrite expert and get a second opinion.
#40
I created a patched module based on latest dev version.
Files patched:
path_redirect.intall
path_redirect.module
path_redirect.admin.inc
See attached:
#41
I 2nd the Perl style regular expressions!! That would provide me total control over the URI's coming into drupal as well as prevent me from mucking with the .htaccess file.
Thanks!
#42
Here is a patch that accepts wildcards with minimal changes to the module by utilizing REGEXP (mysql).
If the admin adds the redirect, from=drupal* and to=http://drupal.org*, a user will be directed like so...
q=drupal/node/182512 to http://drupal.org/node/182512
q=drupalnode to http://drupal.orgnode
q=drupal to http://drupal.org
If the admin adds the redirect, from=drupal* and to=http://drupal.org, a user will be directed like so...
q=drupal/node/182512 to http://drupal.org
q=drupalnode to http://drupal.org
q=drupal to http://drupal.org
If the admin adds the redirect, from=drupal and to=http://drupal.org, then it is the default behavior.
Tested in Drupal 6.14 and is in a live site with no problem.
We initially have the redirections in the .htaccess as rewrite rules (see below), but it was getting a pain to manage, etc.
RewriteRule ^drupal(.*)$ http://drupal.org$1 [R=302,L]#43
ronn,
thanks for sharing the code.
Unfortunately, it doesn't seems to work for my case, where I would need to redirect from:
http://<mywebsite>/catalogue/current*
to
http://<mywebsite>/catalogue/2009/english*
where * shall cover also everything after a ? as for example URL:
http://<mywebsite>/catalogue/current?item=100&colour=1
(should be redirected to: http://<mywebsite>/catalogue/2009/english?item=100&colour=1)
I've tried to put an asterisk into To field or into a ? field (destination), without a success.
#44
I did not take into account the query strings, only paths - it did not even execute the changes because of the check for query strings.
Anyway, this patch should work to do the above, but you may have to revert path_redirect.module back before applying this patch.
Then, you just set the following url redirect:
from=catalogue/current*
to=http://site.com/catalogue/2009/english*
?=
#=
Let me know if you still have problems.
#45
Attached has an important tweak in the regxp query. So please use this one instead of the previous two.
#46
Sorry to report, that (after applying #45), it still doesn't quite work.
The first issue I am noticing is that instead of (URLs are entered as that):
from=catalogue/current*
to=catalogue/2009/english*
the following is saved in Drupal (URLs are listed like that later):
from=catalogue/current*
to=catalogue/2009/english%2A
(but, don't know if it has anything to do with the issue below).
And, while URL: http://site.com/catalogue/current seems to work (redirected to http://site.com/catalogue/2009/english),
as soon as there should be replacements, I get "Page not found" ERROR instead; probably because URL:
http://site.com/catalogue/current?item=100
is converted into:
http://site.com/catalogue/2009/english%3Fitem%3D100
instead of:
http://site.com/catalogue/2009/english?item=100
#47
The query (item=100&...) was being pass to drupal goto function incorrectly. Sorry about that.
This patch should fix that.
#48
Thanks, ronn. Works like a charm now.
#49
ronn,
maybe a couple of "cosmetic" issues should be fixed before consider your patch as just great (at least from my point of view, I've really needed that functionality):
1. In the list of redirects, source "From" URL is listed OK (catalogue/current*) while "To" URL still contains %2A instead of asterisk (catalogue/2009/english%2A); not only that it doesn't look nice and is a bit of confusing, it should probably be fixed also as a basis for issue Nr. 2 (below)
2. Clicking to a "From" (as well as "To"...) URL in the list of redirects doesn't work well (as URL contains asterisk, and such page of course doesn't exist...) - probably it would be a good idea to simply cut away the asterisk from the URL if there is nothing after it (right to it)?
Everything is functional also as it is now, but that would make this solution really complete.
At the same moment, I suggest to include this patch (functionality) in the new development version (and, the next final version, of course).
#50
Thanks for the comment. I was thinking of updating the admin panel a bit too but didn't have time.
Anyway, here is the patch to tweak the admin panel the way you suggested; the patch also includes path_redirect_regex.3.patch changes in it as well.
#51
I'm glad to confirm that it seems to work perfectly now. Thanks again, ronn.
#52
Working for me so far, as well, on 6.14.
Testing on pre-prod site.
Great job!
#53
Love the patch from #50. Hoping this gets integrated into the module itself.
#54
Sorry, that query is not validate with SQL-99 and does not work on PostgreSQL.
http://developer.mimer.com/validator/index.htm
#55
Which query, Dave? I can't find any query in the patch from #50...
#56
@LUTi: There's not a direct query since it's actually performed in path_redirect_load(), but the query that is generated by the patch contains
(path like '%%*' AND '%s' REGEXP CONCAT('^', path)))which is not cross-DB compatible.#57
Subscribing
#58
I believe the specific issue is the 'REGEXP CONCAT'. Concat is not supported in Postgresql and pattern matching works differently. To keep this style there will need to be a check against the global $db_type to see if it's mysql/mysqli or postgresql and create the appropriate queries.
@Dave Reid - Do you know what the correct part of the query for postgresql would be?
#59
I think this needs something like:
switch ($GLOBALS['db_type']) {
case 'pgsql':
// Postgresql where clause goes here.
break;
default:
$where[] = "(path = '%s' OR (path like '%%*' AND '%s' REGEXP CONCAT('^',path)))";
break;
}
#60
Subscribe - useful feature