If a node has two aliases in the url_aliases table, Global Redirect redirects the system path (node/25) to the first alias, but doesn't redirect the second alias to the first alias. Therefore there are two URLs that provide access to the same node -- the two aliases in the url_aliases table. This is not ideal for SEO, and hence the need for global redirect. This feature is required to fulfill 301 redirect needs for many large sites where one URL per content item is important and multiple aliases per node are required.
Ideally it would be possible to choose which alias is the primary URL nodes / system paths. This could involve another settings/pathauto option:
How does is the primary alias selected?
- most recently added alias
- oldest alias
- [others?]
[checkbox] can an author manually select the primary URL (and over-ride the auto-selection above)?
(maybe this one should be a permission?)
Comments
Comment #1
nicholasthompsonInteresting idea - I like really it...
A node could have a primary alias and "n" number of secondary ones which could be an alternative way of redirecting old URL's to new content (eg, when migrating sites into Drupal from old CMS's or static sites).
This should be fairly easy - my only concern would be overhead. Currently Global Redirect causes up to 3 extra hits per page load. Hits for multiple URL's could/will cause even more. At least these hits are small though...
Thanks for the suggestion! I'll try to include it in a release soon. Although this is tagged as 5.x - I'll get this done for 4.7 and 5.x.
Comment #2
Bevan commentedGlad you like it! A simple early implementation could exclude the UI / permissions part and and focus on handling multiple aliases by always redirecting to the most recently added one.
Why does it cause THREE extra hits? There should only by one extra: the first hit to the system path, or a secondary URL, drupal returns a 301 permanent to the primary alias, and the real hit to the primary alias.
Comment #3
nicholasthompsonSorry - I dont think I explained that entirely clearly. Currently Global Redirect causes 3 "extra" hits per page load in terms of there would be no extra hits if the module was disabled...
Hit 1 (COMPULSORY): Check current request against system path's for an alias.
Hit 2 (OPTIONAL): Check if the current request ends in a slash - if so, check system paths for an alias against the current request WITHOUT its slash
Hit 3 (IF NO MATCH ON 1 OR 2): Check if the current request matches the system's frontpage path - if so, redirect to the frontpage... This is currently causing issues with login's as reported in another issue.
If this feature request were implemented, there would be further hits... My current idea would be to scan all destination URL's for the current request, then - upon match - pull up the system path for that alias. Then lookup all aliases for THAT system path, ordering by alias ID so the first one is used. This would be an initial release.
I think the best way to implement this feature would be to form_alter the node's fieldset for path alias' and have a set of radio buttons below the textfield to show extra URL's. If there is only 1 alias then the radio button is disabled as there will be no need for a primary. If there is more than 1, then you can select which is primary using a radio button.
This would then go into a table generated by Global Redirect. This table would simply be a 1 column table with Node ID as the field.
The section of the init() function of this module which will be dealing with secondary alias' can do something like a left join with this new table onto a query similar to the alias lookup query and then do an ORDER BY so that the row which exists as primary goes to the top (not sure if NULL goes to bottom with ASC or DESC - trial and error will tell). You'd also need to do a secondary order on alias ID so that if no primary is set, it reverts to its default rule of lowest alias ID becoming primary.
This is just a quick mental sketch being "dumped" here - thoughts would be greatly appreciated.
I hope this makes sense!
Comment #4
RayZ commentedIf I understand what you want to accomplish, it sounds like Path Redirect may provide a manually administered workaround.
Comment #5
nicholasthompsonThat looks interesting - although it might be more convenient if all that was node-based rather than administered separately.
Plus this feature would help you control what happened if Path Auto decided to make multiple URL's for the same node (easily possible if someone creates a term, it gets indexed by google for 2 months and then they rename the term - you dont want to loose the page that has been indexed for 2 months).
Comment #6
Bevan commentedAh! Now I see where we misunderstood. I thought by 'hit' you meant a client request to the server.
I don't have much programming experience, but what you said sounds like it would work. My only concern as that you'd use nid on the gloablredirect table. Remember that user profiles, terms, views, and other non-node objects can also have aliases, and multiple aliases. In fact pathauto can quite easily generate multiple aliases for terms if you change the term name, for example. Although nodes are the primary concern here, other pages are also important and get indexed.
RayZ; Path Redirect would do the trick with a lot of manual work (for large sites), however an automated solution is necessary for large sites that run pathauto. In our case, our content writers don't understand URL management enough to do all that manually, and I don't have time for laborous manual tasks like that.
Comment #7
nicholasthompsonGood point - nid is not the right identifier. It will have to be the system or source path... But this involves duplicating rows out of the alias table... Ideally I dont want to be modifying the url_alias table.
Comment #8
hass commentedTake "path_redirect" module...
Aside - if you have different url aliases pointing to the same node - you have "duplicate content". Don't do this!!!
Comment #9
nicholasthompsonCorrect - you do, but what if you could set a primary alias for a node and all the others redirected to them?
How usefull would it be if you were implementing a new site based on an old one and you wanted to preserve URL's but redirect them to new, neater, nicer ones. By adding the old and the new URL and setting the new one as the primary, not only would node/123 be direct to the right place, but so would the old URL.
I will look into the path_redirect module - however I think I've seen it before and its kind of a halfway-house between what we're talking about and nothing at all. I think it just allows you to specify a URL to redirect to another URL. I'm not sure it gives you control on a per-node basis... But I'll look into it.
Comment #10
Tobias Maier commentedI know, setting an alias as the default alias is a missing feature in drupal.
But I don't think global redirect is the place to fix this.
Global redirect should be easy and do only its job: prevent duplicate content for SEO
If you move over from an old site to a new site path redirect is your module. I'm using it happily on all of my sites.
So please don't "over enhance" your module keep it simple!
If you would ask me, I would set the status to won't fix
Think about one thing: as more queries you have on every page request and as more complex this queries get as more time will it need to generate a site.
a true solution would fix it on the root: url() or better drupal_get_path_alias() are the places, where a patch should start at.
Comment #11
hass commentedyes, tobi. thats the case. keep it simple and what it is made for with as less SQL request as possible,
An interim redirection task from old url aliases to new ones can really be solved with path_redirect. i'm using path_redirect and it works well for such URL transitions.
Comment #12
Bevan commentedI disagree that Global Redirect is the wrong module for this feature. Global redirect's 'reason' and purpose, as taken from http://drupal.org/project/globalredirect, is:
This feature request fits perfectly in with this purpose, and in fact completes it. Globalredirect currently does not achieve the goal of one-URL-per-node for large sites requiring multiple redirects or aliases per node.
In the end it's up to the module owner if it gets included in this module or not.
Other rebuttle;
pathauto (with certain configurations) does this already (automatedly), as do certain business models (manually) for websites with specific marketing and SEO needs.
pathredirect does that well for site 'imports' etc. -- but that's not the purpose of this feature. This feature would, in addition to it's main purpose, provide an automated way to do this, possibly a faster and easier way.
Good point. What's a better way? This feature need not be 'on' by default.
I'm not a programmer or sysadmin, so maybe this is a stupid idea; what about adding lines to .htaccess when/as aliases are created/edited? This would offload the work to apache, which will handle it much better than drupal, and not require ANY extra SQL queries. Permissions to write to .htaccess, and security could be major issues with this method. Could they be overcome?
Duplicating rows is not ideal -- neither is changing core tables. You could look at it as row-duplication, or you could view it as a relationship by which path/to/a/page is the unique ID, after all, the path is just part of a URI (URL), and a URI is an ID; Uniform Resource IDentifier. Additionaly, all aliases in the url_alias table should be unique. Alternatively, the .htaccess method above would mean that no database changes are required -- although they could be useful to restore broken .htaccess files.
Possible solutions I can see at this point:
Comment #13
Bevan commentedI just noticed that table url_alias has an ID column: pid -- it has auto_increment and is the primary index. Therefore the issue of repeating rows or ''pretending' the path is an ID, is not an issue. A relationship table, specific to globalredirect, that doesn't repeat rows, or change core tables could be as simple as two columns:
Such a small table would probably make more sense as one extra column on the url_alias table, but that would best be done in the path module, which would have to wait for a major release version of that module.
I think I'm really getting out of my depth here. Probably someone with more drupal and programming experience needs to give me some feedback and tell me why I should learn how to do this properly...
Comment #14
Tobias Maier commentedI don't have much time yet, but here comes my suggestion for you, hass:
write a _seperate_ module, which runs by cron.
This module looks at the url_alias table and moves the duplicate ones with the lowest pid (=oldest entries) to the path_redirect module.
But please let the admins define exceptions, which should not be moved and stay as the default path.
Then talk with the pathauto maintainer to include an option, which moves old paths directly to the path_redirect.module
Comment #15
Bevan commentedThat sounds like a very tidy solution indeed. There would be full utilization of existing code that way. I need to check out path redirect to see how it works.
Can someone tell me why writing to .htaccess would or wouldn't work? I understand it's duplicating the purpose of path_redirect. But the server-load advantages might warrant it's advantages. Does anyone agree or disagree?
Comment #16
Bevan commentedI've started a new topic as this one is changing direction and form: http://drupal.org/node/118575
Comment #17
nicholasthompsonAs much as using .htaccess seems like a more efficient way of doing things (ie, Drupal is slow), bear in mind that the htaccess will be used (AGAIK) for EVERY hit, including images and static files - even CSS. Thats quite an overload. It'd be fine if you have, say, less than 100... But for a large site (eg www.sportbusiness.com) with something short of 27,000 aliases - its not really feasible.
Comment #18
Bevan commentedchx is working a feature for drupal 6 that will do this. http://drupal.org/node/106094
There is no point in developing this unless it is for, and only for drupal 5. Any objections to closing it?
Comment #19
grendzy commentedIt looks like this never made it into Drupal 6. I think this would be an awesome feature, especially for those sites that use pathauto's setting for "Create a new alias. Leave the existing alias functioning."
Edit: nevermind, it looks like pathauto already does this when path_redirect is enabled.