I've seen a few discussions in the forum section, and one or two sandbox projects, that attempt to work around this, but I wonder if the solution belongs in this module to avoid duplicating content.
Drupal 7 introduced the concept of a URL for a comment, that takes the form /comment/[comment-id]#[comment-id]. This allows you to create views that link to individual comments, where you get taken to the anchor at the start of that comment. The full node is displayed - it just gives a way to navigate to the particular comment when the page loads.
This was necessary to overcome a difficulty with Drupal 6. If comments ran to more than one page, a comment could be on page 2 of the comments. The correct URL for the comment was then /node/[nid]?page=2#[comment-id]. But instead the navigation went to /node/[nid]#[comment-id]. The result was that the comment someone was looking for wasn't on the webpage that loaded, and consequently neither was its anchor.
So Drupal 7's solution to this is to be welcomed. However: It creates another scenario where search-engine sandboxing could occur. If a node has 10 comments, each page with the URL /comment/[comment-id] loads identical content to the node itself. We thus get 11 different URLs, all with identical content.
So there needs to be a way to rewrite /comment/[comment-id]#[comment-id] as /node/[nid](?page=[pagenum])?#[comment-id]
I'm guessing that the page numbering bit is the trickiest detail of that to implement. That aside, Drupal is already inserting a
tag into the head section of those comment URLs, and the given canonical URL is the correct one. If that is already being done, this should be a solvable problem.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | Rewrite_Comments-1680978-5.patch | 2.49 KB | jamesoakley |
Comments
Comment #1
nicholasthompsonI *REALLY* want to get this fix into GR for personal reasons; The comment module has completely arsed up the SEO on my own blog (http://www.thingy-ma-jig.co.uk) - Google has indexed all the comment pages instead of the Node ID's. Damned annoying! ;)
In *theory* it should be relatively trivial to calculate... Especially as a "one off" hit. You'd just need to figure out how many comments in the list and where, in that list, the comment falls. Threaded comments might make it interesting.
In my personal case, I don't paginate comments anyway, so its less of a problem.
Comment #2
jamesoakleyBasically, I'm seeing people starting landing on my site at the comment pages. That's flagging up the problem for me. It's only a matter of time before my own SEO takes a big hit.
I've just started playing around with http://drupal.org/sandbox/Ayesh/1578662.
It doesn't rewrite the URLs, but it uses drupal_goto to send any visitor who lands on /comment/[cid] to the right page. And - it seems to get the comment pagination right by calling comment_get_display_page().
IIRC, drupal_goto uses 301, so that should mitigate the SEO hit almost entirely. Including it in the URL rewriting would be even better, but even this would be better than nothing.
How would you feel about incorporating the few lines of code in that sandbox module (with due credit to Ayesh)?
I'd happily have a go at writing a patch for G-R for it.
Comment #3
nicholasthompsonIf you could do a patch, that would be *awesome*. I would do it, but some bright spark thought 24 hours in a single day was sufficient. Clearly they weren't employed, didn't have a commute and didn't have a 4 month old baby. ;-) hehe.
Thanks :)
Comment #4
jamesoakleyWell, ours has reached 12 months now, so I'll give it a go ;-)
Comment #5
jamesoakleySee attached.
Two things that I think could be better, but I don't know (quickly - which is all the time I've got right now) how to solve them.
What d'ya think?
Comment #6
jamesoakleyOne more thought - do we need to check that what follows the "comment/" line is only numeric, to prevent potential injection flaws?
Comment #7
nicholasthompsonWe could just do a menu_rebuild, rather than an entire cache clear (eg, we dont need to rebuuld the theme registry ;) heh).
Rewriting the URLs would be nice (maybe hook_url_outbound_alter?) - however the "scope" of GlobalRedirect is to maintain site structure via URL redirects, not to rewrite the content on the site. I feel that when features like that creep in, the weight of the module starts to increase. Bearing in mind this is something that gets included on every page and runs all the time, it needs to be lightweight. (IMHO)
An alternative to altering the menu would be to check the path in the main redirect function (eg, if the request path is
comment/([0-9+])).Comment #8
nicholasthompsonIn theory, no check should be needed. The parameter gets passed into your function which then tries to load it. Non-numeric would fail to load a comment and therefore nothing would happen.
Comment #9
jamesoakleyAgreed, in principle. Using admin_menu, it certainly seems that the menu cache is the specific cache to flush in order to make the change effective.
For some reason, though, putting a call to menu_rebuild at the end of globalredirect_settings_submit_save doesn't have this result. If the comment-url rewriting feature is turned off at the settings page, then comment URLs stop being rewritten straight away. However when it's turned on, it needs a subsequent call to rebuild the menu. Perhaps I'm calling it too early, but I can't see where else I can hook a menu rebuild. (menu_rebuild returns true in either case)
I agree with you, there.
And agree there too. The drupal_not_found() call should take care of all such cases.
Comment #10
ayesh commentedI too think a global redirect would be the better way to get this functionality.
Most of global redirect users are isn't this for SEO, so most likely that they will need this functionality as well.
Although canonical URL is set to the node path, most of us have problems with these comment paths I guess.
However, unlike node.module, comment module is optional. So may be a check (to make sure module is enabled) before adding this option to the admin form?
hook_menu_alter will not insert any paths so we will not need a module_exist() before altering the callback. I'm not sure about this though.
Comment #14
jamesoakleySomething's not right with this.
Periodically, the setting seems to be being turned off. The check-box on Global Redirect's settings page remains checked, but the /comment/cid pages stop being redirected as they should be. Using drush cc all doesn't fix this, but using admin_menu to flush all caches in a web browser instantly fixes it and pages redirect again as they should do.
Putting a few calls to watchdog shows that the hook_menu_alter function isn't being called. That's getting too deep into the core hook for me to understand exactly what's failing. Why would a module register a new hook function like this, and then it suddenly stops being invoked?
Can @ayesh or @nicholasThompson shed any light on this one?
Comment #16
lolandese commentedFound this functionality to be present in the current dev. Seems to work well on a clean install.
If any issues arise, re-open and change issue category to 'task' or, better yet, open a new issue.
Comment #17
jamesoakleyJust for the sake of a complete record, this went in at http://drupalcode.org/project/globalredirect.git/commit/c500ede
It's flagged as an experimental implementation, so I guess review and feedback is still welcomed to help the maintainers work out whether this is the correct way to implement this functionality.
Comment #18
ayesh commentedThat's really great! I added a note to that sandbox module to look for Global Redirect module.
Cheers!
Comment #19
RobertOak commentedThe Comment redirect solves a lot of problems and so far works great! Thank you! What a navigation mess before this was available!
I had an additional issue in that allowing tokens be added to the path before the comment fragment.
i.e. [node aliased path URL]#comment-xxxx
an additional possibility is:
[node aliased path URL][token-tab]#comment-xxxx
with that option available in the global redirect administration form and let's just for sanity, keep it to one token per content type. Maybe token isn't the right word, field group or maybe just straight up appending of text.
i.e.
[node aliased path URL]/text/#comment-xxxx
I can hack the code to do this but it's unlcear where I add this in the $options array plus well, I just don't want to hack up another module that I know will be updated and then I'll have to patch and also actually works and is tested so why should I hack it in the first place. ;)
This is for putting node associated comments in a separate tab. It's a hacked up solution to the "talk" module (which doesn't really work too well) for comments displayed on a separate vertical tab associated with the node.
i.e. http://www.example.com/content/this-is-my-content-aliased-path#comment-xxxx
would then become:
http://www.example.com/content/this-is-my-content-aliased-path/talk/#com...
using the talk module (which will be hacked up and made custom) for what I'm wanting to do.
Anybody have a better way to put comments on a separate vertical tab, plus have global redirect navigate to the associated node plus fragment comment, which allows the user to end up in the right spot on the page when clicking on comment links, please suggest.
I don't want to add path aliasing and everything else to a derived custom module from talk, gets redundant.
Thanks again for this great module.
Comment #20
jamesoakleyIt looks like this got lost when Global Redirect merged into Redirect. #3495377: /comment/ID does not redirect to the canonical URL