For a custom content-type I have the location of comment submission set to 'Display on a separate page'. Google is indexing the comment/reply pages even though I have 'Disallow: /comment/reply' in robots.txt. According to responsdents on google webmaster forum this is because of the reply link on the content-type page (http://www.google.com/support/forum/p/Webmasters/thread?tid=19a66f0cf4f9...).
How can I stop google from indexing these reply pages?
From my understanding so far there seems to be two possible routes:
1) Include meta tag name=robots content=noindex at the top of each reply page. This is my preferred route but how could I do this (is there a .tpl.php file I should add some code to)?
2) Would changing the location of comment submission to 'Display below post or comments' stop the indexing by removing the reply pages entirely? I would prefer not to do this for aesthetic reasons but will do so if needed.
I have searched for a solution and found nothing - so I suspect I might be looking at this completely wrong. Any assistance appreciated.
Phil
Comments
I found an answer
I found an answer that might help someone. In a nutshell: I called the drupal_set_html_head() function in box.tpl.php
My problem was that my comment/reply pages were being indexed by google even though I had "Disallow: /comment/reply" in robots.txt. Google was indexing the reply page because of the the reply link on the content page. The solution was to put the noindex meta tag in the reply pages.
I copied box.tpl.php into my theme directory (box.tpl.php seems only used for comments and search http://drupal.org/node/11814).
I added the following
Now all my comment + reply submission pages have the noindex meta tag!
I think you need to remove "Disallow: /comment/reply" from your robots.txt to allow the robot to crawl your page to learn that the page should not be indexed.
Have same problem. However,
Have same problem.
However, I tried this in 6.19 but the metatag never got written to the page after visiting a url with http://www.site.com/comment/reply/123456 format
subscribing. Haven't got
subscribing. Haven't got time to pursue this right now but it certainly looks helpful and I'm curious to follow the thread. Thanks,
subscribe
I'm not totally sure but allowing lots of comments/reply to be crawled will be performance instensive.
If the above case is true, than I wouldn't mind the links being indexed for the sake of better performance. I'm just worried that google may see them as duplicate content?
...Here's my attempt at including the meta tags in Template.php:
function phptemplate_links($links, $attributes = array('class' => 'links')) {
if ($links['comment_add'] != '') {
$links['comment_add']['attributes']['rel'] = 'noindex, nofollow';
}
$links['comment_comments']['attributes']['rel'] = 'noindex, nofollow';
}
I am using D7. Will this code
I am using D7. Will this code work on D7?
Over a year late but ...
I'm using Drupal 7 (and I know next to nothing about template files or php) but after way too long I came up with this code:
I placed it in the Head portion of my html.tpl.php template (and placed it in my subtheme's template subdirectory).
I did read something where $_GET wasn't going to be supported by Drupal 8, so I used the current_path() function.
I've tested this, and am under the impression that all is working well. If anyone reading this see's some obvious reason why this isn't a good solution, I would hope you would post and give me a heads up.
I also don't understand why this isn't a big issue to every Drupal user.
I just did a Google search.
I just did a Google search.
site:drupal.org/comment/reply
There were 327 pages in the Google index. How is that good for SEO?
site:drupalgard.....com/comment/reply shows 307,000 indexed pages.
site:acqu....com/comment/reply shows 2,130
How is it bad for SEO?
How is it bad for SEO?
Contact me to contract me for D7 -> D10/11 migrations.
Duplicate content. The
Duplicate content. The snippet contained in the page is the same snippet shown on the page the whole comment thread is shown on.
Thin content. At most a comment is usually a paragraph or two (less than 100 words).
The result is 10's, 100's or even 1000's of pages in the Google index that are low quality (per the two factors above). That's what the Panda update was all about.
While I don't know for certain, from my experience with my site, this single Drupal.org page could be generating 8 comment/reply pages that get indexed by Google, since there are 8 replies to the initial comment.
Don't search engines count comments
Don't search engines count comments as a good thing?
How do they do that?
It makes sense.
It makes sense.
I think however that if the comment page is disallowed in Robots.txt, and Google is still indexing it, then it's a problem withing Google - maybe contact them? You could also remove these pages using Webmaster Tools.
Contact me to contract me for D7 -> D10/11 migrations.