Closed (outdated)
Project:
Global Redirect
Version:
6.x-1.x-dev
Component:
Code
Priority:
Normal
Category:
Feature request
Assigned:
Unassigned
Reporter:
Created:
28 Feb 2009 at 19:22 UTC
Updated:
25 Sep 2020 at 07:21 UTC
Jump to comment: Most recent
Comments
Comment #1
nicholasthompsonTechnically this is possible.
The problem is "how do you define a bad entry in the query string". Maybe a module on the page requires it? It could be anything...
Comment #2
giorgio79 commentedGood idea!
I just noticed in Google Webmasters one of my simple node pages was reported for duplicate title tage, when I checked it was like this
mysite.com/mynode
mysite.com/mynode?page=1
mysite.com/mynode?page=2
mysite.com/mynode?page=1205
Weird. No clue how Google picked that up as those page variables do not exist, it is just simply "mynode"
Comment #3
avpadernoGoogle Webmaster Tools will always report those pages like duplicated, whenever the passed query string is used or not.
The only solution to that problem is to add a meta tag to those pages.
Comment #4
giorgio79 commentedThanks Kiam, I think I understand but the problem is that those ?page=xxx dont exist!
I have no idea how Google picked those up, as they all show the same page!
mysite.com/mynode?page=1
mysite.com/mynode?page=2
mysite.com/mynode?page=1205
is the equivalent of mysite.com/mynode
This is not a views page with paging, it is a simple node page :)
Comment #5
avpadernoThat is really oddy. Google should pick up links used in Drupal nodes, not attach random strings to the URLs.
Comment #6
hd commentedI see the same in the webserver logfiles. Google is crawling pages adding out of the blue ?page=xxx and thus theoretically indefinitely crawling the same pages over and over again. What a bottomless mess! Wonder how this brain dead Googlebot is/was picking these up.
Like for example this very page here can be called with any nonsensical query string like http://drupal.org/node/386928?page=123 etc. and Drupal is silently ignoring it. This can lead to significant overhead and waste of bandwidth.
The issue is also discussed at http://drupal.org/node/309804
I was hoping that this module could do something about it, but I understand it is a much wider problem and not all limited to Drupal. One can pretty much add ?page=123 to the URLs of perhaps most websites without any consequence at all.
Comment #7
avpadernoI am closing this issue, which is for a not supported Drupal version.