This causes false alarms, for example in the Google Webmaster Tools stats it will show many errors logged (correctly) as connection errors - the bot thinks your site is offline or partially/temporarily broken etc. As we deny access for some URLs bots should never attempt to crawl anyway, we should respond with standard 403 error instead, so the bot will understand it as "nothing to see here!" and give up, instead of "let's try again, they are probably offline".
For reference, here is the full list of URLs by default denied for all known bots, as either they are never cached or are known to cause serious system overload when crawled (often to index hundreds of empty pages, as it is the case with calendar module) or are just not expected to be crawled and indexed:
location ^~ /search
location ~* /(?:autocomplete|ajax|ahah)/
location ^~ /admin
location ^~ /audio/download
location ~* (?:cgi-bin|vti-bin|wp-content)
location ~* (?:calendar|event|validation|aggregator|vote_up_down|captcha)
location ~* ^/(?:.*/)?(?:user|cart|checkout|logout|flag)
location ~* /(?:node/[0-9]+/edit|node/add|comment/reply|approve|users)
Comments
Comment #1
omega8cc commentedFixed in:
6.x-1.x - http://drupalcode.org/project/provision.git/commit/f40bcbb
6.x-2.x - http://drupalcode.org/project/provision.git/commit/036affb
Comment #3
jami3z commentedhey omega8cc, sorry to reopen but I have this exact same issue and some very unhappy clients. I have taken over management of a server that is running nginx and aegir. It doesnt seem to have provision installed so where else can I go within the config to try and implement a fix like this so googlebot wont receive 444 errors and will stop failing.
cheers
Comment #4
steven jones commentedSorry @jamienotweet, please don't re-open issues for unrelated reasons.
Have a look at: http://drupal.org/support for your support options, or post your question on http://drupal.stackexchange.com/ Good luck!
Comment #5
jami3z commentedUnrelated reasons??? This is the exact issue I am having.
thanks for the help, will try and find it elsewhere.
Comment #6
steven jones commentedMy apologies, I misread your comment, you probably want to upgrade to the latest Aegir version: http://community.aegirproject.org/upgrading
Comment #7
omega8cc commented@jamienotweet There must be provision installed, but if it is a BOA based install, please open issue in the Barracuda or Octopus queue.