robots.txt

sp_key - November 9, 2009 - 10:10

Hi all,

I just managed to install my staging site online.
Currently my structure is like following:

main domain: http://example.com (This is still "under construction")
subdomain: http://example.com/staging
my blog: http://example.com/blog (wordpress installation)

I would like to restrict Google from indexing my staging site so I'm thinking whether I could do this with a robots.txt file.

I don't know whether I need one robots.txt file for each of the above installations (one for my blog, one for my live and one for my staging site) or just one single robots.txt file.

Any thoughts?

Even though you used the word

HershelSR - November 9, 2009 - 12:14

Even though you used the word "subdomain," the URL you gave is not actually a subdomain, it's just a URL on your main domain. A subdomain would be like http://staging.example.com/

Given the URLs you provided above, I do believe you would need only one robots.txt file, because the search engines should perceive your site as one site. Anyone should perceive it that way, because it's all under one domain. :)

HTH

--
CiviHosting - Drupal Hosting at its Best

OK, good to know. I need to

sp_key - November 9, 2009 - 14:04

OK, good to know.
I need to investigate then as according to my hosting company this is my subdomain.

I have to admit, I love this

sp_key - November 9, 2009 - 16:13

I have to admit,
I love this community :)

So problem sorted :)
This was my own error. I had incorrect baseurl info on my settings.php file.

Now I got it sorted.
Do I need a separate robots.txt file for my staging site?

hello sp_key, if your

fighter75 - November 10, 2009 - 03:31

hello sp_key,

if your staging site is located here: http://example.com/staging, you do not need a separated robots.txt file.

Just add to the existing robots.txt file used for http://example.com/ the following line:


Disallow: /staging/

if you want to create a new subdomain in fact you'll need a separated robots.txt file . it must be located in the root directory of your subdomain Ex: /public_html/staging/ Be sure that can be accessed at http://www.staging.exemple.com/robots.txt.

just add the following lines in the robots.txt file:

User-agent: *
Disallow: /

it can be done quickly! good luck:)

 
 

Drupal is a registered trademark of Dries Buytaert.