robots.txt how to manage them on domains ?

matteoraggi - November 5, 2009 - 16:33
Project:Domain Access
Version:6.x-2.0-rc9
Component:Documentation
Category:support request
Priority:normal
Assigned:Unassigned
Status:active
Description

Int the online documentation, i don't have founded nothing speaking about robots.txt , how I can manage it using domain access module?

#1

matteoraggi - November 10, 2009 - 18:52

I have done some testa about it (domain access was configured from a friend and I checked what I can only) and this is my test result:
1) If placed on root of main domain, using domain access, then robot.txt is the same on all other subdomains
2) Putting robots.txt into maindomain/sites/all/themes/themename or maindomain/sites/default/domain_name/ or sites/default/files/domain-N (N is idnumber of subdomain) it don't work.
Conclusion:It look that with domain access module actually is impossible to put a different robots.txt per web site, but I think that it could be useful in soem cases, and more are domains and more probably it will be useful.

#2

agentrickard - November 10, 2009 - 23:02

There is a robots.txt module that is designed for multisite setups and may be helpful. You need some manner of redirect to point the bots to the proper version of the file.

#3

matteoraggi - November 11, 2009 - 11:34

I founded the module, reading multisite, I was thinking that it can't work with domain access module
http://drupal.org/project/robotstxt I don't feel so much able to configure it on domain access module configuration, could you please be more detailed with instructions please?

#4

agentrickard - November 11, 2009 - 10:42

I cannot. I have not used that module.

#5

jeremyr - November 13, 2009 - 16:13

I was recently checking into this module and have found that it only works in the sense of mutli-site installs... as in having multiple instances of drupal installed from the same code base. You would have to activate that module for each instance to get it to work properly.

What I have done to prevent search engines from seeing the same content on my other domains is I enabled the search engine rewrite in the domain settings. I assume this works fine but if you need to specify different items in the robot.txt for each domain I think we'll have to create a new module.

And if you are into "Hacking" and doing non-recommended practices you could do what I do on my development site. In my index.php file I added some php logic to identify robots and then redirect them back to the live site. You could add something like that to redirect them back to the primary domain if they try to scrub the other domains. I'm sure there are all sorts of things wrong with doing that but it would work.

#6

agentrickard - November 13, 2009 - 17:34

You might also be able to do a rewrite rule in .htaccess that points to different versions of robots.txt based on the inbound HTTP_HOST.

 
 

Drupal is a registered trademark of Dries Buytaert.