I've always added a simple robots.txt file to my Drupal sites, since I would see 404 errors in the logs where search engines were looking for one.

Should we add a very basic robots.txt file by default? Perhaps..

User-agent: *
Disallow: /admin

# To customize this robots.txt file, follow the examples...
# Disallow: /path/to/directory/or/file
# Allow: /path/to/directory/or/file

# If you do not disallow a file or directory, it is allowed by default.
# The User-Agent string can be used to target specific search engines. Use Google to find out more about that.

Comments

Steven’s picture

This is a good idea.. there are a lot of pages on a drupal site which do not need to be spidered, such as the comment reply page or the node add page. Drupal.org has one too.

--
If you have a problem, please search before posting a question.

bertboerland’s picture

we discussed this some time ago and there was even a time that drupal included a robots.txt. i am still in favour, however dries didnt agree. see this thread. is included in documentation however.

update: corrected wrong link!
--
groets
bertb

--
groets
bert boerland

adrian’s picture

Other things that could/should be disallowed are the tracker (recent posts) page, and maybe /user (although that could be a site specific thing)

--
The future is so Bryght, I have to wear shades.