Posted by raym0nd on September 6, 2010 at 7:31am
Hi,
How to exclude drupal site from search engine without using robots.txt ?
Is there a way to set in the administration menu or settings.php file?
Please advise.
Thank you :)
Hi,
How to exclude drupal site from search engine without using robots.txt ?
Is there a way to set in the administration menu or settings.php file?
Please advise.
Thank you :)
Comments
Is the website you are talking about public?
If yes, I guess robots.txt is the only way to suggest to the search engines not to index your website.
Yup it is for public. How
Yup it is for public.
How about any module for SEO?
SEO does exactly the opposite!
It could perhaps be used to lower the rank of your pages in searches, but I do not think that it's able to block search engines.
Oic, thanks for the advise.
Oic, thanks for the advise.
Robotstxt project perhaps
In case you literary mean: not use the robots.txt file, but are ok with using the same technology, you could use the Robotstxt contributed module (http://drupal.org/project/robotstxt). By using this module, it is not the robots.txt file that is served to the spider of the search engine, but Drupal itself is generating the file. The purpose of the module is to serve different robots.txt files for different sites in case you run Drupal in multisite config.
Mark
_
... just a thought. I use the following method to keep websites that are under construction out of search engines:
I password protect the root (public / www) directory on the server. That blocks the search engines - so far - and adds a second security layer giving access to those working on the project / site only.
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
Robots.txt / SEO
Create a text file called robots.txt - paste the text below, change 'privatefolder1' and 'privatefile2' to all the files/folders you wish to block, and drop the robots.txt into the root directory of your Drupal site:
User-agent: *
Disallow: /privatefolder1/
Disallow: /privatefolder1/privatefile1.htm
(If you want to disallow everything use this - use with caution):
User-agent: *
Disallow: /
http://benacheson.com
_
ben,
just a quick Q - what is the story behind your highlighted warning?? "(If you want to disallow everything use this - use with caution):"
Thanks ....
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
You are simply telling all
You are simply telling all search engine spiders: "Go away!"
That might not be what you want. The effect will be that your site will not show up any more in any of the better known search engines.
Mark
_
Thanks Mark,
it's what I expected and what was part of the discussion. Since there was no explanation on ben's note I was wondering if there is another issue hidding somewhere.
The point was discussed that for virtual office websites for logged-in members SEO and crawling spiders are usually not wanted.
And to prove the point I recently changed the robots.txt file on an existing website that never really went live:
User-agent: *Disallow: /
But Google duly ignores the changes and keeps crawling the site. So - Troll - http://drupal.org/project/troll and banning - http://www.yoursite.org/admin/reports/visitors these offenders seems to be the only way right now.
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
Thanks for all the
Thanks for all the replies.
As I thought there is a way to disable in the drupal settings itself.
_
webcrawlers browse the site as anonymous-- uncheck the 'access content' permission for the anonymous role and webcrawlers can't see anything.
_
Don't be a Help Vampire - read and abide the forum guidelines.
If you find my assistance useful, please pay it forward to your fellow drupalers.
_
.... but then user that are not logged in cannot see anything either.
I guess it depends on what you want to do.
We are creating some websites for (smallish) companies at present that need to have a "virtual office website".
These sites are used to exchange information between teams and members of teams. Thus the sites are private and the content should not be crawled = SEOed.
Apart form not doing any SEO work on these sites, we also use the double login described earlier, the robotxt exclusion and are always happy to learn some new tricks ....
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
_
huh? Just grant 'authenticated users' the 'access content' permission.
_
Don't be a Help Vampire - read and abide the forum guidelines.
If you find my assistance useful, please pay it forward to your fellow drupalers.
_
.... as I said - it depends what you want to do.
So far we do not know what raym0nd wants to do.
May be he wants a site that is not searched for some proper or not reason and let those who know where to find it see at least the front page.
As far as I understand it without 'access content' permission you would not be able to see anything but the login page.
I would assume that is exactly what one would want - and what we do one some projects mentioned.
Let's see may be raym0nd will enlighten us a bit more and tell us what he is upto and how he will have succeeded in the quest.
.....
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
_
I totally read your previous content wrong, lol. I thought you were saying that logged in users couldn't see anything, duh.
In any case-- you can also use the http://drupal.org/project/front module to allow anonymous users see just the front page so you can add some text there about the site and how to 'register for more info' and such.
But no, there's really no 100% reliable way to stop search engine indexing while allowing anonymous 'user' browsing afaik.
_
Don't be a Help Vampire - read and abide the forum guidelines.
If you find my assistance useful, please pay it forward to your fellow drupalers.
Actually one of the drupal
Actually one of the drupal site I am working on is for internal company usage which I do not want search engine to to locate.
I know that robot.txt can do the job but I heard that drupal setting itself can set which I am curious to find out.
Can this method uncheck the 'access content' permission for the anonymous role ensure that search engine won't see anything? As my site require authentication.
_
... I would recommend not to use anonymous role for an internal website. Having only logged in = authorized roles would give you better recording of what people are doing (have done).
In my mind anonymous role = is for public websites and on the forefront of SEO, promotion, marketing ....
In other words - if you want to keep a website private - than there should be no space for anonymous role.
-----------
Good luck .....
the results of trying Drupal just once are
www.mallsandmore.com
www.sds-i.com
www.proRotaTherm.com
Thanks for the explanation :)
Thanks for the explanation :)
_
robot.txt is not 100% -- search engines are not required obey it. The only 100% guaranteed way to not allow your content to be indexed is to remove access for the 'anonymous' user role.
_
Don't be a Help Vampire - read and abide the forum guidelines.
If you find my assistance useful, please pay it forward to your fellow drupalers.