When I searched for Drupal earlier one of the site links is "Site off-line", imho this gives a really bad impression and should be removed via the sitelinks section in google webmaster console. Image attached showing the issue.

CommentFileSizeAuthor
drupalSerp.jpg73.32 KBFintan

Comments

kbahey’s picture

That needs to be done, but also we need a patch for the site offline page to make it not indexable (temporary redirect?).

Fintan’s picture

Well if your taking the site offline, your best bet is to actually ban spiders i.e. have a robots.txt (User-agent: * Disallow: /) that stops all robots while the site is offline.

But its also true that occasionally the robots.txt is cached during the period that the site is taken offline so probably worth doing both a robots and adding a noindex to the page, however this does not mean that url will not appear, this only means the content will not be indexed, its not quite a gurantee that it will not appear but its as close as you will get.

kbahey’s picture

But that requires manually messing up with files using ftp or ssh on the server. The operation of putting a site online is done from the Drupal admin interface. So, it would be nice to have this without the need to mess with files.

The old way of offline a site was to have a different index.php that does not call drupal, but that was a fully manual multi step process too, and also required ftp/ssh access.

We need to find if there is way of not getting the site offline message indexed in search engines, by putting something in the response (return code of 302 or 5xx?).

Fintan’s picture

best bet then is to return a 503 with a Retry-After in the header

moshe weitzman’s picture

I searched but could not find an open issue for this. I agree that we should be sending that http header.

kbahey’s picture

greggles’s picture

Drupal already sends a 503 when the site is in maintenance mode...

I don't know why that page is cached as "Site Offline". Perhaps the Site Links titles get updated less often/less reliably than other parts of the index?

Fintan’s picture

ooops, sorry forgot about that, it does indeed give the correct 503....

killes@www.drop.org’s picture

Is here anything left to do? note that I don't know which actual url is behind the "Site offline" message. I don't get it shown.

greggles’s picture

killes@www.drop.org’s picture

Status: Active » Fixed

Ok, i've told Google to not use that link anymore. It will be blocked for 3 months.

dww’s picture

FWIW: I remember in the past that Squid had problems caching the site offline pages. Perhaps that's part of what went wrong here? Squid a) didn't ignore the 503 itself and b) served up a site-offline page without the 503 when google came asking? Just shooting in the dark here...

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.