Google Drupal serp showing "Site off-line"

Fintan - May 6, 2008 - 08:33
Project:Drupal.org infrastructure
Component:Other
Category:task
Priority:normal
Assigned:Unassigned
Status:fixed
Description

When I searched for Drupal earlier one of the site links is "Site off-line", imho this gives a really bad impression and should be removed via the sitelinks section in google webmaster console. Image attached showing the issue.

AttachmentSize
drupalSerp.jpg73.32 KB

#1

kbahey - May 6, 2008 - 14:38

That needs to be done, but also we need a patch for the site offline page to make it not indexable (temporary redirect?).

#2

Fintan - May 6, 2008 - 14:54

Well if your taking the site offline, your best bet is to actually ban spiders i.e. have a robots.txt (User-agent: * Disallow: /) that stops all robots while the site is offline.

But its also true that occasionally the robots.txt is cached during the period that the site is taken offline so probably worth doing both a robots and adding a noindex to the page, however this does not mean that url will not appear, this only means the content will not be indexed, its not quite a gurantee that it will not appear but its as close as you will get.

#3

kbahey - May 6, 2008 - 15:17

But that requires manually messing up with files using ftp or ssh on the server. The operation of putting a site online is done from the Drupal admin interface. So, it would be nice to have this without the need to mess with files.

The old way of offline a site was to have a different index.php that does not call drupal, but that was a fully manual multi step process too, and also required ftp/ssh access.

We need to find if there is way of not getting the site offline message indexed in search engines, by putting something in the response (return code of 302 or 5xx?).

#4

Fintan - May 6, 2008 - 15:41

best bet then is to return a 503 with a Retry-After in the header

#5

moshe weitzman - May 6, 2008 - 15:45

I searched but could not find an open issue for this. I agree that we should be sending that http header.

#6

kbahey - May 6, 2008 - 16:02

#7

greggles - May 6, 2008 - 16:24

Drupal already sends a 503 when the site is in maintenance mode...

I don't know why that page is cached as "Site Offline". Perhaps the Site Links titles get updated less often/less reliably than other parts of the index?

#8

Fintan - May 7, 2008 - 08:57

ooops, sorry forgot about that, it does indeed give the correct 503....

#9

killes@www.drop.org - May 7, 2008 - 18:28

Is here anything left to do? note that I don't know which actual url is behind the "Site offline" message. I don't get it shown.

#10

greggles - May 7, 2008 - 18:49

#11

killes@www.drop.org - May 7, 2008 - 19:34
Status:active» fixed

Ok, i've told Google to not use that link anymore. It will be blocked for 3 months.

#12

dww - May 8, 2008 - 03:52

FWIW: I remember in the past that Squid had problems caching the site offline pages. Perhaps that's part of what went wrong here? Squid a) didn't ignore the 503 itself and b) served up a site-offline page without the 503 when google came asking? Just shooting in the dark here...

 
 

Drupal is a registered trademark of Dries Buytaert.