googlebot and unavailable

jusyjim - September 1, 2009 - 14:38
Project:Scheduler
Version:5.x-1.9
Component:Miscellaneous
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed
Description

Pages that I have to be unpublished continue to be indexed by Google after the unpublish date. I've posted a question in the google forums and the answers indicate that the meta tag is being written improperly. According to the page at http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-wit... they say:
-----
We have introduced a new META tag that allows you to tell us when a page should be removed from the main Google web search results: the aptly named unavailable_after tag. This one follows a similar syntax to other REP META tags. For example, to specify that an HTML page should be removed from the search results after 3pm Eastern Standard Time on 25th August 2007, simply add the following tag to the first section of the page:

The date and time is specified in the RFC 850 format.
-----

The way the meta tag is written in my pages is as follows:

Is this something that I can fix?

Thanks.

#1

jusyjim - September 1, 2009 - 14:42

Apparently I should have used the code tags to include the sample meta tags so here they are:

Example Google gives: <META NAME="GOOGLEBOT" CONTENT="unavailable_after: 25-Aug-2007 15:00:00 EST">
My Tag sample: <META NAME="GOOGLEBOT" CONTENT="unavailable_after: 06-Sep-2009 19:02:48 America/New_York">

#2

Eric Schaefer - September 7, 2009 - 19:06
Status:active» needs review

Looks like google doesn't like the "fancy" time zone.
Try this: Use the current release (5.x-1.17) and go to scheduler.module line number 366. It should look like this:

          $unavailable_after = date ("d-M-Y H:i:s e", $node->unpublish_on);

Change this line to this:

          $unavailable_after = date ("d-M-Y H:i:s T", $node->unpublish_on);

('e' replaced by 'T' in the format string)
Please check if this change does actually expire your nodes with google.

#3

jusyjim - September 8, 2009 - 17:22

Thanks Eric, this does seem to do the trick!!!

We will see if Google abides now.

Thanks again!

Jim

#4

Eric Schaefer - September 28, 2009 - 18:36
Status:needs review» fixed

Committed: http://drupal.org/cvs?commit=268626 (D5) and http://drupal.org/cvs?commit=268624 (D6)

#5

jusyjim - October 8, 2009 - 15:10

Eric, Just thought you would like to follow the thread that I am working with in the Google help forums since I'm still having trouble. One comment is that, since the page xhtml the tag should be in lower case, but here is the lik to the discussion if you would like to follow:

http://www.google.com/support/forum/p/Webmasters/thread?fid=6c5bdbed8313...

#6

jusyjim - October 8, 2009 - 15:35

Eric,

Sorry to post 2 times in a row but would it be OK to just change the line in the module from :
drupal_set_html_head('<META NAME="GOOGLEBOT" CONTENT="unavailable_after: '. $unavailable_after .'">');
to
drupal_set_html_head('<meta name="googlebot" content="unavailable_after: '. $unavailable_after .'">');

??

thanks.

#7

Eric Schaefer - October 9, 2009 - 19:39

Changed it both for D5 and D6. Will be part of tomorrows dev release.

#8

jusyjim - October 12, 2009 - 13:28

Good news, thanks Eric!

#9

System Message - October 26, 2009 - 13:30
Status:fixed» closed

Automatically closed -- issue fixed for 2 weeks with no activity.

 
 

Drupal is a registered trademark of Dries Buytaert.