All 'page does not exist' pages are indexed by google, resulting in potentially thousands of pages of search engine spam.

This page needs to have the 'robots' meta tag set in the of the document to 'noindex,nofollow' (see wikipedia's version of this page for an example).

This is a particularly important patch for the SE conscious.

Comments

groovypower’s picture

Have not tried it yet, but was wondering if this could just be added to the function theme_wikitools_page_does_not_exist? Not sure if you can add to the header from a theme function or if you even need to.

chrism2671’s picture

It's worked for google to put this tag into the of the document, which I've done with a simple hack, but as with all these things, it's better if it's done properly.

groovypower’s picture

Do you mind sharing the hack?

chrism2671’s picture

As I recall, if you just find the 'page not found' function in the wikitools code and have it dump the meta tag in with the output for that page it seems to work.

groovypower’s picture

That will dump the meta tag in the body not the header which I believe could invalidate the page and make the spiders more confused.

I finally found a snippet, add to the first line of the function

drupal_set_html_head('<meta name="robots" content="noindex,nofollow" />')
chrism2671’s picture

Oh brilliant I will try that out later! That should really be part of the next release...

jpmckinney’s picture

Status: Active » Closed (duplicate)

See #1045432: Missing wiki page does not set http status code 404. Fixed by returning a 404 header. Google respects HTTP headers.