I"m looking at the header produced by a page with a forced error

16:19:30 mike@dev ~$ HEAD http://example.ca/NoURLHere-GoElsewhere
200 OK
Cache-Control: public, max-age=21600
Connection: close
Date: Tue, 27 Nov 2012 21:23:32 GMT
ETag: "1354051412-0"
Server: Apache
Vary: Accept-Encoding
Content-Language: en
Content-Type: text/html; charset=utf-8
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Tue, 27 Nov 2012 21:23:32 +0000
Client-Date: Tue, 27 Nov 2012 21:23:33 GMT
Client-Peer: 198.72.101.116:80
Client-Response-Num: 1
X-Drupal-Cache: MISS
X-Generator: Drupal 7 (http://drupal.org)
X-Powered-By: PHP/5.3.3

It should clearly produce a "404 Not Found" message so that a tool like Xenu or Google know that it's a 404. It doesn't get that now.

I searched for headers, but unlike Fast404 I didn't see on in the search404_page() function. Adding this didn't seem to help either, but it needs to be added I'm pretty sure:
header('HTTP/1.0 404 Not Found');

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

zyxware’s picture

I am not able to replicate this. Can you please try on a default installation and check this again?

$ wget http://localhost/~user/search404/drupal7/does-not-exist
--2012-11-29 03:17:18-- http://localhost/~user/search404/drupal7/does-not-exist
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2012-11-29 03:17:25 ERROR 404: Not Found.

mgifford’s picture

I'll send you the full domain, but this is my result:

15:29:35 mike@dev ~$ wget http://example.ca/en/NoFilesHere2
--2012-11-30 15:30:15-- http://example.ca/en/NoFilesHere2
Resolving example.ca... 198.72.101.119
Connecting to example.ca|198.72.101.119|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://example.ca/en/recherche-search/NoFilesHere2 [following]
--2012-11-30 15:30:15-- http://example.ca/en/recherche-search/NoFilesHere2
Connecting to example.ca|198.72.101.119|:80... connected.
HTTP request sent, awaiting response...
200 OK
Length: unspecified [text/html]
Saving to: `NoFilesHere2'

[ <=> ] 31,811 --.-K/s in 0.02s

2012-11-30 15:30:17 (1.62 MB/s) - `NoFilesHere2' saved [31811]

zyxware’s picture

Assigned: Unassigned » zyxware
Category: bug » support

@mgifford - What menu is this - "recherche-search"? There seems to be a redirection to that URL.

mgifford’s picture

We'll look into this and get back to you, thanks!

zyxware’s picture

Was this problem sorted out?

mgifford’s picture

Haven't had a chance, sorry. December can be crazy.

mropanen’s picture

I have the same problem if the option "Do a "Search" with custom path instead of a Drupal Search when a 404 occurs" is selected. Without it everything works and a 404 header is set.

yang_yi_cn’s picture

the custom page search option seems to be not implemented very well.

zyxware’s picture

Title: 404 Header not being sent » 404 Header not being sent with custom page option

The custom page option is where the module redirects the request to the custom page currently. This would be a 301/302 redirect. If instead we can execute the menu then it should be able to send 404 status on custom pages. Will have to check how that would work.

zyxware’s picture

Status: Active » Closed (works as designed)

Closing ticket assuming that this issue has been clarified. Please feel free to re-open if you are still facing problems.

Charles Belov’s picture

Status: Closed (works as designed) » Active

Reopening. The custom page needs the option to be the 404 page. That is, it looks to humans like it is providing search results (which it is), but it looks to search engines like the page sought does not exist (because it doesn't).

We're not giving the site visitor a relocated page, we're giving them a hopefully useful 404 page. However, we have no guarantee that the search is actually providing useful results (and often, if it involves an old URL that is from a previous, non-Drupal website, it's totally useless). Therefore we don't want search engines to continue sending people to this URL; we want the search engine to remove the bad URL from its index.

What I would request is that
Use a 301 Redirect instead of 302 Redirect
get a companion
Use a 404 Not Found instead of 302 Redirect

Only workaround right now with custom page is to also check Disable Auto Search. Then the 404 will be sent.

GaëlG’s picture

Issue summary: View changes
Status: Active » Needs review
FileSize
2.19 KB

This seems to work, at least in my use case (Custom search path: recherche?search_api_views_fulltext=@keys).

simonyeldon’s picture

Status: Needs review » Reviewed & tested by the community

Applied the patch in comment 12 and it works perfectly, thanks very much.

Vincenzo’s picture

I agree with Charles Belov in comment #11.
Indeed, I even think that the 404 code should always be returned. However, I am happy enough if the new settings gets merged in.
We had to apply patch #12 to a production platform serving 90 sites.

simonyeldon’s picture

The patch in #12 didnt work as well as I thought, it ended up in us getting a double page outputted.

I have made a slight modification to the patch to prevent this from happening, please feel free to review.

simonyeldon’s picture

Status: Reviewed & tested by the community » Needs review
mrded’s picture

Version: 7.x-1.2 » 7.x-1.x-dev
Priority: Normal » Major
mrded’s picture

Category: Support request » Bug report
Vincenzo’s picture

Status: Needs review » Reviewed & tested by the community

Patch #15 has been used on our 100+ sites for over a year now.
I guess that makes it "reviewed and tested".

  • anish_zyxware committed d1e1a84 on
    #1852240 : Fix 404 header not being sent with custom page option
    
anish_zyxware’s picture

Assigned: zyxware » anish_zyxware
Status: Reviewed & tested by the community » Closed (fixed)
FileSize
2.19 KB

The issue is fixed and available on 7.x-1.x branch. Final patch is attached.
Will be included in next release (which would be happening soon).