When running Drupal on a server with PHP as CGI you have to change line 288(?) in /includes/common.inc from
drupal_set_header('HTTP/1.0 404 Not Found');
to
drupal_set_header('Status: 404 Not Found');
Otherwise it will not send the correct 404 Not Found headers. That includes popular hosts like Site5, Bluehost, etc. More information can be found here: http://us3.php.net/header
Also, the rewrite rules included in Drupal 4.7 are not good. When I tried uncommenting the rules to redirect from non-www to www form of the domain name, it would not work properly. What it would do is if you requested a page like this:
http:// my_site.com/asdf (a non-existant page)
it would just redirect to the home page with the www like this:
http:// www. my_site.com
[spaces inserted to prevent linking]
This is bad for search engines and can cause serious problems and send your site into the supplemental results. I'm speaking from experience from a badly configured server that did this exact same thing -- redirecting 404 errors to the home page instead of sending correct 404 errors.
This is the badly-working default version:
# If you want the site to be accessed WITH the www. only, adapt and uncomment the following:
RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
RewriteRule .* http://www.example.com/ [L,R=301]
I don't understand rewrite rules well, but the following version is the one that I use and it prevents the problem mentioned above: it will redirect something like http:// example.com/asdf to http:// www. example.com/asdf -- either leading the visitor to the correct page (instead of the home page) or a correct 404 not found error:
# Redirect to www
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
If that doesn't make sense let me know and I'll try to explain it better.
Comments
Comment #1
bradlis7 commentedI think I have this problem. Google won't stop hitting my xtracker page, even though I took out the module.
Comment #2
bradlis7 commentedAnother way to avoid having to edit code is to add it to a custom page, and use it as the default 404 in admin->settings.
Does anyone know how to get the original url when you get to the 404 page. If I print out $_GET['q'], then it gives me /node/47, the 404 node, instead of /xtracker, which is what I want. I was trying to do a redirect to /tracker, but this makes it complicated.
Comment #3
tenrapid commentedThe original value of 'q' is saved in $_REQUEST['destination'].
Comment #4
Z2222 commentedWill that work? I tried it and it takes my 404 page and 'themes' it so that it is within the overall drupal theme. Doesn't the 404 header have to be the first thing sent?
Comment #5
Z2222 commentedTo find out, use Firefox with the Live HTTP Headers extension. After installing the extension, restart the browser. Use Alt-l (letter L, lowercase) to open LiveHTTPheaders in the sidebar. Then load a non-existant URL from your site in the browser. If the header in the sidebar says something like 'HTTP 1.x 404 OK' then you have this problem. It should say '404 Not Found', not '404 OK'.
I think the .htaccess problem that I mentioned above is a pretty serious bug. If someone activates those rewrite rules and gets the wrong URL spidered it could wreck their search engine rankings. When my server did something like this to me (due to bad server configuration by the hosting company), it killed my rankings in Google and MSN Search on that particular site. With the new Drupal 4.7 rewrite rules, instead of sending a 404 error it tells search engines that your non-existant page has moved to your home page and that everything is ok. Search engines don't seem to be able to handle that. I hope someone will take a look at it and fix the .htaccess file. Is this the right place to be reporting these kinds of bugs?
Example: visit http:// www. drupal.org/asdf (a non-existant page. I added spaces to prevent creating a link.)
It redirects to http:// drupal.org/asdf (no www) which then sends a 404 error. That is the best way. If this site were using the 4.7 .htaccess file, it would send the browser to the home page and say everything is ok. Does that make sense?
Comment #6
bradlis7 commentedYep, it works. I checked it using my Web Developer extension.
I also checked another install of drupal, and it gave me 404 OK, so I know that that made the difference.
Comment #7
Z2222 commentedThe Web Developer Toolbar does everything. I use the toolbar all the time but didn't know it could check headers so I was using the LiveHTTPheaders extension.
Thanks for checking. I'll try the 404 page again. Where did you put the 404 page? In the root directory? 404.html or something like that?
Comment #8
bradlis7 commentedNo, I created a drupal node, and set it to the default 404 page in administer->settings. And you can look at header information on the toolbar at Information->View Response Headers.
Comment #9
Z2222 commentedThat must have been my mistake — I created a 404 html page.
:S
Comment #10
Z2222 commentedbradlis7, were you getting "404 Ok" header errors before you added the 404 page through the Drupal admin? I tried the Drupal admin settings and still get the "404 Ok" error. I think if you are running PHP as CGI you must use the following syntax in PHP, which means changing common.inc:
Comment #11
magico commentedI run Drupal on a server with lighttpd and fastcgi and 404 work fine without need to change that line.
@guitarmiami: any news?
Comment #12
Z2222 commentedI still have to make the change to common.inc on all of my sites to avoid the "404 Ok" header problem.
http://tips.webdesign10.com/drupal-seo-404-ok-and-htaccess
Comment #13
magico commentedTry to get some attention to this... so a senior can make a decision.
Comment #14
Z2222 commentedI've mentioned it a few times but never got a reply.
I've also recently pointed out a similar "301 Ok" error on my blog post here:
http://tips.webdesign10.com/drupal-seo-404-ok-and-htaccess
Another user there mentions a "403 Ok" error.
Comment #15
Z2222 commentedI just upgraded a site to Drupal 5.1 (from 4.7.3) and it looks like the common.inc file was updated to have this line:
drupal_set_header('Status: 404 Not Found');... but I'm getting the 404 OK problem still.
Any ideas?
Comment #16
mdlueck commentedAbout your suggestion concerning the URL rewrite rules in .htaccess:
I added $1 to the stock Drupal 5.1 .htaccess file, and that is enough tinkering to prevent the elimination of the rest of the URL when the URL is rewritten. I noticed that bad behavior too. I will open a bug report against it suggesting simply adding the $1.
So, My working solution is as follows:
Comment #17
mdlueck commentedWell, my proposed fix did not work at one domain / hosting provider, but your proposed solution does work.
I created and updated a bug report, which is as follows:
http://drupal.org/node/158224
Comment #18
scoutbaker commentedThe fix for this is documented in http://drupal.org/node/109150 and committed to D5 and D6.
Comment #19
Anonymous (not verified) commentedAutomatically closed -- issue fixed for two weeks with no activity.