404 hits to /files directory cached as homepage with broken form actions

joshk - December 10, 2008 - 23:16
Project:Boost
Version:5.x-1.x-dev
Component:Caching logic
Category:bug report
Priority:normal
Assigned:Unassigned
Status:active
Description

To replicate:

Hit a 404 within the files directory on your Drupal site. For instance:

http://site.org/files/ima404.foo

You should get a frontpage, possibly with some other artifacts (e.g. broken images).

Try to use a form. No good. The action is to the 404 url, which Drupal won't process.

This is particularly bad when this gets cached, and the login form on your frontpage ceases to work for anonymous users.

I think the quick fix is for boost not to cache any requests made into the /files directory...

#1

joshk - December 10, 2008 - 23:53

The following code at the top of boost_cache_set() resolves the issue, btw:

<?php
 
global $base_path;
 
$files = $base_path . variable_get('file_directory_path', 'files');
  if (
strpos($_SERVER['REQUEST_URI'], $files) === 0) {
    return
FALSE;
  }
?>

Open to other ideas, and will keep working through this.

#2

EvanDonovan - April 27, 2009 - 19:05

This problem doesn't occur in the latest 6.x-1.x-dev does it?

I followed your steps to replicate and it didn't seem to create a file for the 404, but I just wanted to double-check.

#3

mikeytown2 - April 27, 2009 - 20:48

I have not encountered this problem in the 6.x branch.

#4

swellbow - October 26, 2009 - 17:29

Verified that this occurs with boost 6.x-1.13, drupal 6.14. While I agree that this behavior is a real bummer, it's not entirely boost's fault, since Drupal is handling a folder path request and serving up the homepage. This seems quite odd to me. I thought the better solution might be changes in the files .htaccess, but that might screw up public/private downloads etc. Not sure who wants to tackle this. I'll try the change in #1 though.

#5

swellbow - October 26, 2009 - 17:43

Change in #1 worked great for version 6x-1.14. Attaching as a patch via tortoisesvn.

AttachmentSize
boost6x-1.14_files_path_caching.patch 500 bytes

#6

mikeytown2 - October 26, 2009 - 22:27
Version:5.x-1.x-dev» 6.x-1.x-dev

@swellbow
How do I reproduce this?
Go to example.com/sites/default/files/file.that.is.not.here.ext and this gets cached?

What is set for your 404 handler?
admin/settings/error-reporting

Public or Private download?

#7

swellbow - October 27, 2009 - 20:50

Doh. OK, mine is not for 404s but for how requests to folders in /files are handled. My apologies if I have muddied the waters by replying into this bug -- maybe mine should be separate? I just tried this same process but with no 404 handler page at admin/settings/error-reporting and DID NOT get this behavior. Looks like I am on to something else, and again perhaps this isn't necessarily a problem with boost but one with drupal's handling of requests to the /files directory.

Having said that, here's what I am experiencing.

This only happens for requests to folders that actually exist. 404s to files or folders that don't exist get thrown into Drupal 404 handling.

Here's how I reproduced this. Admin in and clear the boost cache. Then anonymize and go to example.com/files or any existing subdirectory under that. You should see the homepage but with example.com/files in the url. Then visit the homepage directly as anonymous. View source for the home page and it will have the boost cache comment at the bottom. Any forms on the home page will have actions pointing to the originally requested /files folder.

Also, to clarify, files that exist will simply get served (as expected), while files OR folders that don't exist will produce 404s, also as expected. Only folders or subfolders that exist produce this behavior. What was curious to me is that further requests to the same folder or subfolder will NOT get the boost comment at the bottom of the page -- but the homepage will.

404 handler is set to a custom page /page-not-found.

Public download.

Thanks! This module rules.

#8

mikeytown2 - October 27, 2009 - 22:50
Status:active» needs review

On a shared server it hits the servers default 404, seems to bypass the ErrorDocument 404 /index.php handler. I have the search404 module running so I get a search for missing.html when I hit sites/default/files; if I upload a missing.html then I get that files content.

On a dedicated server, I get the homepage when I hit sites/default/files; I should be getting a 404 but I don't.

This has to do with a core bug & its handling on clean URL's.

  # Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]

  RewriteCond %{REQUEST_FILENAME} !-d is the problem. It's skipping index.php if the directory exists, but also in the htaccess file we have this

# Set the default handler.
DirectoryIndex index.php

So when a dir is hit without the index.php file, normally it would show an index, but that's not possible due to this
# Don't show directory listings for URLs which map to a directory.
Options -Indexes

Apache is throughly confused at this point and hits the default configuration file httpd.conf; this is what happens on my shared hosting, since it points to missing.html. Why the dedicated server hits the homepage on 404 probably has to do with
# Make Drupal handle any 404 errors.
ErrorDocument 404 /index.php

Which is good, its what we want, but by this point we have lost our original path so we get the homepage for a 404 when hitting a directory that exists without a index.php file inside of that dir. Drupal doesn't send out a 404.

There are 2 ways of fixing this.
Option A - This only works if your having this issue. Means this rule is at the core of the problem. 404 on missing index.php on directory lookups

# Make Drupal handle any 404 errors.
ErrorDocument 404 /index.php?q=404

With this you will get a page not found error in the watchdog for the path 404. I don't know if there is a default 404 path that one can access with the q= parameter, if there is setting this to that would be ideal.

Option B - Tested & it works

  # Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
  RewriteCond %{REQUEST_FILENAME} !-f
  #RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]





Solving this issue correctly, for the here and now will probably require a simple check. Try this patch. Also I'm fairly certain this bug will effect the core cache, but not 100%. Doing a drupal_not_found() in the init seems to break the layout, but a 404 is returned. If I where to postpone the 404 till later that might be a better option... thoughts on this?

AttachmentSize
boost-345484.patch 1.24 KB

#9

mikeytown2 - October 27, 2009 - 23:38

fix an issue with url variables on the front page

AttachmentSize
boost-345484.1.patch 1.33 KB

#10

mikeytown2 - October 28, 2009 - 00:08

#12

mikeytown2 - October 28, 2009 - 04:19
Version:6.x-1.x-dev» 5.x-1.x-dev
Status:needs review» active

committed, moving back to 5.x

 
 

Drupal is a registered trademark of Dries Buytaert.