I thought I knew how Drupal's "clean" URLs interacted with Apache's mod_rewrite and the file system. The logical flow seems obvious after observing the following code in .htaccess:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
The Apache 1.3 documentation for RewriteCond says:
'-d' (is directory) Treats the TestString as a pathname and tests if it exists and is a directory.
'-f' (is regular file) Treats the TestString as a pathname and tests if it exists and is a regular file.
The code checks for a file or directory that matches the request and, if neither is found, rewrites the clean URL such that Drupal can handle it. But all is not as it seems. I expected RewriteCond %{REQUEST_FILENAME} !-f to require a complete match between the file system and the URL before failing.
Here is the caveat:
Apache will not rewrite a clean URL if Apache finds a file at the base of the Drupal installation whose name matches everything up to the first slash (/) in the request. As an added twist, Apache does the same if the file ends with .txt, .diff, and perhaps some other common extensions.
Here is how to demonstrate the quirk:
- Create an empty file called
node.txtat the base of the Drupal installation. - Browse to the Drupal installation via a clean URL such as
node/1. - Notice the 404 error response from Apache (not
drupal_not_found()).
Because this is a quirk in Apache, I don't expect the Drupal team can fix this. Would documentation be appropriate?
Nic
Comments
I've hit that before in
I've hit that before in non-drupal situations. My random testing showed that mod_rewrite treats everything up to a '.' as a directory name, then tests the filename properly. The front end of the filename is essentially subject to both sets of rules, regardless of what the Apache docs say.
Sucks. No, I don't have a fix.
Something like that
It looks to be a bit more complicated than that in two respects.
1.
RewriteCond %{REQUEST_FILENAME} !-ffails when a file at the base of the Drupal installation partially matches the request.For example, a request for
node/5will return a 404-error if there is a file callednode. A directory of the same name does not yield a 404, however.2. Not all file extensions are treated equally. It appears that files with common extensions like
.txtwill also produce the unexpected behavior.Continuing the same example, a file with any of the following names will result in a 404-error:
But a file with any of the next names will not result in a 404-error:
Does that reflect your experience? Is this behavior intentional?
Nic
This is due to Apache's
This is due to Apache's Options +MultiViews, where a URL of "node.txt" can be made extensionless to make a cleaner URL (it ultimately comes from the ability to do language determination, such that index.html.en and index.html.de are served to browsers sending certain language preferences). MultiViews is, in my opinion, quite a nice feature, and I recommend nearly anyone interested in "clean URLs" or the tenets behind Cool URIs don't change, to turn them on and use them. I, for example, have used http://www.disobey.com/about/morbus since time began to refer to, originally, morbus.htm, then morbus.html, then morbus.shtml, and finally, a Drupal node that has been path aliases. The URL has never changed even though my technology has, and that's one of the prime benefits of MultiViews. You should be able to turn off this behavior by modifying Drupal's default .htaccess from "Options -Indexes" to "Options -Indexes -MultiViews".
http://www.disobey.com/
http://www.gamegrene.com/
Developer of Drupal's GameAPI
http://www.disobey.com/
Nice fix!
Turning of the Multiview option fixed my clean urls problem very neatly after nasty hours of trying find the reason for the 403 error permission denied. THANKX!!!!
One question though multiview is threatning every subdirectory as file if it is not a directory? And if you have directory browsing turned of it seems to imply to the server when testing for clean urls it is looking for a file perhaps or something in the subdirectory and the if the directory browsing is turned of it seems to render a 403 error for not finding the file.
I am not sure I got this right but I was tearing my hair at this problem and now it's fixed. Thank god...