Firefox does not handle filenames with spaces . When a user clicks on an attachment with spaces, the filename is truncated to the first whitespace. While IE & Safari both handle this, Firefox refuses to accept mime headers with unquoted filename parameters. According to Firefox's bugzilla/knowledgebase, Firefox's behavior is the correct behavior and it's a problem with most webservers or web applications. This problem can be easily corrected by surrounding the filename parameter with double quotes.
Current Implementation
In upload_file_download()
'Content-Disposition: '. $disposition .'; filename='. $name
Sould be
'Content-Disposition: '. $disposition .'; filename="'. $name . '"'
However, this breaks if the filename has double quotes within the name.
I'm not sure how to fix this. I don't know if this is the responsibility of mime_header_encode() or upload_file_download(). I'm reporting this bug, and also soliciting help.
| Comment | File | Size | Author |
|---|---|---|---|
| #48 | quoted-string.patch | 3.88 KB | mfb |
| #46 | quoted-string.patch | 3.8 KB | mfb |
| #44 | quoted-string.patch | 2.96 KB | mfb |
| #33 | quoted-string.patch | 1.8 KB | mfb |
| #28 | quoted_string_3.patch | 1.76 KB | mfb |
Comments
Comment #1
AjK commentedI failed to reproduce this on FF 1.5.0.6
I could both upload a file with a space in the name and then download it again.
If FF can't handle filenames with spaces in surely your upload wouldn't work either. Can you upload? (if so, ur FF is half broken by definition!).
Like I said thou, failed to reproduce this.
regards,
--AjK
for info:-
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6
Comment #2
John Hwang commentedThanks for testing this.
I think I forgot to mention a crucial detail. This only happens on PRIVATE downloads. If private upload is turned on, Drupal handles writing the MIME header.
Upload works fine in both cases.
Comment #3
AjK commentedI changed my test server to use "private". Now when I test this, it still works, downloads no problems.
I note that when I click the link, drupal adds a + symbol into the space (ie, it gets files/test+file_0.txt ) even thou there deff is a space in the file name??
regards,
--AjK
Comment #4
John Hwang commentedAjK,
Thanks once again for testing this.
Please note that the files directory must not be accessible, ie it must not be a directory within the directory of your Drupal installation. If you simply switched from "public" to "private", this does not make the directory unavailable.
Move the "files" directory outside the Drupal directory, then turn on private file uploads. The URL will change to "[domain name]/system/files/[filename]".
Also note that "+" is synonymous with "%20" or "whitespace".
Please try again and let me know the results. Thanks again for the help.
Comment #5
AjK commentedYour not going to like this. I moved my files/ directory well out of the web tree (/usr/local/TEST82614/files) and I still cannot reproduce your bug. I even ran Ethereal and captured the packets between my workstation and test server,
regards,
--AjK
Comment #6
jonbob commentedI can reproduce this issue. I wonder if the difference is that in AjK's case, the file in question is a text file and so is being served with an inline disposition? It would be worth testing again with a file that is served as "attachment" instead.
Comment #7
Steven commentedThis might have something to do with the fact that mime_header_encode() does not consider spaces as characters that need encoding. However, I remember many people were annoyed when we applied this blindly to e.g. email subjects, so we can't just make it stricter. Perhaps we need to define a strict and loose version of this?
Comment #8
bmargulies commentedI definitely experienced this with PDF files.
Comment #9
toeofdestiny commentedHere's what upload.module should do (I have here the same problem as described above) :
- See page 3 of http://www.ietf.org/rfc/rfc2183.txt :
So, there is 3 ways to encode the file name, and Drupal only handles 2 of them.
There should be in upload.module something like this :
Where :
See page 12 of http://www.ietf.org/rfc/rfc2045.txt .
This include spaces because, as stated in rfc2183 : parameter value containing only non-`tspecials' characters SHOULD be represented as a single `token'.
Wish this helps solve the problem, because the problem is real around here too (Firefox, Ubuntu).
Comment #10
toeofdestiny commentedHere's my current patch. Seems to work :
Comment #11
zok27 commentedhello.
i also had a problem opening files with spaces in it.
oddly enough though, the problem only keeps appearing with the public setting.
when i changed to the private setting, it was okay. which is fine by me for now.
using 4.7.3 based off a wamp setup.
uploaded using core upload.module
localhosted, tested remotely.
Comment #12
Steven commentedAs I said before, this should be added to mime_header_encode() so it can be re-used (optionally!) by other code.
Comment #13
martin_halford commentedI was also having problems with filenames with spaces being truncated in Firefox and the above patch has worked for me - thanks! My files are Private in a directory outside of the web directories. Just a thought - I have the Firefox extension DownThemAll! installed (brilliant when downloading from a document management repository) and I wondered whether this might have changed Firefox's default behaviour?
Comment #14
JamieR commentedAn issue for me as well. Here is a patch file.
Comment #15
killes@www.drop.org commentedthis should be fixed in HEAD and then backported.
The patch doesn't conform to our coding style.
Comment #16
Chris Johnson commented@Steven: How you propose adding it to mime_header_encode() so that its use is optional?
@killes: What specifically is wrong with the coding style? The all caps CASE comments?
I've got 50 sites running 4.6 and 4.7, and this is affecting my users.
Comment #17
RobRoy commentedSome things I caught:
- First comment should end with a period.
- Comments need cleanup, no CASE stuff, just explain in a clean sentence.
- String concat should look like '"'. $file->filename .'"'; with spaces between dots and non-quotes.
- Control structures should have a break after } so like...
if (cond) {
// Do something.
}
else {
// Something else.
}
http://drupal.org/node/318
Comment #18
Chris Johnson commentedHere is the 4.7 patch re-rolled to fix the coding style, including the picayune missing period on the first comment (but not missing periods on other comments pre-existing in the module).
I would have created a 5.0 patch too, but this area of the upload.module has been extensively modified in 5.0, and the bug may not even exist there anymore. I don't have time to test it in 5.0 right now.
Comment #19
killes@www.drop.org commentedI'll backport once this is committed to 5/HEAD.
Comment #20
killes@www.drop.org commented...
Comment #21
jasoncd commentedI am unfortunately very confused. I just updated from Drupal 4.7.4 to 4.7.6. I see that the Content-Disposition lines have been removed, but I can't find any record of why. This causes files attached to content to always be downloaded as index.php, instead of the actual file name. I had previous hacked the module to fix the space problem by using the technique I posted at http://drupal.org/node/23759. I only now realize that that upload module is not the same as this upload module, which is frustrating.
Comment #22
mfbThis patch adds an optional second parameter to mime_header_encode for quoted-string encoding of ASCII strings containing special characters (e.g. space). This would be helpful for contrib modules such as audio.
Comment #23
mfbThis tweak also escapes the backslash, in addition to double quote.
Comment #24
Chris Johnson commentedWhat needs to happen to get some movement on this issue? We have patches for both 4.7 and 5 posted here. This bug causes lots of problems with Firefox, because all modern file systems support filenames with embedded spaces.
Comment #25
Steven commentedI think the regexp needs to be inverted. Instead of a blacklist of characters, let's have a whitelist. Only if the string consists of characters XYZ should we not quote it. Quoting isn't harmful anyway.
Also please pay attention to code style and readability... there are loads of unnecessarily escaped characters in there, especially in the regexp. The choice of regexp delimiter is also poor, as it forces more escaping. Regexps are unreadable enough as it is, let's not complicate them more.
We should also avoid phrasings like "special characters". This can mean a couple hundred things depending on context. Let's be clear what we're talking about: "make it safe for embedding as a MIME header token according to RFC XXXX" or something.
Comment #26
mfbThe only reason I used a blacklist was to follow RFC 2183 -- http://drupal.org/node/82614#comment-143416 -- with regards to its "tspecials" characters. We could come up with our own more readable whitelist (maybe A-Za-z0-9 and _) but that wouldn't be strictly inline with the RFC..
Comment #27
mfbThe only reason I used a blacklist was to follow RFC 2183 -- http://drupal.org/node/82614#comment-143416 -- with regards to its "tspecials" characters. We could come up with our own more readable whitelist (maybe A-Za-z0-9 and _) but that wouldn't be strictly inline with the RFC..
Comment #28
mfbHere's an updated patch for HEAD. I used a whitelist of OK alphanumerics (and underscore) rather than blacklisting the known "tspecials" characters.
Comment #29
drewish commentedsubscribing. looks like this'll get bumped to 7.
Comment #31
liquidcms commentedwow.. this has been around for 3 yrs and still no fix.. scary.. subscribing..
Comment #32
catchStill applies with offset but needs updating for coding standards concat operator spacing changes.
Comment #33
mfbHere ya go.
Comment #34
matt2000 commentedPatch looks good and applies cleanly, but doesn't seem to actually solve the original issue...
The original issue, by the way, no longer applies to Firefox, as described, but does occur in Opera.
Comment #35
cburschka1.) Looks like you've extended the API, but not actually used the extension. If you define a new parameter for the function, shouldn't it be used wherever the content-disposition header is generated?
2.) Perhaps the parameter should be called "$strict" or "$alphanumeric" instead of "$quoted"...
3.) Does the RFC forbid filenames being quote-encased if they *don't* contain a space? If it does not, why not apply the quotes universally and forget about checking this? All you need to do is
As in the original issue, but with addslashes to escape double-quotes and backslashes.
Comment #36
mfbThere is no longer any place in Drupal core where this is a problem. It's an API extension for contrib modules which do pass along arbitrary file names and such in headers.
I don't care what the parameter is called. Basically it should not be set for headers such as an ASCII e-mail subject where spaces do not need to be wrapped in quotes.
Yes according to the RFC you should only add the quotes when needed. Of course the main standard for e-mail and HTTP-related RFCs is interoperability not strictly following RFCs to the letter.
Comment #37
matt2000 commentedI haven't tested D7 thoroughly, but this is definitely still a problem in 6.6 for certain browsers (Opera, FF<3), and neither upload_file_download() nor mime_header_encode() have changed according to api.drupal.org.
I've tried several approaches and combinations of approaches in this thread, and have not yet found a way to reliably allow spaces in filenames when the download method is private.
Can anyone else confirm the issue still exists?
Comment #38
mfbYou can see here http://api.drupal.org/api/function/upload_file_download/4.7 that file names were passed in the HTTP headers but I was under the impression they no longer are in drupal core. Contrib modules such as audio module on the other hand do need to pass arbitrary file names in headers.
Comment #39
matt2000 commentedOK, my mistake; apparently the old 'bug' has been replaced with a new one; I get same problem now even with a 'space-free' filename. I'll report back after further testing and/or start a new issue.
As for mfb's latest patch, it does extend the API as needed and applies cleanly, so I'm marking RTBC.
Comment #40
drewish commentedSorry but this really needs some unit tests.
Comment #41
mfbWell actually it looks like all of unicode.inc needs unit tests..
Comment #42
matt2000 commentedJust reporting back as promised: my look-a-like issue was actually being caused by an error in fckeditor_file_download(), which has been fixed in the most recent rc.
Comment #43
drewish commentedsee #276453: Tests needed: unicode.inc
Comment #44
mfbAdded a bare minimum of unit tests..
Comment #45
drewish commentedThe unit tests look good but I don't think the docs for the $quoted parameter will pass snuff with the core committers:
The boolean after @param doesn't seem to match the coding standards. And the description doesn't really make sense. What is quoted-string encoding? When and why would I want to use it? When would I not?
Edit: I forgot to mention that once that's corrected I'd say this is RTBC.
Comment #46
mfbbleh, too many coding standards.. :p
BTW it might be cool, if mime_header_encode() supported an array of header field parameters as input; these would automagically be assembled into a valid header string. I.e.
mime_header_encode(array('attachment' => NULL, 'filename' => $filemime, 'modification-date' => date('r')));instead of'attachment; filename=' . mime_header_encode($filename, TRUE) . '; modification-date=' . mime_header_encode(date('r'), TRUE);Comment #47
drewish commentedThat looks a lot better, could we reference the appropriate RFC in the comments? I'm assuming that would be 2045...
Comment #48
mfbThere are at least 4 RFCs that come into play, but seems like 2045 is the best.
Comment #50
mfbComment #51
cburschkaIt's a good idea to improve standard compliance. I'm not sure about the name of the "quoted" parameter, but I can't think of a better one.
There are no code-style problems and the tests pass, so RTBC.
However, you may have to replace test strings like "my test.txt" with something randomly generated, which seems to be the norm,
Comment #52
agerson commentedAny chance of getting this into a drupal 6.x release?
Comment #54
webchickComment #55
webchickNot to drag out a 3-year old issue even longer, but...
Could someone help me out with some clear steps and/or sample code to reproduce this? Tried the steps described by John Hwang (private files, non-web-accessible directory) with upload.module in both Firefox 3.5rc1 and Safari 4.0, the file gets auto-%20ed.
I saw in #36 that this is no longer reproduceable in core. Not to be dense, but then why can't contrib do whatever upload module is doing to avoid the problem?
You can also read this as "We need a more 'real world' test." The current tests really are just testing PHP at this point.
Comment #56
mfbAs I mentioned at #36, this is an API feature that currently has no bearing on core that I know of, but is of use to contrib modules that need to pass along arbitrary strings in headers.
Comment #57
mfbSince this isn't used in core I should add a test module, similar to file_test.module. Or, perhaps I could just reuse file_test.module and add a test to file.test.