Mime filename should be double quoted to handle filenames with spaces

John Hwang - September 5, 2006 - 21:35
Project:Drupal
Version:7.x-dev
Component:base system
Category:bug report
Priority:normal
Assigned:Unassigned
Status:needs work
Description

Firefox does not handle filenames with spaces . When a user clicks on an attachment with spaces, the filename is truncated to the first whitespace. While IE & Safari both handle this, Firefox refuses to accept mime headers with unquoted filename parameters. According to Firefox's bugzilla/knowledgebase, Firefox's behavior is the correct behavior and it's a problem with most webservers or web applications. This problem can be easily corrected by surrounding the filename parameter with double quotes.

Current Implementation

In upload_file_download()

'Content-Disposition: '. $disposition .'; filename='. $name

Sould be

'Content-Disposition: '. $disposition .'; filename="'. $name . '"'

However, this breaks if the filename has double quotes within the name.

I'm not sure how to fix this. I don't know if this is the responsibility of mime_header_encode() or upload_file_download(). I'm reporting this bug, and also soliciting help.

#1

AjK - September 6, 2006 - 22:57

I failed to reproduce this on FF 1.5.0.6

I could both upload a file with a space in the name and then download it again.

If FF can't handle filenames with spaces in surely your upload wouldn't work either. Can you upload? (if so, ur FF is half broken by definition!).

Like I said thou, failed to reproduce this.

regards,
--AjK

for info:-
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6

#2

John Hwang - September 7, 2006 - 13:20
Title:Mime filename should be double quoted to handle filenames with spaces. Currently breaks in firefox» Doesn't work for PRIVATE downloads

Thanks for testing this.

I think I forgot to mention a crucial detail. This only happens on PRIVATE downloads. If private upload is turned on, Drupal handles writing the MIME header.

Upload works fine in both cases.

#3

AjK - September 7, 2006 - 21:08

I changed my test server to use "private". Now when I test this, it still works, downloads no problems.

I note that when I click the link, drupal adds a + symbol into the space (ie, it gets files/test+file_0.txt ) even thou there deff is a space in the file name??

regards,
--AjK

#4

John Hwang - September 7, 2006 - 21:30

AjK,

Thanks once again for testing this.

Please note that the files directory must not be accessible, ie it must not be a directory within the directory of your Drupal installation. If you simply switched from "public" to "private", this does not make the directory unavailable.

Move the "files" directory outside the Drupal directory, then turn on private file uploads. The URL will change to "[domain name]/system/files/[filename]".

Also note that "+" is synonymous with "%20" or "whitespace".

Please try again and let me know the results. Thanks again for the help.

#5

AjK - September 7, 2006 - 22:13

Your not going to like this. I moved my files/ directory well out of the web tree (/usr/local/TEST82614/files) and I still cannot reproduce your bug. I even ran Ethereal and captured the packets between my workstation and test server,

GET /system/files/test+file+again.txt HTTP/1.1
Host: d5.x
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://d5.x/node/4
Cookie: PHPSESSID=43lb61vok0fp0u8lt9gvrm4f31

HTTP/1.1 200 OK
Date: Thu, 07 Sep 2006 23:05:52 GMT
Server: Apache/1.3.36 (Unix) PHP/5.1.4
X-Powered-By: PHP/5.1.4
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Thu, 07 Sep 2006 23:05:52 GMT
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Content-Length: 11
Content-Disposition: inline; filename=test file again.txt
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/plain; name=test file again.txt

test file 2

regards,
--AjK

#6

JonBob - September 8, 2006 - 14:37
Title:Doesn't work for PRIVATE downloads» Mime filename should be double quoted to handle filenames with spaces

I can reproduce this issue. I wonder if the difference is that in AjK's case, the file in question is a text file and so is being served with an inline disposition? It would be worth testing again with a file that is served as "attachment" instead.

#7

Steven - September 8, 2006 - 16:30

This might have something to do with the fact that mime_header_encode() does not consider spaces as characters that need encoding. However, I remember many people were annoyed when we applied this blindly to e.g. email subjects, so we can't just make it stricter. Perhaps we need to define a strict and loose version of this?

#8

bmargulies - September 16, 2006 - 01:40

I definitely experienced this with PDF files.

#9

toeofdestiny - October 6, 2006 - 19:31

Here's what upload.module should do (I have here the same problem as described above) :

- See page 3 of http://www.ietf.org/rfc/rfc2183.txt :

NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters)
   parameter value containing only non-`tspecials' characters SHOULD be
   represented as a single `token'.  A short parameter value containing
   only ASCII characters, but including `tspecials' characters, SHOULD
   be represented as `quoted-string'.  Parameter values longer than 78
   characters, or which contain non-ASCII characters, MUST be encoded as
   specified in [RFC 2184].

So, there is 3 ways to encode the file name, and Drupal only handles 2 of them.

There should be in upload.module something like this :

      if (node_access('view', $node)) {
        $name = mime_header_encode($file->filename);

        // BEGIN PATCH ****
        if ( there is a tspecials character in $name ) $name = '"' . $name . '"';
        // END PATCH ****

        $type = mime_header_encode($file->filemime);
        // Serve images and text inline for the browser to display rather than download.
        $disposition = ereg('^(text/|image/)', $file->filemime) ? 'inline' : 'attachment';
        return array(
          'Content-Type: '. $type .'; name='. $name,
          'Content-Length: '. $file->filesize,
          'Content-Disposition: '. $disposition .'; filename='. $name
        );
      }

Where :

tspecials :=  "(" / ")" / "<" / ">" / "@" /
                   "," / ";" / ":" / "\" / <">
                   "/" / "[" / "]" / "?" / "="
                   ; Must be in quoted-string,
                   ; to use within parameter values

See page 12 of http://www.ietf.org/rfc/rfc2045.txt .

This include spaces because, as stated in rfc2183 : parameter value containing only non-`tspecials' characters SHOULD be represented as a single `token'.

Wish this helps solve the problem, because the problem is real around here too (Firefox, Ubuntu).

#10

toeofdestiny - October 6, 2006 - 19:59

Here's my current patch. Seems to work :

<?php
// PATCH : upload.module - drupal 4.7.3 ----------------------------------------
function upload_file_download($file) {
 
$file = file_create_path($file);
 
$result = db_query("SELECT f.* FROM {files} f WHERE filepath = '%s'", $file);
  if (
$file = db_fetch_object($result)) {
    if (
user_access('view uploaded files')) {
     
$node = node_load($file->nid);
      if (
node_access('view', $node)) {
       
// PATCH! ==============================================================

        // CASE 1 : only ascii, one token, no tspecials.
       
$name = $file->filename;
        if (
preg_match( '/[^\x20-\x7E]/', $name ) )
        {
         
// CASE 2 : non-ascii --> needs encoding.
         
$name = mime_header_encode($file->filename);
        }
        else if (
preg_match( '/[ \(\)\<\>\@\,\;\:\\\"\/\[\]\?\=]/', $name ) )
        {
         
// CASE 3 : n tokens, tspecials characters, SHOULD be 'quoted-string'.
          // tsp : "(" ")" "<" ">" "@" "," ";" ":" "\" <"> "/" "[" "]" "?" "="
         
$name = "\"$name\"";
        }
       
       
// =====================================================================
       
$type = mime_header_encode($file->filemime);
       
       
// Serve images and text inline for the browser to display rather than download.
       
$disposition = ereg('^(text/|image/)', $file->filemime) ? 'inline' : 'attachment';
        return array(
         
'Content-Type: '. $type .'; name='. $name,
         
'Content-Length: '. $file->filesize,
         
'Content-Disposition: '. $disposition .'; filename='. $name
       
);
      }
      else {
        return -
1;
      }
    }
    else {
      return -
1;
    }
  }
}
?>

#11

zok27 - October 10, 2006 - 18:54

hello.
i also had a problem opening files with spaces in it.
oddly enough though, the problem only keeps appearing with the public setting.
when i changed to the private setting, it was okay. which is fine by me for now.

using 4.7.3 based off a wamp setup.
uploaded using core upload.module
localhosted, tested remotely.

#12

Steven - October 14, 2006 - 11:00

As I said before, this should be added to mime_header_encode() so it can be re-used (optionally!) by other code.

#13

martin_halford - October 26, 2006 - 13:44

I was also having problems with filenames with spaces being truncated in Firefox and the above patch has worked for me - thanks! My files are Private in a directory outside of the web directories. Just a thought - I have the Firefox extension DownThemAll! installed (brilliant when downloading from a document management repository) and I wondered whether this might have changed Firefox's default behaviour?

#14

JamieR - November 15, 2006 - 23:23
Status:active» needs review

An issue for me as well. Here is a patch file.

AttachmentSizeStatusTest resultOperations
upload.module.patch_2.txt1.12 KBIgnoredNoneNone

#15

killes@www.drop.org - November 17, 2006 - 10:12
Version:4.7.3» 5.x-dev
Status:needs review» needs work

this should be fixed in HEAD and then backported.

The patch doesn't conform to our coding style.

#16

Chris Johnson - December 22, 2006 - 13:28

@Steven: How you propose adding it to mime_header_encode() so that its use is optional?

@killes: What specifically is wrong with the coding style? The all caps CASE comments?

I've got 50 sites running 4.6 and 4.7, and this is affecting my users.

#17

RobRoy - December 24, 2006 - 07:27

Some things I caught:
- First comment should end with a period.
- Comments need cleanup, no CASE stuff, just explain in a clean sentence.
- String concat should look like '"'. $file->filename .'"'; with spaces between dots and non-quotes.
- Control structures should have a break after } so like...

if (cond) {
// Do something.
}
else {
// Something else.
}

http://drupal.org/node/318

#18

Chris Johnson - January 5, 2007 - 11:15
Version:5.x-dev» 4.7.5
Status:needs work» reviewed & tested by the community

Here is the 4.7 patch re-rolled to fix the coding style, including the picayune missing period on the first comment (but not missing periods on other comments pre-existing in the module).

I would have created a 5.0 patch too, but this area of the upload.module has been extensively modified in 5.0, and the bug may not even exist there anymore. I don't have time to test it in 5.0 right now.

AttachmentSizeStatusTest resultOperations
upload47.patch.txt1.23 KBIgnoredNoneNone

#19

killes@www.drop.org - January 26, 2007 - 00:14
Status:reviewed & tested by the community» needs work

I'll backport once this is committed to 5/HEAD.

#20

killes@www.drop.org - January 26, 2007 - 00:33
Version:4.7.5» 5.x-dev

...

#21

jasoncd - January 30, 2007 - 23:06
Version:5.x-dev» 4.7.6

I am unfortunately very confused. I just updated from Drupal 4.7.4 to 4.7.6. I see that the Content-Disposition lines have been removed, but I can't find any record of why. This causes files attached to content to always be downloaded as index.php, instead of the actual file name. I had previous hacked the module to fix the space problem by using the technique I posted at http://drupal.org/node/23759. I only now realize that that upload module is not the same as this upload module, which is frustrating.

#22

mfb - February 17, 2007 - 23:42
Version:4.7.6» 5.x-dev
Status:needs work» needs review

This patch adds an optional second parameter to mime_header_encode for quoted-string encoding of ASCII strings containing special characters (e.g. space). This would be helpful for contrib modules such as audio.

AttachmentSizeStatusTest resultOperations
quoted_string_0.patch1.48 KBIgnoredNoneNone

#23

mfb - February 18, 2007 - 05:33
Component:upload.module» other

This tweak also escapes the backslash, in addition to double quote.

AttachmentSizeStatusTest resultOperations
quoted_string_2.patch1.47 KBIgnoredNoneNone

#24

Chris Johnson - March 28, 2007 - 12:28

What needs to happen to get some movement on this issue? We have patches for both 4.7 and 5 posted here. This bug causes lots of problems with Firefox, because all modern file systems support filenames with embedded spaces.

#25

Steven - March 29, 2007 - 01:13
Status:needs review» needs work

I think the regexp needs to be inverted. Instead of a blacklist of characters, let's have a whitelist. Only if the string consists of characters XYZ should we not quote it. Quoting isn't harmful anyway.

Also please pay attention to code style and readability... there are loads of unnecessarily escaped characters in there, especially in the regexp. The choice of regexp delimiter is also poor, as it forces more escaping. Regexps are unreadable enough as it is, let's not complicate them more.

We should also avoid phrasings like "special characters". This can mean a couple hundred things depending on context. Let's be clear what we're talking about: "make it safe for embedding as a MIME header token according to RFC XXXX" or something.

#26

mfb - March 29, 2007 - 03:56

The only reason I used a blacklist was to follow RFC 2183 -- http://drupal.org/node/82614#comment-143416 -- with regards to its "tspecials" characters. We could come up with our own more readable whitelist (maybe A-Za-z0-9 and _) but that wouldn't be strictly inline with the RFC..

#27

mfb - March 29, 2007 - 03:56

The only reason I used a blacklist was to follow RFC 2183 -- http://drupal.org/node/82614#comment-143416 -- with regards to its "tspecials" characters. We could come up with our own more readable whitelist (maybe A-Za-z0-9 and _) but that wouldn't be strictly inline with the RFC..

#28

mfb - June 19, 2007 - 00:46
Version:5.x-dev» 6.x-dev
Component:other» base system
Status:needs work» needs review

Here's an updated patch for HEAD. I used a whitelist of OK alphanumerics (and underscore) rather than blacklisting the known "tspecials" characters.

AttachmentSizeStatusTest resultOperations
quoted_string_3.patch1.76 KBIgnoredNoneNone

#29

drewish - July 27, 2007 - 18:15
Version:6.x-dev» 7.x-dev

subscribing. looks like this'll get bumped to 7.

#31

liquidcms - June 4, 2008 - 05:36

wow.. this has been around for 3 yrs and still no fix.. scary.. subscribing..

#32

catch - June 4, 2008 - 11:59
Status:needs review» needs work

Still applies with offset but needs updating for coding standards concat operator spacing changes.

#33

mfb - June 10, 2008 - 00:32
Status:needs work» needs review

Here ya go.

AttachmentSizeStatusTest resultOperations
quoted-string.patch1.8 KBIdleFailed: Failed to install HEAD.View details | Re-test

#34

matt2000 - November 9, 2008 - 06:24

Patch looks good and applies cleanly, but doesn't seem to actually solve the original issue...

The original issue, by the way, no longer applies to Firefox, as described, but does occur in Opera.

#35

Arancaytar - November 9, 2008 - 15:00
Status:needs review» needs work

1.) Looks like you've extended the API, but not actually used the extension. If you define a new parameter for the function, shouldn't it be used wherever the content-disposition header is generated?

2.) Perhaps the parameter should be called "$strict" or "$alphanumeric" instead of "$quoted"...

3.) Does the RFC forbid filenames being quote-encased if they *don't* contain a space? If it does not, why not apply the quotes universally and forget about checking this? All you need to do is

<?php
'Content-Disposition: '. $disposition .'; filename="'. addslashes($name) . '"'
?>

As in the original issue, but with addslashes to escape double-quotes and backslashes.

#36

mfb - November 9, 2008 - 20:17

There is no longer any place in Drupal core where this is a problem. It's an API extension for contrib modules which do pass along arbitrary file names and such in headers.

I don't care what the parameter is called. Basically it should not be set for headers such as an ASCII e-mail subject where spaces do not need to be wrapped in quotes.

Yes according to the RFC you should only add the quotes when needed. Of course the main standard for e-mail and HTTP-related RFCs is interoperability not strictly following RFCs to the letter.

#37

matt2000 - November 9, 2008 - 20:32
Status:needs work» needs review

There is no longer any place in Drupal core where this is a problem.

I haven't tested D7 thoroughly, but this is definitely still a problem in 6.6 for certain browsers (Opera, FF<3), and neither upload_file_download() nor mime_header_encode() have changed according to api.drupal.org.

I've tried several approaches and combinations of approaches in this thread, and have not yet found a way to reliably allow spaces in filenames when the download method is private.

Can anyone else confirm the issue still exists?

#38

mfb - November 9, 2008 - 20:49

You can see here http://api.drupal.org/api/function/upload_file_download/4.7 that file names were passed in the HTTP headers but I was under the impression they no longer are in drupal core. Contrib modules such as audio module on the other hand do need to pass arbitrary file names in headers.

#39

matt2000 - November 9, 2008 - 21:14
Status:needs review» reviewed & tested by the community

OK, my mistake; apparently the old 'bug' has been replaced with a new one; I get same problem now even with a 'space-free' filename. I'll report back after further testing and/or start a new issue.

As for mfb's latest patch, it does extend the API as needed and applies cleanly, so I'm marking RTBC.

#40

drewish - November 10, 2008 - 00:28
Status:reviewed & tested by the community» needs work

Sorry but this really needs some unit tests.

#41

mfb - November 10, 2008 - 01:38

Well actually it looks like all of unicode.inc needs unit tests..

#42

matt2000 - November 12, 2008 - 11:02

Just reporting back as promised: my look-a-like issue was actually being caused by an error in fckeditor_file_download(), which has been fixed in the most recent rc.

#43

drewish - November 16, 2008 - 12:07

#44

mfb - March 2, 2009 - 02:39
Status:needs work» needs review

Added a bare minimum of unit tests..

AttachmentSizeStatusTest resultOperations
quoted-string.patch2.96 KBIdleFailed: Failed to install HEAD.View details | Re-test

#45

drewish - March 2, 2009 - 04:36

The unit tests look good but I don't think the docs for the $quoted parameter will pass snuff with the core committers:

+ * @param boolean $quoted
+ *   If TRUE, perform quoted-string encoding of non-alphanumeric ASCII.

The boolean after @param doesn't seem to match the coding standards. And the description doesn't really make sense. What is quoted-string encoding? When and why would I want to use it? When would I not?

Edit: I forgot to mention that once that's corrected I'd say this is RTBC.

#46

mfb - March 2, 2009 - 06:09

bleh, too many coding standards.. :p

BTW it might be cool, if mime_header_encode() supported an array of header field parameters as input; these would automagically be assembled into a valid header string. I.e.

<?php
mime_header_encode
(array('attachment' => NULL, 'filename' => $filemime, 'modification-date' => date('r')));
?>
instead of
<?php
'attachment; filename=' . mime_header_encode($filename, TRUE) . '; modification-date=' . mime_header_encode(date('r'), TRUE);
?>

AttachmentSizeStatusTest resultOperations
quoted-string.patch3.8 KBIdleFailed: Failed to install HEAD.View details | Re-test

#47

drewish - March 2, 2009 - 06:35

That looks a lot better, could we reference the appropriate RFC in the comments? I'm assuming that would be 2045...

#48

mfb - March 2, 2009 - 06:48

There are at least 4 RFCs that come into play, but seems like 2045 is the best.

AttachmentSizeStatusTest resultOperations
quoted-string.patch3.88 KBIdleInvalid PHP syntax in unicode.inc.View details | Re-test

#49

System Message - March 4, 2009 - 07:55
Status:needs review» needs work

The last submitted patch failed testing.

#50

mfb - March 17, 2009 - 08:34
Status:needs work» needs review

#51

Arancaytar - May 12, 2009 - 20:57
Status:needs review» reviewed & tested by the community

It's a good idea to improve standard compliance. I'm not sure about the name of the "quoted" parameter, but I can't think of a better one.

There are no code-style problems and the tests pass, so RTBC.

However, you may have to replace test strings like "my test.txt" with something randomly generated, which seems to be the norm,

#52

agerson - May 21, 2009 - 18:46

Any chance of getting this into a drupal 6.x release?

#53

System Message - May 25, 2009 - 03:20
Status:reviewed & tested by the community» needs work

The last submitted patch failed testing.

#54

webchick - May 25, 2009 - 04:50
Status:needs work» reviewed & tested by the community

#55

webchick - June 20, 2009 - 08:24
Status:reviewed & tested by the community» needs work

Not to drag out a 3-year old issue even longer, but...

Could someone help me out with some clear steps and/or sample code to reproduce this? Tried the steps described by John Hwang (private files, non-web-accessible directory) with upload.module in both Firefox 3.5rc1 and Safari 4.0, the file gets auto-%20ed.

I saw in #36 that this is no longer reproduceable in core. Not to be dense, but then why can't contrib do whatever upload module is doing to avoid the problem?

You can also read this as "We need a more 'real world' test." The current tests really are just testing PHP at this point.

#56

mfb - June 20, 2009 - 08:53

As I mentioned at #36, this is an API feature that currently has no bearing on core that I know of, but is of use to contrib modules that need to pass along arbitrary strings in headers.

#57

mfb - June 24, 2009 - 19:55

Since this isn't used in core I should add a test module, similar to file_test.module. Or, perhaps I could just reuse file_test.module and add a test to file.test.

 
 

Drupal is a registered trademark of Dries Buytaert.