As it stands, Imagefield saves an uploaded image file with its original filename, character for character. For example:
"this%20file.jpg" is saved as "this%20file.jpg"
"that+file.jpg" is saved as "that+file.jpg"
etc...
Filenames with encoded entities (or %'s in general) pose a problem when trying to access the image via a URL. Currently it can can break the new image upload previews for Imagefield, direct access to the image via url, scaled versions via imagecache, thickboxed versions, etc.
All that is really needed is a little filename cleanup in _imagefield_widget_prepare_form_values() to avoid this.
Right after:
// Attach new files
if ($file = file_check_upload($fieldname . '_upload')) {
$file = (array)$file;
We can clean things up by adding:
// urldecode and cleanup the filename get rid of problems
$file['filename'] = str_replace("%","",urldecode($file['filename']));
It's very simple change to avoid problems down the line and (if nothing else) clean up some very ugly filenames people seem to get when using saved web images.
| Comment | File | Size | Author |
|---|---|---|---|
| #2 | imagefield-enc-chars.patch | 633 bytes | Moonshine |
Comments
Comment #1
catchpatch needs to be in unified diff format. This change looks sensible though.
Comment #2
Moonshine commentedWell hopefully this will do. I've made serveral other changes to imagefield so the line numbering may be off. It's just the simple addition described above.
Comment #3
sunGood catch, however, that is not enough. I was a bit shocked, that there is neither such a filename cleanup in Drupal's file.inc/upload.module, nor a function to clean a filename yet. Kind of file_make_url_safe().
Any special char needs to be removed from a filename. For example, try downloading this:
http://testdrupal/files/1 Copy of %58 2% &# in €ur%C3%B6.jpg(Example link)
Ran some tests:
While
%20seems to be correctly converted and@or$seems to be no problem, any other character, especially such as#,&or€will break a link pointing to such a file.Bear in mind that almost any filesystem is able to save files with such filenames. In my tests, Drupal correctly decoded almost all chars but € and ö in front of generating the file. Regarding German umlauts I'm not sure if they were misinterpreted by Apache running on Windows, since I've seen some of our clients using such filenames on production sites running on Unix.
However, characters like # will definitely break the URI (and not the filename).
I wonder if this bug should really be discussed solely for Imagefield. (Tests run with upload.module)
I'd suggest to move this issue to Drupal's file system queue.
Comment #4
sunSee also http://drupal.org/node/153574
Comment #5
smk-ka commentedThere is a new module that takes care of transliteration and cleaning of filenames:
Transliterate filenames
Comment #6
sunExcellent! file_translit works for me.