When uploading files with special caracteres, e.g. ç,ã,é,ü,ì etc, the files are saved with different characters and after it didn't get the file.

"çãéüì.txt" turn in "çãéüì.txt"

Sorry for my english, i'm Brazilian.
And it's my first bug report, sorry if i do anything wrong.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

anderson.ribeiro’s picture

Status: Needs work » Active
marcvangend’s picture

Version: 7.0-alpha6 » 7.x-dev

Filing under 7.x-dev so it will get the attention is needs. I'm not sure this is critical though.

lotyrin’s picture

Are we not transliterating filenames in core? I tried uploading a file named "ウィキペディア.txt" (contains only Japanese katakana) and get:

For security reasons, your upload has been renamed to txt..
Error message The specified file txt. could not be uploaded. Only files with the following extensions are allowed: txt.
lotyrin’s picture

Title: File names problems » File names aren't transliterated.
Assigned: anderson.ribeiro » Unassigned

+1 for this being critical. I'd say it's pretty common for users to need to be able to upload files with names in their own language.

cleaver’s picture

This functionality can be provided by the transliteration module: http://drupal.org/project/transliteration

If it is decided that this needs to be in core, then some of the transliteration code could be adopted.

anderson.ribeiro’s picture

Since the upload of files is in core, so i think that the transliteration is important to be in core.

catch’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » task
Priority: Critical » Normal

It's a good idea but it's too late for Drupal 7.

lotyrin’s picture

Maybe doing transliteration is a future task, but for now these files need to at least get uploaded and stored in a way that we can use them in combination with things like image formats.

Even if what we do is detect when there are invalid characters and replace the whole file name with a number.

lotyrin’s picture

Version: 8.x-dev » 7.x-dev
Category: task » bug
Priority: Normal » Critical
kotnik’s picture

FileSize
26.73 KB

I just tried to upload çãéüì.jpg to an article, and it completely eat the file. This is critical.

andypost’s picture

I always use transliteration module. But suppose better to leave it in contrib because not all sites are used with file uploads.

lotyrin’s picture

I don't see filename transliteration as a feature that belongs in contrib, but rather a bug fix that belongs in the file module (and therefore, belongs in core).

Unless, of course, there is a simpler alternative for fixing this.

andypost’s picture

@lotyrin transliteration could be changed more often then core because of nature of languages and charsets. So better drop a line in readme pointing to project page.
Also there probability that some users wont use transliteration so anyway this should be configurable

webchick’s picture

Priority: Critical » Normal
Status: Active » Postponed (maintainer needs more info)

I can't reproduce #10? I named the image as specified and it shows up fine for me:

çãéüì.jpg

When I view the URL in my browser (Firefox or Safari), I see the name of the file as expected:

URL

What are the steps to actually break this?

webchick’s picture

Oh. And please follow those same steps in Drupal 6 with Upload module and report the results. I have a hunch the behaviour is the same. If so, this is a Drupal 8 issue.

andypost’s picture

Mostly this depends on file-system you are using at server. cant reproduce now.

lotyrin’s picture

Title: File names aren't transliterated. » File names aren't transliterated, so server configuration needs to support UTF8 filenames.
Component: file.module » documentation
Assigned: Unassigned » lotyrin
Category: bug » task

Hmm. I am on a new server, so I guess I can't guarantee that this isn't an issue with my server configuration. If #14 worked, I guess it works on at least some Apache/PHP/filesystem configurations, isn't a bug in File and seems more like a documentation issue (making sure users know how to configure their server for UTF8 filenames)

I'll try to figure out exactly what needs to exist server-side to allow these files to be saved properly and make sure those steps exist in documentation somewhere.

andypost’s picture

Attaching this file causes cut-off first word
Original filename: Андрей аАЕёЁиИйЙоОуУъЪыЫьЬэЭюЮяЯ.png
Saved filename: аАЕёЁиИйЙоОуУъЪыЫьЬэЭюЮяЯ.png

Screen attached as original filename, so D6 works exactly the same

kotnik’s picture

I did a bit of research, and this is not related to Drupal, but PHP's environment is not set properly.

Basename function is removing characters cause the results of the basename() function are dependent on your locale setting.

If you want to reproduce, go to file.inc on the 1196th line. Before that add something like:

setlocale(LC_ALL, 'C');

That was the case with my system, and that's how I got the screenshot above.

After setting locale to:

setlocale(LC_ALL, 'en_US.UTF8');

Everything works fine.

So, this should at least be documented, or Drupal could set locale depending on language?

Rob Carriere’s picture

You can set locale, but that only helps if the underlying stack (PHP and OS) actually supports that locale, so by itself it is not a reliable solution.

I can confirm that Drupal 6 has the same issue if the underlying OS has the issue (been there multiple times...).

kotnik’s picture

PHP's setlocale returns FALSE if the locale functionality is not implemented, and in that case Drupal could note user on Status Report.

I think that this should go to 8.x-dev, where Drupal would set correct locale, depending on the site language, and if it fails, user would be notified about that on Status Report page. From my experience, almost all systems support all locales (all Windows do by default), and this would prevent future issues about this.

Damien Tournoud’s picture

Status: Postponed (maintainer needs more info) » Closed (duplicate)

So this is a duplicate of #278425: Using basename() is not locale safe. Let's help get this other issue in.