Swedish characters in file names
Anders Östberg - May 14, 2009 - 21:21
| Project: | Filebrowser |
| Version: | 6.x-2.0-rc10 |
| Component: | Directory Listing Pages |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | postponed (maintainer needs more info) |
Jump to:
Description
Swedish characters Å,Ä,Ö,å,ä,ö in file names display as a square.
My web site uses UTF-8, is that perhaps the problem?

#1
No its because the fopen function or its relatives in PHP are written in ANSI standard, not UTF.
There is no real workaround for this, so it's kinda best to rename those files you want to download
#2
Thanks.
This is a major problem, and if that is how Filebrowser will work I'll have to look for another solution.
#3
at least it used to be how the old version worked.
Greetings
Zewa
#4
Yes this is a major problem but I can't reproduce it. I also use UTF8 and french language and I have no problem with our specific characters (for filenames and descriptions).
Can you have a look at the page source code to see if this is a font problem or an encoding problem.
#5
mmhm ... can you tell me what version of PHP you are using Yoran?
I use PHP 5.2.5 + PHP 4.4.8 + PEAR with the Xampp Package 1.6.6a for development.
Greetings
Zewa
#6
The page's charset is utf-8 and font is Arial. If I force the browser to display the page using "Western European (Windows)" encoding, the national characters are displayed correctly. PHP version is 5.2.9-1.
#7
Additional info; I tried adding utf8_encode() for the display-name output, and the characters are then displayed correctly, so I would assume this has to do with not converting the file names to utf-8. I couldn't make this work properly for the file url though, I don't know how and where in the code to correctly convert all characters, so I'll have to leave this to the maintainer.
#8
Still a problem with rc10
#9
Sorry for my late anwser.
I tried to understand what was going on and my guess is that you filesystem is not using UTF-8 as filename encoding and PHP readdir is just taking what the filesystem is giving to it, whatever encoding it is. I made a try with EXT3 FileSystem and ISO-8859-15 encoding, and I can reproduce the issue.
So, the problem of you solution (utf8_encode) is that it will not work in any situation as we can have many kind of encoding for any filesystem. Perhaps better solution is to use mb_convert_encoding instead. Can you give this a try :
replace (in filebrowser.module)'display-name' => $file_name,
by
'display-name' => mb_convert_encoding($file_name, "UTF-8","ISO-8859-1"),
If this working, perhaps I can add a new setting "filesystem encoding" with UTF-8 as default. What do you thing ?
#10
That works but I'm afraid this is only half the solution.
The trickier part is the url, I can't make the links work with any kind of encoding/conversion.
#11
well this part can be handled via path-auto. it can rewrite your url links.
filebrowser than has to recode it to the encoding needed.
Greetings
Zewa
#12
Well I see :/ I can use some kind of transilteraion but I fear duplicates with this solution.
Perhaps a hash can do the job...
#13
I'm unfortunately running out of time to get my site into production so I'll have to give up on this now and implement some other Windows/Swedish-specific solution. I'll keep an eye on filebrowser though, it's a great idea and exactly what I need if it only could handle UTF-8 and national characters. Thanks for your efforts in building this and trying to resolve the problem.
#14
replace (in filebrowser.module)'display-name' => $file_name,
by
'display-name' => mb_convert_encoding($file_name, "UTF-8","ISO-8859-9"),
this workaround works for Turkish characters. as you pointed out in your post, a new setting "filesystem encoding" would be useful IMHO.
thanks Yoran.
#15
I've tried that, and many similar conversions, but it only solves half the problem. The displayname is correct in the browser, but the link to the file or subdirectory is still incorrect.