I noticed that patch files are sent as ISO-8859-1 rather than UTF-8. See for example http://drupal.org/files/issues/content-type-descriptions.patch (the quotes). As Drupal is completely UTF-8, we should probably send those patch files as UTF-8 as well.
| Comment | File | Size | Author |
|---|---|---|---|
| #20 | Screenshot-5.png | 55.73 KB | brianV |
Comments
Comment #1
morbus iffServer configuration, unrelated to Drupal. See http://httpd.apache.org/docs-2.0/mod/mod_mime.html#addcharset. The default for Apache 2 is controlled via "AddDefaultCharset On", which enables a default charset of iso-8859-1. To fix .patch file display, probably something along the lines of:
AddCharset utf-8 .patch
Comment #2
gerhard killesreiter commentedcould we make utf-8 the default charset for all text documents?
Comment #3
kkaefer commentedYes, this is not related to Drupal. That's why I categorized it as Drupal.org maintenance. I don't mind sending all text files as utf-8.
Comment #4
kkaefer commentedIs there a consensus on changing the default encoding to UTF-8?
Comment #5
kkaefer commented(recategorizing)
Comment #6
killes@www.drop.org commentedI think that's a good idea. What would I need to change?
Comment #7
bdragon commentedAdding
AddDefaultCharset utf-8in .htaccess should do the trick. (Tested locally)
Comment #8
moshe weitzman commentedfyi, security.drupal.org needs some changes too as it fails to even display patches inline but rather offers a download.
Comment #9
bdragon commentedI believe that the code to solve THAT problem would be:
AddType text/plain .patchComment #10
pwolanin commentedyes please - it's very annoying with s.d.o
Comment #11
grendzy commented+1 on this. Patches with non-latin characters get corrupted by Firefox. (see http://drupal.org/files/issues/212130-decode-entities-support-all-entiti... ).
At least on drupal.org, the server doesn't send any charset at all:
Comment #12
pwolanin commentedsecurity.drupal.org now displays patches inline, BUT I cannot download them any longer.
update: seems (maybe) to have been the fact that I disabled 3rd party cookies. Not sure that makes sense, but...
Comment #13
grendzy commentedNow that the D6 upgrade is complete, maybe it's a good time to revisit this?
Comment #14
nnewton commentedI'm currently discussing this with the OSL. I'd like to change this globally. Assigning this to me.
Comment #15
brianV commentedIs there any status update for this? #582534: Free tagging vocabularies: Treatment of comma in Chinese and Unicode is breaking because of this issue.
Comment #16
killes@www.drop.org commentedWas this fixed? the example given in #11 downloads fine for me.
Comment #17
johnalbinNo, it wasn't fixed. Safari seems to auto-correct and guess at the correct encoding. But if you use Firefox, you'll see the patch in #11 is still corrupted.
instead of
Comment #18
johnalbin:-\
Although, maybe its a Firefox bug.
Because I do see the correct charset in the headers.
Comment #19
gerhard killesreiter commented#11 works for me in FF.
Comment #20
brianV commentedNot working here yet.
Another example is this one here:
http://drupal.org/files/issues/582534-freetagging-fullwidth-comma_0.patch
which displays as in the attached screenshot. I also suspect that the testbots are getting the ISO-8859-1 version of this patch, and that is why it is failing miserably. Full issue is #582534: Free tagging vocabularies: Treatment of comma in Chinese and Unicode.
Comment #21
gerhard killesreiter commentedThis patch works as well for me in FF but I get "Content-Type: text/plain" using curl -I for _both_ of them...
Comment #22
brianV commented@Gerhard Killesreiter,
I am using Firefox 3.5.
On further reading, I determined the following:
You are right in that Curl returns 'text/plain', but that doesn't indicate anything about the encoding. Actually, bare 'text/plain' is completely ambiguous as to what charset is used, and leaves it up to the browser. It seems that in my case, at least, ISO-8859-1 is used by default.
If nothing else, the patch server should should send the content type as 'text/plain;charset=utf-8' to prevent the browser from applying ISO-8859-1 or whatever charset it seems to be using.
Comment #23
gerhard killesreiter commentedYes, this is what we should send.
I've now added what Bdragon proposed to .htaccess and curl give me
Content-Type: text/plain; charset=UTF-8
Does it work for you too?
Comment #24
brianV commented@killes
Fixed it for me! Thanks!
Comment #25
grendzy commentedYes!
Comment #26
nnewton commentedThis is now merged into our vhost files.
Thanks guys.