I noticed that patch files are sent as ISO-8859-1 rather than UTF-8. See for example http://drupal.org/files/issues/content-type-descriptions.patch (the quotes). As Drupal is completely UTF-8, we should probably send those patch files as UTF-8 as well.

CommentFileSizeAuthor
#20 Screenshot-5.png55.73 KBbrianV

Comments

morbus iff’s picture

Server configuration, unrelated to Drupal. See http://httpd.apache.org/docs-2.0/mod/mod_mime.html#addcharset. The default for Apache 2 is controlled via "AddDefaultCharset On", which enables a default charset of iso-8859-1. To fix .patch file display, probably something along the lines of:

AddCharset utf-8 .patch

gerhard killesreiter’s picture

could we make utf-8 the default charset for all text documents?

kkaefer’s picture

Yes, this is not related to Drupal. That's why I categorized it as Drupal.org maintenance. I don't mind sending all text files as utf-8.

kkaefer’s picture

Component: web site » Site organization

Is there a consensus on changing the default encoding to UTF-8?

kkaefer’s picture

Project: Drupal.org site moderators » Drupal.org infrastructure
Component: Site organization » Webserver

(recategorizing)

killes@www.drop.org’s picture

I think that's a good idea. What would I need to change?

bdragon’s picture

Status: Active » Needs review

Adding
AddDefaultCharset utf-8
in .htaccess should do the trick. (Tested locally)

moshe weitzman’s picture

fyi, security.drupal.org needs some changes too as it fails to even display patches inline but rather offers a download.

bdragon’s picture

I believe that the code to solve THAT problem would be:
AddType text/plain .patch

pwolanin’s picture

yes please - it's very annoying with s.d.o

grendzy’s picture

+1 on this. Patches with non-latin characters get corrupted by Firefox. (see http://drupal.org/files/issues/212130-decode-entities-support-all-entiti... ).

At least on drupal.org, the server doesn't send any charset at all:

HTTP/1.0 200 OK
Date: Wed, 31 Dec 2008 01:44:06 GMT
Server: Apache
Last-Modified: Tue, 30 Dec 2008 14:46:06 GMT
ETag: "1c7ba4-2850-45f44a6b2e780"
Accept-Ranges: bytes
Content-Length: 10320
Cache-Control: max-age=1209600
Expires: Wed, 14 Jan 2009 01:44:06 GMT
Vary: Accept-Encoding
Content-Type: text/plain
X-Cache: MISS from www4.drupal.org
X-Cache-Lookup: HIT from www4.drupal.org:80
Via: 1.0 www4.drupal.org:80 (squid/2.6.STABLE17)
Connection: keep-alive
pwolanin’s picture

security.drupal.org now displays patches inline, BUT I cannot download them any longer.

update: seems (maybe) to have been the fact that I disabled 3rd party cookies. Not sure that makes sense, but...

grendzy’s picture

Now that the D6 upgrade is complete, maybe it's a good time to revisit this?

nnewton’s picture

Assigned: Unassigned » nnewton

I'm currently discussing this with the OSL. I'd like to change this globally. Assigning this to me.

brianV’s picture

Is there any status update for this? #582534: Free tagging vocabularies: Treatment of comma in Chinese and Unicode is breaking because of this issue.

killes@www.drop.org’s picture

Was this fixed? the example given in #11 downloads fine for me.

johnalbin’s picture

No, it wasn't fixed. Safari seems to auto-correct and guess at the correct encoding. But if you use Firefox, you'll see the patch in #11 is still corrupted.

+  '∧' => '∧',

instead of

+  '∧' => '∧',
johnalbin’s picture

:-\

Although, maybe its a Firefox bug.

$ curl -I http://drupal.org/files/issues/212130-decode-entities-support-all-entities.patch
HTTP/1.1 200 OK
Server: Apache/2.2.3 (CentOS)
Last-Modified: Wed, 31 Dec 2008 00:33:42 GMT
ETag: "20aa33-27e7-45f4cdc1ec580"
Cache-Control: max-age=1209600
Expires: Tue, 19 Jan 2010 14:02:54 GMT
Vary: Accept-Encoding
Content-Type: text/plain; charset=UTF-8
Content-Length: 10215

Because I do see the correct charset in the headers.

gerhard killesreiter’s picture

#11 works for me in FF.

brianV’s picture

StatusFileSize
new55.73 KB

Not working here yet.

Another example is this one here:

http://drupal.org/files/issues/582534-freetagging-fullwidth-comma_0.patch

which displays as in the attached screenshot. I also suspect that the testbots are getting the ISO-8859-1 version of this patch, and that is why it is failing miserably. Full issue is #582534: Free tagging vocabularies: Treatment of comma in Chinese and Unicode.

gerhard killesreiter’s picture

This patch works as well for me in FF but I get "Content-Type: text/plain" using curl -I for _both_ of them...

brianV’s picture

@Gerhard Killesreiter,

I am using Firefox 3.5.

On further reading, I determined the following:

You are right in that Curl returns 'text/plain', but that doesn't indicate anything about the encoding. Actually, bare 'text/plain' is completely ambiguous as to what charset is used, and leaves it up to the browser. It seems that in my case, at least, ISO-8859-1 is used by default.

If nothing else, the patch server should should send the content type as 'text/plain;charset=utf-8' to prevent the browser from applying ISO-8859-1 or whatever charset it seems to be using.

gerhard killesreiter’s picture

Yes, this is what we should send.

I've now added what Bdragon proposed to .htaccess and curl give me

Content-Type: text/plain; charset=UTF-8

Does it work for you too?

brianV’s picture

@killes

Fixed it for me! Thanks!

grendzy’s picture

Status: Needs review » Fixed

Yes!

nnewton’s picture

This is now merged into our vhost files.

Thanks guys.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Component: Webserver » Servers