Problem:
http://www.alphagalileo.org/rss_channels/?feed=all . This URL uses redirect.
The redirect happens in this way:
GET /rss_channels/?feed=all HTTP/1.0
User-Agent: Wget/1.10.2 (Red Hat modified)
Accept: */*
Host: www.alphagalileo.org
Connection: Keep-Alive
HTTP/1.1 307 Temporary Redirect
Connection: close
Date: Thu, 04 Sep 2008 08:25:16 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Set-Cookie: CFID=4574600;expires=Sat, 28-Aug-2038 08:25:16 GMT;path=/
Set-Cookie: CFTOKEN=67833421;expires=Sat, 28-Aug-2038 08:25:16 GMT;path=/
location: http://www.alphagalileo.org/rss_channels/all.xml
Content-Type: text/html; charset=UTF-8
(i just recorded the http traffic with wireshark)
wget handles http response header field names differently as Drupal does. (see resp_header_locate in http.c at wget source code)
Drupal treats these field names case sensitive, wget does the opposite. I read the RFC 2616, but it's not clear me that Drupal does or does not follow the RFC in the question. But for practical reasons it should work as wget.
I attached a patch to fix this issue. It basically makes all the header field names lowercase and fix the places in drupal core where it matters.
Comments
Comment #1
dries commentedLet's write a simpletest for this?
Comment #2
aron novakI wrote a test for this bug. I re-rolled the patch, now it contains the test too.
Comment #3
aron novakI rerolled the patch to make it apply without any notices.
Comment #5
lilou commentedSee: #335122: Test clean HEAD after every commit and http://pastebin.ca/1258476
Comment #7
aron novakRerolled against HEAD.
Comment #9
mr.baileysMessage header field names are case insensitive (see Message Headers in RFC 2616), so it looks like wget got it right:
I'm confused about the patch though:
If we just established that header field names are case insensitive, then why are we changing the case of these field names, especially since the RFC uses capitalized field names itself (for example: If-Modified-Since)? Note: I'm not saying it's the wrong approach, I'm just not sure I fully understand the problem.
Comment #10
aron novakWell, you can do the strtolower() at every comparison, but this sucks. I think the right approach is to use it always lower-case in Drupal (except when we issue a request) and use strtolower() only when parsing the http requests.
I have to take a look why the patch failed.
Comment #11
mr.baileysI looked closer at your patch and the test and think I now understand the problem a bit better. I also agree that using strtolower() or strcasecmp() isn't the most elegant solution. I can't shake off the feeling that the proposed patch treats the symptoms rather than the actual problem though, but I understand your reasoning.
And now on to the actual patch:
I think this can become a bit messy: if I'm not mistaken, we can end up with two values, $result->headers['Set-Cookie'] and $result->headers['set-cookie'] which have different values?
as well as when the actual lowercase conversion happens.
Comment #12
c960657 commentedWell, I think the actual problem is that PHP doesn't support an array-like datatype with case-insensitive keys. The strtolower() approach is also used in SimpleTest in DrupalWebTestCase::drupalGetHeaders().
This changes headers that are sent by drupal_http_request(). There is no need to lower-case them. It is only received headers that should be lower-cased.
Comment #13
c960657 commentedUpdated patch.
Comment #15
c960657 commentedAhem.
Comment #17
c960657 commentedIt's been a while, but here is a reroll.
Comment #18
c960657 commentedReroll.
Comment #19
c960657 commentedReroll.
Comment #20
c960657 commentedReroll.
Comment #21
damien tournoud commentedRTBC if the bot passes.
Comment #23
c960657 commented#20: drupal_http_request-6.patch queued for re-testing.
Comment #24
c960657 commentedRTBC as per #21.
Comment #25
dries commentedCan we add a code comment that references the RFC (see comment #9)?
Comment #26
c960657 commentedAdded a reference to RFC 2616 in the Doxygen comment.
Comment #27
dries commentedThanks a lot. Committed to CVS HEAD.