When drupal_page_cache_header() compares the client's If-Modified-Since header to $cache->created, it converts $cache->created to a string and compares it to the string sent by the client.

HTTP/1.1 allows several variations of the date format, i.e. the same time may be represented in slightly different ways. If the client sends the date in a different format than the one generated by Drupal, it will never receive a 304 Not Modified response.

This patch makes drupal_page_cache_header() compare Unix timestamps instead.

Firefox appears to send the same string as received in the Last-Modified header, so Firefox users don't experience this problem, but AFAICT this behaviour is not required by the HTTP spec.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Well, the behavior is not forbidden by the HTTP/1.1 specification (section 14.25):

      Note: When handling an If-Modified-Since header field, some
      servers will use an exact date comparison function, rather than a
      less-than function, for deciding whether to send a 304 (Not
      Modified) response. To get best results when sending an If-
      Modified-Since header field for cache validation, clients are
      advised to use the exact date string received in a previous Last-
      Modified header field whenever possible.
c960657’s picture

I assume that “exact date comparison function” means comparing the date represented by the header, not the verbatim string.

Anyway, not returning a 304 is never forbidden, but it will give us better caching if we return 304 whenever possible.

Dries’s picture

Damiwn, are you suggesting this patch is a bad idea? It looks like an improvement over the current situation to me. It would be great if it came with tests though.

Damien Tournoud’s picture

Category: bug » feature
Priority: Minor » Normal

@Dries: I'm only saying that this is not a bug. Requalifying as a feature request.

c960657’s picture

I added a test that looks for 304 Not Modified, even if If-Modified-Since uses another date format that the one generated in drupal_page_cache_header().

In order to do this, drupal_web_test_case now allows setting request headers and inspecting response headers. I added type-hinting to the methods I touched (as suggested in #318016: Type hint array parameters).

The existing header parser in drupal_http_request() is moved to a seperate function. I changed drupalHead() to return a header => value array instead of a string — this seems more natural, now that we are able to parse the HTTP headers.

It may be a bit confusing that request headers are specified as array("header: value", …), while response headers are returned as array("header" => "value", …). Drupal already uses different conventions in drupal_set_header() and in $result->headers in drupal_http_request(). I don't know whether we should standardize on one format.

gmdate() in drupal_page_cache_header() now uses the named constant, DATE_RFC1123 (available as of PHP 5.1.1). RFC 1123 dates are mandated by section 3.3.1 in the HTTP/1.1 spec.

c960657’s picture

Chasing HEAD.

Dries’s picture

Status: Needs review » Needs work

- The "header:" stuff is a little weird indeed until I saw the examples. It is weird to refer to it as "header: " -- it is just the regular HTTP-header format. Maybe it is a terminology thing in the phpDoc.

- curlExec() could use more/better code comments.

- Would be good to add a test case for another date format.

- Would be good to add a test case for a scenario that should _not_ return a 403. Do you consider the test cases to be complete?

c960657’s picture

- The "header:" stuff is a little weird indeed until I saw the examples. It is weird to refer to it as "header: " -- it is just the regular HTTP-header format.

I have changed the variable names and documentation to refer to name/value rather than header/value. The HTTP spec refers to these as field name and field value. Though the "name: value" format is the regular format used by header(), I think it needs to be explicitly documented, because associative arrays are also sometimes used when working with HTTP headers.

I added some comments to curlExec() and some more tests. I think the test coverage is now more or less complete wrt If-Modified-Since. If-None-Match is a rather complicated subject that should have some more tests of its own. Would you like these in this issue?

c960657’s picture

This needs a reroll. Much of the header voodoo can probably be avoided by hooking into drupal_web_test_case::curlHeaderCallback() that was just added in #243532: Catch notices, warnings, errors and fatal errors from the tested side.

c960657’s picture

Status: Needs work » Needs review
FileSize
19.63 KB

Reroll.

Now hooks into drupal_web_test_case::curlHeaderCallback() and leaves drupal_http_request() alone, except that it changes the $header/$name wording to $name/$value for consistency ($name/$value is already used in drupal_mail_send()).

Status: Needs review » Needs work

The last submitted patch failed testing.

c960657’s picture

Status: Needs work » Needs review
c960657’s picture

Status: Needs review » Postponed

The work on adding the ability to add and inspect headers in simpletests is continuing over in #330582: Retrieve HTTP response headers. I'll postpone this issue until that other issue is fixed.

[edit: I originally linked to this issue instead of #330582]

c960657’s picture

Status: Postponed » Needs review

Now that #330582: Retrieve HTTP response headers has landed, the patch has been reduced in size. The patch now fixes drupal_page_cache_header(), adds a test, and adds the ability to specify HTTP request headers in drupal_web_test_case. The latter is relevant in other issues also, e.g. #43462: cache_set and cache_get base_url brokenosity.

c960657’s picture

Dries’s picture

Status: Needs review » Fixed

Thanks for the re-roll and the additional tests. This looks like the right thing to do and the code looks solid too. Committed to CVS HEAD. Thanks!

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.