function page_get_cache() in bootstrap.inc only checks the page cache table on GET requests. This seems like a mistake; if a browser or proxy has a cached version of a page, it's not uncommon for it to make a HEAD request to see if it should fetch a new copy. Since HEAD requests never go to the page cache, Drupal will return headers indicating that the page has changed even if it hasn't.

In general, a HEAD request should return the same headers as a GET request. From RFC2616:

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. ... The response to a HEAD request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.

I suggest changing

  if (!$user->uid && $_SERVER['REQUEST_METHOD'] == 'GET' && count(drupal_set_message()) == 0) {

to

  if (!$user->uid && ($_SERVER['REQUEST_METHOD'] == 'GET' || $_SERVER['REQUEST_METHOD'] == 'HEAD') && count(drupal_set_message()) == 0) {
CommentFileSizeAuthor
#26 d6-head.patch2.21 KBmfb
#23 d6-head.patch2.61 KBmfb
#18 d6-head.patch2.71 KBmfb
#7 d7-head.patch7.02 KBmfb
#2 d7-head.patch2.73 KBmfb
page_cache_head.patch535 byteseli

Comments

mfb’s picture

Most browsers will make a conditional GET request (If-Modified-Since), not a HEAD request. But tailing my logs, I do see HEAD requests, some from browsers and many from various bots and scripts.

So this seems like a good idea to me. It may also be a good idea to set the cache on a HEAD request (in common.inc); since we're going to all the effort to bootstrap and generate a page we may as well get something useful out of it.

mfb’s picture

Version: 5.7 » 7.x-dev
StatusFileSize
new2.73 KB

Patch (for head) to support HEAD requests.

moshe weitzman’s picture

Status: Needs review » Reviewed & tested by the community

Yes, there is no reason why these HEAD requests should be excluded.

dries’s picture

It would be great if we could write a test for this.

mfb’s picture

Should the tests just check that page cache works with HEAD requests or anything more?

damien tournoud’s picture

@mfb: we don't have a page cache at the moment, so this will have to be implemented. Moreover, there is no standardized way to call HEAD requests for now, so you will have to suggest an API extension to drupal_web_test_case.

mfb’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new7.02 KB

OK this adds drupalHead() and a very basic page cache test.

drupalHead() should be quite useful for future HTTP header tests.

dries’s picture

Status: Needs review » Fixed

I've made some minor changes to this patch and committed it to CVS HEAD. Thanks.

mfb’s picture

Status: Fixed » Needs work

Looks like something got screwed up, drupalHead() function is repeated twice in drupal_web_test_case.php

mfb’s picture

By the way, I just did some testing and actually, it was pointless to add the if $_SERVER['REQUEST_METHOD'] == 'HEAD' logic to page_set_cache(), because on a HEAD request, PHP dies as soon as you call ob_end_flush(), just before cache_set() would be called.

Won't cause any trouble to leave it in but it doesn't do anything useful after all.

dropcube’s picture

Priority: Normal » Critical

Marking it critical for the fact that drupalHead() function is repeated twice in drupal_web_test_case.php

dries’s picture

Status: Needs work » Fixed

Fixed in CVS HEAD.

damien tournoud’s picture

As a side note, Dries: #168909: Drupal messages (status/error) are cached along with nodes never made it to 7.x.

eli’s picture

mfb- Then something must have changed since Drupal 5. It definititely made a difference in 5.

mfb’s picture

@eli: I was talking specifically about my patch to page_set_cache() in common.inc, not your patch to page_get_cache() in bootstrap.inc.

I think it's ok to leave the HEAD logic in common.inc so there is some uniformity to how HEAD requests are handled. If you skip this loop for HEAD requests then the HEAD request moves on to the exit hook, whereas HEAD requests normally (with caching disabled or a cache already set) die before reaching the exit hook.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

mfb’s picture

Version: 7.x-dev » 6.x-dev
Priority: Critical » Normal
Status: Closed (fixed) » Needs review
StatusFileSize
new2.71 KB

This fix could get backported if desired.

webchick’s picture

Version: 6.x-dev » 7.x-dev
Priority: Normal » Critical
Status: Needs review » Active

This change broke SimpleTest all over the place in #308186: cURL 7.18.2 and SimpleTest module bitterly despise one another. After removing the CURLOPT_NOBODY => FALSE from curlExec() in drupal_web_test_case.php, things are working okay.

What larger consequence does that have, and can we get some tests in place to make ensure that whatever was changed here is doing what it needs to do?

chx’s picture

We ran nto http://curl.haxx.se/mail/tracker-2008-07/0006.html . If you read the manpage though http://pwet.fr/man/linux/fonctions_bibliotheques/curl_easy_setopt When setting CURLOPT_POST to a non-zero value, it will automatically set CURLOPT_NOBODY to 0 (since 7.14.1). As 7.14.1 was released on 09-06-2005 I believe we can conclude this fixed. You can http://ca.php.net/manual/en/function.curl-version.php use this to complain hard when simpletest gets enabled if you so want.

chx’s picture

I compiled a libcurl 7.19 to confirm that's the culprit and indeed it is. Behold this:

root@Tumbler:/usr/local/experimental# php -i |grep libcurl
cURL Information => libcurl/7.18.0 OpenSSL/0.9.8g zlib/1.2.3.3 libidn/1.1
root@Tumbler:/usr/local/experimental# bin/php -i |grep libcurl
cURL Information => libcurl/7.19.0
root@Tumbler:/usr/local/experimental# bin/php x.php
array(0) {
}
root@Tumbler:/usr/local/experimental# php x.php
array(1) {
  ["username"]=>
  string(7) "johndoe"
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://localhost/d.php');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,"username=johndoe");
curl_setopt($ch, CURLOPT_NOBODY, FALSE);
curl_exec($ch);
curl_close ($ch);

where d.php just dump $_POST

webchick’s picture

Version: 7.x-dev » 6.x-dev
Priority: Critical » Normal
Status: Active » Needs review

Ok, sorry for the freakout. I couldn't quite parse that referenced changelog entry, but chx summed it up as:

"if you set FOO 1 then BAR gets set to 0 but if you set BAR to 0 then FOO gets set to 0 as well.
WTF?"

So this is basically a workaround we need to do in our curlExec function to accommodate cURL's wacky, wacky ways.

And since SimpleTest is not part of D6, this can go back where it belongs.

mfb’s picture

StatusFileSize
new2.61 KB

Reroll due to offset warnings.

dpearcefl’s picture

Is this still an issue using current Drupal 6?

Status: Needs review » Needs work

The last submitted patch, d6-head.patch, failed testing.

mfb’s picture

Status: Needs work » Needs review
StatusFileSize
new2.21 KB

Yes, re-rolled

dpearcefl’s picture

Thanks!

Status: Needs review » Closed (outdated)

Automatically closed because Drupal 6 is no longer supported. If the issue verifiably applies to later versions, please reopen with details and update the version.