Hi,

I'm writing a very simple module to allow private files caching.

  • I have been searching a solution for a long time before posting here.
  • I'm talking about browser cache.
  • I work on my dedicated server with root access to apache/drupal settings.
  • I want to cache images using different styles of and original file.
  • And I have no problem with caching css/js files or images using public files system.

As you can read here the image_file_download function doesn't set Expires, Cache-Control and ETag headers for images.

// By not explicitly setting them here, this uses normal Drupal
// Expires, Cache-Control and ETag headers to prevent proxy or
// browser caching of private images.

So i use a hook_file_download in this function to define them :

function privatefilescache_file_download($uri){
	$year = 86400 * 365;
	// Getting file info
	$info = image_get_info($uri);
	// For images files
	if(isset($info['mime_type']) && strpos($info['mime_type'], 'image/') === 0){				
		$header = array(
			'Cache-Control' => 'max-age='.$year.', private',
			'Expires' => gmdate('D, d M Y H:i:s', time() + $year) .' GMT',
			'ETag' => strtr(md5($info['file_size']), 0, 10)
		);
		drupal_set_message('<pre>'. print_r($header, TRUE) .'</pre>');
		return $header;
	}
}

In my HTTP Header, i can see that Cache-Control, Expires and ETag are correctly set.
Request and Response parameters are equal but Last-Modified parameter is automatically set to the current date, and I have a 200 instead of 304 status code.

Even if i set Last-Modified to "Mon, 13 Aug 2012 14:35:00 GMT" for example, I still have a 200 status code.

The request :

Cache-Control:max-age=0
If-Modified-Since:Mon, 13 Aug 2012 14:35:00 GMT
If-None-Match:bec86aeee7d11a957751af645155a115

The response :

Cache-Control:max-age=31536000, private
Date:Sun, 18 Nov 2012 18:11:07 GMT
ETag:bec86aeee7d11a957751af645155a115
Expires:Mon, 18 Nov 2013 18:11:07 GMT
Last-Modified:Mon, 13 Aug 2012 14:35:00 GMT

I tried with 'public' instead of 'private' in Cache-Control. Still doesn't work.

Can it be an error in Apache configuration ?
How can I fix my problem?

I'm working on a private image bank, you understand why I have to deal with cache and private files system..

Comments

goofus’s picture

Hello,
I am not an expert on HTTP 304 messages :) However, I did a quick bit of research.

Here's a link to the W3 web site:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

Here is what is says about 304 messages:

10.3.5 304 Not Modified

If the client has performed a conditional GET request and access is allowed, but the document has not been modified, the server SHOULD respond with this status code. The 304 response MUST NOT contain a message-body, and thus is always terminated by the first empty line after the header fields.

Note the "no body", my emphasis. In your code, aren't you explicitly sending a body with the following statement: drupal_set_message('<pre>'. print_r($header, TRUE) .'</pre>');? Check the doc on drupal_set_message here:
http://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/drupal_set_message/7. Note the source code contains

// Mark this page as being uncacheable.
    drupal_page_is_cacheable(FALSE);

. Thus, could be part of your issue :)

Good Luck :)

jexperton’s picture

Thank you for your message but it doesn't fix the problem.

goofus’s picture

Hello,
Glad to have helped.

Again, I'm not an expert on generating an HTTP 304. However, if the the W3 doc I referenced is current, then you need to explicitly instruct Drupal to finish (write, flush, close) as soon as the headers are written. Not content should be added to the response after the headers.

In addition, you need to make sure the client (web browser) is performing a "conditional get".

My prior point was, issuing the drupal_set_message() is one factor in the failure. Yes, you should remove it, however you have more to do. Again, I'm not an expert, so I'll leave this other folks to help you further.

Good Luck :)

kris digital’s picture

Hi,
I had the same problem... No Caching is no option. So here is how it works for me, I adjusted your example, all managed files are cached:

function privatefilescache_file_download($uri){
if(isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
	  $ifs = strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']);
	  
	  $file = db_select('file_managed', 'fm')
        ->fields('fm')
        ->condition("uri", $uri)
        ->execute()
        ->fetchAssoc();

    if(!empty($file)) {
      $modified = $file['timestamp'];
      if($modified < $ifs) {
        header('HTTP/1.1 304 Not Modified');
        exit();
      }
    }
	}
    $year = 86400 * 365;
  	
		$header = array(
			'Cache-Control' => 'max-age='.$year,
			'Expires' => gmdate('D, d M Y H:i:s', time() + $year) .' GMT',
			'ETag' => strtr(md5($uri), 0, 10)
		); 
 
		return $header;
}
feng-shui’s picture

Make sure you set the Content-Type header, or you will find files will download as plain text.

Angry Dan’s picture

Correct me if I'm wrong, but isn't hook_file_download() also used to control file access? And if so, doesn't your code make all files accessible all of the time?

Angry Dan’s picture

And based on my comments and the code above, here's what I've ended up doing:

/**
 * Implements hook_file_download().
 */
function mymodule_system_file_download($uri) {
  if (isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
    $ifs = strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']);
    $file = db_select('file_managed', 'fm')
      ->fields('fm')
      ->condition("uri", $uri)
      ->execute()
      ->fetchAssoc();
    if (!empty($file)) {
      $modified = $file['timestamp'];
      if ($modified < $ifs) {
        header($_SERVER['SERVER_PROTOCOL'] . ' 304 Not Modified');
        exit();
      }
    }
  }

  // Don't return headers to avoid granting access to all files.
  $max_age = variable_get('page_cache_maximum_age', 0);
  drupal_add_http_header('Cache-Control', 'private, max-age=' . $max_age);
  drupal_add_http_header('Last-Modified', gmdate(DATE_RFC1123, REQUEST_TIME));
  drupal_add_http_header('ETag', strtr(md5($uri), 0, 10));
}
defconjuan’s picture

Thanks, tested and working without granting access to all files.

jim_at_miramontes’s picture

... presumably because I'm unable to get a value into $_SERVER['HTTP_IF_MODIFIED_SINCE']. Initially, HTTP_IF_MODIFIED_SINCE wasn't even showing up in $_SERVER at all; I was able to get it in via a modification to my Apache config file suggested by http://serverfault.com/questions/342143/image-caching-using-http-if-modi... the variable is now present in $_SERVER, but never gets a value. As far as I can tell, the add_http_header call that is meant to generate the Last-Modified header is working properly.

As an aside, if I load the image into a browser and temporarily hack the above code so that I'm guaranteed to go down the path to generate a 304 header, I do in fact get such a header -- "GET /system/files/the_test_file.jpg HTTP/1.1" 304 - "-" ..., but the image does not appear in the browser.

Any thoughts about this? It's kinda driving me crazy, and I really need to get it working....

EDIT: It seems to be working, in fact. I was testing my code by simply putting an appropriate private image into a browser and refreshing it over and over again; this failed as described. Oops -- this was forcing a reload of the image; the browser was doing just what I had asked it to do. When I check the state of the image inside a page, the caching is working properly. Sorry for the distraction, but maybe this will be helpful to someone else in the future.

jduhls’s picture

Had page with tons of private download thumbnails at about 400ms average each with total page load of 13s average. This custom hook cut thumbnail average to about 230ms with 9s for the entire page. Good improvement. Thanks!

Anonymous’s picture

rolled it into a custom module and now my browser queries drupal and gets a 304 for cached images.

I also found this module https://www.drupal.org/project/private_image_cache but did not investigate.