Let's try to make a good VCL for people using Varnish 3 and Drupal 7.

There is not enough documentation on the internet yet, and the code below is combination of dozens of articles that I have read, so there is some confusion.

At the code at the bottom:
1. Is anything good/important missing?
2. Is something is redundant and could be removed for cleaner reading OR better perfomance?
3. Is anything in wrong order and could be moved for better performance?
4. Where is better to remove some headers that come from backend and they are not needed; At vcl_fetch or at vlc_deliver?
5. Based on https://www.varnish-cache.org/docs/trunk/tutorial/purging.html, is it better to ban or to purge? and what value to put to parameter ban_lurker_sleep? Is ban or purge called by this module and when?

backend default {
.host = "127.0.0.1";
.port = "9880";
.connect_timeout = 600s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
}

acl purge {
  "localhost";
  "127.0.0.1";
}

sub vcl_recv {
  
  #cloudflare
  remove req.http.X-Forwarded-For;
  if (req.http.cf-connecting-ip) {
    set req.http.X-Forwarded-For = req.http.cf-connecting-ip;
    } else {
		set req.http.X-Forwarded-For = client.ip;
	}
	
  if (req.request != "GET" &&
    req.request != "HEAD" &&
    req.request != "PUT" &&
    req.request != "POST" &&
    req.request != "TRACE" &&
    req.request != "OPTIONS" &&
    req.request != "DELETE") {
      /* Non-RFC2616 or CONNECT which is weird. */
      return (pipe);
  }
  
  if (req.http.Expect) {
	return (pipe);
  }
  
  if (req.url ~ "^/admin/content/backup_migrate/export") {
    return (pipe);
  }
  
  # Not cacheable by default
  if (req.request != "GET" && req.request != "HEAD") {
  return(pass);
  }

  # Do not cache these paths.
  if (req.url ~ "^/install\.php$" ||
	req.url ~ "^/update\.php$" ||
	req.url ~ "^/cron\.php$" ||
	req.url ~ "^/admin\.php$" ||
	req.url ~ "^/batch\.php$" ||
	req.url ~ "^/status\.php$" ||
    req.url ~ "^/batch/.*$") {
      return (pass);
  }
 
  if (req.http.Authorization || req.http.Cookie) {
  return (pass);
  }
  
  if (req.url ~ "(?i)\.(png|gif|jpeg|jpg|ico|swf|css|js|htm|html)(\?[a-z0-9]+)?$") {
   unset req.http.Cookie;
  }
   
  # Remove cookies. has_js, google analytics, drupal related, google ads, piwik, cloudflare-uid
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js|Drupal.toolbar.collapsed|Drupal.tableDrag.showWeight|__gads|_pk|__cfduid)=[^;]*", "");
  # Remove a ";" prefix, if present.
  set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
  # Remove empty cookies.
  if (req.http.Cookie ~ "^\s*$") {
    unset req.http.Cookie;
  }
  
# Normalize the Accept-Encoding header
  if (req.http.Accept-Encoding) {
    if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
      # No point in compressing these
      remove req.http.Accept-Encoding;
    }
    elseif (req.http.Accept-Encoding ~ "gzip") {
      set req.http.Accept-Encoding = "gzip";
    }
    else {
      # Unknown or deflate algorithm
      remove req.http.Accept-Encoding;
    }
  }
  
  # Do not allow outside access to cron.php or install.php.
  if (req.url ~ "^/(cron|install)\.php$") {
    error 404 "Page not found.";
  }
  
  # fixing global redirect
  if (req.url ~ "node\?page=[0-9]+$") {
         set req.url = regsub(req.url, "node(\?page=[0-9]+$)", "\1");
         return (lookup);
   }
  

if (req.request == "PURGE") {
  if (!client.ip ~ purge) {
    error 405 "Not allowed.";
  }
  ban("req.url == " + req.url);
  error 200 "Purged.";
  }

  # Let's have a little grace
  set req.grace = 30s;
  
  return (lookup);
}


sub vcl_fetch {
   if (req.url ~ "(?i)\.(png|gif|jpeg|jpg|ico|swf|css|js)(\?[a-z0-9]+)?$") {
	  unset beresp.http.set-cookie;
   }
	
    unset beresp.http.Server;
	unset beresp.http.X-drupal-cache;
	unset beresp.http.Etag;
	
    remove req.http.X-Forwarded-For;
    set req.http.X-Forwarded-For = req.http.rlnclientipaddr;
    if (req.url ~ "^/w00tw00t") {
        error 403 "Not permitted";
    }
	
	# Allow items to be stale if needed. - the maximum time Varnish should keep an object. 1800s = 30min
    set beresp.grace = 1800s;
	
    return(deliver);
}

sub vcl_deliver {
   remove resp.http.X-Varnish;
   remove resp.http.Via;
   remove resp.http.Age;
   remove resp.http.X-Mod-Pagespeed;
   remove resp.http.X-Powered-By;
}

sub vcl_pipe {
    # http://www.varnish-cache.org/ticket/451
    # This forces every pipe request to be the first one.
    set bereq.http.connection = "close";
}

Comments

lochii’s picture

Are you using this in production now?

As for your banning vs purging question, I believe ultimately that banning is more powerful and hence this is what the drupal module has accomplished (i.e , try doing a full flush cache with purge(), it wouldn't easily be possible)

The problem is, the ban lurker (see This Blog Article)

The important point the article makes, is that if you use the ban lurker, it can only operate on obj.* and not req.* (which is unfortunately what Drupal issues), worse still, obj.* doesn't contain the URL or the HOST fields! you need a bit of VCL to copy this into the obj from the beresp (and then you can strip it out again when you issue it):

sub vcl_fetch {
  set beresp.http.x-url = req.url;
  set beresp.http.x-host = req.http.host;
}

sub vcl_deliver {
  unset resp.http.x-url; # Optional
  unset resp.http.x-host; #Optional 
}

sub vcl_recv {
  if (req.request == "PURGE") {
    if (client.ip !~ purge) {
      error 401 "Not allowed";
    }
    ban("obj.http.x-url ~ " req.url); # Assumes req.url is a regex. This might be a bit too simple
  }
}

(source: This Article)

You can then happily issue a ban like such:

ban obj.http.x-host == "example.com" && obj.http.x-url ~ "\.png$"

and it will be ban lurker friendly and get tidied for you when not in use.

halcyonCorsair’s picture

I've posted a good basic Drupal 7, Varnish 3 configuration that I'm using to github here:

https://github.com/halcyonCorsair/varnish-for-drupal

jahwe2000’s picture

The above code to remove Analytics cookies does not work, as it expects an equals sign ('=') somewhere. To remove things without this, we use this code:

 // Remove has_js and Google Analytics cookies.
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");

  // Remove cookies. has_js, google analytics, drupal related, google ads, piwik, cloudflare-uid
  set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(Drupal.toolbar.collapsed|Drupal.tableDrag.showWeight|__gads|_pk|__cfduid)[^;]*", "");

I did not modify the '__[a-z]+' part of the original line as I do not know if GA sets an equals sign as well. Then, only the Drupal session cookie remains in our setup with D6. I know this issue is about D7, but for D6 users, the anonymous users can be left without cookie following this guide:
http://www.advomatic.com/blogs/marco-carbone/drupal-privacy-configuring-...

klezmer41’s picture

Is is assuming using the Varnish contrib module? Is there any downside to using that vs whatever Pressflow 7 comes with?

mgifford’s picture

I'm curious what the impact is of using:

  # fixing global redirect
  if (req.url ~ "node\?page=[0-9]+$") {
         set req.url = regsub(req.url, "node(\?page=[0-9]+$)", "\1");
         return (lookup);
   }

If the global redirect module isn't enabled. I don't want to add to debugging problems, particularly with multi-lingual sites.

Mark Theunissen’s picture

Mark Theunissen’s picture

Issue summary: View changes

added new question

mgifford’s picture

Would be great if some of the insights here could be included in the README.

@Mark Theunissen thanks for keeping the Varnish script you published at Four Kitchens up to date. I noticed the date was Jul 03, 2013.

oxy86’s picture

I've published a somewhat lengthy how-to for Varnish 3 and Drupal 7 on my blog along with performance test results and useful commands for monitoring Varnish.
You can download my own default.vcl file here.

vegardx’s picture

Your milage might vary, but we've had good success with this configuration on numerous sites. You can find the latest iteration of it here: https://gist.github.com/vegardx/b4482d01f9801c841d67

Remember to set variables that you're behind a reverse proxy, and with the upstream proxy address. Also configure this module and set the cache expire under development -> performance, and check the box to serve cached pages for anonymous users.

// Read the correct headers, etc. 
$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('127.0.0.1');
$conf['reverse_proxy_header'] = 'HTTP_X_FORWARDED_FOR';
$conf['omit_vary_cookie'] = FALSE;

// Specific for this module
$conf['cache_backends'][] = 'sites/all/modules/varnish/varnish.cache.inc';
$conf['cache_class_cache_page'] = 'VarnishCache';

This configuration and module in combination with expires module should be able to give you a hit-rate close to 1-0.90, depending on how many pages are cleared and your traffic in general.

damienmckenna’s picture

Category: Support request » Task
alx_benjamin’s picture

Guys,

anyone used varnish with multilingual support and language_cookies module?

I need to set site language based on language cookie. This can be achieved with language_cookies module.
But when running under varnish the cookie keeps on resetting to sites default language.

My guess is my varnish conf is missing some cookie setting.

Any help is appreciated.

Thank you

manarth’s picture

I noticed that the example VCL at the top has extremely long timeouts (10 minutes):

backend default {
  .host = "127.0.0.1";
  .port = "9880";
  .connect_timeout = 600s;
  .first_byte_timeout = 600s;
  .between_bytes_timeout = 600s;
}

These values are very common in examples all over the web…I think that someone published an example without thinking through those values, and they've simply been replicated everywhere thanks to the miracles of ctrl-c, ctrl-v.

The varnish documentation provides an example with much lower values:

backend www {
  .host = "www.example.com";
  .port = "http";
  .connect_timeout = 1s;
  .first_byte_timeout = 5s;
  .between_bytes_timeout = 2s;
}

Those values are probably a little low for a Drupal site, but 10 minutes also feels far too long.

Here are the values I'm using at the moment:

backend default {
    .host = "127.0.0.1";
    .port = "http";
    .connect_timeout = 5s; 
    .first_byte_timeout = 180s;
    .between_bytes_timeout = 10s;
}
misc’s picture

Status: Active » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.