Varnish HTTP accelerator

Table of Contents

Intro

Varnish is a very light weight HTTP caching service which has a very small footprint and can be very customizable.

A basic Varnish installation consist of Varnishing sitting in from of your web server (referred to as back-end), clients will interact solely with Varnish and never know there is a back-end web server (nor should they have access to the back end directly). When a request comes in to Varnish for a page that is being cached over a standard HTTP connection Varnish will either service this directly to the client (without sending a request to the back-end) from Varnish’s cache if exist or if not in cache will send a request to your web back-end over its TCP socket via an HTTP request, Varnish will then server this directly to the client and hold it in its cache for as long as Varnish is configured to hold the cache.

As you may have wondered from the above Varnish will send a cached page to a client (if exist) directly to the client never sending a request to the back-end, this will render your back-end access log less valuable as they will only see request which where not currently held in cache. Varnish’s solution to this is the “varnishncsa” daemon which is responsible for writing ncsa formated logs to a separate Varnish log file (this is included with the Varnish packed provided by EPEL however is a separate service).

Another good tool which can be found in the default Varnish package is the “varnishstat”, varnishstat is a statistic tool which can be used with a running copy of varnish. Varnishstat will provide useful analysis for cache hit to miss ratio’s along with a lot more which I have yet to dive in to.

And yet another useful tool which can be used with Varnish, however not provided in the package is “curl”. With curl we can check the Varnish HTTP Header to determine if a page was pulled from cache or from the back-end, or even how long the cache has lived. Using the “curl” command with the -I trigger we are able to pull only the Headers from a page using a HEAD request rather than a GET request (example: curl -I http://website.com/).

HTTP/1.1 200 OK
Server: Apache
X-Powered-By: PHP/5.2.9
Content-Type: text/html; charset=UTF-8
Content-Length: 7040
Date: Thu, 13 Aug 2009 22:52:08 GMT
X-Varnish: 1462348119
Age: 0
Via: 1.1 varnish
Connection: keep-alive

 
HTTP/1.1 200 OK
Server: Apache
X-Powered-By: PHP/5.2.9
Content-Type: text/html; charset=UTF-8
Content-Length: 7040
Date: Thu, 13 Aug 2009 22:52:09 GMT
X-Varnish: 1462348120 1462348119
Age: 2
Via: 1.1 varnish
Connection: keep-alive

If by looking at the above request you guessed the first header was from a non-cached and the second was from a cache you would be correct, this is because Varnish has a X-Varnish header which the first number represents the ID for the current request and the seconds represents the ID of the request which generated this cache (if the second ID is not present this was not cached). Another useful information in the Varnish headers is the Age header, the Age header displays in seconds how long this item has been held in cache.

Installation

As mentioned above the Varnish package is available from EPEL and we should be safe to subscribe the server to this channel without any worries, to subscribe a server to the EPEL channel visit https://fedoraproject.org/wiki/EPEL/FAQ#howtouse. Once the server is subscribed you can install Varnish by running yum install varnish.

Key files

/etc/sysconfig/varnish : The default Varnish daemon’s configuration.
/etc/varnish/default.vcl : A default Varnish ACL file
/etc/init.d/varnish : The Varnish initialization script.
/etc/init.d/varnishncsa : The Varnish NCSA Logging daemon’s initialization script.

Configuration

After installation you will first want to have a look at the /etc/sysconfig/varnish configuration, I will go over a few key pieces in this file below.

DAEMON_OPTS="-a 0.0.0.0:80 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-u varnish -g varnish \
-s file,/var/lib/varnish/varnish_storage.bin,128M"

Above is the DAEMON_OPTS variable which should be the only portion we may need to modify to get Varnish working as intended as this is the triggers of the Varnish daemon command. The beginning of this command (-a) represents the Varnish listening TCP socket (in this case all interfaces on port 80), the -T refers to the telnet listning socket (You are able to telnet in to Varnish to an admin CLI which can be used to clear cache and much more while keeping the service up and running), the -f refers to the ACL file or configuration you will be using for this Varnish daemon and course the -u and -g refer to the User and Group to run Varnish under. The -s refers where to keep the Varnish cache file and how large is will be (this is the cache memory limit).

Next we will look at the /etc/varnish/default.vcl,this is the default file and can be renamed, however if it is you will need to modify your /etc/sysconfig/varnish configuration. On the top line of the vcl file you will the backend block, this is used to refer to the back-end’s listening address and port and should be configured to that of the web server.

backend default {
.host = "127.0.0.1";
.port = "8080";
}

As the vcl file uses a programing-esque syntax it is very flexible and can be complex, with that mentioned I will not go in to detail but I will place a sample vcl file below.

backend default {
.host = "127.0.0.1";
.port = "8080";
}

sub vcl_recv {

if (req.http.host == "nocache.DOMAIN.com") {
pass;
}

lookup;
}

sub vcl_fetch {

if (req.request == "GET") {
unset obj.http.Set-Cookie;
set obj.ttl=30m;
deliver;
}
}

Lastly you will want to enable the Varnish and Varnishncsa Service’s using chkconfig (chkconfig varnish on && chkconfig varnishncsa on), after enabling these server you will want to start them up assuming everything is configured properly and you are ready to begin testing (service varnish start && service varnishncsa start). After starting the varnishncsa server you will notice the log files in /var/log/varnish/varnishncsa.log, you will be able to use these for reviewing and/or Analytic use.

One last thing to note about Varnish is since it is a raw cache of a past request Cookie’s do not work with Varnish so it is highly possible Varnish my break key dynamic portions of your web site, in this situation it is good to pass them directly to the back-end (notice the pass; function used in the above sample). One of my pages I have noticed to not work with varnish is my Web Mail as it relies highly on Cookie’s.

Author: Jeffrey Ness <jness@flip-edesign.com>

Date: 2010-03-22 10:13:00 CDT

HTML generated by org-mode 6.21b in emacs 23