Why Pound is awesome in front of Varnish

We all know Varnish is awesome. I went as far as presenting a topic on Varnish then writing about it. This is a known fact.

However, what happens to all that caching goodness when you want to run your entire site over SSL? Out of the box, Varnish doesn't support it. While I've heard some mention that not supporting SSL is an oversight, there exists some very sound reasoning for why not

So how do people terminate SSL?

What is Pound?

Without copying exactly what's on the Pound documentation, or the Wikipedia entry about Pound, it's essentially a reverse proxy, SSL terminator and load balancer but NOT a webserver. It's small, easy enough to install and has minimal configuration. Stunnel is similarly simple, but since I have quite extensive experience using Stunnel, I decided to learn something new.

On my load balancing servers, Pound listens on port 443 and Varnish listens on port 80. When traffic comes in on port 443, it hits Pound, gets decrypted using my server certificate and then gets passed to Varnish on port 80. By putting all traffic through Varnish, I'm able to take advantage of its caching ability for both HTTP and HTTPS traffic.

It's almost, that simple. I had to make some minor changes to my VCL to receive and cache mixed mode traffic. Prior to these changes, I would sometimes deliver resources using the HTTP schema to pages delivered over HTTPS. This had the understandable effect of causing my browser to complain about insecure resources.

Getting Varnish and Pound to play nicely

Realising that we need to handle HTTP/HTTPS traffic differently in Varnish, even though it all comes in on port 80, I decided to use a separate cache hash key for each. Varnish uses hashes of the URI as a key to store and look up data by. My VCL implements the vcl_hash subroutine to detect HTTPS traffic and alter the hash key. We add a header in Pound to tell Varnish that the traffic came in over SSL and then watch the magic happen.

pound.cfg

ListenHTTPS
  Address 0.0.0.0
  Port 443
  HeadRemove "X-Forwarded-Proto"
  AddHeader "X-Forwarded-Proto: https"
  Cert "/etc/ssl/certs/adammalone.net.pem"
End
 
Service
  HeadRequire "Host:.*adammalone.net.*"
    Backend
      Address 127.0.0.1
      Port 80
    End
End
default.vcl - vcl_hash {}
sub vcl_hash {
  hash_data(req.url);
  if (req.http.host) {
    hash_data(req.http.host);
  } else {
    hash_data(server.ip);
  }
  # Use special internal SSL hash for https content
  # X-Forwarded-Proto is set to https by Pound
  if (req.http.X-Forwarded-Proto ~ "https") {
    hash_data(req.http.X-Forwarded-Proto);
  }
  return (hash);
}

The hash_data function allows us to add further information to the hash. By adding 'https' to the host and uri information, we're altering the hash in such a way that it is different from just the host + uri that an http request would use.

I've also attached a downloadable copy of my full Pound config and the puppet manifest that generates it for people who are interested in replicating this functionality. I'm using my Pound puppet class located at typhonius/puppet-pound, a fork of mrintegrity/puppet-pound.

Drupal configuration

The final thing to do is to inform Drupal it needs to be in SSL mixed mode and to enter a small snippet in my settings.php so it can be turned on or off based on the incoming request. If Varnish is running on the same server as your Drupal installation, you'll need to replace www.xxx.yyy.zzz with 127.0.0.1. Otherwise it'll be the IP of your load balancing server.

// Varnish Settings
$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('www.xxx.yyy.zzz');
$conf['reverse_proxy_header'] = 'HTTP_X_FORWARDED_FOR';
$conf['page_cache_invoke_hooks'] = FALSE;
 
if (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] == 'https') {
  $_SERVER['HTTPS'] = 'on';
}

This is how I allow SSL through Varnish, if you do it differently, add a comment!

AttachmentSize
Plain text icon pound.cfg_.txt1011 bytes
Plain text icon puppet_pound.pp_.txt1004 bytes

Comments

Submitted by John_B (not verified) on

I set up a Pound > Varnish > Apache setup a couple of days ago. It was kindof a pain, the constant flux in Varnish VCL (it just changed again in Varnish 4) being one problem, turning the SSL cert. into a .pem which Pound recognized being another, and setting up and debugging redirects (variously in Apache conf., in .htaccess, and in the .vcl) was a third.

I followed a suggestion where Varnish tests for incoming port, rather than the "X-Forwarded-Proto: https", to identify HTTPS requests: do you think there is any reason to prefer one or the other?

After getting the whole thing working, I started to think, 'what was all that about? nginx could have done it far more simply. It is a pity no one seems to have done any benchmarks. Like you I happen to like Apache and Varnish, and am less keen on or familiar with nginx, but that is not a good enough reason to use an over-complex and possibly less efficient setup, with three pieces of software (where using Pound to detect https is pretty much a hack to deal with a Varnish shortcoming) instead of one piece of software.

I wouldn't say Varnish is in a state of flux. Varnish 3 has been around as a stable release for over three years now and I doubt will be obselete any time soon. 

Changing a .crt and .key into a .pem is just a case of concatenating the files (provided they're in pem format).

I usually keep redirects in .htaccess and then add a conditional in my VCL to cache 301s for a bit to keep them from hitting the web.

I've not actually heard of Varnish being able to distinguish the port a request is coming in on. Wouldn't that require it bind to more than just port 80 for mixed mode traffic? Most methods that pass data through Varnish will pipe it into port 80 anyway (the Pound method does just that in its backend declaration).

I use nginx locally for development, and to keep myself in touch with the configs. One of the reasons I like Pound is that it's very limited in its scope. It does a couple of things and is overall pretty simple. If you're running a large hosting platform, like Acquia, then some of the additional features nginx provides will totally make the difference. For my limited purposes (and the limited memory on my tiny servers), Pound keeps it simple. 

Submitted by Diogo (not verified) on

Thank you for share your tip. Today i use ssl with apache mod_ssl, but i will migrate to Pound soon.

I dont understand if is really necessary use vcl_hash. If i want to detect request from Pound i can check header from vcl_recv, right?

Using vcl_hash stores the HTTP traffic and the HTTPS traffic with separate cache hashes. If the same hash key is used, the a request to http://example.com will be equivalent to one for https://example.com.

This is undesirable if a user expects an SSL page and then receives static assets from a non-SSL source. Usually browsers prohibit this and pages can look unstyled or be missing media assets.

Altering the hash key generated within vcl_hash (which gets called anyway after vcl_recv) allows us to keep SSL and non-SSL apart even with mixed mode traffic.

Submitted by John_B (not verified) on

Yes, updating Varnish takes *me* too long. More skilled people well get there quicker! Still it was annoying having to change vcl from Varnish 2 - 2.1 - 3.0 - 4.0 (not that I need 4 but you get it anyway if relying on apt-get, unless you make a point of asking for 3.0).

I guess if I redirect all traffic to https, there is no risk of mixed http and https in static resources?

Getting Varnish listening on two ports: "DAEMON_OPTS="-a :9080,:9443 \" (thanks to http://blog.ajnicholls.com/varnish-apache-and-https/ for that).

In case others are scratching their heads over updating the vlc snippet in the article I linked for Varnish 4.0, here it is.
Varnish 3 version:
sub vcl_recv {
# Set the director to cycle between web servers.
if (server.port == 9443) {
set req.backend = default_ssl;
....

Varnish 4 version:
# add to header
import std;
# no change needed to setting up a backend default_ssl

sub vcl_recv {
# change syntax as follows
if (std.port(server.ip) == 9443) {
set req.backend_hint = default_ssl;

Oh interesting! I never knew Varnish could listen on many. That itself brings even more flexibility to the table; perhaps I'll throw Varnish in front of some more services! The method described on the other blog seems to get round using a different hash key by using ports instead; many ways to skin a cat!

If all requests come in via HTTPS then there'll be no issues with HTTP resources leaking in (unless specifically stated in the markup of the site). Resources and internal links created with relative links or without schema (//example.com) should be served by default with the same schema that the site is accessed by.

Submitted by Anonymous (not verified) on

This is interesting, I have been researching options for SSL termination in front of varnish to see if there's something more advantageous than nginx or HAproxy - thanks for the writeup!

I have been partial towards HAProxy for simplicity and load balancing advantages.

Would you see your solution with pound more for single-server-setups, or would you add it as another player in the stack, or go with something else for multi-node production deployments?

Submitted by Adam Malone on

Pound can be used for multi-node setups. At heart it's a load balancer that just has some nice SSL termination built in. Potentially traffic on 443 could get bounced through Varnish, then back to Pound:80 for a load balancing step before being directed to one backend server.

I haven't ever tested this in production personally however so your mileage may vary. A lot of larger companies opt for something like NGINX; possibly due to endless configuration that can be altered.

From the pound webpage:
"Quite a few people have reported using Pound successfully in production environments. The largest volume reported to date is a site with an average of about 30M requests per day, peaking at over 600 requests/sec."

I'd say a average of 30M requests/day and >600 requests/sec should cover MOST use cases.

Submitted by Thrawn (not verified) on

Recent versions of HAProxy can terminate SSL for you. However, the most scalable solution is still to delegate SSL termination to a farm. Something like this:

- Layer 4 (raw TCP) load balancer distributes traffic to a farm of Apache/Nginx servers doing SSL termination
- Farm servers all send traffic to Varnish
- Optional: Varnish sends traffic to a layer 7 (HTTP) load balancer, so the backend application can be scaled too.

HAProxy can do both layer 4 and layer 7 load balancing.

Submitted by Thrawn (not verified) on

Are you sure it's a good idea to cache SSL resources? There's a reason they normally are not cached. What if the cache serves up a sensitive, user-specific resource to another user - or even *all* users?

SSL is used to serve an encrypted message between the server and the browser. There is no requirement for a user to be logged in or authenticated.

We ensure that content delivered over SSL follows the same rules as content delivered as standard HTTP content; that is, that any logged in user bypasses cache so user specific content can't be stored.

There does however exist caching over SSL for logged in users. I can't imagine Facebook is 100% uncached, it's just done in a well engineered manner to not expose data to incorrect users.

Submitted by ADDISON (not verified) on

I would like to find out an technically explanation for why keeping SSL module enable in Apache.

Well I set up correctly Pound in front of Varnish for dealing with SSL. In Apache I did the following changes:

- removed 443 from listen in ports.conf
- commented all SSL statement in virtual host file and apache2.conf like SSLProtocol and SSLCipherSuite)

Everything works as expected with SSL protoco. Because Apache was absolved from any SSL tasks I wanted to disable SSL module. Once disable the whole configuration isn't working. I had to enable it again.

Why is need SSL module in Apache2 when Pound is dealing with SSL certificate and ciphers and passing a unsecure connection through Varnish to Apache?

It'd be the same command that you use to purge Varnish without Pound. All Pound does is terminate SSL and pass requests/responses to and from Varnish.

Add new comment