Quantcast
Viewing all articles
Browse latest Browse all 31

Web Server Performance Part II: Varnish and Pound

Web Server Part I: General Caching and Apache Connections

In part I of this series, we talked about setting up an Apache server with a smaller memory footprint and PHP caching system. This helps the speed of our Drupal and other PHP web sites quite a bit, but we could still use some help with CSS, images, JavaScript and other web objects. For this we need a more general caching system, like Varnish.

In this article, we’ll set up Varnish to listen on the normal web port (80) and pass traffic internally to Apache listening on 127.0.0.1:8080. We’ll also tweak Varnish to avoid problems with some common PHP web sites. Finally, we’ll install Pound, which decrypts SSL traffic to enable our HTTPS sites to benefit from Varnish caching as well.

1. Install Varnish

Get the latest version from the Varnish Ubuntu repos:

$ curl http://repo.varnish-cache.org/debian/GPG-key.txt | sudo apt-key add -
$ echo "deb http://repo.varnish-cache.org/ubuntu/ precise varnish-3.0" | sudo tee -a /etc/apt/sources.list
$ sudo apt-get update
$ sudo apt-get install varnish

As of this writing, this gives you Varnish 3.0.4.

2. Set up Varnish to listen for HTTP traffic on port 80

Edit /etc/default/varnish (I’ll only list the settings I’m interested in):

START=yes
DAEMON_OPTS="-a :80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,256m"

3. Set up Varnish to pass HTTP traffic to Apache

Edit /etc/varnish/default.vcl and add this to the top of the file (not in sub vcl_recv or other subroutines):

backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

4. Configure Apache to listen on 127.0.0.1:8080 instead of 80

Edit /etc/apache2/ports.conf to change its listen port:

NameVirtualHost 127.0.0.1:8080
Listen 127.0.0.1:8080

If you have any virtual hosts in /etc/apache2/sites-available, you should change them to listen on 127.0.0.1:8080 as well.

5. Restart everything

$ service apache2 restart && service varnish restart

6. (Optional) Enable the Varnish logging system

Edit /etc/default/varnishncsa and start the logging service:

VARNISHNCSA_ENABLED=1
$ service varnishncsa start

Once this is running, Varnish will start writing logfiles in NCSA format to /var/log/varnish/ . (I don’t find Varnish logging to be all that interesting except for debugging, so I typically leave it disabled.)

Now you have a basic Varnish install listening on port 80 on your server and passing traffic to Apache. You can type varnishstat at the command line to see the cache working in real time. Plenty of other utilities are available to assess how much of your content is being served from cache. If you run the Apache benchmark command from the shell:

$ ab -n 10 -c 5 http://www.mysite.com/

You should notice a huge spike in the requests per second benchmark when Varnish is running and serving cached content.

But Varnish still likely needs some tweaking to work properly.

Issues With Varnish

1. My server logs are screwed up

In your Apache access logs, all the client connections are now showing as 127.0.0.1. This is understandable since Varnish is the proxy passing traffic to Apache, but it’s not particularly useful. To fix this, install the Apache RPAF (reverse proxy add forward) module and make a small adjustment to the Varnish config file.

$ apt-get install libapache2-mod-rpaf
$ vi /etc/apache2/mods-available/rpaf.conf
<IfModule rpaf_module>
RPAFenable On
RPAFsethostname On
RPAFproxy_ips 127.0.0.1
RPAFheader X-Forwarded-For
</IfModule>
$ a2enmod rpaf

Finally, add this to the vcl_recv subroutine of your /etc/varnish/default.vcl file and restart everything:

sub vcl_recv {

  remove req.http.X-Forwarded-For;
  set req.http.X-Forwarded-For = client.ip;

etc.
$ service apache2 restart && service varnish restart

2. My code to print the user’s IP address doesn’t work

In the previous step you told Varnish to add an X-Forwarded-For header showing the client’s IP address to be read by the Apache RPAF module. So any code to detect a visitor’s IP address should ask for that header (example here in PHP):

<?php

if( isset( $_SERVER[ 'HTTP_X_FORWARDED_FOR' ] ) ) {
  $real_ip = $_SERVER[ 'HTTP_X_FORWARDED_FOR' ];
}

?>

3. I can’t log in to WordPress/Drupal/PhpMyAdmin/etc. -OR- Varnish doesn’t seem to be caching anything

By default, Varnish doesn’t cache content with cookies in it. And since many web applications rely heavily on cookies to permit users to interact with them, much of your content won’t get cached, which defeats the purpose of running a caching system. At the same time, if you’ve adjusted your .vcl to strip most cookies in order to cache aggressively, you’re probably removing cookies that your web application needs to handle logins or sessions.

Ideally Varnish should remove any cookies that aren’t required for your applications so you’re caching as much content as possible while allowing content with essential cookies to pass through to Apache. There are various configuration options (and complete files) available to tell Varnish to behave this way, and the Lullabot vcl for Drupal is a good place to start.

To optimize your caching, you may need to watch your HTTP headers as you interact with your web applications to see which cookies are being passed back and forth. Utilities like the Firebug or Live HTTP Headers plugins for Firefox can help with this.

For WordPress, you could tell Varnish not to cache content accompanied by WordPress login and settings cookies, for example:

sub vcl_recv {

  if (req.http.cookie ~ "(wordpress_logged_in|wp-settings)") {
    return(pass);
  }

etc.

You’d then want some commands to remove all other unnecessary cookies, like ‘unset req.http.Cookie;’. This would ensure that your admin and authenticated users can work normally with your site while you’re still caching a lot of content.

If you want to start from scratch, some basic config templates are available from Github.

PhpMyAdmin is a bit trickier since it tries to POST to the port Apache is running on (8080). First, add a directive like this to the PhpMyAdmin config.inc.php:

$cfg['PmaAbsoluteUri'] = 'http://myserver.com/path-to-phpmyadmin';

Then, assuming your phpmyadmin lives in a directory like myserver.com/phpmyadmin, you could tell Varnish to just ignore it in sub vcl_recv:

  if (req.url ~ "phpmyadmin") {
     return(pass);
  }

And in sub vcl_fetch:

  if (req.url ~ "phpmyadmin") {
     return(hit_for_pass);
  }

Getting all your web sites and applications to cache as much content as possible will take some experimentation. For example, I found it desirable to eliminate has_js cookies and Google’s __utm* tracking cookies in order to cache more content (in sub vcl_recv):

set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");
set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
if (req.http.Cookie == "") {
    unset req.http.Cookie;
}

What About SSL?

Varnish doesn’t handle SSL traffic. It probably never will. So if you want an SSL-encrypted web site to be cached with Varnish, you need to intercept and decrypt the SSL traffic before it gets passed to Varnish, with a utility like the Pound reverse proxy and load balancer.

Since we only need Pound to decrypt SSL traffic, we’ll set it up to listen on port 443 and pass to Varnish on port 80.

$ apt-get install pound

Set Pound to start on system boot (in /etc/default/pound):

startup=1

Edit your Pound configuration (/etc/pound/pound.cfg):

User            "www-data"
Group           "www-data"

ListenHTTPS
        Address 10.0.0.1  # put your server's public IP address here
        Port 443
        Cert "/etc/ssl/private/myserver.com.pem"
        HeadRemove "X-Forwarded-Proto"
        AddHeader "X-Forwarded-Proto: https"
        Service
                BackEnd
                        Address 127.0.0.1
                        Port 80
                End
        End
End

The Cert directive points to your web site’s certificate for decrypting SSL traffic, and the “BackEnd” points to the address and port where Varnish is listening. Thus, SSL traffic goes from 443 (Pound) to 127.0.0.1:80 (Varnish) to 127.0.0.1:8080 (Apache). We also add an HTTP header to indicate that we’re forwarding SSL traffic.

Incidentally, Pound passes along the client IP address correctly, so for accurate logging, we want to adjust our previous Varnish rule to ignore connections with the ‘X-Forwarded-Proto: https’ header:

if (req.http.X-Forwarded-Proto !~ "https") {
  remove req.http.X-Forwarded-For;
  set req.http.X-Forwarded-For = client.ip;
}

If Apache is already listening on port 443, we need to move it elsewhere. Edit /etc/apache2/ports.conf and any virtual hosts:

<IfModule mod_ssl.c>
   Listen 44333
</IfModule>

Start/restart Apache, Varnish, and Pound:

$ service apache2 restart && service varnish restart && service pound start

Now we have an Apache Worker/PHP-FPM/FastCGI web server running APC as a PHP opcode cache, Varnish acting as a general cache, and Pound acting as a wrapper for SSL traffic. In our next installment we’ll talk about some performance improvements specific to Drupal and WordPress.

The post Web Server Performance Part II: Varnish and Pound appeared first on GeoffStratton.com.


Viewing all articles
Browse latest Browse all 31

Trending Articles