This question was asked a lot more frequently 10 years ago. In fact we were asked it so often that we used it as the title on the home page for that very reason. It amazed me that people would go to all the effort of configuring a cluster of multiple application servers but only use ONE load balancer! That's just plain stupid.
two loadbalancers

"Thou shalt not buy just one"

These days people understand a lot more about high availability best practices and patterns i.e. (N+1 redundancy etc.). The primary function of a load balancer is to keep your application running with no down time. It is an invaluable tool for systems architects and it has the benefit of being pretty simple to understand.

Before I discuss the fundamentals of ADC methodology, were you actually just looking for information on how to compare solutions from Kemp Technologies, Barracuda Networks &

If you are still with me let's cover a brief bit of history:

What did we do before we had load balancers?

Before load balancers we used DNS servers with multiple A records. The flaw with using DNS was a lack of health checks on the servers. So if your hard drive failed on one of your servers clients would still be directed to it (which we all know is very bad).

What is a load balancer?

A load balancer is simply a device that sends internet traffic to a group of application servers with an even load distribution. The key advantage that load balancers have over DNS is that they can ensure the server is responding AND that it is actually giving the correct response!

So is it not pretty obvious why we need two of them?

Yes. These days systems engineers have a much better grasp on fundamental high-availability requirements. If your load balancer fails then ALL of your customers will be very upset. So you definitely want a couple of them in a clustered pair.

Lori MacVittie (who is way better at writing than me) has an excellent explanation here:

In fact most of our customers have at least 3 load balancers and preferably 5 or more.
Otherwise how could they run a full development and testing environment before they move to production?

So what makes different?

The appliance is one of the most flexible load balancers on the market. The design of the appliance allows different load balancing modules to utilize the core high availability framework of the appliance. Multiple load balancing methods can be used at the same time or in combination with each other.

How exactly does a load balancer work, and which methods should I use?

All load balancers (Application Delivery Controllers) use the same load balancing methods. I will try to explain some of the advantages and disadvantages of each method. It is very common that people choose a particular method because it is what they were told to do/expected to do rather than what would actually suit their application.

Load balancers traditionally use a combination of routing based OSI Layer 2/3/4 techniques (generally referred to as Layer 4 load balancing), all modern load balancers also support layer 7 techniques (full application reverse proxy). However just because the number is bigger it doesn't mean it is a better solution for you! A good analogy is - 'would 7 blades on your razor actually be better than 4?'

Layer 4
DR (Direct Routing)
Ultra-fast local server based load balancing
Requires handling the ARP issue on the real servers
Layer 4
NAT (Network Address Translation)
Fast Layer 4 load balancing, the appliance becomes
the default gateway for the real servers
Layer 4
Similar to DR but works across
IP encapsulated tunnels
Layer 7
SSL Termination
Usually required in order to process cookie persistence in
HTTPS streams on the load balancer - Processor intensive
Layer 7
SNAT (HAProxy)
Layer 7 allows great flexibility including full SNAT and WAN load
balancing, HTTP or RDP cookie insertion and URL switching

Direct Routing (DR) load balancing method

The one-arm direct routing (DR) mode is the recommended mode for installation because it's a very high performance solution with very little change to your existing infrastructure. NB. Foundry networks call this Direct Server Return and F5 call it N-Path.

Direct Routing load balancing method

  • Direct routing works by changing the destination MAC address of the incoming packet on the fly which is very fast.
  • However, it means that when the packet reaches the real server it expects it to own the VIP. This means you need to make sure the real server responds to the VIP, but does not respond to ARP requests.
  • On average, DR mode is 8 times quicker than NAT for HTTP, 50 times quicker for terminal services and much, much faster for streaming media or FTP.
  • Direct routing mode enables servers on a connected network to access either the VIPs or RIPs. No extra subnet's or routes are required on the network.
  • The real server must be configured to respond to both the VIP & its own IP address.
  • Port translation is not possible in DR mode i.e. have a different RIP port than the VIP port.

When using a load balancer in one-arm DR mode all load balanced services can be configured on the same subnet as the real servers. The real servers must be configured to respond to the virtual server IP address as well as their own IP address.

Quite often vendors of application delivery controllers will give a long list of problems with DSR (Direct Server Return) load balancing. However normally they are trying to sell you features that a properly designed application like Exchange 2013 should not need - and yet "surprisingly enough", tell you that you have to have the most expensive load balancer model in order to use the unnecessary feature they are recommending. A good example that always makes me laugh is the Kemp Technologies sizing tool for Exchange 2013: Kemp insist on enabling SSL termination by default (which Microsoft specifically spent a lot of time and effort designing Exchange 2013 so that you don't need to terminate SSL on the load balancer!) the result of which is practically ANY variable you change in the sizing tool recommends the most expensive load balancers!

Network Address Translation (NAT) load balancing method (two arm)

Sometimes it is not possible to use DR mode. The two most common reasons being: if the application cannot bind to RIP & VIP at the same time; or if the host operating system cannot be modified to handle the ARP issue. The second choice is Network Address Translation (NAT) mode. This is also a fairly high performance solution but it requires the implementation of a two arm infrastructure with an internal and external subnet to carry out the translation (the same way a firewall works). Network engineers with experience of hardware load balancers will have often used this method.
Network Address Translation (NAT) load balancing method

  • In two-arm NAT mode the load balancer translates all requests from the external virtual server to the internal real servers.
  • The real servers must have their default gateway configured to point at the load balancer.
  • For the real servers to be able to access the internet on their own, i.e. browse the web, the setup wizard automatically adds the required MASQUERADE rule in the firewall script (some vendors incorrectly call this S-NAT).
  • If you want real servers to be accessible on their own IP address for non-load balanced services, i.e. SMTP, you will need to set up individual SNAT and DNAT firewall script rules for each real server. Or you can set up a dedicated virtual server with just one real server as the target.
  • Please see the advanced NAT considerations section of our administration manual for more details on these two issues.

When using a load balancer in two-arm NAT mode, all load balanced services can be configured on the external IP. The real servers must also have their default gateways directed to the internal IP. You can also configure the load balancers in one-arm NAT mode, but in order to make the servers accessible from the local network you need to change some routing information on the real servers.

Now to be honest, we at have always found two arm mode a pain in the neck to support and configure. Why? Because it always involves network downtime as you play with configurations and try to get new subnets, VLANS and even DMZs talking to each other.
We had a bank screw this up on a live web site recently...and it was not pretty.

The only reason you usually can't do NAT in one arm mode (same subnet) is because all of your local servers won't be able to talk to the load balanced cluster unless they change their routing table to use the load balancer as a default gateway (which is very difficult on Windows Server but thats another story). If however all of the clients accessing the cluster are on a different network (like the Internet) then one arm NAT mode is great!

One Arm - Network Address Translation (NAT) load balancing method

Now that a lot of our customers are moving to the Amazon cloud with our AWS based load balancer, we are getting a lot more configurations in one-arm NAT mode. The reason being that when you are using AWS then most of the time your clients are on a different network i.e. the Internet so any return traffic is always routed correctly.

OK, who spotted the deliberate error? Yes, I don't have a picture for one arm NAT mode ... I'll update it later (honest).
Configuring your servers in AWS can be a real mind bender, but once you get the hang of it - it becomes really easy. I'm a bit rusty so I virtually always have to go and look at Robert Coopers' excellent documentation.

One arm NAT mode is a fast flexible and easy way to deliver transparent load balancing for your application. The only draw back is the local routing issue.
We use it heavily in the Amazon cloud - Why? Because you can't use DR mode in the Amazon cloud - that's why.

So when would you not use L4 DR mode or L4 NAT mode?

I will give you 3 reasons to use L7 instead:

  1. You need to route traffic to servers on different networks (watch out for high latency!)
  2. You need some kind of fancy application persistence like cookies etc.
  3. Um...lots of other things that you should really fix in your application and not on the load balancer...

Source Network Address Translation (SNAT) load balancing method (layer 7 load balancing)

If your application requires that the load balancer handles cookie insertion then you need to use the SNAT configuration. This also has the advantage of a one arm configuration and does not require any changes to the application servers. However, as the load balancer is acting as a full proxy it doesn't have the same raw throughput as the routing based methods.
Source Network Address Translation (SNAT) load balancing method

The network diagram for the Layer 7 HAProxy SNAT mode is very similar to the Direct Routing example except that no re-configuration of the real servers is required. The load balancer proxies the application traffic to the servers so that the source of all traffic becomes the load balancer.

  • As with other modes a single unit does not require a Floating IP.
  • SNAT is a full proxy and therefore load balanced servers do not need to be changed in any way.

Because SNAT is a full proxy any server in the cluster can be on any accessible subnet including across the Internet or WAN. SNAT is not TRANSPARENT by default i.e. the real servers will see the source address of each request as the load balancers IP address. The clients source IP address will be in the X-Forwarded-For for header (see TPROXY method).

NB. Rather than messing around with TPROXY did you know you can load balance based on the x-forwarded header? (pretty cool eh?)

Transparent Source Network Address Translation (SNAT-TPROXY) load balancing method

If the source address of the client is a requirement then HAProxy can be forced into transparent mode using TPROXY, this requires that the real servers use the load balancer as the default gateway (as in NAT mode) and only works for directly attached subnet's (as in NAT mode).
Transparent Source Network Address Translation (SNAT-TPROXY) load balancing method

  • As with other modes a single unit does not require a Floating IP.
  • SNAT acts as a full proxy but in TPROXY mode all server traffic must pass through the load balancer.
  • The real servers must have their default gateway configured to point at the load balancer.

Transparent proxy is impossible to implement over a routed network i.e. wide area network such as the Internet. To get transparent load balancing over the WAN you can use the TUN load balancing method (Direct Routing over secure tunnel) with Linux or UNIX based systems only.

SSL Termination or Acceleration (SSL) with or without TPROXY

All of the layer 4 and Layer 7 load balancing methods can handle SSL traffic in pass through mode i.e. the backend servers do the decryption and encryption of the traffic. This is very scaleable as you can just add more servers to the cluster to gain higher Transactions per second (TPS). However if you want to inspect HTTPS traffic in order to read or insert cookies you will need to decode (terminate) the SSL traffic on the load balancer. You can do this by importing your secure key and signed certificate to the load balancer giving it the authority to decrypt traffic. The load balancer uses standard apache/PEM format certificates.
You can define a Pound/Stunnel SSL virtual server with a single backend either a Layer 4 NAT mode virtual server or more usually a Layer 7 HAProxy VIP which can then insert cookies.
SSL Termination or Acceleration (SSL) with or without TPROXY

Pound/Stunnel-SSL is not TRANSPARENT by default i.e. the backend will see the source address of each request as the load balancers IP address. The clients source IP address will be in the X-Forwarded-For for header. However Pound/Stunnel-SSL can also be configured with TPROXY to ensure that the backend can see the source IP address of all traffic.