Nigel,

In building a testlab at work, I was faced with the same dilemma. Initially I used RRDNS to balance load between two web servers and found that IE cached IP addresses. This made for disappointing results.

I did find two interesting software products though. Balance <http://balance.sf.net/> and Pound <http://www.apsis.ch/pound/index.html>. I tried Balance first since it was written in C, had a small footprint, and had very few features (it's only a tcp proxy with round robin balancing and fail over). For the last month, I've used it successfully with intense loads and I've been perfectly happy with it. On a 1GHz Athlon (no doubt overkill), I can sustain hundreds of connections and many megabits of throughput without appreciable latency or cpu load. I use it to proxy http and https connections flawlessly. To top it off, it compiled and was usable within a few minutes. Reading the documentation took less time than the compile and configuring it was even quicker at that. As a C programmer, I appreciated the source code and was satisfied with the competence of the authors (the munich.net folks).

I'm planning on putting Pound through the paces also but since Balance works so well for exactly what I need I'm not terribly motivated. Pound has many more features and seems comparable to the Arrowpoint systems we use for load balancing and fail over in our production environment. So maybe someday I'll have a Pound v. Arrowpoint showdown.

I'd love to hear from others with similar experience.

--John

Nigel Hamilton wrote:

Hi,

Imagine for a moment ... you're stuck on a deserted island and there's no hardware load balancer vendor in sight. With only coconuts, five linux servers and an Internet connection you need to come up with a software-based load balancing scheme.

Which would you choose and why?

1) Round Robin DNS
__________________

Bind enables you to map a domain name to multiple IP addresses (e.g., search.turbo10.com -> 130.54.23.24, 130.54.23.34). Setting the DNS time to live (TTL) to 60 seconds helps to prevent caching the IP address.

The disadvantages of DNS round-robin include:
* client DNS caching(?) may reduce the random distribution of traffic in the cluster 
(is it that bad?)
* it does not take account of load on any individual server
* a server may fail and users may experience a dropped connection

The advantages of DNS round-robin include:
* simple to implement
* no load is born by the server to distribute the load (e.g., no need to process 
Apache RewriteMap)
* no need for a 'point' machine to accept all incoming hits

... or ...

2) Apache Rewrite Map
_____________________
This alternative can distribute the load randomly and also selectively based on the URL 
(e.g., http://turbo10.com/?u=1212 --> 130.54.23.24, http://turbo10.com/?u=1211 --> 
130.54.23.25).

The disadvantages include:
* a machine must act as a proxy for all requests.
* there is an extra network socket connection required between proxy and server.
* all HTTP headers need to be inspected and rewritten.
* proxyReversePass is needed to send cookies back to the client.
* more complex to implement than DNS round-robin

The advantages include:
* the RewriteMap file can be updated based on load averages in the cluster. Allowing 
heavily loaded machines to receive less traffic and lightly load machines to receive 
more.
* a cron job could update the map file based on load averages.
* if a server goes down it can be removed from the RewriteMap (i.e., and the cluster) 
immediately making the loss of a server invisible to the end user.

... or ...

3) Other Software-Based Load Balancing Solution
_______________________________________________
????


Regards,



NIge


Nigel Hamilton






Reply via email to