Craig R. McClanahan wrote:
> "[EMAIL PROTECTED]" wrote:
>
>> why not do iot in shared storage and implement SSI ? thats what the
>> mod_jserv shm file was for...a shared hunk of disk store.
>> -Ys-
>> [EMAIL PROTECTED]
>
>
> This is certainly one valid approach. It works for cases where the servers are all
> on the same machine. But you still need an approach that works on a multiple
> machine environment as well.
Yes! I want a system where you can have as many Tomcats as you
want on each machine, and as many of those machines as you want,
and they all act together as one big distributed farm of Tomcats.
The administrator should be able to configure which Tomcat instance
replicates to which other Tomcat instance(s). And, this should be
transparent to users regardless of the web server being used.
Unless I'm missing something, I don't see why we should change the
C-side of the AJP code to make all this work..
> It would be worth someone taking the time to articulate proposed use cases, from
> which we can identify desired features.
Yes, and I'll see what I can do..
> My bet is that we will end up wanting a
> variety of pluggable implementations with different functionality and performance
> characteristics.
I sure do.
Some features I'd like to add to Tomcat 4 (and use!):
* Add the functionality to Catalina to operate as a distributed Servlet
container as described in the latest Servlet specification in
stand-alone operation. This should support pluggable communication
mechanisms (plug in the protocol of your choice at runtime via a custom
config), including RMI and IP-Multicast technologies like Weblogic uses.
* Add a load balancing and fault tolerance Valve to Catalina that makes
the web server act as a load balancing HTTP redirect server. The goal
is to make it easy to install, run, and maintain a cluster of Tomcat "web
containers". This Valve should later be able to take advantage of Avalon
if Catalina is running within Avalon, and if not it should still work in
standalone mode.
More detail:
This Valve makes the web server serve HTTP redirects (HTTP 302 - Moved
Temporarily) when the request should be moved to another Tomcat instance.
It acts in concert with one or more other Tomcat instances in a
completely decentralized way. For instance, no single machine acts as
the central brain for the system.. each redirect server tries to cooperate
with any other redirect servers in the system, but each one has authority
over its own requests should it not be able to communicate with other
redirect servers in the system. These redirect servers keep track of how
many content servers there are in the system, and distribute HTTP requests
to each of the content servers in a manner dependent on the load balancing
module.
The load balancing module should be able to use many different load
balancing algorithms. The system administrator should be able to
configure which algorithm(s) to use for any given context. Load
balancing algorithms may include:
- Least Loaded Server: load can be calculated in one of many ways,
including counting requests per unit of time, asking the operating
system for the load average, current number of request Threads
running, etc.
- Weighted Percentage: larger, more powerful servers in the cluster
are given a higher percentage of the requests that each redirect
server must forward. Smaller, less capable servers will receive
a smaller percentage of the requests/workload.
- Fastest Path: a module at each data center will use ICMP pings to
determine which data center has the fastest path back to the client.
At that time, the fastest data center's own cluster of content
servers are the only remaining candidates for forwarding and serving
the request. Note that the pinger can't be written in the Java
programming language (no support for ICMP -- a potential security
issue for consumers), so that component needs to be written in
another programming language.
- Sticky Sessions: Any user's request will be sent to the same content
server that the user's last request was sent to. Identification of
a user is usually done in the form of cookies, or jsessionids.
- Round Robin: perhaps the least intelligent algorithm, but the simplest
to understand, set up, and try out. Redirect servers simply redirect
requests one to each server in the list, looping back to the first
server when the last server in the list is used.
The load balancing module should allow pluggable load balancing
algorithms to work in conjunction with each other. The administrator
should be able to assign a decision making "weights" to each
algorithm so that the decision about where to send a request can
be made after considering more than just one algorithm's suggestion.
Some algorithms, though, should never be used in combination with
others (for example, Sticky Sessions.. what if you use it with two
other algorithms and they override the Sticky Session and send the
request elsewhere where it doesn't have its session replicated?).
So, there should be two main types of load balancing algorithms:
Democratic algorithms, and Dictator algorithms. Dictator algorithms
don't vote on where to send the request, they just send it.
Democratic algorithms work together, voting about where to send
a request, and each algorithm's vote is weighted.
Regardless of the load balancing algorithm(s) used, the request will
never be redirected to a server that is down. For this to work, the
load balancer must be able to maintain contact with each content server.
This contact should be configurable as well. Possible choices may
include:
- TCP Socket Keepalive: each redirect server opens a single TCP socket
connection to each content server (to something like an echo service)
and keeps the connection alive by sending data to the content server
to be echoed back to the redirect server. If the TCP connection
breaks and cannot be reestablished, the redirect server considers
that content server to be unavailable and will not send any requests
to it again until the TCP connection can be reestablished.
- UDP Ping: each redirect server periodically sends a UDP message out
to a content server (the content server implements a UDP echo service)
to be echoed back to the redirect server. If packets are lost for
longer than a configurable timeout interval, then the redirect server
considers that content server to be down until it begins echoing back
the UDP ping messages.
Comments? Suggestions?
--
Jason Brittain
Software Engineer, Olliance Inc. http://www.Olliance.com
Current Maintainer, Locomotive Project http://www.Locomotive.org