Í don't totally agree with you. There are software solution which are
working  in the same way. We are using F5 Big/IP ( a load balancing
solution) which is using the same concept than vqalive for cluster
redundancy (two servers are sharing a virtual common ip adress )

I have not review yet vqalive code but I suppose it will be helpful to
add a list of process and check to the conditionnal switching. And don't
forget portability (It will be great if it can be running not only on
Intel/ Linux ...)

For example :

pop daemon must be running
smtp daemon must be running
eth1 must be up and receive packets
eth0 must be up and receive packets

If for example, one of those process or state is not true , servers are
switched and alarm is sent.

Benoit

----- Original Message -----
From: "Gabriel Ambuehl" <[EMAIL PROTECTED]>
To: "Ken Jones" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Saturday, November 04, 2000 10:39 AM
Subject: Re: vqalive - comments?


| Hello Ken,
|
| Friday, November 03, 2000, 1:30:41 AM, you wrote:
| > Two identical email servers can be setup to share a private NFS
| > partion off a raid array. Each server provides the exact same
| > services as the other machine. Each machine has its own IP. All
| > clients are told to connect a third IP for services. One of the
| > servers will start out with the shared IP. If this server ever
| > dies and goes offline, the second server, using vqalive, will
| > detect the failure and assign the IP to itself. As a hot swap
| > over. Clients can then continue to get services. And the IP
| > address is switched transparently.
|
| In general, those tools are quite handy, however they can leave you in
| a false sense of security about your failover setup and thus
| you should always keep one thing in mind when using this or
| similar IP failover tools: their whole concept bases on the assumption
| that a server which fails will do it in a way where it is either still
| able to give it's IP free for its twin (which normally gets triggered
| by some signal of the twin) OR it will crash that badly
| that it won't respond to any IP requests and basically frees the IP as
| well but thus the failover will fail if the primary server crashes in
a
| way that it doesn't respond to all IP request anymore but still claims
| to have his IP. It's hard to describe this admittedly strange,
uncommon state,
| but we *had* machines which were answering to pings but the daemons on
| them weren't accepting any connections.
|
| When we were trying to deploy a suitable failover system for our
servers,
| we encountered such a situation several times (although we
unfortunately
| weren't able to reproduce it and therefore believe it was some kind of
| race situation which never happened to any production machines, BTW)
| and we found only one real solution to it: you need to have some kind
| of NAT load balancer (this might be ipnat with some custom hacks or
| heavy metal stuff such as the Foundry Network ServerIron) /
intelligent switch or router
| or some other device which is able to physically disconnect a server
| from the network so his spare twin is able to get the IP without
| screwing up the network by having two identical IPs which generally is
| a rather bad thing (TM).
|
| I'm not saying vqalive isn't working as I haven't yed had the time to
| check it out but (which I'll surely do this weekend) but I thought
| that it's fair to know where the limitations of this concept lay. Oh
| and Ken, if I'm wrong with my assumptions then I'd be very interested
to know
| how you managed to solve the case I described above.
|
|
|
| Best regards,
|  Gabriel
|
|
|

Reply via email to