On Mon, 2011-04-04 at 11:44 -0500, Neil Aggarwal wrote: > >From what I can figure out from the ha.cf file, heartbeat > uses ping to tell if the peer is up.
Not really. It uses special heartbeat packets to tell if the peer is up. Ping is used to tell the difference between a dead peer and a bad NIC or cable. If the NIC or cable is bad, the remote peer would not respond, but also neither would any of the ping targets. The other node would see its remote node dead, but the ping targets alive, so it would know to take over resources. This is a crude method of avoiding split brain compared to a real STONITH device, but it works surprisingly well in a number of situations. We ran a number of critical services on heartbeat-v1 clusters for years until we switched over to using Pacemaker last year when it became obvious that no one is supporting heartbeat-v1 configurations any more (we were dragged kicking and screaming into the much more complicated but also much more flexible and reliable world of Pacemaker). > > I want to switch the virtual IP if the ldirectord process > is not running or locked up. That may happen even if the > network card is ok. > > Is there a way to do that? You don't say whether or not you are using Pacemaker. If you are, then you can set up ldirectord as a Pacemaker resource and let Pacemaker handle the monitoring. If you are not doing that, then you will need something external to do the monitoring. That is basically a limitation of heartbeat-v1 resources in general; the individual resources are not monitored, so it is possible to get into a situation where one or more resources are hung or crashed, but the heartbeat is still running so no failover occurs. The only solutions to that involve some sort of external monitor outside heartbeat (of which Pacemaker seems to be the recommended one). --Greg _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
