Re: [Linux-HA] Disconnected node will not release ipaddr resource

Alex Dean Wed, 09 Dec 2009 18:22:26 -0800


On Dec 9, 2009, at 5:34 PM, Mullis, Josh (CCI - Atlanta) wrote:

Shouldn't node1 release the resource if the ping node (1.1.1.1) is down?


That's not how ipfail works.

ipfail presumes that the two nodes are always in contact. Based on their ability to ping 1.1.1.1, they will decide which one should hold your resources. If the two nodes lose contact with each other, you have a split-brain and all bets are off.

"Note that ipfail needs redundant communications media to work correctly - because it won't cause a failover on its own unless it can contact the other cluster member. In other words, if you're pinging on the same media as the only heartbeat channel configured, you're destined to be disappointed in ipfail."

http://linux-ha.org/ipfail

If your ethernet connection is your only medium your cluster nodes can use to communicate, ipfail really isn't much use. You could try adding something like mon. I've written a mon alert which causes heartbeat to go standby if it can't ping it's gateway IP, and this has worked pretty well. Mon's really quite easy to learn, and I think it only took an afternoon of tinkering to get a 'go standby' action I was happy with.


http://linux-ha.org/mon
http://mon.wiki.kernel.org/index.php/Main_Page

You could also switch to a v2 heartbeat+pacemaker configuration, which will get you resource-level monitoring. In this case, the ability to ping 1.1.1.1 is your 'resource'. I believe you'd then use pingd rather than ipfail. I haven't done this personally, but I'm sure many/ most on this list have.


alex

PGP.sig
Description: This is a digitally signed message part

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Disconnected node will not release ipaddr resource

Reply via email to