On Thu, 14 Mar 2013 14:06:36 +0000 Owen Le Blanc <lebl...@man.ac.uk> wrote: > I have a number of pacemaker managed clusters. We use an independent > heartbeat network for corosync, and we use another network for the > managed services. The heartbeat network is routed using different > hardware from the service network. We have two machine rooms, and > our normal pacemaker clusters have one node in each machine room. > > In the past I've used ocf:pacemaker:ping as part of our > configurations, but we had problems, since our network is busy, and > many of the routers (the most reliable things to ping) are configured > to ignore pings when they have too much to do otherwise. In this way > we often had false connectivity failures in the past, and services > would flop from one side to the other. > > Recently we had a power failure which affected all of the switches on > our service network in one machine room. This meant that all > services in that machine room were unavailable. Our pacemaker > clusters unfortunately saw this as no problem, since without a ping > test, they couldn't tell that the network was down. > > Has anyone done any work to measure network connectivity in > connection with pacemaker without using ping? I can see a couple of > potential ways to avoid it, but I hate to reinvent wheels.
I have seen a commercial (but pacemaker-based) solution that seemed to use link-detection on the hw-level to suicide the local node when both links (one to the outside and one to the peer) went down. But I don't even know if this was done inside pacemaker, nor did I have time to think about something similar for our cluster. I just trust that four links using two switches with independant power will be safe enough... Have fun, Arnold
signature.asc
Description: PGP signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org