On 03/14/2013 03:36 PM, Arnold Krille wrote: > On Thu, 14 Mar 2013 14:06:36 +0000 Owen Le Blanc <lebl...@man.ac.uk> > wrote: >> I have a number of pacemaker managed clusters. We use an independent >> heartbeat network for corosync, and we use another network for the >> managed services. The heartbeat network is routed using different >> hardware from the service network. We have two machine rooms, and >> our normal pacemaker clusters have one node in each machine room. >> >> In the past I've used ocf:pacemaker:ping as part of our >> configurations, but we had problems, since our network is busy, and >> many of the routers (the most reliable things to ping) are configured >> to ignore pings when they have too much to do otherwise. In this way >> we often had false connectivity failures in the past, and services >> would flop from one side to the other. >> >> Recently we had a power failure which affected all of the switches on >> our service network in one machine room. This meant that all >> services in that machine room were unavailable. Our pacemaker >> clusters unfortunately saw this as no problem, since without a ping >> test, they couldn't tell that the network was down. >> >> Has anyone done any work to measure network connectivity in >> connection with pacemaker without using ping? I can see a couple of >> potential ways to avoid it, but I hate to reinvent wheels. > > I have seen a commercial (but pacemaker-based) solution that seemed to > use link-detection on the hw-level to suicide the local node when both > links (one to the outside and one to the peer) went down. > > But I don't even know if this was done inside pacemaker, nor did I have > time to think about something similar for our cluster. > > I just trust that four links using two switches with independant power > will be safe enough... I've done a suicide when the link goes away by looking at /sys/class/net//<interface>//carrier
for example, cat /sys/class/net/eth0/carrier and see what it looks like... It's 1 when the link is up, and 0 when it's down. You could presumably write a script that uses that to set node attributes too... -- Alan Robertson <al...@unix.sh> - @OSSAlanR "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org