Forwarding to the list for posterity (i.e. google) - I believe my reply did solve the problem, BTW.
The crm config in question is: node scc-bak node scc-pri primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="10.1.1.180" cidr_netmask="24" \ op monitor interval="30s" primitive drbd_r0 ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="15" role="Master" \ op monitor interval="30" role="Slave" \ primitive fs_r0 ocf:heartbeat:Filesystem \ params device="/dev/drbd1" directory="/home/scc" fstype="ext3" \ op monitor interval="10s" primitive scc-stonith stonith:meatware \ operations $id="scc-stonith-operations" \ op monitor interval="3600" timeout="20" start-delay="15" \ params hostlist="10.1.1.32 10.1.1.31" group r0 fs_R0 ClusterIP ms ms_drbd_r0 drbd_ro \ meta master-max="1" master-node-max="1" clone-max="2" \ clone-node-max="1" notify="true" colocation r0_on_drbd inf: r0 ms_drbd_r0:Master order r0_after_drbd inf: ms_drbd_r0:promote r0:start property $id="cib-bootstrap-options" \ dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" rsc_defaults $id="rsc-options" \ resource-stickiness="200" I probably should have noted that "scc-pri" and "scc-bak" aren't really the best choice of names, because "pri" and "bak" are kind of meaningless assuming identical nodes (and the nomenclature gets confusing when you start talking about masters and slaves on top of that). Anyway... -------- Original Message -------- Subject: Re: How can I make the secondary machine elect itself owner of the floating IP address? Date: Thu, 20 Sep 2012 12:36:03 +1000 From: Tim Serong To: Epps, Josh Hi Josh, On 09/20/2012 10:47 AM, Epps, Josh wrote: > Hi Tim, > > I saw one of your Gossamer threads and I really need some help. > > I have a two-node cluster running on SLES 11 SP2 with Pacemaker and DRBD. > When I shutdown the primary with the "shutdown -h now" the > ocf:heartbeat:IPaddr2 transfers nicely to the backup server. > But when I simulate a failure on the primary node by killing the power > neither the floating IP address or the mount transfer to the secondary > machine. What's probably happening is: - When you do a clean shutdown of one node, the surviving node knows the first has gone away, and it can safely take over those resources. - When you cut power, the surviving node doesn't know what state the first node is in, so will do nothing until the first node is fenced. - You're using the meatware STONITH plugin (which probably doesn't need a monitor op, BTW), which means you should see a CRIT message in syslog on the surviving node, telling you it expects the first node to be fenced. > > How can I make the secondary machine elect itself owner of the floating > IP address? Assuming the first machine is really down :) you should be able to tell the cluster this is so by running "meatclient -c scc-pri" on the surviving node (but do check syslog to see if you're really getting warnings about a node needing to be fenced). > Suse support today said that it can’t be done with just two nodes but we > just require a one-way failover. Two node clusters should work fine, they're just more annoying than three node - see for example "STONITH Deathmatch Explained" at http://ourobengr.com/ha/ If the above doesn't solve it for you, do you mind if we take this to the linux-ha or pacemaker public mailing list? More eyes on a problem never hurts, and then a solution becomes googlable :) Regards, Tim -- Tim Serong Senior Clustering Engineer SUSE tser...@suse.com _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org