On 10/23/2012 05:04 PM, Andrew Martin wrote: > Hello, > > Under the Clusters from Scratch documentation, allow-two-primaries is > set in the DRBD configuration for an active/passive cluster: > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html-single/Clusters_from_Scratch/index.html#_write_the_drbd_config > > "TODO: Explain the reason for the allow-two-primaries option" > > Is the reason for allow-two-primaries in this active/passive cluster > (using ext4, a non-cluster filesystem) to allow for failover in the type > of situation I have described (where the old primary/master is suddenly > offline like with a power supply failure)? Are split-brains prevented > because Pacemaker ensures that only one node is promoted to Primary at > any time?
no "allow-two-primaries" needed in an active/passive setup, the fence-handler (executed on the Primary if connection to Secondary is lost) inserts a location-constraint into the Pacemaker configuration so the cluster does not even "think about" to promote an outdated Secondary > > Is it possible to recover from such a failure without allow-two-primaries? Yes. If you only disconnect DRBD as in you test described below and cluster communication over redundant network is still possible (and Pacemaker is up and running), the Primary will insert that location-constraint and prevents a Secondary from becoming Primary because the constraint is already placed ... if Pacemaker is _not_ running during your disconnection test, you also receive an error because obviously it is also impossible to place that constraint. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > Thanks, > > Andrew > > ------------------------------------------------------------------------ > *From: *"Andrew Martin" <amar...@xes-inc.com> > *To: *"The Pacemaker cluster resource manager" > <pacemaker@oss.clusterlabs.org> > *Sent: *Friday, October 19, 2012 10:45:04 AM > *Subject: *[Pacemaker] Behavior of Corosync+Pacemaker with DRBD primary > power loss > > Hello, > > I have a 3 node Pacemaker + Corosync cluster with 2 "real" nodes, node0 > and node1, running a DRBD resource (single-primary) and the 3rd node in > standby acting as a quorum node. If node0 were running the DRBD > resource, and thus is DRBD primary, and its power supply fails, will the > DRBD resource be promoted to primary on node1? > > If I simply cut the DRBD replication link, node1 reports the following > state: > Role: > Secondary/Unknown > > Disk State: > UpToDate/DUnknown > > Connection State: > WFConnection > > > I cannot manually promote the DRBD resource because the peer is not > outdated: > 0: State change failed: (-7) Refusing to be Primary while peer is not > outdated > Command 'drbdsetup 0 primary' terminated with exit code 11 > > I have configured the CIB-based crm-fence-peer.sh utility in my drbd.conf > fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; > but I do not believe it would be applicable in this scenario. > > If node0 goes offline like this and doesn't come back (e.g. after a > STONITH), does Pacemaker have a way to tell node1 that its peer is > outdated and to proceed with promoting the resource to primary? > > Thanks, > > Andrew > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org