В Mon, 06 Oct 2014 10:27:49 -0400 Digimer <li...@alteeve.ca> пишет:
> On 06/10/14 02:11 AM, Andrei Borzenkov wrote: > > On Mon, Oct 6, 2014 at 9:03 AM, Digimer <li...@alteeve.ca> wrote: > >> If stonith was configured, after the time out, the first node would fence > >> the second node ("unable to reach" != "off"). > >> > >> Alternatively, you can set corosync to 'wait_for_all' and have the first > >> node do nothing until it sees the peer. > >> > > > > Am I right that wait_for_all is available only in corosync 2.x and not in > > 1.x? > > You are correct, yes. > > >> To do otherwise would be to risk a split-brain. Each node needs to know the > >> state of the peer in order to run services safely. By having both start at > >> the same time, then they know what the other is doing. By disabling quorum, > >> you allow one node to continue to operate when the other leaves, but it > >> needs that initial connection to know for sure what it's doing. > >> > > > > Does it apply to both corosync 1.x and 2.x or only to 2.x with > > wait_for_all? Because I actually also was confused about precise > > meaning of disabling quorum in pacemaker (setting no-quorum-policy: > > ignore). So if I have two node cluster with pacemaker 1.x and corosync > > 1.x with no-quorum-policy=ignore and no fencing - what happens when > > one single node starts? > > Quorum tells the cluster that if a peer leaves (gracefully or was > fenced), the remaining node is allowed to continue providing services. > > Stonith is needed to put a node that is in an unknown state into a known > state; Be it because it couldn't reach the node when starting or because > the node stopped responding. > > So quorum and stonith play rather different roles. > > Without stonith, regardless of quorum, you risk split-brains and/or data > corruption. Operating a cluster without stonith is to operate a cluster > in an undermined state and should never be done. > OK I try to rephrase. Is it possible to achieve the same effect as wait_for_all in corosync 2.x with combination of pacemaker 1.1.x and corosync 1.x? I.e. ensure that cluster does not come up *on the first startup* until all nodes are present? So just make cluster nodes wait for others to join instead of trying to stonith them? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org