Hi, On Wed, Jun 09, 2010 at 12:11:09PM +0200, Torresani, Roberto wrote: > Well... it seem to be SOLVED!!! > Thank you Dejan. > In the next few days I will load the cluster and then see how it behaves. > > I simply raise the token value to 10000 msec, leave all the others to the > defaults.
You should also raise the consensus value to 12000. corosync would even refuse to start in this case. Thanks, Dejan > > Thank you again. > Regards, > Roberto > > > > > -----Original Message----- > > From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] > > Sent: Tuesday, June 08, 2010 6:42 PM > > To: The Pacemaker cluster resource manager > > Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere > > > > Hi, > > > > On Mon, Jun 07, 2010 at 02:57:57PM +0200, Torresani, Roberto wrote: > > > Sorry for have choosen the wrong ml... > > > > That's no problem. There's just better chance of getting help on > > the other list. > > > > > Here the corosync.conf used by one cluster, the other one is > > > just the same provided by the epel repository packages. > > > > > > I will try to raise the token value to 10000 as you suggest. Is > > > there a theoretical or a best practice to set this value ? > > > > No, but 5000 should be OK for most. Ultimately, it depends on > > your network. I forgot what was exactly the case here, but it > > seems like you had some heavy processing (backup?) which used > > most of resources. That may be really hard to predict. You can > > use sar or similar to monitor the load. > > > > Thanks, > > > > Dejan > > > > > I will keep you informed as it goes, and open a thread on the > > > corosync ml if necessary. > > > > > > Thank you. > > > > > > > > > # Please read the corosync.conf.5 manual page > > > compatibility: whitetank > > > > > > totem { > > > version: 2 > > > secauth: off > > > threads: 0 > > > token: 1000 > > > hold: 180 > > > token_retransmits_before_loss_const: 20 > > > join: 60 > > > consensus: 4800 > > > vsftype: none > > > max_messages: 20 > > > interface { > > > ringnumber: 0 > > > bindnetaddr: 192.168.206.0 > > > mcastaddr: 226.94.1.1 > > > mcastport: 5405 > > > } > > > } > > > > > > logging { > > > fileline: off > > > to_stderr: yes > > > to_logfile: yes > > > to_syslog: yes > > > logfile: /tmp/corosync.log > > > debug: off > > > timestamp: on > > > logger_subsys { > > > subsys: AMF > > > debug: off > > > } > > > } > > > > > > amf { > > > mode: disabled > > > } > > > > > > aisexec { > > > user: root > > > group: root > > > } > > > > > > service { > > > name: pacemaker > > > ver: 0 > > > } > > > _______________________________________________ > > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc > t=Pacemaker > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc > t=Pacemaker > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker