On 7 Mar 2014, at 2:39 am, Jay Janssen <jay.jans...@percona.com> wrote:
> > primitive p_service ... \ > op monitor interval="2s" role="Master" \ > op monitor interval="5s" role="Slave" \ > op start timeout="10000s" interval="0" > ms ms_service p_service \ > meta master-max="3" clone-max="3" target-role="Started" > is-managed="true" ordered="false" interleave="true" notify="false" > > In my case I have all three nodes happily in the ‘master’ state and for a > test I simultaneously cause the underlying service to fail on all of them. > In all cases, the next monitor operation returns OCF_FAILED_MASTER. > Subsequent monitor checks will then return OCF_ERR_NOT_RUNNING, since the > node falls out of the Master state. > > I want all the resource clones to issue the start operation more or less at > the same time (hence ordered=”false”) so I can use the start operation to > coordination amongst the nodes as they start (true start order is important > and depends on node state, so I’m overloading the start action for this > coordination in the all-down state). For the most part, this is working fine. > > However, I seem to be getting into a race condition where a node (seemingly > the one that happens to detect OCF_FAILED_MASTER last) ends up NOT starting > until AFTER the others start. Two nodes get ‘start’ issued at the same time > like I want, but the 3rd gets stuck still monitoring ( and returning errors) > until after the others start. Once the others are up, then the 3rd finally > unblocks and starts. I think I'd want to see a crm_report for this, that will have the logs and PE files we'd need to diagnose any issue. > > Is this expected behavior in Pacemaker? Is there any way I can suppress > this behavior and get nodes to start irrespective of long start operations on > other nodes? > > > FWIW, if someone can explain to me how I can enforce start action order in a > group like this in a more Pacemaker friendly way, I’m all ears. > > > > Jay Janssen > http://about.me/jay.janssen > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org