primitive p_service ... \
        op monitor interval="2s" role="Master" \
        op monitor interval="5s" role="Slave" \
        op start timeout="10000s" interval="0"
ms ms_service p_service \
        meta master-max="3" clone-max="3" target-role="Started" 
is-managed="true" ordered="false" interleave="true" notify="false"

In my case I have all three nodes happily in the ‘master’ state and for a test 
I simultaneously cause the underlying service to fail on all of them.  In all 
cases, the next monitor operation returns OCF_FAILED_MASTER.  Subsequent 
monitor checks will then return OCF_ERR_NOT_RUNNING, since the node falls out 
of the Master state. 

I want all the resource clones to issue the start operation more or less at the 
same time (hence ordered=”false”) so I can use the start operation to 
coordination amongst the nodes as they start (true start order is important and 
depends on node state, so I’m overloading the start action for this 
coordination in the all-down state).  For the most part, this is working fine.

However, I seem to be getting into a race condition where a node (seemingly the 
one that happens to detect OCF_FAILED_MASTER last) ends up NOT starting until 
AFTER the others start.   Two nodes get ‘start’ issued at the same time like I 
want, but the 3rd gets stuck still monitoring ( and returning errors) until 
after the others start.  Once the others are up, then the 3rd finally unblocks 
and starts.

Is this expected behavior in Pacemaker?   Is there any way I can suppress this 
behavior and get nodes to start irrespective of long start operations on other 
nodes? 


FWIW, if someone can explain to me how I can enforce start action order in a 
group like this in a more Pacemaker friendly way, I’m all ears.  



Jay Janssen
http://about.me/jay.janssen


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to