Re: [Pacemaker] Clone resource dependency issue - undesired restart of dependent resources

Ron Kerry Tue, 01 Mar 2011 14:54:06 -0800

On 3/1/2011 2:39 PM, Ron Kerry wrote:

On 2/28/2011 2:33 PM, Ron Kerry wrote:

Folks -


I have a configuration issue that I am unsure how to resolve. Consider the 
following set of
resources.

clone rsc1-clone rsc1 \
meta clone-max="2" target-role="Started"
primitive rsc1 ...
primitive rsc2 ... meta resource-stickiness="1"
primitive rsc3 ... meta resource-stickiness="1"

Plus the following constraints

colocation rsc2-with-clone inf: rsc2 rsc1-clone
colocation rsc3-with-clone inf: rsc3 rsc1-clone
order clone-before-rsc2 : rsc1-clone rsc2
order clone-before-rsc3 : rsc1-clone rsc3


I am getting the following behavior that is undesirable.

During normal operation, a copy of the rsc1 resource is running on my two 
systems with rs2 and rsc3
typically running split between the two systems. The rsc2 & rsc3 resources are 
operationally
dependent on a copy of rsc1 being up and running first.

SystemA SystemB
======= =======
rsc1     rsc1
rsc2     rsc3

If SystemB goes down, then rsc3 moves over to SystemA as expected

SystemA SystemB
======= =======
rsc1     X X
rsc2      X
rsc3     X X

When SystemB comes back into the cluster, crmd starts the rsc1 clone on SystemB 
but then also
restarts both rsc2 & rsc3. This means both are stopped and then both started 
again. This is not what
we want. We want these resources to remain running on SystemA until one of them 
is moved manually by
an administrator to re-balance them across the systems.

How do we configure these resources/constraints to achieve that behavior? We 
are already using
resource-stickiness, but that is meaningless if crmd is going to be doing a 
restart of these
resources.


Using advisory (score="0") order constraints seems to acheive the correct 
behavior. I have not done
extensive testing yet to see if other failover behaviors are broken with this 
approach, but initial
basic testing looks good. It is always nice to answer one's own questions :-)

colocation rsc2-with-clone inf: rsc2 rsc1-clone
colocation rsc3-with-clone inf: rsc3 rsc1-clone
order clone-before-rsc2 0: rsc1-clone rsc2
order clone-before-rsc3 0: rsc1-clone rsc3

Does anyone know of any specific problems with this approach??


I set up a greatly simplified generic resource configuration:

 Online: [ elvis queen ]
  Clone Set: A-clone [A]
      Started: [ elvis queen ]
  B-1    (ocf::rgk:typeB):       Started elvis
  B-2    (ocf::rgk:typeB):       Started queen
  Clone Set: stonith-l2network-set [stonith-l2network]
      Started: [ elvis queen ]

The A and B resources are just shell scripts in infinite while loop where the contents of the loopis a sleep 5 command so they run forever but do not consume machine resources.

If I kill the A-clone running on queen, it just gets restarted and nothing at all happens to B-2 (itstays on queen and never knows any different). This is not optimal behavior for our purposes.

However on the good side, if the A-clone cannot (re)start on queen, then B-2 does fail over to elvisas we expect.


Does anybody have any ideas about how to get the proper behavior in all cases?


--

Ron Kerry         rke...@sgi.com
Global Product Support - SGI Federal

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Clone resource dependency issue - undesired restart of dependent resources

Reply via email to