I have narrowed this down to an issue that I feel is really a bug in the way pacemaker is dealing with constraints made between resource groups as opposed to resource primitives. The version of pacemaker involved here is:
  libpacemaker3-1.1.2-0.7.1
  pacemaker-1.1.2-0.7.1

A configuration which involves colocation/order constraints made between a clone and a simple primitive exhibits proper failover behavior.

  Online: [ elvis queen ]
   Clone Set: A-clone [A]
       Started: [ elvis queen ]
   B-1    (ocf::rgk:typeB):       Started elvis
   B-2    (ocf::rgk:typeB):       Started queen
   Clone Set: stonith-l2network-set [stonith-l2network]
       Started: [ elvis queen ]

    <constraints>
      <rsc_colocation id="A-with-B-1" rsc="B-1" score="INFINITY" 
with-rsc="A-clone"/>
      <rsc_colocation id="A-with-B-2" rsc="B-2" score="INFINITY" 
with-rsc="A-clone"/>
      <rsc_order first="A-clone" id="A-before-B-1" symmetrical="true" 
then="B-1"/>
      <rsc_order first="A-clone" id="A-before-B-2" symmetrical="true" 
then="B-2"/>
    </constraints>

This is from after queen come back into the cluster after beign reset.

Mar  1 16:07:03 elvis pengine: [4218]: info: determine_online_status: Node 
elvis is online
Mar  1 16:07:03 elvis pengine: [4218]: info: determine_online_status: Node 
queen is online
Mar  1 16:07:03 elvis pengine: [4218]: notice: clone_print:  Clone Set: A-clone 
[A]
Mar  1 16:07:03 elvis pengine: [4218]: notice: short_print:      Started: [ 
elvis ]
Mar  1 16:07:03 elvis pengine: [4218]: notice: short_print:      Stopped: [ A:1 
]
Mar 1 16:07:03 elvis pengine: [4218]: notice: native_print: B-1 (ocf::rgk:typeB): Started elvis Mar 1 16:07:03 elvis pengine: [4218]: notice: native_print: B-2 (ocf::rgk:typeB): Started elvis Mar 1 16:07:03 elvis pengine: [4218]: notice: clone_print: Clone Set: stonith-l2network-set [stonith-l2network]
Mar  1 16:07:03 elvis pengine: [4218]: notice: short_print:      Started: [ 
elvis ]
Mar  1 16:07:03 elvis pengine: [4218]: notice: short_print:      Stopped: [ 
stonith-l2network:1 ]
Mar  1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource A:0   
(Started elvis)
Mar  1 16:07:03 elvis pengine: [4218]: notice: LogActions: Start A:1    (queen)
Mar  1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource B-1   
(Started elvis)
Mar  1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource B-2   
(Started elvis)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource stonith-l2network:0 (Started elvis)
Mar  1 16:07:03 elvis pengine: [4218]: notice: LogActions: Start 
stonith-l2network:1    (queen)

Note that the dependent B-1 and B-2 resources are left running where they were. This is proper and expected failover behavior given they have resource-stickiness set.


While the same configuration replacing the simple primitives with a group of two primitives exhibits incorrect failover behavior.

  Online: [ elvis queen ]
   Clone Set: AZ-clone [AZ-group]
       Started: [ elvis queen ]
   Resource Group: BC-group-1
       B-1      (ocf::rgk:typeB):       Started elvis
       C-1      (ocf::rgk:typeC):       Started elvis
   Clone Set: stonith-l2network-set [stonith-l2network]
       Started: [ elvis queen ]
   Resource Group: BC-group-2
       B-2      (ocf::rgk:typeB):       Started queen
       C-2      (ocf::rgk:typeC):       Started queen

    <constraints>
      <rsc_colocation id="AZ-with-BC-group-1" rsc="BC-group-1" score="INFINITY" 
with-rsc="AZ-clone"/>
      <rsc_colocation id="AZ-with-BC-group-2" rsc="BC-group-2" score="INFINITY" 
with-rsc="AZ-clone"/>
      <rsc_order first="AZ-clone" id="AZ-before-BC-group-1" symmetrical="true" 
then="BC-group-1"/>
      <rsc_order first="AZ-clone" id="AZ-before-BC-group-2" symmetrical="true" 
then="BC-group-2"/>
    </constraints>

Mar  2 12:44:43 elvis pengine: [4218]: info: determine_online_status: Node 
elvis is online
Mar  2 12:44:43 elvis pengine: [4218]: info: determine_online_status: Node 
queen is online
Mar  2 12:44:43 elvis pengine: [4218]: notice: clone_print:  Clone Set: 
AZ-clone [AZ-group]
Mar  2 12:44:43 elvis pengine: [4218]: notice: short_print:      Started: [ 
elvis ]
Mar  2 12:44:43 elvis pengine: [4218]: notice: short_print:      Stopped: [ 
AZ-group:1 ]
Mar  2 12:44:43 elvis pengine: [4218]: notice: group_print:  Resource Group: 
BC-group-1
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: B-1 (ocf::rgk:typeB): Started elvis Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: C-1 (ocf::rgk:typeC): Started elvis Mar 2 12:44:43 elvis pengine: [4218]: notice: clone_print: Clone Set: stonith-l2network-set [stonith-l2network]
Mar  2 12:44:43 elvis pengine: [4218]: notice: short_print:      Started: [ 
elvis ]
Mar  2 12:44:43 elvis pengine: [4218]: notice: short_print:      Stopped: [ 
stonith-l2network:1 ]
Mar  2 12:44:43 elvis pengine: [4218]: notice: group_print:  Resource Group: 
BC-group-2
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: B-2 (ocf::rgk:typeB): Started elvis Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: C-2 (ocf::rgk:typeC): Started elvis
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource A:0   
(Started elvis)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource Z:0   
(Started elvis)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start A:1    (queen)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start Z:1    (queen)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource B-1 
(Started elvis)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource C-1 
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource stonith-l2network:0 (Started elvis)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start 
stonith-l2network:1    (queen)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource B-2 
(Started elvis)
Mar  2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource C-2 
(Started elvis)

Note that the dependent B-1/C-1 and B-2/C-2 resources are restarted. This is incorrect failover behavior given they have resource-stickiness set.

This appears to me to be a clear bug. Pacemaker should not be handling groups any differently than it does primitives!



On 3/1/2011 5:48 PM, Ron Kerry wrote:
On 3/1/2011 2:39 PM, Ron Kerry wrote:
 > On 2/28/2011 2:33 PM, Ron Kerry wrote:
 >> Folks -
 >>
 >> I have a configuration issue that I am unsure how to resolve. Consider the 
following set of
 >> resources.
 >>
 >> clone rsc1-clone rsc1 \
 >> meta clone-max="2" target-role="Started"
 >> primitive rsc1 ...
 >> primitive rsc2 ... meta resource-stickiness="1"
 >> primitive rsc3 ... meta resource-stickiness="1"
 >>
 >> Plus the following constraints
 >>
 >> colocation rsc2-with-clone inf: rsc2 rsc1-clone
 >> colocation rsc3-with-clone inf: rsc3 rsc1-clone
 >> order clone-before-rsc2 : rsc1-clone rsc2
 >> order clone-before-rsc3 : rsc1-clone rsc3
 >>
 >>
 >> I am getting the following behavior that is undesirable.
 >>
 >> During normal operation, a copy of the rsc1 resource is running on my two 
systems with rs2 and rsc3
 >> typically running split between the two systems. The rsc2 & rsc3 resources 
are operationally
 >> dependent on a copy of rsc1 being up and running first.
 >>
 >> SystemA SystemB
 >> ======= =======
 >> rsc1 rsc1
 >> rsc2 rsc3
 >>
 >> If SystemB goes down, then rsc3 moves over to SystemA as expected
 >>
 >> SystemA SystemB
 >> ======= =======
 >> rsc1 X X
 >> rsc2 X
 >> rsc3 X X
 >>
 >> When SystemB comes back into the cluster, crmd starts the rsc1 clone on 
SystemB but then also
 >> restarts both rsc2 & rsc3. This means both are stopped and then both 
started again. This is not what
 >> we want. We want these resources to remain running on SystemA until one of 
them is moved manually by
 >> an administrator to re-balance them across the systems.
 >>
 >> How do we configure these resources/constraints to achieve that behavior? 
We are already using
 >> resource-stickiness, but that is meaningless if crmd is going to be doing a 
restart of these
 >> resources.
 >>
 >
 > Using advisory (score="0") order constraints seems to acheive the correct 
behavior. I have not done
 > extensive testing yet to see if other failover behaviors are broken with 
this approach, but initial
 > basic testing looks good. It is always nice to answer one's own questions :-)
 >
 > colocation rsc2-with-clone inf: rsc2 rsc1-clone
 > colocation rsc3-with-clone inf: rsc3 rsc1-clone
 > order clone-before-rsc2 0: rsc1-clone rsc2
 > order clone-before-rsc3 0: rsc1-clone rsc3
 >
 > Does anyone know of any specific problems with this approach??
 >
 >

I set up a greatly simplified generic resource configuration:

Online: [ elvis queen ]
Clone Set: A-clone [A]
Started: [ elvis queen ]
B-1 (ocf::rgk:typeB): Started elvis
B-2 (ocf::rgk:typeB): Started queen
Clone Set: stonith-l2network-set [stonith-l2network]
Started: [ elvis queen ]

The A and B resources are just shell scripts in infinite while loop where the 
contents of the loop
is a sleep 5 command so they run forever but do not consume machine resources.

If I kill the A-clone running on queen, it just gets restarted and nothing at 
all happens to B-2 (it
stays on queen and never knows any different). This is not optimal behavior for 
our purposes.

However on the good side, if the A-clone cannot (re)start on queen, then B-2 
does fail over to elvis
as we expect.

Does anybody have any ideas about how to get the proper behavior in all cases?



--

Ron Kerry         rke...@sgi.com
Global Product Support - SGI Federal


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to