I have narrowed this down to an issue that I feel is really a bug in the way pacemaker is dealing
with constraints made between resource groups as opposed to resource primitives. The version of
pacemaker involved here is:
libpacemaker3-1.1.2-0.7.1
pacemaker-1.1.2-0.7.1
A configuration which involves colocation/order constraints made between a clone and a simple
primitive exhibits proper failover behavior.
Online: [ elvis queen ]
Clone Set: A-clone [A]
Started: [ elvis queen ]
B-1 (ocf::rgk:typeB): Started elvis
B-2 (ocf::rgk:typeB): Started queen
Clone Set: stonith-l2network-set [stonith-l2network]
Started: [ elvis queen ]
<constraints>
<rsc_colocation id="A-with-B-1" rsc="B-1" score="INFINITY"
with-rsc="A-clone"/>
<rsc_colocation id="A-with-B-2" rsc="B-2" score="INFINITY"
with-rsc="A-clone"/>
<rsc_order first="A-clone" id="A-before-B-1" symmetrical="true"
then="B-1"/>
<rsc_order first="A-clone" id="A-before-B-2" symmetrical="true"
then="B-2"/>
</constraints>
This is from after queen come back into the cluster after beign reset.
Mar 1 16:07:03 elvis pengine: [4218]: info: determine_online_status: Node
elvis is online
Mar 1 16:07:03 elvis pengine: [4218]: info: determine_online_status: Node
queen is online
Mar 1 16:07:03 elvis pengine: [4218]: notice: clone_print: Clone Set: A-clone
[A]
Mar 1 16:07:03 elvis pengine: [4218]: notice: short_print: Started: [
elvis ]
Mar 1 16:07:03 elvis pengine: [4218]: notice: short_print: Stopped: [ A:1
]
Mar 1 16:07:03 elvis pengine: [4218]: notice: native_print: B-1 (ocf::rgk:typeB):
Started elvis
Mar 1 16:07:03 elvis pengine: [4218]: notice: native_print: B-2 (ocf::rgk:typeB):
Started elvis
Mar 1 16:07:03 elvis pengine: [4218]: notice: clone_print: Clone Set: stonith-l2network-set
[stonith-l2network]
Mar 1 16:07:03 elvis pengine: [4218]: notice: short_print: Started: [
elvis ]
Mar 1 16:07:03 elvis pengine: [4218]: notice: short_print: Stopped: [
stonith-l2network:1 ]
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource A:0
(Started elvis)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Start A:1 (queen)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource B-1
(Started elvis)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource B-2
(Started elvis)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Leave resource stonith-l2network:0
(Started elvis)
Mar 1 16:07:03 elvis pengine: [4218]: notice: LogActions: Start
stonith-l2network:1 (queen)
Note that the dependent B-1 and B-2 resources are left running where they were. This is proper and
expected failover behavior given they have resource-stickiness set.
While the same configuration replacing the simple primitives with a group of two primitives exhibits
incorrect failover behavior.
Online: [ elvis queen ]
Clone Set: AZ-clone [AZ-group]
Started: [ elvis queen ]
Resource Group: BC-group-1
B-1 (ocf::rgk:typeB): Started elvis
C-1 (ocf::rgk:typeC): Started elvis
Clone Set: stonith-l2network-set [stonith-l2network]
Started: [ elvis queen ]
Resource Group: BC-group-2
B-2 (ocf::rgk:typeB): Started queen
C-2 (ocf::rgk:typeC): Started queen
<constraints>
<rsc_colocation id="AZ-with-BC-group-1" rsc="BC-group-1" score="INFINITY"
with-rsc="AZ-clone"/>
<rsc_colocation id="AZ-with-BC-group-2" rsc="BC-group-2" score="INFINITY"
with-rsc="AZ-clone"/>
<rsc_order first="AZ-clone" id="AZ-before-BC-group-1" symmetrical="true"
then="BC-group-1"/>
<rsc_order first="AZ-clone" id="AZ-before-BC-group-2" symmetrical="true"
then="BC-group-2"/>
</constraints>
Mar 2 12:44:43 elvis pengine: [4218]: info: determine_online_status: Node
elvis is online
Mar 2 12:44:43 elvis pengine: [4218]: info: determine_online_status: Node
queen is online
Mar 2 12:44:43 elvis pengine: [4218]: notice: clone_print: Clone Set:
AZ-clone [AZ-group]
Mar 2 12:44:43 elvis pengine: [4218]: notice: short_print: Started: [
elvis ]
Mar 2 12:44:43 elvis pengine: [4218]: notice: short_print: Stopped: [
AZ-group:1 ]
Mar 2 12:44:43 elvis pengine: [4218]: notice: group_print: Resource Group:
BC-group-1
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: B-1 (ocf::rgk:typeB):
Started elvis
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: C-1 (ocf::rgk:typeC):
Started elvis
Mar 2 12:44:43 elvis pengine: [4218]: notice: clone_print: Clone Set: stonith-l2network-set
[stonith-l2network]
Mar 2 12:44:43 elvis pengine: [4218]: notice: short_print: Started: [
elvis ]
Mar 2 12:44:43 elvis pengine: [4218]: notice: short_print: Stopped: [
stonith-l2network:1 ]
Mar 2 12:44:43 elvis pengine: [4218]: notice: group_print: Resource Group:
BC-group-2
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: B-2 (ocf::rgk:typeB):
Started elvis
Mar 2 12:44:43 elvis pengine: [4218]: notice: native_print: C-2 (ocf::rgk:typeC):
Started elvis
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource A:0
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource Z:0
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start A:1 (queen)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start Z:1 (queen)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource B-1
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource C-1
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Leave resource stonith-l2network:0
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Start
stonith-l2network:1 (queen)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource B-2
(Started elvis)
Mar 2 12:44:43 elvis pengine: [4218]: notice: LogActions: Restart resource C-2
(Started elvis)
Note that the dependent B-1/C-1 and B-2/C-2 resources are restarted. This is incorrect failover
behavior given they have resource-stickiness set.
This appears to me to be a clear bug. Pacemaker should not be handling groups any differently than
it does primitives!
On 3/1/2011 5:48 PM, Ron Kerry wrote:
On 3/1/2011 2:39 PM, Ron Kerry wrote:
> On 2/28/2011 2:33 PM, Ron Kerry wrote:
>> Folks -
>>
>> I have a configuration issue that I am unsure how to resolve. Consider the
following set of
>> resources.
>>
>> clone rsc1-clone rsc1 \
>> meta clone-max="2" target-role="Started"
>> primitive rsc1 ...
>> primitive rsc2 ... meta resource-stickiness="1"
>> primitive rsc3 ... meta resource-stickiness="1"
>>
>> Plus the following constraints
>>
>> colocation rsc2-with-clone inf: rsc2 rsc1-clone
>> colocation rsc3-with-clone inf: rsc3 rsc1-clone
>> order clone-before-rsc2 : rsc1-clone rsc2
>> order clone-before-rsc3 : rsc1-clone rsc3
>>
>>
>> I am getting the following behavior that is undesirable.
>>
>> During normal operation, a copy of the rsc1 resource is running on my two
systems with rs2 and rsc3
>> typically running split between the two systems. The rsc2 & rsc3 resources
are operationally
>> dependent on a copy of rsc1 being up and running first.
>>
>> SystemA SystemB
>> ======= =======
>> rsc1 rsc1
>> rsc2 rsc3
>>
>> If SystemB goes down, then rsc3 moves over to SystemA as expected
>>
>> SystemA SystemB
>> ======= =======
>> rsc1 X X
>> rsc2 X
>> rsc3 X X
>>
>> When SystemB comes back into the cluster, crmd starts the rsc1 clone on
SystemB but then also
>> restarts both rsc2 & rsc3. This means both are stopped and then both
started again. This is not what
>> we want. We want these resources to remain running on SystemA until one of
them is moved manually by
>> an administrator to re-balance them across the systems.
>>
>> How do we configure these resources/constraints to achieve that behavior?
We are already using
>> resource-stickiness, but that is meaningless if crmd is going to be doing a
restart of these
>> resources.
>>
>
> Using advisory (score="0") order constraints seems to acheive the correct
behavior. I have not done
> extensive testing yet to see if other failover behaviors are broken with
this approach, but initial
> basic testing looks good. It is always nice to answer one's own questions :-)
>
> colocation rsc2-with-clone inf: rsc2 rsc1-clone
> colocation rsc3-with-clone inf: rsc3 rsc1-clone
> order clone-before-rsc2 0: rsc1-clone rsc2
> order clone-before-rsc3 0: rsc1-clone rsc3
>
> Does anyone know of any specific problems with this approach??
>
>
I set up a greatly simplified generic resource configuration:
Online: [ elvis queen ]
Clone Set: A-clone [A]
Started: [ elvis queen ]
B-1 (ocf::rgk:typeB): Started elvis
B-2 (ocf::rgk:typeB): Started queen
Clone Set: stonith-l2network-set [stonith-l2network]
Started: [ elvis queen ]
The A and B resources are just shell scripts in infinite while loop where the
contents of the loop
is a sleep 5 command so they run forever but do not consume machine resources.
If I kill the A-clone running on queen, it just gets restarted and nothing at
all happens to B-2 (it
stays on queen and never knows any different). This is not optimal behavior for
our purposes.
However on the good side, if the A-clone cannot (re)start on queen, then B-2
does fail over to elvis
as we expect.
Does anybody have any ideas about how to get the proper behavior in all cases?
--
Ron Kerry rke...@sgi.com
Global Product Support - SGI Federal
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker