[Pacemaker] [Question] About the stop order at the time of the Probe error.

renayama19661014 Tue, 21 Aug 2012 23:49:42 -0700

Hi All,

We found a problem at the time of Porobe error.


It is the following simple resource constitution.

============
Last updated: Wed Aug 22 15:19:50 2012
Stack: Heartbeat
Current DC: drbd1 (6081ac99-d941-40b9-a4a3-9f996ff291c0) - partition with quorum
Version: 1.0.12-c6770b8
1 Nodes configured, unknown expected votes
1 Resources configured.
============

Online: [ drbd1 ]

 Resource Group: grpTest
     resource1  (ocf::pacemaker:Dummy): Started drbd1
     resource2  (ocf::pacemaker:Dummy): Started drbd1
     resource3  (ocf::pacemaker:Dummy): Started drbd1
     resource4  (ocf::pacemaker:Dummy): Started drbd1

Node Attributes:
* Node drbd1:

Migration summary:
* Node drbd1: 


Depending on the resource that the Probe error occurs, the stop of the resource 
does not become the inverse order.

I confirmed it in the next procedure.

Step 1) Make resource2 and resource4 a starting state.

[root@drbd1 ~]# touch /var/run/Dummy-resource2.state
[root@drbd1 ~]# touch /var/run/Dummy-resource4.state

Step 2) Start a node and send cib.

Step 3) Resource2 and resource3 stop, but are not inverse order.

(snip)
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: group_print:  Resource Group: 
grpTest
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
resource1#011(ocf::pacemaker:Dummy):#011Stopped 
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
resource2#011(ocf::pacemaker:Dummy):#011Started drbd1
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
resource3#011(ocf::pacemaker:Dummy):#011Stopped 
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
resource4#011(ocf::pacemaker:Dummy):#011Started drbd1
(snip)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 6: 
stop resource2_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
key=6:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource2_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource2 stop[6] (pid 32745)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 
11: stop resource4_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
key=11:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource4_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource4 stop[7] (pid 32746)
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: operation stop[6] on resource2 for 
client 32719: pid 32745 exited with return code 0
(snip)


I know that there is a cause of this stop order for order in group.

In this case our user wants to stop a resource in inverse order definitely.

 * resource4_stop -> resource2_stop

Stop order is important to the resource of our user.


I ask next question.

Question 1) Is there right setting in cib.xml to evade this problem?

Question 2) In Pacemaker1.1, does this problem occur?

Question 3) I added following order.


        <rsc_order id="order-2" first="resource1" then="resource3" />
        <rsc_order id="order-3" first="resource1" then="resource4" />
        <rsc_order id="order-5" first="resource2" then="resource4" />

            And the addition of this order seems to solve a problem.
            Is the addition of order right as one method of the solution, too?


Best Regards,
Hideo Yamauchi.


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] [Question] About the stop order at the time of the Probe error.

Reply via email to