Hi, I applied the changeset for Bug lf#2433 (No services should be stopped until probes finish) to pacemaker 1.0.7-4.1. Either I misinterpreted the bugfix or it's not working that way I thought it would. While both of my dummy rscs are running, issuing a clean to dummy0 stops dummy1 (dummy1 is ordered after dummy0). It looks like the stop is issued to dummy1 without waiting for the monitor of dummy0 to return.
CIB: node $id="281aeabe-f895-4499-8e45-b380b3e82e0b" qpr1 node $id="c2493a06-ff09-40cb-b47d-04dae5a00802" qpr2 primitive dummy0 ocf:heartbeat:Dummy \ op monitor interval="60s" timeout="120s" primitive dummy1 ocf:heartbeat:Dummy \ op monitor interval="60s" timeout="120s" colocation col-dummy0_dummy1 inf: dummy0 dummy1 order order-dummy0_dummy1 inf: dummy0:start dummy1:start property $id="cib-bootstrap-options" \ dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \ cluster-infrastructure="Heartbeat" \ stonith-enabled="false" \ Syslog: Aug 10 07:49:30 qpr1 crm_shadow: [27517]: info: Invoked: crm_shadow Aug 10 07:49:30 qpr1 cibadmin: [27518]: info: Invoked: cibadmin -Ql -o nodes Aug 10 07:49:30 qpr1 cibadmin: [27519]: info: Invoked: cibadmin -Ql -o resources Aug 10 07:49:30 qpr1 crm_resource: [27520]: info: Invoked: crm_resource -C -r dummy0 -H qpr1 Aug 10 07:49:30 qpr1 crmd: [27131]: info: do_lrm_invoke: Removing resource dummy0 from the LRM Aug 10 07:49:30 qpr1 crmd: [27131]: info: send_direct_ack: ACK'ing resource op dummy0_delete_60000 from 0:0:crm-resource-27520: lrm_invoke-lrmd-1281440970-14 Aug 10 07:49:30 qpr1 crmd: [27131]: info: lrm_remove_deleted_op: Removing op dummy0_monitor_60000:17 for deleted resource dummy0 Aug 10 07:49:31 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing key=5:14:7:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy0_monitor_0 ) Aug 10 07:49:31 qpr1 lrmd: [27128]: info: rsc:dummy0:20: monitor Aug 10 07:49:31 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing key=9:14:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_stop_0 ) Aug 10 07:49:31 qpr1 lrmd: [27128]: info: rsc:dummy1:21: stop Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy1_monitor_60000 (call=19, status=1, cib-update=0, confirmed=true) Cancelled Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy0_monitor_0 (call=20, rc=0, cib-update=44, confirmed=true) ok Aug 10 07:49:31 qpr1 crm_resource: [27526]: info: Invoked: crm_resource -C -r dummy0 -H qpr2 Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy1_stop_0 (call=21, rc=0, cib-update=45, confirmed=true) ok Aug 10 07:49:31 qpr1 cib: [27525]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-11.raw Aug 10 07:49:31 qpr1 cib: [27525]: info: write_cib_contents: Wrote version 0.17.0 of the CIB to disk (digest: 4ab7e93936f66e0ac5bd95aeae3afcbe) Aug 10 07:49:31 qpr1 cib: [27525]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.09ENqU (digest: /var/lib/heartbeat/crm/cib.dIFAa2) Aug 10 07:49:32 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing key=8:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy0_monitor_60000 ) Aug 10 07:49:32 qpr1 lrmd: [27128]: info: rsc:dummy0:22: monitor Aug 10 07:49:32 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing key=9:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_start_0 ) Aug 10 07:49:32 qpr1 lrmd: [27128]: info: rsc:dummy1:23: start Aug 10 07:49:32 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy1_start_0 (call=23, rc=0, cib-update=46, confirmed=true) ok Aug 10 07:49:32 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy0_monitor_60000 (call=22, rc=0, cib-update=47, confirmed=false) ok Aug 10 07:49:32 qpr1 cib: [27536]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-12.raw Aug 10 07:49:32 qpr1 cib: [27536]: info: write_cib_contents: Wrote version 0.18.0 of the CIB to disk (digest: 1da0b5df8907098f77d8819e16380257) Aug 10 07:49:32 qpr1 cib: [27536]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.G4maW1 (digest: /var/lib/heartbeat/crm/cib.kfLsWb) Aug 10 07:49:34 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing key=10:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_monitor_60000 ) Aug 10 07:49:34 qpr1 lrmd: [27128]: info: rsc:dummy1:24: monitor Aug 10 07:49:34 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation dummy1_monitor_60000 (call=24, rc=0, cib-update=48, confirmed=false) ok I also patched and tried pacemaker 1.0.6-1 as a sanity check (same result). The cib files were deleted, the systems were rebooted and resources were recreated when switching versions. Regards, Troy _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker