On Tue, Aug 10, 2010 at 6:57 PM, Stepan, Troy <troy.ste...@unisys.com> wrote: > Hi, > > I applied the changeset for Bug lf#2433 (No services should be stopped until > probes finish) to pacemaker 1.0.7-4.1.
The PE is sufficiently complex that its quite normal for backports like this not to have the intended result. Its quite possible that this fix built upon another one from 1.0.8 or .9 If the problem persists with .9, please let me know. > Either I misinterpreted the bugfix or it's not working that way I thought it >would. While both of my dummy rscs are running, issuing a clean to dummy0 >stops dummy1 (dummy1 is ordered after dummy0). It looks like the stop is >issued to dummy1 without waiting for the monitor of dummy0 to return. > > CIB: > > node $id="281aeabe-f895-4499-8e45-b380b3e82e0b" qpr1 > node $id="c2493a06-ff09-40cb-b47d-04dae5a00802" qpr2 > primitive dummy0 ocf:heartbeat:Dummy \ > op monitor interval="60s" timeout="120s" > primitive dummy1 ocf:heartbeat:Dummy \ > op monitor interval="60s" timeout="120s" > colocation col-dummy0_dummy1 inf: dummy0 dummy1 > order order-dummy0_dummy1 inf: dummy0:start dummy1:start > property $id="cib-bootstrap-options" \ > dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \ > cluster-infrastructure="Heartbeat" \ > stonith-enabled="false" \ > > Syslog: > > Aug 10 07:49:30 qpr1 crm_shadow: [27517]: info: Invoked: crm_shadow > Aug 10 07:49:30 qpr1 cibadmin: [27518]: info: Invoked: cibadmin -Ql -o nodes > Aug 10 07:49:30 qpr1 cibadmin: [27519]: info: Invoked: cibadmin -Ql -o > resources > Aug 10 07:49:30 qpr1 crm_resource: [27520]: info: Invoked: crm_resource -C -r > dummy0 -H qpr1 > Aug 10 07:49:30 qpr1 crmd: [27131]: info: do_lrm_invoke: Removing resource > dummy0 from the LRM > Aug 10 07:49:30 qpr1 crmd: [27131]: info: send_direct_ack: ACK'ing resource > op dummy0_delete_60000 from 0:0:crm-resource-27520: > lrm_invoke-lrmd-1281440970-14 > Aug 10 07:49:30 qpr1 crmd: [27131]: info: lrm_remove_deleted_op: Removing op > dummy0_monitor_60000:17 for deleted resource dummy0 > Aug 10 07:49:31 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing > key=5:14:7:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy0_monitor_0 ) > Aug 10 07:49:31 qpr1 lrmd: [27128]: info: rsc:dummy0:20: monitor > Aug 10 07:49:31 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing > key=9:14:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_stop_0 ) > Aug 10 07:49:31 qpr1 lrmd: [27128]: info: rsc:dummy1:21: stop > Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy1_monitor_60000 (call=19, status=1, cib-update=0, confirmed=true) > Cancelled > Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy0_monitor_0 (call=20, rc=0, cib-update=44, confirmed=true) ok > Aug 10 07:49:31 qpr1 crm_resource: [27526]: info: Invoked: crm_resource -C -r > dummy0 -H qpr2 > Aug 10 07:49:31 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy1_stop_0 (call=21, rc=0, cib-update=45, confirmed=true) ok > Aug 10 07:49:31 qpr1 cib: [27525]: info: write_cib_contents: Archived > previous version as /var/lib/heartbeat/crm/cib-11.raw > Aug 10 07:49:31 qpr1 cib: [27525]: info: write_cib_contents: Wrote version > 0.17.0 of the CIB to disk (digest: 4ab7e93936f66e0ac5bd95aeae3afcbe) > Aug 10 07:49:31 qpr1 cib: [27525]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.09ENqU (digest: > /var/lib/heartbeat/crm/cib.dIFAa2) > Aug 10 07:49:32 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing > key=8:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy0_monitor_60000 ) > Aug 10 07:49:32 qpr1 lrmd: [27128]: info: rsc:dummy0:22: monitor > Aug 10 07:49:32 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing > key=9:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_start_0 ) > Aug 10 07:49:32 qpr1 lrmd: [27128]: info: rsc:dummy1:23: start > Aug 10 07:49:32 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy1_start_0 (call=23, rc=0, cib-update=46, confirmed=true) ok > Aug 10 07:49:32 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy0_monitor_60000 (call=22, rc=0, cib-update=47, confirmed=false) ok > Aug 10 07:49:32 qpr1 cib: [27536]: info: write_cib_contents: Archived > previous version as /var/lib/heartbeat/crm/cib-12.raw > Aug 10 07:49:32 qpr1 cib: [27536]: info: write_cib_contents: Wrote version > 0.18.0 of the CIB to disk (digest: 1da0b5df8907098f77d8819e16380257) > Aug 10 07:49:32 qpr1 cib: [27536]: info: retrieveCib: Reading cluster > configuration from: /var/lib/heartbeat/crm/cib.G4maW1 (digest: > /var/lib/heartbeat/crm/cib.kfLsWb) > Aug 10 07:49:34 qpr1 crmd: [27131]: info: do_lrm_rsc_op: Performing > key=10:15:0:78c6bc14-cfa7-4516-b291-610bf2ee22eb op=dummy1_monitor_60000 ) > Aug 10 07:49:34 qpr1 lrmd: [27128]: info: rsc:dummy1:24: monitor > Aug 10 07:49:34 qpr1 crmd: [27131]: info: process_lrm_event: LRM operation > dummy1_monitor_60000 (call=24, rc=0, cib-update=48, confirmed=false) ok > > I also patched and tried pacemaker 1.0.6-1 as a sanity check (same result). > The cib files were deleted, the systems were rebooted and resources were > recreated when switching versions. > > Regards, > Troy > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker