On 14 Nov 2013, at 5:53 pm, Kazunori INOUE <kazunori.ino...@gmail.com> wrote:
> Hi, Andrew > > 2013/11/13 Kazunori INOUE <kazunori.ino...@gmail.com>: >> 2013/11/13 Andrew Beekhof <and...@beekhof.net>: >>> >>> On 16 Oct 2013, at 8:51 am, Andrew Beekhof <and...@beekhof.net> wrote: >>> >>>> >>>> On 15/10/2013, at 8:24 PM, Kazunori INOUE <kazunori.ino...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm using pacemaker-1.1 (the latest devel). >>>>> I started resource (f1 and f2) which fence vm3 on vm1. >>>>> >>>>> $ crm_mon -1 >>>>> Last updated: Tue Oct 15 15:16:37 2013 >>>>> Last change: Tue Oct 15 15:16:21 2013 via crmd on vm1 >>>>> Stack: corosync >>>>> Current DC: vm1 (3232261517) - partition with quorum >>>>> Version: 1.1.11-0.284.6a5e863.git.el6-6a5e863 >>>>> 3 Nodes configured >>>>> 3 Resources configured >>>>> >>>>> Online: [ vm1 vm2 vm3 ] >>>>> >>>>> pDummy (ocf::pacemaker:Dummy): Started vm3 >>>>> Resource Group: gStonith3 >>>>> f1 (stonith:external/libvirt): Started vm1 >>>>> f2 (stonith:external/ssh): Started vm1 >>>>> >>>>> >>>>> "reset" of f1 which hasn't been started on vm2 was performed when vm3 is >>>>> fenced. >>>>> >>>>> $ ssh vm3 'rm -f /var/run/Dummy-pDummy.state' >>>>> $ for i in vm1 vm2; do ssh $i 'hostname; egrep " reset | off " >>>>> /var/log/ha-log'; done >>>>> vm1 >>>>> Oct 15 15:17:35 vm1 stonith-ng[14870]: warning: log_operation: >>>>> f2:15076 [ Performing: stonith -t external/ssh -T reset vm3 ] >>>>> Oct 15 15:18:06 vm1 stonith-ng[14870]: warning: log_operation: >>>>> f2:15464 [ Performing: stonith -t external/ssh -T reset vm3 ] >>>>> vm2 >>>>> Oct 15 15:17:16 vm2 stonith-ng[9160]: warning: log_operation: f1:9273 >>>>> [ Performing: stonith -t external/libvirt -T reset vm3 ] >>>>> Oct 15 15:17:46 vm2 stonith-ng[9160]: warning: log_operation: f1:9588 >>>>> [ Performing: stonith -t external/libvirt -T reset vm3 ] >>>>> >>>>> Is it specifications? >>>> >>>> Yes, although the host on which the device is started usually gets >>>> priority. >>>> I will try to find some time to look through the report to see why this >>>> didn't happen. >>> >>> Reading through this again, it sounds like it should be fixed by your >>> earlier pull request: >>> >>> https://github.com/beekhof/pacemaker/commit/6b4bfd6 >>> >>> Yes? >> >> No. > > How is this change? Thanks for this. I tweaked it a bit further and pushed: https://github.com/beekhof/pacemaker/commit/4cbbeb0 > > diff --git a/fencing/remote.c b/fencing/remote.c > index 6c11ba9..68b31c5 100644 > --- a/fencing/remote.c > +++ b/fencing/remote.c > @@ -778,6 +778,7 @@ stonith_choose_peer(remote_fencing_op_t * op) > { > st_query_result_t *peer = NULL; > const char *device = NULL; > + uint32_t active = fencing_active_peers(); > > do { > if (op->devices) { > @@ -790,7 +791,8 @@ stonith_choose_peer(remote_fencing_op_t * op) > > if ((peer = find_best_peer(device, op, FIND_PEER_SKIP_TARGET > | FIND_PEER_VERIFIED_ONLY))) { > return peer; > - } else if ((peer = find_best_peer(device, op, > FIND_PEER_SKIP_TARGET))) { > + } else if ((op->query_timer == 0 || op->replies >= > op->replies_expected || op->replies >= active) > + && (peer = find_best_peer(device, op, > FIND_PEER_SKIP_TARGET))) { > return peer; > } else if ((peer = find_best_peer(device, op, > FIND_PEER_TARGET_ONLY))) { > return peer; > @@ -801,8 +803,13 @@ stonith_choose_peer(remote_fencing_op_t * op) > && stonith_topology_next(op) == pcmk_ok); > > if (op->devices) { > - crm_notice("Couldn't find anyone to fence %s with %s", op->target, > - (char *)op->devices->data); > + if (op->query_timer == 0 || op->replies >= > op->replies_expected || op->replies >= active) { > + crm_notice("Couldn't find anyone to fence %s with %s", > op->target, > + (char *)op->devices->data); > + } else { > + crm_debug("Couldn't find verified device to fence %s with > %s", op->target, > + (char *)op->devices->data); > + } > } else { > crm_debug("Couldn't find anyone to fence %s", op->target); > } > > >>>> I'm kind of swamped at the moment though. >>>> >>>>> >>>>> Best Regards, >>>>> Kazunori INOUE >>>>> <stopped_resource_performed_reset.tar.bz2>_______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org