On 4 Dec 2013, at 11:47 am, Brian J. Murrell <br...@interlinx.bc.ca> wrote:
> > On Tue, 2013-12-03 at 18:26 -0500, David Vossel wrote: >> >> We did away with all of the policy engine logic involved with trying to move >> fencing devices off of the target node before executing the fencing action. >> Behind the scenes all fencing devices are now essentially clones. If the >> target node to be fenced has a fencing device running on it, that device can >> execute anywhere in the cluster to avoid the "suicide" situation. > > OK. > >> When you are looking at crm_mon output and see a fencing device is running >> on a specific node, all that really means is that we are going to attempt to >> execute fencing actions for that device from that node first. If that node >> is unavailable, > > Would it be better to not even try to use a node and ask it to commit > suicide but always try to use another node? IIRC the only time we ask a node to fence itself is when it is (or thinks it is) the last node standing. > >> we'll try that same device anywhere in the cluster we can get it to work > > OK. > >> (unless you've specifically built some location constraint that prevents the >> fencing device from ever running on a specific node) > > While I do have constraints on the more service-oriented resources to > give them preferred nodes, I don't have any constraints on the fencing > resources. > > So given all of the above, and given the log I supplied showing that the > fencing was just not being attempted anywhere other than the node to be > fenced (which was down during that log) any clues as to where to look > for why? > >> Hope that helps. > > It explains the differences, but unfortunately I'm still not sure why it > wouldn't get run somewhere else, eventually, rather than continually > being attempted on the node to be killed (which as I mentioned, was shut > down at the time the log was made). Yes, this is surprising. Can you enable the blackbox for stonith-ng, reproduce and generate a crm_report for us please? It will contain all the information we need.
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org