Stonith is never a SPOF. Something else needs to have failed before fencing has even a chance to do so.
Unless you put all the nodes on the same PDU... but that would be silly. On Mon, Feb 6, 2012 at 3:29 PM, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: > 06.02.2012 01:55, Andrew Beekhof wrote: >> On Sat, Feb 4, 2012 at 5:50 AM, Vladislav Bogdanov <bub...@hoster-ok.com> >> wrote: >>> Hi Andrew, Dejan, all, >>> >>> 25.01.2012 03:24, Andrew Beekhof wrote: >>> [snip] >>>>>> If they're for the same host but different devices, then at most >>>>>> you'll get the commands sent in parallel, guaranteeing simultaneous is >>>>>> near impossible. >>>>> >>>>> Yes, what I meant is almost simultaneous, i.e. that both ports >>>>> are for a while turned "off" at the same time. I'm not sure how >>>>> does it work in reality. For instance, how long does the reset >>>>> command keep the power off on the outlet. So, it should be >>>>> "simultanous enough" :) >>>> >>>> I dont think 'reboot' is an option if you're using multiple devices. >>>> You have to use 'off' (followed by a manual 'on') for any kind of >>>> reliability. >>>> >>> >>> Why not to implement subsequent 'ons' after all 'offs' are confirmed? >> >> That could be possible in the future. >> However since none of this was possible in the old stonithd, its not >> something I plan for the initial implementation. >> >> Also, you're requiring an extra level of intelligence in stonith-ng, >> to know that even though the admin asked for 'reboot' and the devices >> support 'reboot', that we should ignore that and do 'off' + 'on' in >> some specific scenarios. >> >>> With some configurable delay f.e. >>> That would be great for careful admins who keep fencing device lists actual. >>> From admin's PoV, reset and reset-like on-off operations should not >>> differ in a result, offending host should be restarted if admin says >>> 'restart' or 'reboot' in fencing parameters for that host (sorry, do not >>> remember which one is used). >>> Need in manual 'on' looks like a limitation for me so I wouldn't use >>> such fencing mechanism. I prefer to have everything automated and >>> predictable as much as possible. >> >> Then don't put a node under the control of two devices. >> Have it be two ports on the same host and you wont hit this limitation. > > It's a SPOF in the case of PDUs. > > I do not use PDUs at all, I have everything ready to shorten 'reset' > lines on servers instead of plugging off power cords, just waiting for > linear fencing topology to be implemented in both snonith-ng and crmsh. > > So, I just care about generic admin who wants to use PDUs for fencing. > >> >>> If 'on' is not done, then fencing is not doing what you've specified >>> (for 'reboot/reset' action). >>> >>> Even more, if we need to do 'reset' of a host which has two PSUs >>> connected to two different PDUs, then it should be translated to >>> 'all-off' - 'delay' - 'all-on' automatically. I would like such powerful >>> fencing system very much (yes, I'm a careful admin). >>> >>> I understand that implementation will require some efforts (even for so >>> great programmer like you Andrew), but that would be a really useful >>> feature, >>> >>> Best, >>> Vladislav >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org