07.02.2012 00:22, Andrew Beekhof wrote: > Stonith is never a SPOF. Sorry for being unclear.
I meant that having redundant PSU connected to two outlets of the same PDU (connected to the one power source in turn) is a SPOF for a node, not for a cluster. So I assume that everybody connect every RPSU to two different PDUs and to two different power sources. Then it is impossible to do reset-like (all offs - all ons) operation on two power outlets from within a single instance of fencing agent (which knows about one PDU only). Then that logic should be considered to be moved one layer upper. > > Something else needs to have failed before fencing has even a chance to do so. > > Unless you put all the nodes on the same PDU... but that would be silly. > > On Mon, Feb 6, 2012 at 3:29 PM, Vladislav Bogdanov <bub...@hoster-ok.com> > wrote: >> 06.02.2012 01:55, Andrew Beekhof wrote: >>> On Sat, Feb 4, 2012 at 5:50 AM, Vladislav Bogdanov <bub...@hoster-ok.com> >>> wrote: >>>> Hi Andrew, Dejan, all, >>>> >>>> 25.01.2012 03:24, Andrew Beekhof wrote: >>>> [snip] >>>>>>> If they're for the same host but different devices, then at most >>>>>>> you'll get the commands sent in parallel, guaranteeing simultaneous is >>>>>>> near impossible. >>>>>> >>>>>> Yes, what I meant is almost simultaneous, i.e. that both ports >>>>>> are for a while turned "off" at the same time. I'm not sure how >>>>>> does it work in reality. For instance, how long does the reset >>>>>> command keep the power off on the outlet. So, it should be >>>>>> "simultanous enough" :) >>>>> >>>>> I dont think 'reboot' is an option if you're using multiple devices. >>>>> You have to use 'off' (followed by a manual 'on') for any kind of >>>>> reliability. >>>>> >>>> >>>> Why not to implement subsequent 'ons' after all 'offs' are confirmed? >>> >>> That could be possible in the future. >>> However since none of this was possible in the old stonithd, its not >>> something I plan for the initial implementation. >>> >>> Also, you're requiring an extra level of intelligence in stonith-ng, >>> to know that even though the admin asked for 'reboot' and the devices >>> support 'reboot', that we should ignore that and do 'off' + 'on' in >>> some specific scenarios. >>> I just >>>> With some configurable delay f.e. >>>> That would be great for careful admins who keep fencing device lists >>>> actual. >>>> From admin's PoV, reset and reset-like on-off operations should not >>>> differ in a result, offending host should be restarted if admin says >>>> 'restart' or 'reboot' in fencing parameters for that host (sorry, do not >>>> remember which one is used). >>>> Need in manual 'on' looks like a limitation for me so I wouldn't use >>>> such fencing mechanism. I prefer to have everything automated and >>>> predictable as much as possible. >>> >>> Then don't put a node under the control of two devices. >>> Have it be two ports on the same host and you wont hit this limitation. >> >> It's a SPOF in the case of PDUs. >> >> I do not use PDUs at all, I have everything ready to shorten 'reset' >> lines on servers instead of plugging off power cords, just waiting for >> linear fencing topology to be implemented in both snonith-ng and crmsh. >> >> So, I just care about generic admin who wants to use PDUs for fencing. >> >>> >>>> If 'on' is not done, then fencing is not doing what you've specified >>>> (for 'reboot/reset' action). >>>> >>>> Even more, if we need to do 'reset' of a host which has two PSUs >>>> connected to two different PDUs, then it should be translated to >>>> 'all-off' - 'delay' - 'all-on' automatically. I would like such powerful >>>> fencing system very much (yes, I'm a careful admin). >>>> >>>> I understand that implementation will require some efforts (even for so >>>> great programmer like you Andrew), but that would be a really useful >>>> feature, >>>> >>>> Best, >>>> Vladislav >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org