Hi Dejan,

Dejan Muhamedagic wrote:
As usual, constructive criticism/suggestions/etc are welcome.

Thanks for sharing.
Allow me to bring up a topic that to my point of view is important.
You have written:

The lights-out devices (IBM RSA, HP iLO, Dell DRAC) are becoming increasingly 
popular
and in future they may even become standard equipment of of-the-shelf computers.
They are, however, inferior to UPS devices, because they share a power supply 
with their
host (a cluster node). If a node stays without power, the device supposed to 
control it
would be just as useless. Even though this is obvious to us, the cluster 
manager is not
in the know and will try to fence the node in vain. This will continue forever 
because all
other resource operations would wait for the fencing/stonith operation to 
succeed.

This is the same problem with PDUs as they share the same power supply with
the host as well.  Is there any intention to deal with this issue?  I'm
thinking of the powerfail algorithm:

If the PDUs becomes unavailable and shortly after the host is unavailable as
well, then assume the host is down and fenced successfully.

This would be true if the PDU (and with it the host) loses power.
At the moment it looks that stonith without such an algorithm is
a SPoF by design, because after a single failure (powerloss), the
cluster is not able to bring up the resources again.

Looking forward to your comments,

  Peter


_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to