Hi,

I am using IPMI plugin for configuring STONITH with heartbeat cluster.
If a resource fails on one node then the other node STONITHs that node. But
when the failed node comes back after the reboot, the STONITH device itself
fails on the node which has started again. Logs indicate that IPMI start
operation returned 1 (i.e. unknown error). I suspect that this may be due
to some initialization delays at network level. But I am not sure about
this. What could be the best way to overcome this issue? I consider adding
a start delay to stonith device but can't say if that is the right
approach.

Moreover, how should one configure start/monitor operation failure for a
STONITH device? I have currently configured pacemaker to fence the node if
start/monitor operation fails for STONITH device. Is this the right
configuration?

And what should be the monitoring frequency for STONITH device?

Regards
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to