Hi Dejan- I apologize for creating the hb_report with experimental timeouts and fail counts not reset. I found the issue was with the clustered file system. When node2 disappeared, OCFS2 I/O would hang while the file system recovered from the lost node. When the start timeouts were set higher, resources would start as soon as I/O resumed which explains the delay in failover
You don't have stonith configured, which makes a two-node > configuration impossible. I'm interested to know what you mean by this. I've configured several 2 node heartbeat clusters without stonith since data divergance wasn't a huge worry. This is my first time working with pacemaker/openais. What difference does stonith make if the second node is not available to be shot? ie, power failure. Thanks again _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
