Hi Steven, On 04.08.2011, at 18:27, Steven Dake wrote:
> redundant ring is only supported upstream in corosync 1.4.1 or later. What does "supported" mean in this context, exactly? I'm asking, because we're having serious issues with these systems since they went into production (the testing phase did not show any problems, but we also couldn't use real workloads then). Since the cluster went productive, we're having issues with seemingly random STONITH events that seem to be related to a high I/O load on a DRBD-mirrored OCFS2 volume - but I don't see any pattern yet. We've had these machines running for nearly two weeks without major problems and suddenly they went back to killing each other :-( > The retransmit list message issues you are having is fixed in corosync > 1.3.3. and later This is what is triggering the redundant ring faulty > error. Could it also cause the instability problems we're seeing? Thanks again, for helping! -- Sebastian _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker