On 2009-06-04T09:23:04, Steven Dake <sd...@redhat.com> wrote: > The problem with checking the link status with the current code is that > the protocol blocks I/O waiting for a response from the failed ring. > This could of course be modified to behave differently.
Right, so the rechecking could possibly be a separate thread, sending an occasional liveness packet on the failed ring and trigger the RRP recovery after it has heard from other nodes on it? Some smarts would be needed of course to not constantly retrigger partially active rings (which would fail again immediately). > So the act of failing a link is expensive and we dont want to retest > that it is valid very often. Does "expensive" mean that it'll actually slow down the healthy ring(s)? Regards, Lars -- SuSE Labs, OPS Engineering, Novell, Inc. SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker