On Mon, 2009-05-25 at 18:32 +0300, Juha Heinanen wrote: > Florian Haas writes: > > > Agree that they're hacks, but disagree with your alternative. Why should > > Pacemaker be concerned with low-level OpenAIS recovery procedures? > > then have the variable in OpenAIS configuration. >
Self-healing is not as obvious or easy as it sounds. Totem (the protocol) has no way to determine when the admin has replaced the faulty switch in the network. The only options I see is to periodically try the failed ring for liveness. The problem with this approach is it is hard to implement. Another option is to reenable the ring after some period of time internally and "hope for the best". The problem is with this approach that is causes performance degredation every time the failed ring is reenabled and restarted. I think the first option is the best, but atm there isn't anyone that has written patches and most people are focused on the 1.0 release... Regards -steve > -- juha > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker