On Fri, Nov 19, 2010 at 10:11 AM, JiaQiang Xu <xjqkill...@gmail.com> wrote: > Hi, > > I'm using pacemaker 1.0.9 and corosync 1.2.7. > Recently I found a problem with CRMD restart. > > If CRMD crashes or is manually killed, for now corosync will try to restart it > up to 100 times (done in lib/ais/plugin.c). But what if CRMD become so buggy > (or due to some environmental factor) that it cannot be restarted successfully > after 100 times?
This has only ever happened during development when I broke something. No user has ever hit this. > I read through the code and found that in this situation the ais > plugin will send > out a notification message to other nodes in the cluster. But now the > nodes won't > do anything more than updating peer information upon receiving this > notification. > > Is this a bug? No, there is nothing else that needs to be done. Other parts of pacemaker look at that peer data and will shoot the node if necessary. > Or we just don't plan to deal with it? > > Thanks. > --Jiaqiang > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker