Re: [Pacemaker] Problem with CRMD restart

Andrew Beekhof Sat, 20 Nov 2010 00:59:33 -0800

On Fri, Nov 19, 2010 at 10:11 AM, JiaQiang Xu <xjqkill...@gmail.com> wrote:
> Hi,
>
> I'm using pacemaker 1.0.9 and corosync 1.2.7.
> Recently I found a problem with CRMD restart.
>
> If CRMD crashes or is manually killed, for now corosync will try to restart it
> up to 100 times (done in lib/ais/plugin.c). But what if CRMD become so buggy
> (or due to some environmental factor) that it cannot be restarted successfully
> after 100 times?


This has only ever happened during development when I broke something.
No user has ever hit this.

> I read through the code and found that in this situation the ais
> plugin will send
> out a notification message to other nodes in the cluster. But now the
> nodes won't
> do anything more than updating peer information upon receiving this
> notification.
>
> Is this a bug?

No, there is nothing else that needs to be done.
Other parts of pacemaker look at that peer data and will shoot the
node if necessary.

> Or we just don't plan to deal with it?
>
> Thanks.
> --Jiaqiang
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Problem with CRMD restart

Reply via email to