ping. What can I do to assist in moving this bug forward to be fix? On Wed, May 30, 2012 at 10:42 AM, Larry Brigman <larry.brig...@gmail.com> wrote: > On Tue, May 29, 2012 at 3:08 PM, Larry Brigman <larry.brig...@gmail.com> > wrote: >> On Fri, May 25, 2012 at 3:40 PM, David Vossel <dvos...@redhat.com> wrote: >>> ----- Original Message ----- >>>> From: "Larry Brigman" <larry.brig...@gmail.com> >>>> To: "The Pacemaker cluster resource manager" >>>> <pacemaker@oss.clusterlabs.org> >>>> Sent: Friday, May 25, 2012 5:27:21 PM >>>> Subject: Re: [Pacemaker] Removed nodes showing back in status >>>> >>>> On Fri, May 25, 2012 at 9:59 AM, Larry Brigman >>>> <larry.brig...@gmail.com> wrote: >>>> > On Wed, May 16, 2012 at 1:53 PM, David Vossel <dvos...@redhat.com> >>>> > wrote: >>>> >> ----- Original Message ----- >>>> >>> From: "Larry Brigman" <larry.brig...@gmail.com> >>>> >>> To: "The Pacemaker cluster resource manager" >>>> >>> <pacemaker@oss.clusterlabs.org> >>>> >>> Sent: Monday, May 14, 2012 4:59:55 PM >>>> >>> Subject: Re: [Pacemaker] Removed nodes showing back in status >>>> >>> >>>> >>> On Mon, May 14, 2012 at 2:13 PM, David Vossel >>>> >>> <dvos...@redhat.com> >>>> >>> wrote: >>>> >>> > ----- Original Message ----- >>>> >>> >> From: "Larry Brigman" <larry.brig...@gmail.com> >>>> >>> >> To: "The Pacemaker cluster resource manager" >>>> >>> >> <pacemaker@oss.clusterlabs.org> >>>> >>> >> Sent: Monday, May 14, 2012 1:30:22 PM >>>> >>> >> Subject: Re: [Pacemaker] Removed nodes showing back in status >>>> >>> >> >>>> >>> >> On Mon, May 14, 2012 at 9:54 AM, Larry Brigman >>>> >>> >> <larry.brig...@gmail.com> wrote: >>>> >>> >> > I have a 5 node cluster (but it could be any number of >>>> >>> >> > nodes, 3 >>>> >>> >> > or >>>> >>> >> > larger). >>>> >>> >> > I am testing some scripts for node removal. >>>> >>> >> > I remove a node from the cluster and everything looks >>>> >>> >> > correct >>>> >>> >> > from >>>> >>> >> > crm >>>> >>> >> > status standpoint. >>>> >>> >> > When I remove a second node, the first node that was removed >>>> >>> >> > now >>>> >>> >> > shows back >>>> >>> >> > in the crm status as off-line. I'm following the guidelines >>>> >>> >> > provided >>>> >>> >> > in Pacemaker Explained docs. >>>> >>> >> > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-node-delete.html >>>> >>> >> > >>>> >>> >> > I believe this is a bug but want to put it out to the list >>>> >>> >> > to be >>>> >>> >> > sure. >>>> >>> >> > Versions. >>>> >>> >> > RHEL5.7 x86_64 >>>> >>> >> > corosync-1.4.2 >>>> >>> >> > openais-1.1.3 >>>> >>> >> > pacemaker-1.1.5 >>>> >>> >> > >>>> >>> >> > Status after first node removed >>>> >>> >> > [root@portland-3 ~]# crm status >>>> >>> >> > ============ >>>> >>> >> > Last updated: Mon May 14 08:42:04 2012 >>>> >>> >> > Stack: openais >>>> >>> >> > Current DC: portland-1 - partition with quorum >>>> >>> >> > Version: >>>> >>> >> > 1.1.5-1.3.sme-01e86afaaa6d4a8c4836f68df80ababd6ca3902f >>>> >>> >> > 4 Nodes configured, 4 expected votes >>>> >>> >> > 0 Resources configured. >>>> >>> >> > ============ >>>> >>> >> > >>>> >>> >> > Online: [ portland-1 portland-2 portland-3 portland-4 ] >>>> >>> >> > >>>> >>> >> > Status after second node removed. >>>> >>> >> > [root@portland-3 ~]# crm status >>>> >>> >> > ============ >>>> >>> >> > Last updated: Mon May 14 08:42:45 2012 >>>> >>> >> > Stack: openais >>>> >>> >> > Current DC: portland-1 - partition with quorum >>>> >>> >> > Version: >>>> >>> >> > 1.1.5-1.3.sme-01e86afaaa6d4a8c4836f68df80ababd6ca3902f >>>> >>> >> > 4 Nodes configured, 3 expected votes >>>> >>> >> > 0 Resources configured. >>>> >>> >> > ============ >>>> >>> >> > >>>> >>> >> > Online: [ portland-1 portland-3 portland-4 ] >>>> >>> >> > OFFLINE: [ portland-5 ] >>>> >>> >> > >>>> >>> >> > Both nodes were removed from the cluster from node 1. >>>> >>> >> >>>> >>> >> When I added a node back into the cluster the second node >>>> >>> >> that was removed now shows as offline. >>>> >>> > >>>> >>> > The only time I've seen this sort of behavior is when I don't >>>> >>> > completely shutdown corosync and pacemaker on the node I'm >>>> >>> > removing before I delete it's configuration from the cib. Are >>>> >>> > you >>>> >>> > sure corosync and pacemaker are gone before you delete the node >>>> >>> > from the cluster config? >>>> >>> >>>> >>> Well, I run service pacemaker stop and service corosync stop >>>> >>> prior to >>>> >>> doing >>>> >>> the remove. Since I am doing it all in a script it's possible >>>> >>> that >>>> >>> there >>>> >>> is a race condition that I have just expose or the services are >>>> >>> not >>>> >>> fully down >>>> >>> when the service script exits. >>>> >> >>>> >> Yep, If you are waiting for the service scripts to return I would >>>> >> expect it to be safe to remove the nodes at that point. >>>> >> >>>> >>> BTW, I'm running pacemaker as it's own process instead of being a >>>> >>> child of >>>> >>> corosync (if that makes a difference). >>>> >>> >>>> >> >>>> >> This shouldn't matter. >>>> >> >>>> >> An hb_report of this will help us distinguish if this is a bug or >>>> >> not. >>>> > Bug opened with the hb and crm reports. >>>> > https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2648 >>>> > >>>> >>>> I just tried something that seem to point that things are still >>>> around somewhere >>>> in the cib. I stopped and pacemaker. This causes both removed nodes >>>> to show back in pacemaker as offline. Looks like the cluster's from >>>> scratch >>>> documentation to remove a node doesn't work correctly. >>> >>> Interesting, thanks for generating the logs. I'll look through them when I >>> get a chance. >>> >>>> BTW which is the best place to file the bugs? Clusterlabs or >>>> Linuxfoundations? >>> >>> We are tracking pacemaker issues here, http://bugs.clusterlabs.org/. Please >>> re-locate the issue. >> >> Done: http://bugs.clusterlabs.org/show_bug.cgi?id=5068 > > Looks like any cib transition will cause the removed not to re-appear. > > What are the next steps that I can do to assist?
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org