Re: [Pacemaker] Issue in deleting multi state resource

Andrew Beekhof Thu, 22 May 2014 03:36:30 -0700

On 21 May 2014, at 3:13 pm, K Mehta <kiranmehta1...@gmail.com> wrote:


> Andrew,
> 1.  Is there a workaround for this issue ? '

For now its basically just "retry the command"

> 2. Also, can let me know if there are more issues with old versions in 
> deleting multistate resource as mentioned in 
> http://www.gossamer-threads.com/lists/linuxha/pacemaker/91230

that looks like an issue with pcs not removing constraints that reference the 
resource you're trying to delete

> 
> Regards,
>  Kiran
> 
> 
> On Wed, May 21, 2014 at 9:44 AM, Andrew Beekhof <and...@beekhof.net> wrote:
> 
> On 19 May 2014, at 5:43 pm, K Mehta <kiranmehta1...@gmail.com> wrote:
> 
> > Please see my reply inline. Attached is the crm_report output.
> >
> >
> > On Thu, May 8, 2014 at 5:45 AM, Andrew Beekhof <and...@beekhof.net> wrote:
> >
> > On 8 May 2014, at 12:38 am, K Mehta <kiranmehta1...@gmail.com> wrote:
> >
> > > I created a multi state resource ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 
> > > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2).
> > >
> > > Here is the configuration:
> > > ==========================
> > > [root@vsanqa11 ~]# pcs config
> > > Cluster Name: vsanqa11_12
> > > Corosync Nodes:
> > >
> > > Pacemaker Nodes:
> > >  vsanqa11 vsanqa12
> > >
> > > Resources:
> > >  Master: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> > >   Meta Attrs: clone-max=2 globally-unique=false target-role=started
> > >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (class=ocf 
> > > provider=heartbeat type=vgc-cm-agent.ocf)
> > >    Attributes: cluster_uuid=2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> > >    Operations: monitor interval=30s role=Master timeout=100s 
> > > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-30s)
> > >                monitor interval=31s role=Slave timeout=100s 
> > > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-31s)
> > >
> > > Stonith Devices:
> > > Fencing Levels:
> > >
> > > Location Constraints:
> > >   Resource: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> > >     Enabled on: vsanqa11 (score:INFINITY) 
> > > (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> > >     Enabled on: vsanqa12 (score:INFINITY) 
> > > (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> > >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> > >     Enabled on: vsanqa11 (score:INFINITY) 
> > > (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> > >     Enabled on: vsanqa12 (score:INFINITY) 
> > > (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> > > Ordering Constraints:
> > > Colocation Constraints:
> > >
> > > Cluster Properties:
> > >  cluster-infrastructure: cman
> > >  dc-version: 1.1.10-14.el6_5.2-368c726
> > >  last-lrm-refresh: 1399466204
> > >  no-quorum-policy: ignore
> > >  stonith-enabled: false
> > >
> > > ==============================================
> > > When i try to create and delete this resource in a loop,
> >
> > Why would you do that? :-)
> >
> > Just to test if things are fine if resource is created and deleted in quick 
> > succession. But the issue is also seen arbitrarily. Issue is sometimes seen 
> > even in first iteration of the loop.
> >
> > > after few iteration, delete fails as shown below. This can be reproduced 
> > > easily. I make sure to unclone resource before deleting the resource. 
> > > Unclone happens successfully
> 
> [snip]
> 
> > > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_trigger_update: 
> > > Sending flush op to all hosts for: 
> > > master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (<null>)
> > > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update: 
> > > Sent delete 4404: node=vsanqa12, 
> > > attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, 
> > > set=(null), section=status
> > > May  7 07:20:13 vsanqa12 crmd[4319]:   notice: process_lrm_event: LRM 
> > > operation vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2_stop_0 (call=1379, 
> > > rc=0, cib-update=1161, confirmed=true) ok
> > > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update: 
> > > Sent delete 4406: node=vsanqa12, 
> > > attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, 
> > > set=(null), section=status
> > > May  7 07:20:13 vsanqa12 attrd[4317]:  warning: attrd_cib_callback: 
> > > Update 4404 for master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2=(null) 
> > > failed: Application of an update diff failed
> > > May  7 07:20:13 vsanqa12 cib[4314]:  warning: cib_process_diff: Diff 
> > > 0.6804.2 -> 0.6804.3 from vsanqa11 not applied to 0.6804.2: Failed 
> > > application of an update diff
> > > May  7 07:20:13 vsanqa12 cib[4314]:   notice: cib_server_process_diff: 
> > > Not applying diff 0.6804.3 -> 0.6804.4 (sync in progress)
> 
> 
> Ah. Now I recognise this :-(
> 
> First the good news, this will be fixed when the new CIB code arrives in 6.6
> 
> The way the old cib works is that one node makes the change and sends it out 
> as a patch to the other nodes.
> Great in theory except the old patch format wasn't real great at preserving 
> ordering changes - but it can detect them, hence:
> 
> > May  7 07:20:13 vsanqa12 cib[4314]:  warning: cib_process_diff: Diff 
> > 0.6804.2 -> 0.6804.3 from vsanqa11 not applied to 0.6804.2: Failed 
> > application of an update diff
> 
> The cib does recover, but the operation is reported as having failed to pcs.
> 
> We are considering a couple of options that may make it into 6.5
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Issue in deleting multi state resource

Reply via email to