Re: [Pacemaker] Issue in deleting multi state resource

Andrew Beekhof Tue, 20 May 2014 21:21:25 -0700

On 19 May 2014, at 5:43 pm, K Mehta <kiranmehta1...@gmail.com> wrote:


> Please see my reply inline. Attached is the crm_report output.
> 
> 
> On Thu, May 8, 2014 at 5:45 AM, Andrew Beekhof <and...@beekhof.net> wrote:
> 
> On 8 May 2014, at 12:38 am, K Mehta <kiranmehta1...@gmail.com> wrote:
> 
> > I created a multi state resource ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 
> > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2).
> >
> > Here is the configuration:
> > ==========================
> > [root@vsanqa11 ~]# pcs config
> > Cluster Name: vsanqa11_12
> > Corosync Nodes:
> >
> > Pacemaker Nodes:
> >  vsanqa11 vsanqa12
> >
> > Resources:
> >  Master: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >   Meta Attrs: clone-max=2 globally-unique=false target-role=started
> >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (class=ocf 
> > provider=heartbeat type=vgc-cm-agent.ocf)
> >    Attributes: cluster_uuid=2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >    Operations: monitor interval=30s role=Master timeout=100s 
> > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-30s)
> >                monitor interval=31s role=Slave timeout=100s 
> > (vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-monitor-interval-31s)
> >
> > Stonith Devices:
> > Fencing Levels:
> >
> > Location Constraints:
> >   Resource: ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >     Enabled on: vsanqa11 (score:INFINITY) 
> > (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> >     Enabled on: vsanqa12 (score:INFINITY) 
> > (id:location-ms-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> >   Resource: vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2
> >     Enabled on: vsanqa11 (score:INFINITY) 
> > (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa11-INFINITY)
> >     Enabled on: vsanqa12 (score:INFINITY) 
> > (id:location-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2-vsanqa12-INFINITY)
> > Ordering Constraints:
> > Colocation Constraints:
> >
> > Cluster Properties:
> >  cluster-infrastructure: cman
> >  dc-version: 1.1.10-14.el6_5.2-368c726
> >  last-lrm-refresh: 1399466204
> >  no-quorum-policy: ignore
> >  stonith-enabled: false
> >
> > ==============================================
> > When i try to create and delete this resource in a loop,
> 
> Why would you do that? :-)
> 
> Just to test if things are fine if resource is created and deleted in quick 
> succession. But the issue is also seen arbitrarily. Issue is sometimes seen 
> even in first iteration of the loop. 
> 
> > after few iteration, delete fails as shown below. This can be reproduced 
> > easily. I make sure to unclone resource before deleting the resource. 
> > Unclone happens successfully

[snip]

> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_trigger_update: 
> > Sending flush op to all hosts for: 
> > master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2 (<null>)
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update: Sent 
> > delete 4404: node=vsanqa12, 
> > attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null), 
> > section=status
> > May  7 07:20:13 vsanqa12 crmd[4319]:   notice: process_lrm_event: LRM 
> > operation vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2_stop_0 (call=1379, rc=0, 
> > cib-update=1161, confirmed=true) ok
> > May  7 07:20:13 vsanqa12 attrd[4317]:   notice: attrd_perform_update: Sent 
> > delete 4406: node=vsanqa12, 
> > attr=master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2, id=<n/a>, set=(null), 
> > section=status
> > May  7 07:20:13 vsanqa12 attrd[4317]:  warning: attrd_cib_callback: Update 
> > 4404 for master-vha-2be6c088-a1fa-464a-b00d-f4bccb4f5af2=(null) failed: 
> > Application of an update diff failed
> > May  7 07:20:13 vsanqa12 cib[4314]:  warning: cib_process_diff: Diff 
> > 0.6804.2 -> 0.6804.3 from vsanqa11 not applied to 0.6804.2: Failed 
> > application of an update diff
> > May  7 07:20:13 vsanqa12 cib[4314]:   notice: cib_server_process_diff: Not 
> > applying diff 0.6804.3 -> 0.6804.4 (sync in progress)


Ah. Now I recognise this :-(

First the good news, this will be fixed when the new CIB code arrives in 6.6

The way the old cib works is that one node makes the change and sends it out as 
a patch to the other nodes.
Great in theory except the old patch format wasn't real great at preserving 
ordering changes - but it can detect them, hence:

> May  7 07:20:13 vsanqa12 cib[4314]:  warning: cib_process_diff: Diff 0.6804.2 
> -> 0.6804.3 from vsanqa11 not applied to 0.6804.2: Failed application of an 
> update diff

The cib does recover, but the operation is reported as having failed to pcs.

We are considering a couple of options that may make it into 6.5

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Issue in deleting multi state resource

Reply via email to