Re: [Pacemaker] Pacemaker 1.1.8, Corosync, No CMAN, Promotion issues

Andrew Beekhof Thu, 11 Apr 2013 15:32:26 -0700

On 11/04/2013, at 8:15 AM, pavan tc <pavan...@gmail.com> wrote:

> Hi,
> 
> [I did go through the mail thread titled: "RHEL6 and clones: CMAN needed 
> anyway?", but was not sure about some answers there]
> 
> I recently moved from pacemaker 1.1.7 to 1.1.8-7 on centos 6.2. I see the 
> following in syslog:
> 
> corosync[2966]:   [pcmk  ] ERROR: process_ais_conf: You have configured a 
> cluster using the Pacemaker plugin for Corosync. The plugin is not supported 
> in this environment and will be removed very soon.
> corosync[2966]:   [pcmk  ] ERROR: process_ais_conf:  Please see Chapter 8 of 
> 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using 
> Pacemaker with CMAN
> 
> Does this mean that my current configuration is incorrect and will not work 
> as it used to with pacemaker 1.1.7/Corosync?


It will continue to work until the Pacemaker plugin is removed from RHEL.

> 
> I looked at the "Clusters from Scratch" instructions and it talks mostly 
> about GFS2. I don't have any filesystem requirements. In that case, can I 
> live with Pacemaker/Corosync?

Yes, but only until the Pacemaker plugin is removed from RHEL.

> 
> I do understand that this config is not recommended, but the reason I ask is 
> because I am hitting a weird problem with this setup which I will explain 
> below. Just want to make sure that I don't start off with an erroneous setup.
> 
> I have a two-node multi-state resource configured with the following config:
> 
> [root@vsanqa4 ~]# crm configure show
> node vsanqa3
> node vsanqa4
> primitive vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e 
> ocf:heartbeat:vgc-cm-agent.ocf \
>         params cluster_uuid="6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e" \
>         op monitor interval="30s" role="Master" timeout="100s" \
>         op monitor interval="31s" role="Slave" timeout="100s"
> ms ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e 
> vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e \
>         meta clone-max="2" globally-unique="false" target-role="Started"
> location ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e-nodes 
> ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e \
>         rule $id="ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e-nodes-rule" -inf: 
> #uname ne vsanqa4 and #uname ne vsanqa3
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.8-7.el6-394e906" \
>         cluster-infrastructure="classic openais (with plugin)" \
>         expected-quorum-votes="2" \
>         stonith-enabled="false" \
>         no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="100"
> 
> With this config, if I simulate a crash on the master with "echo c > 
> /proc/sysrq-trigger", the slave does not get promoted for about 15 minutes. 
> It does detect the peer going down, but does not seem to issue the promote 
> immediately:
> 
> Apr 10 14:12:32 vsanqa4 corosync[2966]:   [TOTEM ] A processor failed, 
> forming new configuration.
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] notice: pcmk_peer_update: 
> Transitional membership event on ring 166060: memb=1, new=0, lost=1
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: pcmk_peer_update: 
> memb: vsanqa4 1967394988
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: pcmk_peer_update: 
> lost: vsanqa3 1950617772
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] notice: pcmk_peer_update: 
> Stable membership event on ring 166060: memb=1, new=0, lost=0
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: pcmk_peer_update: 
> MEMB: vsanqa4 1967394988
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: 
> ais_mark_unseen_peer_dead: Node vsanqa3 was not seen in the previous 
> transition
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: update_member: Node 
> 1950617772/vsanqa3 is now: lost
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [pcmk  ] info: 
> send_member_notification: Sending membership update 166060 to 2 children
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [TOTEM ] A processor joined or left 
> the membership and a new membership was formed.
> Apr 10 14:12:38 vsanqa4 cib[3386]:   notice: ais_dispatch_message: Membership 
> 166060: quorum lost
> Apr 10 14:12:38 vsanqa4 crmd[3391]:   notice: ais_dispatch_message: 
> Membership 166060: quorum lost
> Apr 10 14:12:38 vsanqa4 cib[3386]:   notice: crm_update_peer_state: 
> crm_update_ais_node: Node vsanqa3[1950617772] - state is now lost
> Apr 10 14:12:38 vsanqa4 crmd[3391]:   notice: crm_update_peer_state: 
> crm_update_ais_node: Node vsanqa3[1950617772] - state is now lost
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [CPG   ] chosen downlist: sender 
> r(0) ip(172.16.68.117) ; members(old:2 left:1)
> Apr 10 14:12:38 vsanqa4 corosync[2966]:   [MAIN  ] Completed service 
> synchronization, ready to provide service.
> 
> Then (after about 15 minutes), I see the following:

There were no logs at all in between?

> 
> Apr 10 14:26:46 vsanqa4 crmd[3391]:   notice: do_state_transition: State 
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED 
> origin=crm_timer_popped ]
> Apr 10 14:26:46 vsanqa4 pengine[3390]:   notice: unpack_config: On loss of 
> CCM Quorum: Ignore
> Apr 10 14:26:46 vsanqa4 pengine[3390]:   notice: LogActions: Promote 
> vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e:0#011(Slave -> Master vsanqa4)
> Apr 10 14:26:46 vsanqa4 pengine[3390]:   notice: process_pe_message: 
> Calculated Transition 3: /var/lib/pacemaker/pengine/pe-input-392.bz2
> 
> Thanks,
> Pavan
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Pacemaker 1.1.8, Corosync, No CMAN, Promotion issues

Reply via email to