On 11/04/2013, at 8:15 AM, pavan tc <pavan...@gmail.com> wrote: > Hi, > > [I did go through the mail thread titled: "RHEL6 and clones: CMAN needed > anyway?", but was not sure about some answers there] > > I recently moved from pacemaker 1.1.7 to 1.1.8-7 on centos 6.2. I see the > following in syslog: > > corosync[2966]: [pcmk ] ERROR: process_ais_conf: You have configured a > cluster using the Pacemaker plugin for Corosync. The plugin is not supported > in this environment and will be removed very soon. > corosync[2966]: [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of > 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using > Pacemaker with CMAN > > Does this mean that my current configuration is incorrect and will not work > as it used to with pacemaker 1.1.7/Corosync?
It will continue to work until the Pacemaker plugin is removed from RHEL. > > I looked at the "Clusters from Scratch" instructions and it talks mostly > about GFS2. I don't have any filesystem requirements. In that case, can I > live with Pacemaker/Corosync? Yes, but only until the Pacemaker plugin is removed from RHEL. > > I do understand that this config is not recommended, but the reason I ask is > because I am hitting a weird problem with this setup which I will explain > below. Just want to make sure that I don't start off with an erroneous setup. > > I have a two-node multi-state resource configured with the following config: > > [root@vsanqa4 ~]# crm configure show > node vsanqa3 > node vsanqa4 > primitive vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e > ocf:heartbeat:vgc-cm-agent.ocf \ > params cluster_uuid="6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e" \ > op monitor interval="30s" role="Master" timeout="100s" \ > op monitor interval="31s" role="Slave" timeout="100s" > ms ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e > vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e \ > meta clone-max="2" globally-unique="false" target-role="Started" > location ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e-nodes > ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e \ > rule $id="ms-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e-nodes-rule" -inf: > #uname ne vsanqa4 and #uname ne vsanqa3 > property $id="cib-bootstrap-options" \ > dc-version="1.1.8-7.el6-394e906" \ > cluster-infrastructure="classic openais (with plugin)" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > With this config, if I simulate a crash on the master with "echo c > > /proc/sysrq-trigger", the slave does not get promoted for about 15 minutes. > It does detect the peer going down, but does not seem to issue the promote > immediately: > > Apr 10 14:12:32 vsanqa4 corosync[2966]: [TOTEM ] A processor failed, > forming new configuration. > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] notice: pcmk_peer_update: > Transitional membership event on ring 166060: memb=1, new=0, lost=1 > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: pcmk_peer_update: > memb: vsanqa4 1967394988 > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: pcmk_peer_update: > lost: vsanqa3 1950617772 > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] notice: pcmk_peer_update: > Stable membership event on ring 166060: memb=1, new=0, lost=0 > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: pcmk_peer_update: > MEMB: vsanqa4 1967394988 > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: > ais_mark_unseen_peer_dead: Node vsanqa3 was not seen in the previous > transition > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: update_member: Node > 1950617772/vsanqa3 is now: lost > Apr 10 14:12:38 vsanqa4 corosync[2966]: [pcmk ] info: > send_member_notification: Sending membership update 166060 to 2 children > Apr 10 14:12:38 vsanqa4 corosync[2966]: [TOTEM ] A processor joined or left > the membership and a new membership was formed. > Apr 10 14:12:38 vsanqa4 cib[3386]: notice: ais_dispatch_message: Membership > 166060: quorum lost > Apr 10 14:12:38 vsanqa4 crmd[3391]: notice: ais_dispatch_message: > Membership 166060: quorum lost > Apr 10 14:12:38 vsanqa4 cib[3386]: notice: crm_update_peer_state: > crm_update_ais_node: Node vsanqa3[1950617772] - state is now lost > Apr 10 14:12:38 vsanqa4 crmd[3391]: notice: crm_update_peer_state: > crm_update_ais_node: Node vsanqa3[1950617772] - state is now lost > Apr 10 14:12:38 vsanqa4 corosync[2966]: [CPG ] chosen downlist: sender > r(0) ip(172.16.68.117) ; members(old:2 left:1) > Apr 10 14:12:38 vsanqa4 corosync[2966]: [MAIN ] Completed service > synchronization, ready to provide service. > > Then (after about 15 minutes), I see the following: There were no logs at all in between? > > Apr 10 14:26:46 vsanqa4 crmd[3391]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Apr 10 14:26:46 vsanqa4 pengine[3390]: notice: unpack_config: On loss of > CCM Quorum: Ignore > Apr 10 14:26:46 vsanqa4 pengine[3390]: notice: LogActions: Promote > vha-6f92a1f6-969c-4c41-b9ca-7eb6f83ace2e:0#011(Slave -> Master vsanqa4) > Apr 10 14:26:46 vsanqa4 pengine[3390]: notice: process_pe_message: > Calculated Transition 3: /var/lib/pacemaker/pengine/pe-input-392.bz2 > > Thanks, > Pavan > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org