On Mon, 23 Sep 2013 20:48:10 +1000, Andrew Beekhof wrote: > On 23/09/2013, at 8:27 PM, Alessandro Bono <alessandro.b...@gmail.com> wrote: > >> On Mon, 23 Sep 2013 15:31:59 +1000, Andrew Beekhof wrote: >> >>> I see: >>> >>>> Sep 22 00:45:48 [4412] ga1-ext pacemakerd: error: pcmk_cpg_dispatch: >>>> Connection to the CPG API failed: Library error (2) >>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: error: pcmk_cpg_dispatch: >>>> Connection to the CPG API failed: Library error (2) >>>> Sep 22 00:45:48 [4421] ga1-ext attrd: error: pcmk_cpg_dispatch: >>>> Connection to the CPG API failed: Library error (2) >>>> Sep 22 00:45:48 [4418] ga1-ext cib: error: pcmk_cpg_dispatch: >>>> Connection to the CPG API failed: Library error (2) >>>> Sep 22 00:45:48 [4423] ga1-ext crmd: error: pcmk_cpg_dispatch: >>>> Connection to the CPG API failed: Library error (2) >>> >>> which basically means Corosync died or was killed. If you configure >>> logging in cluster.conf, corosync might provide more information. >> >> Hi Andrew >> >> if I set in /etc/sysconfig/pacemaker >> >> PCMK_debug=yes >> PCMK_debugfile=/var/log/pacemaker.log >> PCMK_logpriority=debug >> >> Is it correct to debug this problem or there is a better configuration? > > That configures _pacemaker's_ logging. > You want to find out why _corosync_ is dying, hence: > >>> If you configure logging in cluster.conf
ok now I understand modified cluster.conf like this <cluster config_version="6" name="ga-ext_cluster"> <logging debug="on"/> <clusternodes> <clusternode name="ga1-ext" nodeid="1"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="ga1-ext"/> </method> </fence> </clusternode> <clusternode name="ga2-ext" nodeid="2"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="ga2-ext"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_pcmk" name="pcmk"/> </fencedevices> </cluster> restarted pacemaker on both nodes thank you for your support > >> >> >>> >>> I also see: >>> >>>>> stonith-enabled="false" \ >>> >>> which is why the cluster couldn't return to a sane state and you got a >>> split-brain for drbd. >> >> ok, I'll try to configure stonith with libvirt >> >> thanks >> >>> >>> >>> On 23/09/2013, at 5:34 AM, Alessandro Bono <alessandro.b...@gmail.com> >>> wrote: >>> >>>> Ok Sunday morning it's not good to collect information >>>> logs are in /var/log/cluster directory on both nodes >>>> these are for secondary node ga2-ext >>>> >>>> Sep 22 00:45:39 corosync [TOTEM ] Process pause detected for 20442 ms, >>>> flushing membership messages. >>>> Sep 22 00:45:45 corosync [CMAN ] quorum lost, blocking activity >>>> Sep 22 00:45:45 corosync [QUORUM] This node is within the non-primary >>>> component and will NOT provide any services. >>>> Sep 22 00:45:45 corosync [QUORUM] Members[1]: 2 >>>> Sep 22 00:45:45 corosync [TOTEM ] A processor joined or left the >>>> membership and a new membership was formed. >>>> Sep 22 00:45:45 corosync [CMAN ] quorum regained, resuming activity >>>> Sep 22 00:45:45 corosync [QUORUM] This node is within the primary >>>> component and will provide service. >>>> Sep 22 00:45:45 corosync [QUORUM] Members[2]: 1 2 >>>> Sep 22 00:45:45 corosync [QUORUM] Members[2]: 1 2 >>>> Sep 22 00:45:46 corosync [CPG ] chosen downlist: sender r(0) >>>> ip(10.12.23.2) ; members(old:2 left:1) >>>> Sep 22 00:45:46 corosync [MAIN ] Completed service synchronization, ready >>>> to provide service. >>>> Sep 22 00:45:46 [11432] ga2-ext cib: info: pcmk_cpg_membership: >>>> Left[1.0] cib.1 >>>> Sep 22 00:45:46 [11433] ga2-ext stonith-ng: info: pcmk_cpg_membership: >>>> Left[1.0] stonith-ng.1 >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now offline >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: pcmk_cpg_membership: >>>> Member[1.0] cib.2 >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: pcmk_cpg_membership: >>>> Joined[2.0] cib.1 >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: pcmk_cpg_membership: >>>> Member[2.0] cib.1 >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now online >>>> Sep 22 00:45:47 [11432] ga2-ext cib: info: pcmk_cpg_membership: >>>> Member[2.1] cib.2 >>>> Sep 22 00:45:47 [11433] ga2-ext stonith-ng: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now offline >>>> Sep 22 00:45:48 [11432] ga2-ext cib: info: cib_process_diff: >>>> Diff 0.143.13 -> 0.143.14 from ga1-ext not applied to 0.143.9: >>>> current "num_updates" is less than required >>>> Sep 22 00:45:48 [11432] ga2-ext cib: info: >>>> cib_server_process_diff: Requesting re-sync from peer >>>> Sep 22 00:45:48 [11432] ga2-ext cib: notice: >>>> cib_server_process_diff: Not applying diff 0.143.14 -> 0.143.15 (sync in >>>> progress) >>>> Sep 22 00:45:48 [11432] ga2-ext cib: notice: >>>> cib_server_process_diff: Not applying diff 0.143.15 -> 0.143.16 (sync in >>>> progress) >>>> Sep 22 00:45:48 [11432] ga2-ext cib: notice: >>>> cib_server_process_diff: Not applying diff 0.143.16 -> 0.143.17 (sync in >>>> progress) >>>> Sep 22 00:45:48 [11433] ga2-ext stonith-ng: info: pcmk_cpg_membership: >>>> Member[1.0] stonith-ng.2 >>>> Sep 22 00:45:48 [11433] ga2-ext stonith-ng: info: pcmk_cpg_membership: >>>> Joined[2.0] stonith-ng.1 >>>> Sep 22 00:45:48 [11433] ga2-ext stonith-ng: info: pcmk_cpg_membership: >>>> Member[2.0] stonith-ng.1 >>>> Sep 22 00:45:48 [11433] ga2-ext stonith-ng: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now online >>>> Sep 22 00:45:48 [11433] ga2-ext stonith-ng: info: pcmk_cpg_membership: >>>> Member[2.1] stonith-ng.2 >>>> Sep 22 00:45:48 [11432] ga2-ext cib: info: cib_process_replace: >>>> Digest matched on replace from ga1-ext: >>>> 00be365e16e96092747ee3d8acc74e7b >>>> Sep 22 00:45:48 [11432] ga2-ext cib: info: cib_process_replace: >>>> Replaced 0.143.9 with 0.143.17 from ga1-ext >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_replace_notify: >>>> Replaced: 0.143.9 -> 0.143.17 from ga1-ext >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_replace operation for section 'all': OK (rc=0, >>>> origin=ga1-ext/ga2-ext/(null), version=0.143.17) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: crm_client_new: >>>> Connecting 0x23c9cd0 for uid=0 gid=0 pid=24772 >>>> id=3e16081f-1b7b-4fb7-ab3a-e47dffb68615 >>>> Sep 22 00:45:48 [11437] ga2-ext crmd: notice: cman_event_callback: >>>> Membership 904: quorum lost >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: notice: >>>> crm_update_peer_state: cman_event_callback: Node ga1-ext[1] - state is >>>> now lost (was member) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_query operation for section nodes: OK (rc=0, >>>> origin=local/crm_attribute/2, version=0.143.17) >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: info: >>>> peer_update_callback: ga1-ext is now lost (was member) >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: warning: reap_dead_nodes: >>>> Our DC node (ga1-ext) left the cluster >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: notice: cman_event_callback: >>>> Membership 904: quorum acquired >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: warning: reap_dead_nodes: >>>> Our DC node (ga1-ext) left the cluster >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: info: cman_event_callback: >>>> Membership 904: quorum retained >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: warning: reap_dead_nodes: >>>> Our DC node (ga1-ext) left the cluster >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_query operation for section >>>> //cib/status//node_state[@id='ga2-ext']//transient_attributes//nvpair[@name='shutdown']: >>>> No such device or address (rc=-6, origin=local/attrd/18, version=0.143.17) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_query operation for section >>>> //cib/status//node_state[@id='ga2-ext']//transient_attributes//nvpair[@name='terminate']: >>>> No such device or address (rc=-6, origin=local/attrd/19, version=0.143.17) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_query operation for section >>>> //cib/status//node_state[@id='ga2-ext']//transient_attributes//nvpair[@name='probe_complete']: >>>> OK (rc=0, origin=local/attrd/20, version=0.143.17) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Forwarding cib_modify operation for section status to master >>>> (origin=local/attrd/21) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Completed cib_query operation for section >>>> //cib/status//node_state[@id='ga2-ext']//transient_attributes//nvpair[@name='master-drbd0']: >>>> OK (rc=0, origin=local/attrd/22, version=0.143.17) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: cib_process_request: >>>> Forwarding cib_modify operation for section status to master >>>> (origin=local/attrd/23) >>>> Sep 22 00:45:49 [11432] ga2-ext cib: info: crm_client_destroy: >>>> Destroying 0 events >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: notice: do_state_transition: >>>> State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION >>>> cause=C_FSA_INTERNAL origin=reap_dead_nodes ] >>>> Sep 22 00:45:49 [11437] ga2-ext crmd: info: update_dc: >>>> Unset DC. Was ga1-ext >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: pcmk_cpg_membership: >>>> Left[1.0] crmd.1 >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now offline >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: >>>> peer_update_callback: Client ga1-ext/peer now has status [offline] >>>> (DC=<null>) >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: pcmk_cpg_membership: >>>> Member[1.0] crmd.2 >>>> Sep 22 00:45:50 [11432] ga2-ext cib: info: write_cib_contents: >>>> Archived previous version as /var/lib/heartbeat/crm/cib-48.raw >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: pcmk_cpg_membership: >>>> Joined[2.0] crmd.1 >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: pcmk_cpg_membership: >>>> Member[2.0] crmd.1 >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: >>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga1-ext[1] - >>>> corosync-cpg is now online >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: >>>> peer_update_callback: Client ga1-ext/peer now has status [online] >>>> (DC=<null>) >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: info: pcmk_cpg_membership: >>>> Member[2.1] crmd.2 >>>> Sep 22 00:45:50 [11437] ga2-ext crmd: warning: do_log: FSA: Input >>>> I_JOIN_OFFER from route_message() received in state S_ELECTION >>>> Sep 22 00:45:50 [11432] ga2-ext cib: info: write_cib_contents: >>>> Wrote version 0.143.0 of the CIB to disk (digest: >>>> 1d6de77608199fb31f552e22d5d0f708) >>>> Sep 22 00:45:50 [11432] ga2-ext cib: info: retrieveCib: >>>> Reading cluster configuration from: /var/lib/heartbeat/crm/cib.EqQlyt >>>> (digest: /var/lib/heartbeat/crm/cib.kNjefQ) >>>> Sep 22 00:45:58 corosync [TOTEM ] A processor failed, forming new >>>> configuration. >>>> Sep 22 00:46:00 corosync [CMAN ] quorum lost, blocking activity >>>> Sep 22 00:46:00 corosync [QUORUM] This node is within the non-primary >>>> component and will NOT provide any services. >>>> Sep 22 00:46:00 corosync [QUORUM] Members[1]: 2 >>>> Sep 22 00:46:00 corosync [TOTEM ] A processor joined or left the >>>> membership and a new membership was formed. >>>> Sep 22 00:46:00 corosync [CPG ] chosen downlist: sender r(0) >>>> ip(10.12.23.2) ; members(old:2 left:1) >>>> Sep 22 00:46:00 corosync [MAIN ] Completed service synchronization, ready >>>> to provide service. >>>> >>>> /var/log/cluster/fenced.log on primary >>>> >>>> Sep 22 00:45:48 fenced cluster is down, exiting >>>> Sep 22 00:45:48 fenced daemon cpg_dispatch error 2 >>>> Sep 22 00:45:48 fenced cpg_dispatch error 2 >>>> >>>> /var/log/cluster/dlm_controld.log on primary >>>> Sep 22 00:45:48 dlm_controld cluster is down, exiting >>>> Sep 22 00:45:48 dlm_controld daemon cpg_dispatch error 2 >>>> >>>> >>>> On Sun, 22 Sep 2013 08:21:18 +0000, Alessandro Bono wrote: >>>> >>>>> Found logs in corosync(!?) log directory >>>>> >>>>> these are for primary node ga1-ext >>>>> >>>>> Sep 22 00:45:29 corosync [TOTEM ] A processor failed, forming new >>>>> configuration. >>>>> Sep 22 00:45:31 corosync [CMAN ] quorum lost, blocking activity >>>>> Sep 22 00:45:31 corosync [QUORUM] This node is within the non-primary >>>>> component and will NOT provide any services. >>>>> Sep 22 00:45:31 corosync [QUORUM] Members[1]: 1 >>>>> Sep 22 00:45:31 corosync [TOTEM ] A processor joined or left the >>>>> membership and a new membership was formed. >>>>> Sep 22 00:45:31 corosync [CPG ] chosen downlist: sender r(0) >>>>> ip(10.12.23.1) ; members(old:2 left:1) >>>>> Sep 22 00:45:31 corosync [MAIN ] Completed service synchronization, >>>>> ready to provide service. >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: pcmk_cpg_membership: >>>>> Left[4.0] cib.2 >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now offline >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: pcmk_cpg_membership: >>>>> Member[4.0] cib.1 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: pcmk_cpg_membership: >>>>> Left[4.0] crmd.2 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now offline >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> peer_update_callback: Client ga2-ext/peer now has status [offline] >>>>> (DC=true) >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: warning: match_down_event: >>>>> No match for shutdown action on ga2-ext >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: >>>>> peer_update_callback: Stonith/shutdown of ga2-ext not matched >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> crm_update_peer_join: peer_update_callback: Node ga2-ext[2] - join-2 >>>>> phase 4 -> 0 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> abort_transition_graph: peer_update_callback:214 - Triggered >>>>> transition abort (complete=1) : Node failure >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: pcmk_cpg_membership: >>>>> Member[4.0] crmd.1 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: cman_event_callback: >>>>> Membership 900: quorum lost >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: >>>>> crm_update_peer_state: cman_event_callback: Node ga2-ext[2] - state >>>>> is now lost (was member) >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> peer_update_callback: ga2-ext is now lost (was member) >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: warning: match_down_event: >>>>> No match for shutdown action on ga2-ext >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: >>>>> peer_update_callback: Stonith/shutdown of ga2-ext not matched >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> abort_transition_graph: peer_update_callback:214 - Triggered >>>>> transition abort (complete=1) : Node failure >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: do_state_transition: >>>>> State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN >>>>> cause=C_FSA_INTERNAL origin=check_join_state ] >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> do_dc_join_offer_one: An unknown node joined - (re-)offer to any >>>>> unconfirmed nodes >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: join_make_offer: >>>>> Making join offers based on membership 900 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: join_make_offer: >>>>> Skipping ga1-ext: already known 4 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> abort_transition_graph: do_te_invoke:158 - Triggered transition abort >>>>> (complete=1) : Peer Halt >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: do_state_transition: >>>>> State transition S_INTEGRATION -> S_FINALIZE_JOIN [ >>>>> input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: crmd_join_phase_log: >>>>> join-2: ga2-ext=none >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: crmd_join_phase_log: >>>>> join-2: ga1-ext=confirmed >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: do_state_transition: >>>>> State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ >>>>> input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: >>>>> abort_transition_graph: do_te_invoke:151 - Triggered transition abort >>>>> (complete=1) : Peer Cancelled >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/703, version=0.143.10) >>>>> Sep 22 00:45:31 [4419] ga1-ext stonith-ng: info: pcmk_cpg_membership: >>>>> Left[3.0] stonith-ng.2 >>>>> Sep 22 00:45:31 [4419] ga1-ext stonith-ng: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now offline >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/704, version=0.143.10) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section cib: OK (rc=0, >>>>> origin=local/crmd/705, version=0.143.11) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section nodes: OK (rc=0, >>>>> origin=local/crmd/706, version=0.143.11) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/707, version=0.143.12) >>>>> Sep 22 00:45:31 [4421] ga1-ext attrd: notice: >>>>> attrd_local_callback: Sending full refresh (origin=crmd) >>>>> Sep 22 00:45:31 [4421] ga1-ext attrd: notice: >>>>> attrd_trigger_update: Sending flush op to all hosts for: >>>>> probe_complete (true) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section nodes: OK (rc=0, >>>>> origin=local/crmd/708, version=0.143.12) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/709, version=0.143.13) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section cib: OK (rc=0, >>>>> origin=local/crmd/710, version=0.143.13) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_query operation for section 'all': OK (rc=0, >>>>> origin=local/crmd/711, version=0.143.13) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: notice: unpack_config: >>>>> On loss of CCM Quorum: Ignore >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: >>>>> determine_online_status: Node ga1-ext is online >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource dovecot active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: notice: unpack_rsc_op: >>>>> Operation monitor found resource drbd0:0 active in master mode on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource ClusterIP active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource mail active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource mysql active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource drbdlinks active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: unpack_rsc_op: >>>>> Operation monitor found resource SharedFS active on ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: clone_print: >>>>> Master/Slave Set: ms_drbd0 [drbd0] >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: short_print: >>>>> Masters: [ ga1-ext ] >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: short_print: >>>>> Stopped: [ ga2-ext ] >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: group_print: >>>>> Resource Group: service_group >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> SharedFS (ocf::heartbeat:Filesystem): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> drbdlinks (ocf::tummy:drbdlinks): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> ClusterIP (ocf::heartbeat:IPaddr): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> mail (ocf::heartbeat:MailTo): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> mysql (lsb:mysqld): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_print: >>>>> dovecot (lsb:dovecot): Started ga1-ext >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: native_color: >>>>> Resource drbd0:1 cannot run anywhere >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: master_color: >>>>> Promoting drbd0:0 (Master ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: master_color: >>>>> ms_drbd0: Promoted 1 instances of a possible 1 to master >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave drbd0:0 (Master ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave drbd0:1 (Stopped) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave SharedFS (Started ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave drbdlinks (Started ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave ClusterIP (Started ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave mail (Started ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave mysql (Started ga1-ext) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: info: LogActions: >>>>> Leave dovecot (Started ga1-ext) >>>>> Sep 22 00:45:31 [4419] ga1-ext stonith-ng: info: pcmk_cpg_membership: >>>>> Member[3.0] stonith-ng.1 >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_query operation for section >>>>> //cib/status//node_state[@id='ga1-ext']//transient_attributes//nvpair[@name='probe_complete']: >>>>> OK (rc=0, origin=local/attrd/51, version=0.143.13) >>>>> Sep 22 00:45:31 [4421] ga1-ext attrd: notice: >>>>> attrd_trigger_update: Sending flush op to all hosts for: >>>>> master-drbd0 (10000) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/attrd/52, version=0.143.13) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_query operation for section >>>>> //cib/status//node_state[@id='ga1-ext']//transient_attributes//nvpair[@name='master-drbd0']: >>>>> OK (rc=0, origin=local/attrd/53, version=0.143.13) >>>>> Sep 22 00:45:31 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/attrd/54, version=0.143.13) >>>>> Sep 22 00:45:31 [4422] ga1-ext pengine: notice: process_pe_message: >>>>> Calculated Transition 621: >>>>> /var/lib/pacemaker/pengine/pe-input-288.bz2 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: do_state_transition: >>>>> State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ >>>>> input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: do_te_invoke: >>>>> Processing graph 621 (ref=pe_calc-dc-1379803531-659) derived from >>>>> /var/lib/pacemaker/pengine/pe-input-288.bz2 >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: run_graph: >>>>> Transition 621 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, >>>>> Source=/var/lib/pacemaker/pengine/pe-input-288.bz2): Complete >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: info: do_log: FSA: >>>>> Input I_TE_SUCCESS from notify_crmd() received in state >>>>> S_TRANSITION_ENGINE >>>>> Sep 22 00:45:31 [4423] ga1-ext crmd: notice: do_state_transition: >>>>> State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS >>>>> cause=C_FSA_INTERNAL origin=notify_crmd ] >>>>> Sep 22 00:45:45 corosync [TOTEM ] A processor joined or left the >>>>> membership and a new membership was formed. >>>>> Sep 22 00:45:45 corosync [CMAN ] quorum regained, resuming activity >>>>> Sep 22 00:45:45 corosync [QUORUM] This node is within the primary >>>>> component and will provide service. >>>>> Sep 22 00:45:45 corosync [QUORUM] Members[2]: 1 2 >>>>> Sep 22 00:45:45 corosync [QUORUM] Members[2]: 1 2 >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: notice: cman_event_callback: >>>>> Membership 904: quorum acquired >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: notice: >>>>> crm_update_peer_state: cman_event_callback: Node ga2-ext[2] - state >>>>> is now member (was lost) >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: info: >>>>> peer_update_callback: ga2-ext is now member (was lost) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/712, version=0.143.14) >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: info: crm_cs_flush: >>>>> Sent 0 CPG messages (1 remaining, last=37): Try again (6) >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: info: cman_event_callback: >>>>> Membership 904: quorum retained >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: crm_cs_flush: >>>>> Sent 0 CPG messages (1 remaining, last=64): Try again (6) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section cib: OK (rc=0, >>>>> origin=local/crmd/713, version=0.143.15) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section nodes: OK (rc=0, >>>>> origin=local/crmd/714, version=0.143.15) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/715, version=0.143.16) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section nodes: OK (rc=0, >>>>> origin=local/crmd/716, version=0.143.16) >>>>> Sep 22 00:45:45 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/717, version=0.143.16) >>>>> Sep 22 00:45:45 [4423] ga1-ext crmd: info: crm_cs_flush: >>>>> Sent 0 CPG messages (2 remaining, last=37): Try again (6) >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: crm_cs_flush: >>>>> Sent 0 CPG messages (3 remaining, last=64): Try again (6) >>>>> Sep 22 00:45:46 corosync [CPG ] chosen downlist: sender r(0) >>>>> ip(10.12.23.1) ; members(old:1 left:0) >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: crm_cs_flush: >>>>> Sent 0 CPG messages (2 remaining, last=37): Try again (6) >>>>> Sep 22 00:45:46 [4419] ga1-ext stonith-ng: info: pcmk_cpg_membership: >>>>> Joined[4.0] stonith-ng.2 >>>>> Sep 22 00:45:46 [4419] ga1-ext stonith-ng: info: pcmk_cpg_membership: >>>>> Member[4.0] stonith-ng.1 >>>>> Sep 22 00:45:46 [4419] ga1-ext stonith-ng: info: pcmk_cpg_membership: >>>>> Member[4.1] stonith-ng.2 >>>>> Sep 22 00:45:46 [4419] ga1-ext stonith-ng: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now online >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: pcmk_cpg_membership: >>>>> Joined[5.0] crmd.2 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: pcmk_cpg_membership: >>>>> Member[5.0] crmd.1 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: pcmk_cpg_membership: >>>>> Member[5.1] crmd.2 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now online >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: >>>>> peer_update_callback: Client ga2-ext/peer now has status [online] >>>>> (DC=true) >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: pcmk_cpg_membership: >>>>> Joined[5.0] cib.2 >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: pcmk_cpg_membership: >>>>> Member[5.0] cib.1 >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: pcmk_cpg_membership: >>>>> Member[5.1] cib.2 >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: >>>>> crm_update_peer_proc: pcmk_cpg_membership: Node ga2-ext[2] - >>>>> corosync-cpg is now online >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: notice: do_state_transition: >>>>> State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN >>>>> cause=C_FSA_INTERNAL origin=peer_update_callback ] >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: >>>>> do_dc_join_offer_one: An unknown node joined - (re-)offer to any >>>>> unconfirmed nodes >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: join_make_offer: >>>>> Making join offers based on membership 904 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: join_make_offer: >>>>> join-2: Sending offer to ga2-ext >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: >>>>> crm_update_peer_join: join_make_offer: Node ga2-ext[2] - join-2 >>>>> phase 0 -> 1 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: join_make_offer: >>>>> Skipping ga1-ext: already known 4 >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: >>>>> abort_transition_graph: do_te_invoke:158 - Triggered transition abort >>>>> (complete=1) : Peer Halt >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_modify operation for section status: OK (rc=0, >>>>> origin=local/crmd/718, version=0.143.17) >>>>> Sep 22 00:45:46 [4419] ga1-ext stonith-ng: info: crm_cs_flush: >>>>> Sent 0 CPG messages (1 remaining, last=5): Try again (6) >>>>> Sep 22 00:45:46 corosync [MAIN ] Completed service synchronization, >>>>> ready to provide service. >>>>> Sep 22 00:45:46 [4412] ga1-ext pacemakerd: info: crm_cs_flush: >>>>> Sent 0 CPG messages (1 remaining, last=9): Try again (6) >>>>> Sep 22 00:45:46 [4418] ga1-ext cib: info: crm_cs_flush: >>>>> Sent 4 CPG messages (0 remaining, last=68): OK (1) >>>>> Sep 22 00:45:46 [4423] ga1-ext crmd: info: crm_cs_flush: >>>>> Sent 3 CPG messages (0 remaining, last=40): OK (1) >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: cib_process_request: >>>>> Completed cib_sync_one operation for section 'all': OK (rc=0, >>>>> origin=ga2-ext/ga2-ext/(null), version=0.143.17) >>>>> Sep 22 00:45:48 [4412] ga1-ext pacemakerd: error: pcmk_cpg_dispatch: >>>>> Connection to the CPG API failed: Library error (2) >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: error: pcmk_cpg_dispatch: >>>>> Connection to the CPG API failed: Library error (2) >>>>> Sep 22 00:45:48 [4421] ga1-ext attrd: error: pcmk_cpg_dispatch: >>>>> Connection to the CPG API failed: Library error (2) >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: error: pcmk_cpg_dispatch: >>>>> Connection to the CPG API failed: Library error (2) >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: error: cib_cs_destroy: >>>>> Corosync connection lost! Exiting. >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: terminate_cib: >>>>> cib_cs_destroy: Exiting fast... >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: error: pcmk_cpg_dispatch: >>>>> Connection to the CPG API failed: Library error (2) >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: qb_ipcs_us_withdraw: >>>>> withdrawing server sockets >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: crm_client_destroy: >>>>> Destroying 0 events >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: crm_client_destroy: >>>>> Destroying 0 events >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: qb_ipcs_us_withdraw: >>>>> withdrawing server sockets >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: crm_client_destroy: >>>>> Destroying 0 events >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: qb_ipcs_us_withdraw: >>>>> withdrawing server sockets >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: error: crmd_cs_destroy: >>>>> connection terminated >>>>> Sep 22 00:45:48 [4412] ga1-ext pacemakerd: error: mcp_cpg_destroy: >>>>> Connection destroyed >>>>> Sep 22 00:45:48 [4418] ga1-ext cib: info: crm_xml_cleanup: >>>>> Cleaning up memory from libxml2 >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: error: >>>>> stonith_peer_cs_destroy: Corosync connection terminated >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: stonith_shutdown: >>>>> Terminating with 1 clients >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: >>>>> cib_connection_destroy: Connection to the CIB closed. >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: crm_client_destroy: >>>>> Destroying 0 events >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: qb_ipcs_us_withdraw: >>>>> withdrawing server sockets >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: main: Done >>>>> Sep 22 00:45:48 [4419] ga1-ext stonith-ng: info: crm_xml_cleanup: >>>>> Cleaning up memory from libxml2 >>>>> Sep 22 00:45:48 [4412] ga1-ext pacemakerd: info: crm_xml_cleanup: >>>>> Cleaning up memory from libxml2 >>>>> Sep 22 00:45:48 [4421] ga1-ext attrd: crit: attrd_cs_destroy: >>>>> Lost connection to Corosync service! >>>>> Sep 22 00:45:48 [4421] ga1-ext attrd: notice: main: Exiting... >>>>> Sep 22 00:45:48 [4421] ga1-ext attrd: notice: main: >>>>> Disconnecting client 0x1987990, pid=4423... >>>>> Sep 22 00:45:48 [4421] ga1-ext attrd: error: >>>>> attrd_cib_connection_destroy: Connection to the CIB terminated... >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: info: qb_ipcs_us_withdraw: >>>>> withdrawing server sockets >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: info: >>>>> tengine_stonith_connection_destroy: Fencing daemon disconnected >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: notice: crmd_exit: >>>>> Forcing immediate exit: Link has been severed (67) >>>>> Sep 22 00:45:48 [4423] ga1-ext crmd: info: crm_xml_cleanup: >>>>> Cleaning up memory from libxml2 >>>>> Sep 22 00:45:48 [4420] ga1-ext lrmd: info: >>>>> cancel_recurring_action: Cancelling operation ClusterIP_monitor_30000 >>>>> Sep 22 00:45:48 [4420] ga1-ext lrmd: warning: qb_ipcs_event_sendv: >>>>> new_event_notification (4420-4423-6): Bad file descriptor (9) >>>>> Sep 22 00:45:48 [4420] ga1-ext lrmd: warning: send_client_notify: >>>>> Notification of client crmd/84c7e6b7-398c-40da-bec9-48b5e36dce2b >>>>> failed >>>>> Sep 22 00:45:48 [4420] ga1-ext lrmd: info: crm_client_destroy: >>>>> Destroying 1 events >>>>> Sep 22 00:45:48 [4422] ga1-ext pengine: info: crm_client_destroy: >>>>> Destroying 0 events >>>>> >>>>> >>>>> no logs on ga2-ext, seems strange >>>>> no corosync configuration on cluster nodes >>>>> >>>>> [root@ga2-ext ~]# find /etc/corosync/ >>>>> /etc/corosync/ >>>>> /etc/corosync/uidgid.d >>>>> /etc/corosync/amf.conf.example >>>>> /etc/corosync/corosync.conf.old >>>>> /etc/corosync/corosync.conf.example >>>>> /etc/corosync/corosync.conf.example.udpu >>>>> /etc/corosync/service.d >>>>> >>>>> [root@ga1-ext ~]# find /etc/corosync/ >>>>> /etc/corosync/ >>>>> /etc/corosync/corosync.conf.example >>>>> /etc/corosync/service.d >>>>> /etc/corosync/corosync.conf.example.udpu >>>>> /etc/corosync/uidgid.d >>>>> /etc/corosync/corosync.conf.old >>>>> /etc/corosync/amf.conf.example >>>>> >>>>> same packages on both nodes >>>>> >>>>> corosync-1.4.1-15.el6_4.1.x86_64 >>>>> corosynclib-1.4.1-15.el6_4.1.x86_64 >>>>> drbd-bash-completion-8.3.15-1.el6.x86_64 >>>>> drbdlinks-1.23-1.el6.noarch >>>>> drbd-pacemaker-8.3.15-1.el6.x86_64 >>>>> drbd-udev-8.3.15-1.el6.x86_64 >>>>> drbd-utils-8.3.15-1.el6.x86_64 >>>>> pacemaker-1.1.10-1.el6.x86_64 >>>>> pacemaker-cli-1.1.10-1.el6.x86_64 >>>>> pacemaker-cluster-libs-1.1.10-1.el6.x86_64 >>>>> pacemaker-debuginfo-1.1.10-1.el6.x86_64 >>>>> pacemaker-libs-1.1.10-1.el6.x86_64 >>>>> >>>>> >>>>> On Sun, 22 Sep 2013 07:14:27 +0000, Alessandro Bono wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> I have a problem with a cluster where pacemaker dies without logs or >>>>>> something >>>>>> Problem started when I switched to centos 6.4 and converted cluster from >>>>>> corosync to cman >>>>>> this happen typically when system is under high load >>>>>> tonight I received notification of drbd split brian and found on primary >>>>>> machine only these programs running >>>>>> >>>>>> 4420 ? Ss 1:29 /usr/libexec/pacemaker/lrmd >>>>>> 4422 ? Ss 0:42 /usr/libexec/pacemaker/pengine >>>>>> >>>>>> on secondary machine pacemaker is ok >>>>>> on logs only drbd disconnect and split brain notification >>>>>> I tried pacemaker 1.1.8 from centos and 1.1.9 and 1.1.10 from >>>>>> clusterlabs with same result >>>>>> >>>>>> howto debug this problem? >>>>>> /etc/sysconfig/pacemaker has lots configuration but not sure which one >>>>>> to use >>>>>> >>>>>> >>>>>> pacemaker configuration is: >>>>>> >>>>>> node ga1-ext \ >>>>>> attributes standby="off" >>>>>> node ga2-ext \ >>>>>> attributes standby="off" >>>>>> primitive ClusterIP ocf:heartbeat:IPaddr \ >>>>>> params ip="10.12.23.3" cidr_netmask="24" \ >>>>>> op monitor interval="30s" >>>>>> primitive SharedFS ocf:heartbeat:Filesystem \ >>>>>> params device="/dev/drbd/by-res/r0" directory="/shared" >>>>>> fstype="ext4" options="noatime,nobarrier" >>>>>> primitive dovecot lsb:dovecot >>>>>> primitive drbd0 ocf:linbit:drbd \ >>>>>> params drbd_resource="r0" \ >>>>>> op monitor interval="15s" >>>>>> primitive drbdlinks ocf:tummy:drbdlinks >>>>>> primitive mail ocf:heartbeat:MailTo \ >>>>>> params email="r...@company.com" subject="ga-ext cluster - " >>>>>> primitive mysql lsb:mysqld >>>>>> group service_group SharedFS drbdlinks ClusterIP mail mysql dovecot \ >>>>>> meta target-role="Started" >>>>>> ms ms_drbd0 drbd0 \ >>>>>> meta master-max="1" master-node-max="1" clone-max="2" >>>>>> clone-node-max="1" notify="true" >>>>>> colocation service_on_drbd inf: service_group ms_drbd0:Master >>>>>> order service_after_drbd inf: ms_drbd0:promote service_group:start >>>>>> property $id="cib-bootstrap-options" \ >>>>>> dc-version="1.1.10-1.el6-368c726" \ >>>>>> cluster-infrastructure="cman" \ >>>>>> expected-quorum-votes="2" \ >>>>>> stonith-enabled="false" \ >>>>>> no-quorum-policy="ignore" \ >>>>>> last-lrm-refresh="1379831462" \ >>>>>> maintenance-mode="false" >>>>>> rsc_defaults $id="rsc-options" \ >>>>>> resource-stickiness="100" >>>>>> >>>>>> >>>>>> cman configuration >>>>>> >>>>>> cat /etc/cluster/cluster.conf >>>>>> >>>>>> <cluster config_version="6" name="ga-ext_cluster"> >>>>>> <logging debug="off"/> >>>>>> <clusternodes> >>>>>> <clusternode name="ga1-ext" nodeid="1"> >>>>>> <fence> >>>>>> <method name="pcmk-redirect"> >>>>>> <device name="pcmk" port="ga1-ext"/> >>>>>> </method> >>>>>> </fence> >>>>>> </clusternode> >>>>>> <clusternode name="ga2-ext" nodeid="2"> >>>>>> <fence> >>>>>> <method name="pcmk-redirect"> >>>>>> <device name="pcmk" port="ga2-ext"/> >>>>>> </method> >>>>>> </fence> >>>>>> </clusternode> >>>>>> </clusternodes> >>>>>> <fencedevices> >>>>>> <fencedevice agent="fence_pcmk" name="pcmk"/> >>>>>> </fencedevices> >>>>>> </cluster> >>>>>> >>>>>> tell me you need other information >>>>>> >>>>>> thank you >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org