Hello, I have 2 active-passive fail over system with corosync and drbd. One system using 2 debian server and the other using 2 ubuntu server. The debian servers are for web server fail over and the ubuntu servers are for database server fail over.
I applied the same configuration in the pacemaker. Everything works fine, fail over can be done nicely and also the file system synchronization, but in the ubuntu server, it was always has error after a couple week or month. The pacemaker in ubuntu1 had different status with ubuntu2, ubuntu1 assumed that ubuntu2 was down and ubuntu2 assumed that something happened with ubuntu1 but still alive and took over the resources. It made the drbd resource cannot be taken over, thus no fail over happened and we must manually restart the server because restarting pacemaker and corosync didn't help. I have changed the configuration of pacemaker a couple time, but the problem still exist. has anyone experienced it? I use Ubuntu 14.04.1 LTS. I got this error in apport.log ERROR: apport (pid 20361) Fri Dec 19 02:43:52 2014: executable: /usr/lib/pacemaker/lrmd (command line "/usr/lib/pacemaker/lrmd") ERROR: apport (pid 20361) Fri Dec 19 02:43:52 2014: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment ERROR: apport (pid 20361) Fri Dec 19 02:43:52 2014: wrote report /var/crash/_usr_lib_pacemaker_lrmd.0.crash my pacemaker configuration: node $id="1" db \ attributes standby="off" node $id="2" db2 \ attributes standby="off" primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip="192.168.0.100" cidr_netmask="24" \ op monitor interval="30s" primitive DBase ocf:heartbeat:mysql \ meta target-role="Started" \ op start timeout="120s" interval="0" \ op stop timeout="120s" interval="0" \ op monitor interval="20s" timeout="30s" primitive DbFS ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/sync" fstype="ext4" \ op start timeout="60s" interval="0" \ op stop timeout="180s" interval="0" \ op monitor interval="60s" timeout="60s" primitive Links lsb:drbdlinks primitive r0 ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="29s" role="Master" \ op start timeout="240s" interval="0" \ op stop timeout="180s" interval="0" \ op promote timeout="180s" interval="0" \ op demote timeout="180s" interval="0" \ op monitor interval="30s" role="Slave" group DbServer ClusterIP DbFS Links DBase ms ms_r0 r0 \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master" location prefer-db DbServer 50: db colocation DbServer-with-ms_ro inf: DbServer ms_r0:Master order DbServer-after-ms_ro inf: ms_r0:promote DbServer:start property $id="cib-bootstrap-options" \ dc-version="1.1.10-42f2063" \ cluster-infrastructure="corosync" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1363370585" my corosync config: totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 3600 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: off threads: 0 rrp_mode: none transport: udpu cluster_name: Dbcluster } nodelist { node { ring0_addr: db nodeid: 1 } node { ring0_addr: db2 nodeid: 2 } } quorum { provider: corosync_votequorum } amf { mode: disabled } service { ver: 0 name: pacemaker } aisexec { user: root group: root } logging { fileline: off to_stderr: yes to_logfile: yes logfile: /var/log/corosync/corosync.log to_syslog: no syslog_facility: daemon debug: off timestamp: on logger_subsys { subsys: AMF debug: off tags: enter|leave|trace1|trace2|trace3|trace4|trace6 } } my drbd.conf: global { usage-count no; } common { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; } startup { degr-wfc-timeout 120; } disk { on-io-error detach; } syncer { rate 100M; al-extents 257; } } resource r0 { protocol C; flexible-meta-disk internal; on db2 { address 192.168.0.10:7801; device /dev/drbd0 minor 0; disk /dev/sdb1; } on db { device /dev/drbd0 minor 0; disk /dev/db/sync; address 192.168.0.20:7801; } handlers { split-brain "/usr/lib/drbd/notify-split-brain.sh root"; } net { after-sb-0pri discard-younger-primary; #discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri call-pri-lost-after-sb; } } I have no idea, how to solve this problem. Maybe someone can help me. best regards, ariee
Dec 18 23:19:31 [1167] db corosync notice [TOTEM ] A processor failed, forming new configuration. Dec 18 23:19:32 [1167] db corosync notice [TOTEM ] A new membership (ipubuntu1:1040) was formed. Members Dec 18 23:19:35 [1167] db corosync notice [TOTEM ] A processor failed, forming new configuration. Dec 18 23:19:39 [1167] db corosync notice [TOTEM ] A new membership (ipubuntu1:1044) was formed. Members left: 2 Dec 18 23:19:39 [1289] db cib: info: pcmk_cpg_membership: Left[1.0] cib.2 Dec 18 23:19:39 [1289] db cib: info: crm_update_peer_proc: pcmk_cpg_membership: Node db2[2] - corosync-cpg is now offline Dec 18 23:19:39 [1294] db crmd: info: pcmk_cpg_membership: Left[1.0] crmd.2 Dec 18 23:19:39 [1289] db cib: info: pcmk_cpg_membership: Member[1.0] cib.1 Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_proc: pcmk_cpg_membership: Node db2[2] - corosync-cpg is now offline Dec 18 23:19:39 [1294] db crmd: info: peer_update_callback: Client db2/peer now has status [offline] (DC=db2) Dec 18 23:19:39 [1287] db pacemakerd: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=7): Try again (6) Dec 18 23:19:39 [1294] db crmd: notice: peer_update_callback: Our peer on the DC is dead Dec 18 23:19:39 [1290] db stonith-ng: info: pcmk_cpg_membership: Left[1.0] stonith-ng.2 Dec 18 23:19:39 [1294] db crmd: info: pcmk_cpg_membership: Member[1.0] crmd.1 Dec 18 23:19:39 [1290] db stonith-ng: info: crm_update_peer_proc: pcmk_cpg_membership: Node db2[2] - corosync-cpg is now offline Dec 18 23:19:39 [1294] db crmd: notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_CRMD_STATUS_CALLBACK origin=peer_update_callback ] Dec 18 23:19:39 [1294] db crmd: info: update_dc: Unset DC. Was db2 Dec 18 23:19:39 [1290] db stonith-ng: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=2): Try again (6) Dec 18 23:19:39 [1290] db stonith-ng: info: pcmk_cpg_membership: Member[1.0] stonith-ng.1 Dec 18 23:19:39 [1294] db crmd: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=14): Try again (6) Dec 18 23:19:39 [1167] db corosync notice [QUORUM] This node is within the non-primary component and will NOT provide any services. Dec 18 23:19:39 [1287] db pacemakerd: notice: pcmk_quorum_notification: Membership 1044: quorum lost (1) Dec 18 23:19:39 [1294] db crmd: notice: pcmk_quorum_notification: Membership 1044: quorum lost (1) Dec 18 23:19:39 [1167] db corosync notice [QUORUM] Members[1]: 1 Dec 18 23:19:39 [1167] db corosync notice [MAIN ] Completed service synchronization, ready to provide service. Dec 18 23:19:39 [1294] db crmd: notice: crm_update_peer_state: pcmk_quorum_notification: Node db2[2] - state is now lost (was member) Dec 18 23:19:39 [1287] db pacemakerd: notice: crm_update_peer_state: pcmk_quorum_notification: Node db2[2] - state is now lost (was member) Dec 18 23:19:39 [1294] db crmd: info: peer_update_callback: db2 is now lost (was member) Dec 18 23:19:39 [1294] db crmd: info: crm_cs_flush: Sent 2 CPG messages (0 remaining, last=16): OK (1) Dec 18 23:19:39 [1294] db crmd: info: do_log: FSA: Input I_ELECTION_DC from do_election_check() received in state S_ELECTION Dec 18 23:19:39 [1294] db crmd: notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Dec 18 23:19:39 [1294] db crmd: info: do_te_control: Registering TE UUID: 3f9f2c50-dc16-44bf-94cd-30e05a0e1df6 Dec 18 23:19:39 [1294] db crmd: info: set_graph_functions: Setting custom graph functions Dec 18 23:19:39 [1293] db pengine: info: crm_client_new: Connecting 0x7f53a8043710 for uid=109 gid=118 pid=1294 id=09b157e6-3fdd-4a6d-8b51-0449cb81d6d6 Dec 18 23:19:39 [1294] db crmd: info: do_dc_takeover: Taking over DC status for this partition Dec 18 23:19:39 [1289] db cib: info: cib_process_readwrite: We are now in R/W mode Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_master operation for section 'all': OK (rc=0, origin=local/crmd/28, version=0.106.18) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section cib: OK (rc=0, origin=local/crmd/29, version=0.106.18) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section //cib/configuration/crm_config//cluster_property_set//nvpair[@name='dc-version']: OK (rc=0, o rigin=local/crmd/30, version=0.106.18) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section crm_config: OK (rc=0, origin=local/crmd/31, version=0.106.18) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section //cib/configuration/crm_config//cluster_property_set//nvpair[@name='cluster-infrastructure']: OK (rc=0, origin=local/crmd/32, version=0.106.18) Dec 18 23:19:39 [1294] db crmd: info: join_make_offer: Making join offers based on membership 1044 Dec 18 23:19:39 [1294] db crmd: info: join_make_offer: join-1: Sending offer to db Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_join: join_make_offer: Node db[1] - join-1 phase 0 -> 1 Dec 18 23:19:39 [1294] db crmd: info: do_dc_join_offer_all: join-1: Waiting on 1 outstanding join acks Dec 18 23:19:39 [1294] db crmd: info: update_dc: Set DC to db (3.0.7) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section crm_config: OK (rc=0, origin=local/crmd/33, version=0.106.18) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/35, version=0.106.18) Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_join: do_dc_join_filter_offer: Node db[1] - join-1 phase 1 -> 2 Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_expected: do_dc_join_filter_offer: Node db[1] - expected state is now member Dec 18 23:19:39 [1294] db crmd: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ] Dec 18 23:19:39 [1294] db crmd: info: crmd_join_phase_log: join-1: db2=none Dec 18 23:19:39 [1294] db crmd: info: crmd_join_phase_log: join-1: db=integrated Dec 18 23:19:39 [1294] db crmd: info: do_dc_join_finalize: join-1: Syncing our CIB to the rest of the cluster Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_sync operation for section 'all': OK (rc=0, origin=local/crmd/36, version=0.106.18) Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_join: finalize_join_for: Node db[1] - join-1 phase 2 -> 3 Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section nodes: OK (rc=0, origin=local/crmd/37, version=0.106.18) Dec 18 23:19:39 [1294] db crmd: info: services_os_action_execute: Managed Filesystem_meta-data_0 process 8975 exited with rc=0 Dec 18 23:19:39 [1294] db crmd: info: services_os_action_execute: Managed IPaddr2_meta-data_0 process 8980 exited with rc=0 Dec 18 23:19:39 [1294] db crmd: info: services_os_action_execute: Managed mysql_meta-data_0 process 8984 exited with rc=0 Dec 18 23:19:39 [1294] db crmd: info: crm_update_peer_join: do_dc_join_ack: Node db[1] - join-1 phase 3 -> 4 Dec 18 23:19:39 [1294] db crmd: info: do_dc_join_ack: join-1: Updating node state to member for db Dec 18 23:19:39 [1294] db crmd: info: erase_status_tag: Deleting xpath: //node_state[@uname='db']/lrm Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_delete operation for section //node_state[@uname='db']/lrm: OK (rc=0, origin=local/crmd/38, version=0.106.19) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/crmd/39, version=0.106.20) Dec 18 23:19:39 [1294] db crmd: info: do_state_transition: State transition S_FINALIZE_JOIN -> S_POLICY_ENGINE [ input=I_FINALIZED cause=C_FSA_INTERNAL origin=check_join_state ] Dec 18 23:19:39 [1294] db crmd: info: abort_transition_graph: do_te_invoke:151 - Triggered transition abort (complete=1) : Peer Cancelled Dec 18 23:19:39 [1292] db attrd: notice: attrd_local_callback: Sending full refresh (origin=crmd) Dec 18 23:19:39 [1292] db attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: master-r0 (10000) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section nodes: OK (rc=0, origin=local/crmd/40, version=0.106.20) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/crmd/41, version=0.106.21) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section cib: OK (rc=0, origin=local/crmd/42, version=0.106.22) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/43, version=0.106.22) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section //cib/status//node_state[@id='1']//transient_attributes//nvpair[@name='master-r0']: OK (rc=0, origin=local/attrd/12, version=0.106.22) Dec 18 23:19:39 [1292] db attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/attrd/13, version=0.106.22) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_query operation for section //cib/status//node_state[@id='1']//transient_attributes//nvpair[@name='probe_complete']: OK (rc=0, origin=local/attrd/14, version=0.106.22) Dec 18 23:19:39 [1289] db cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/attrd/15, version=0.106.22) Dec 18 23:19:40 [1289] db cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/44, version=0.106.22) Dec 18 23:19:40 [1293] db pengine: notice: unpack_config: On loss of CCM Quorum: Ignore Dec 18 23:19:40 [1293] db pengine: info: determine_online_status: Node db is online Dec 18 23:19:40 [1293] db pengine: info: clone_print: Master/Slave Set: ms_r0 [r0] Dec 18 23:19:40 [1293] db pengine: info: short_print: Masters: [ db ] Dec 18 23:19:40 [1293] db pengine: info: short_print: Stopped: [ db2 ] Dec 18 23:19:40 [1293] db pengine: info: group_print: Resource Group: DbServer Dec 18 23:19:40 [1293] db pengine: info: native_print: ClusterIP (ocf::heartbeat:IPaddr2): Started db Dec 18 23:19:40 [1293] db pengine: info: native_print: DbFS (ocf::heartbeat:Filesystem): Started db Dec 18 23:19:40 [1293] db pengine: info: native_print: Links (lsb:drbdlinks): Started db Dec 18 23:19:40 [1293] db pengine: info: native_print: DBase (ocf::heartbeat:mysql): Started db Dec 18 23:19:40 [1293] db pengine: info: native_color: Resource r0:1 cannot run anywhere Dec 18 23:19:40 [1293] db pengine: info: master_color: Promoting r0:0 (Master db) Dec 18 23:19:40 [1293] db pengine: info: master_color: ms_r0: Promoted 1 instances of a possible 1 to master Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave r0:0 (Master db) Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave r0:1 (Stopped) Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave ClusterIP (Started db) Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave DbFS (Started db) Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave Links (Started db) Dec 18 23:19:40 [1293] db pengine: info: LogActions: Leave DBase (Started db) Dec 18 23:19:40 [1293] db pengine: notice: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-1202.bz2 Dec 18 23:19:40 [1294] db crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Dec 18 23:19:40 [1294] db crmd: info: do_te_invoke: Processing graph 0 (ref=pe_calc-dc-1418941180-20) derived from /var/lib/pacemaker/pengine/pe-input-1202.bz2 Dec 18 23:19:40 [1294] db crmd: notice: run_graph: Transition 0 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1202.bz2): Complete Dec 18 23:19:40 [1294] db crmd: info: do_log: FSA: Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE Dec 18 23:19:40 [1294] db crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Dec 18 23:19:57 [1289] db cib: info: crm_client_new: Connecting 0x7ffbf6cf2f40 for uid=0 gid=0 pid=9069 id=c56388b9-8031-4304-be22-0b1462c1c814 Dec 18 23:19:57 [1289] db cib: info: cib_process_request: Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.106.22) Dec 18 23:19:57 [1289] db cib: info: crm_client_destroy: Destroying 0 events
Dec 18 23:19:29 [1011] db2 corosync warning [MAIN ] Corosync main process was not scheduled for 2856.1863 ms (threshold is 2400.0000 ms). Consider token timeout increase. Dec 18 23:19:31 [1011] db2 corosync notice [TOTEM ] A processor failed, forming new configuration. Dec 18 23:19:31 [1011] db2 corosync notice [TOTEM ] A new membership (ipubuntu1:1040) was formed. Members Dec 18 23:19:53 [1106] db2 lrmd: warning: child_timeout_callback: r0_monitor_30000 process (PID 23890) timed out Dec 18 23:19:53 [1106] db2 lrmd: warning: operation_finished: r0_monitor_30000:23890 - timed out after 20000ms Dec 18 23:19:53 [1109] db2 crmd: error: process_lrm_event: LRM operation r0_monitor_30000 (212) Timed Out (timeout=20000ms) Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/crmd/3152, version=0.106.19) Dec 18 23:19:53 [1104] db2 cib: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=134): Try again (6) Dec 18 23:19:53 [1109] db2 crmd: info: abort_transition_graph: process_graph_event:520 - Triggered transition abort (complete=1, node=db2, tag=lrm_rsc_op, id=r0_last_failure_0, magic=2:1;10:13:0:c702936a-46bc-42de-8439-3694fb05b2b1, cib=0.106.19) : Old event Dec 18 23:19:53 [1109] db2 crmd: warning: update_failcount: Updating failcount for r0 on db2 after failed monitor: rc=1 (update=value++, time=1418941193) Dec 18 23:19:53 [1109] db2 crmd: info: process_graph_event: Detected action (13.10) r0_monitor_30000.212=unknown error: failed Dec 18 23:19:53 [1109] db2 crmd: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Dec 18 23:19:53 [1107] db2 attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-r0 (1) Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crmd/3153, version=0.106.19) Dec 18 23:19:53 [1108] db2 pengine: notice: unpack_config: On loss of CCM Quorum: Ignore Dec 18 23:19:53 [1108] db2 pengine: info: determine_online_status: Node db2 is online Dec 18 23:19:53 [1108] db2 pengine: info: determine_online_status: Node db is online Dec 18 23:19:53 [1108] db2 pengine: warning: unpack_rsc_op: Processing failed op monitor for r0:0 on db2: unknown error (1) Dec 18 23:19:53 [1108] db2 pengine: info: clone_print: Master/Slave Set: ms_r0 [r0] Dec 18 23:19:53 [1108] db2 pengine: info: native_print: r0 (ocf::linbit:drbd): Slave db2 FAILED Dec 18 23:19:53 [1108] db2 pengine: info: short_print: Masters: [ db ] Dec 18 23:19:53 [1108] db2 pengine: info: group_print: Resource Group: DbServer Dec 18 23:19:53 [1108] db2 pengine: info: native_print: ClusterIP (ocf::heartbeat:IPaddr2): Started db Dec 18 23:19:53 [1108] db2 pengine: info: native_print: DbFS (ocf::heartbeat:Filesystem): Started db Dec 18 23:19:53 [1108] db2 pengine: info: native_print: Links (lsb:drbdlinks): Started db Dec 18 23:19:53 [1108] db2 pengine: info: native_print: DBase (ocf::heartbeat:mysql): Started db Dec 18 23:19:53 [1108] db2 pengine: info: master_color: Promoting r0:1 (Master db) Dec 18 23:19:53 [1108] db2 pengine: info: master_color: ms_r0: Promoted 1 instances of a possible 1 to master Dec 18 23:19:53 [1108] db2 pengine: info: RecurringOp: Start recurring monitor (30s) for r0:0 on db2 Dec 18 23:19:53 [1108] db2 pengine: info: RecurringOp: Start recurring monitor (30s) for r0:0 on db2 Dec 18 23:19:53 [1108] db2 pengine: notice: LogActions: Recover r0:0 (Slave db2) Dec 18 23:19:53 [1108] db2 pengine: info: LogActions: Leave r0:1 (Master db) Dec 18 23:19:53 [1108] db2 pengine: info: LogActions: Leave ClusterIP (Started db) Dec 18 23:19:53 [1108] db2 pengine: info: LogActions: Leave DbFS (Started db) Dec 18 23:19:53 [1108] db2 pengine: info: LogActions: Leave Links (Started db) Dec 18 23:19:53 [1108] db2 pengine: info: LogActions: Leave DBase (Started db) Dec 18 23:19:53 [1109] db2 crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Dec 18 23:19:53 [1109] db2 crmd: info: do_te_invoke: Processing graph 3033 (ref=pe_calc-dc-1418941193-3212) derived from /var/lib/pacemaker/pengine/pe-input-643.bz2 Dec 18 23:19:53 [1108] db2 pengine: notice: process_pe_message: Calculated Transition 3033: /var/lib/pacemaker/pengine/pe-input-643.bz2 Dec 18 23:19:53 [1109] db2 crmd: notice: te_rsc_command: Initiating action 58: notify r0_pre_notify_stop_0 on db2 (local) Dec 18 23:19:53 [1109] db2 crmd: info: do_lrm_rsc_op: Performing key=58:3033:0:c702936a-46bc-42de-8439-3694fb05b2b1 op=r0_notify_0 Dec 18 23:19:53 [1106] db2 lrmd: info: log_execute: executing - rsc:r0 action:notify call_id:216 Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_query operation for section //cib/status//node_state[@id='2']//transient_attributes//nvpair[@name='fail-count-r0']: No such device or address (rc=-6, origin=local/attrd/45, version=0.106.19) Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_query operation for section /cib: OK (rc=0, origin=local/attrd/46, version=0.106.19) Dec 18 23:19:53 [1107] db2 attrd: notice: attrd_perform_update: Sent update 47: fail-count-r0=1 Dec 18 23:19:53 [1107] db2 attrd: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-r0 (1418941193) Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/attrd/47, version=0.106.20) Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_query operation for section //cib/status//node_state[@id='2']//transient_attributes//nvpair[@name='last-failure-r0']: No such device or address (rc=-6, origin=local/attrd/48, version=0.106.20) Dec 18 23:19:53 [1109] db2 crmd: notice: te_rsc_command: Initiating action 59: notify r0_pre_notify_stop_0 on db Dec 18 23:19:53 [1109] db2 crmd: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=99): Try again (6) Dec 18 23:19:53 [1109] db2 crmd: info: abort_transition_graph: te_update_diff:172 - Triggered transition abort (complete=0, node=db2, tag=nvpair, id=status-2-fail-count-r0, name=fail-count-r0, value=1, magic=NA, cib=0.106.20) : Transient attribute: update Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_query operation for section /cib: OK (rc=0, origin=local/attrd/49, version=0.106.20) Dec 18 23:19:53 [1107] db2 attrd: notice: attrd_perform_update: Sent update 50: last-failure-r0=1418941193 Dec 18 23:19:53 [1109] db2 crmd: info: abort_transition_graph: te_update_diff:172 - Triggered transition abort (complete=0, node=db2, tag=nvpair, id=status-2-last-failure-r0, name=last-failure-r0, value=1418941193, magic=NA, cib=0.106.21) : Transient attribute: update Dec 18 23:19:53 [1104] db2 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=local/attrd/50, version=0.106.21) Dec 18 23:19:53 [1106] db2 lrmd: info: log_finished: finished - rsc:r0 action:notify call_id:216 pid:23916 exit-code:0 exec-time:98ms queue-time:0ms Dec 18 23:19:53 [1109] db2 crmd: info: match_graph_event: Action r0_notify_0 (58) confirmed on db2 (rc=0) Dec 18 23:19:53 [1109] db2 crmd: notice: process_lrm_event: LRM operation r0_notify_0 (call=216, rc=0, cib-update=0, confirmed=true) ok Dec 18 23:19:53 [1104] db2 cib: info: crm_cs_flush: Sent 0 CPG messages (3 remaining, last=134): Try again (6) Dec 18 23:19:53 [1109] db2 crmd: info: crm_cs_flush: Sent 0 CPG messages (1 remaining, last=99): Try again (6)
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org