Thanks for the reply! Yes, I have checked that my LSB scripts are
compliant. If this can provide any insight, here is a clip from
/var/log/messsages on genome-ldap2 when genome-ldap1 goes down (when the
order constraint is in place):
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info:
handle_shutdown_request: Creating shutdown request for genome-ldap1
(state=S_IDLE)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap1, magic=NA, cib=0.117.3) :
Transient attribute: update
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 2 cluster nodes are eligible to run resources.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
126: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=126, ref=pe_calc-dc-1268146762-101, seq=304,
quorate=1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: short_print:
Started: [ genome-ldap2 genome-ldap1 ]
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1 times on genome-ldap2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice:
common_apply_stickiness: LDAP-clone can fail 999999 more times on
genome-ldap2 before being forced off
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: RecurringOp:
Start recurring monitor (30s) for LDAP-IP on genome-ldap2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Move
resource LDAP-IP (Started genome-ldap1 -> genome-ldap2)
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions:
Restart resource LDAP:0 (Started genome-ldap2)
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP:1 (genome-ldap1)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 65: 12 actions in 12 synapses
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 65 (ref=pe_calc-dc-1268146762-101) derived from
/var/lib/pengine/pe-warn-98.bz2
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 16 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 13: stop LDAP:1_stop_0 on genome-ldap1
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 65: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-98.bz2
Mar 9 06:59:22 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:1_stop_0 (13) confirmed on genome-ldap1 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 17 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 8: stop LDAP-IP_stop_0 on genome-ldap1
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_stop_0 (8) confirmed on genome-ldap1 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 9: start LDAP-IP_start_0 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=9:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_start_0 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:12: start
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 4 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_crm_command:
Executing crm-event (20): do_shutdown on genome-ldap1
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip -f inet addr add
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip -f inet addr add
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip link set eth0 up
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip link set eth0 up
Mar 9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO:
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83
auto not_used not_used
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO:
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83
auto not_used not_used
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_start_0 (call=12, rc=0, cib-update=127, confirmed=true) ok
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_start_0 (9) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 10: monitor LDAP-IP_monitor_30000 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=10:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_monitor_30000 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:13: monitor
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 14 fired and confirmed
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 12: start LDAP:0_start_0 on genome-ldap2 (local)
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=12:65:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP:0_start_0 )
Mar 9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:14: start
Mar 9 06:59:22 genome-ldap2 lrmd: [4333]: WARN: For LSB init script, no
additional parameters are needed.
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_monitor_30000 (call=13, rc=0, cib-update=128,
confirmed=false) ok
Mar 9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_monitor_30000 (10) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) Checking configuration files for slapd:
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stderr) config file testing succeeded
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) OK ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) Starting slapd:
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) FAILED
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout) ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:start:stdout)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_start_0 (call=14, rc=1, cib-update=129, confirmed=true)
unknown error
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: WARN: status_from_rc: Action
12 (LDAP:0_start_0) on genome-ldap2 failed (target: 0 vs. rc: 1): Error
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: WARN: update_failcount:
Updating failcount for LDAP:0 on genome-ldap2 after failed start: rc=1
(update=INFINITY, time=1268146763)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
match_graph_event:272 - Triggered transition abort (complete=0,
tag=lrm_rsc_op, id=LDAP:0_start_0,
magic=0:1;12:65:0:0cc80735-d478-48c0-8260-02b627bed719, cib=0.117.8) :
Event failed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort priority upgraded from 0 to 1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort action done superceeded by restart
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:0_start_0 (12) confirmed on genome-ldap2 (rc=4)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 15 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
65 (Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0,
Source=/var/lib/pengine/pe-warn-98.bz2): Stopped
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 65 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 2 cluster nodes are eligible to run resources.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
130: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update:
Sending flush op to all hosts for: fail-count-LDAP:0 (INFINITY)
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info:
cib_process_shutdown_req: Shutdown REQ from genome-ldap1
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_shutdown_req for section 'all'
(origin=genome-ldap1/genome-ldap1/(null), version=0.117.8): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update:
Sent update 31: fail-count-LDAP:0=INFINITY
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update:
Sending flush op to all hosts for: last-failure-LDAP:0 (1268146763)
Mar 9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update:
Sent update 33: last-failure-LDAP:0=1268146763
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=130, ref=pe_calc-dc-1268146763-108, seq=304,
quorate=1
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.9) :
Transient attribute: update
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
te_update_diff:146 - Triggered transition abort (complete=1,
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.10) :
Transient attribute: update
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
131: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
132: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP:0 (lsb:ldap): Started genome-ldap2 FAILED
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice:
common_apply_stickiness: LDAP-clone can fail 999999 more times on
genome-ldap2 before being forced off
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: RecurringOp:
Start recurring monitor (10s) for LDAP:0 on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP-IP (Started genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions:
Recover resource LDAP:0 (Started genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=132, ref=pe_calc-dc-1268146763-109, seq=304,
quorate=1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: handle_response:
pe_calc calculation pe_calc-dc-1268146763-108 is obsolete
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 66: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-99.bz2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap1 is shutting down
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP:0 (lsb:ldap): Started genome-ldap2 FAILED
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1000000 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN:
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after
1000000 failures (max=1000000)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:0 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP-IP cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling
Node genome-ldap1 for shutdown
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP-IP (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP:0 (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 67: 6 actions in 6 synapses
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 67 (ref=pe_calc-dc-1268146763-109) derived from
/var/lib/pengine/pe-warn-100.bz2
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 10 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_crm_command:
Executing crm-event (14): do_shutdown on genome-ldap1
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 2: stop LDAP:0_stop_0 on genome-ldap2 (local)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=2:67:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP:0_stop_0 )
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:15: stop
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_monitor_10000 (call=11, status=1, cib-update=0,
confirmed=true) Cancelled
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] CLM
CONFIGURATION CHANGE
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] New Configuration:
Mar 9 06:59:23 genome-ldap2 cib: [2034]: notice: ais_dispatch:
Membership 308: quorum lost
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: ais_dispatch:
Membership 308: quorum lost
Mar 9 06:59:23 genome-ldap2 lrmd: [4390]: WARN: For LSB init script, no
additional parameters are needed.
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.85)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 67: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-100.bz2
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: crm_update_peer: Node
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84)
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: ais_status_callback:
status: genome-ldap1 is now lost (was member)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Left:
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_peer: Node
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84)
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.84)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: erase_node_from_join:
Removed node genome-ldap1 from join calculations: welcomed=0 itegrated=0
finalized=0 confirmed=1
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Joined:
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_quorum:
Updating quorum status to false (call=135)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) Stopping slapd:
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section nodes
(origin=local/crmd/133, version=0.117.10): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 308: memb=1,
new=0, lost=1
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: memb: genome-ldap2 1426129162
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: lost: genome-ldap1 1409351946
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] CLM
CONFIGURATION CHANGE
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] New Configuration:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] r(0)
ip(10.1.1.85)
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Left:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [CLM ] Members Joined:
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 308: memb=1, new=0, lost=0
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
pcmk_peer_update: MEMB: genome-ldap2 1426129162
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
ais_mark_unseen_peer_dead: Node genome-ldap1 was not seen in the
previous transition
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
update_member: Node 1409351946/genome-ldap1 is now: lost
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [pcmk ] info:
send_member_notification: Sending membership update 308 to 2 children
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [TOTEM ] A processor
joined or left the membership and a new membership was formed.
Mar 9 06:59:23 genome-ldap2 corosync[1989]: [MAIN ] Completed
service synchronization, ready to provide service.
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element:
cib:diff: - <cib have-quorum="1" admin_epoch="0" epoch="117"
num_updates="11" />
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element:
cib:diff: + <cib have-quorum="0" admin_epoch="0" epoch="118"
num_updates="1" />
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section cib
(origin=local/crmd/135, version=0.118.1): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph:
need_abort:59 - Triggered transition abort (complete=0) : Non-status change
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort priority upgraded from 0 to 1000000
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority:
Abort action done superceeded by restart
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: need_abort: Aborting on
change to have-quorum
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents:
Archived previous version as /var/lib/heartbeat/crm/cib-42.raw
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents:
Wrote version 0.118.0 of the CIB to disk (digest:
5ab39d8c6134011247378d7b7a8e8cb9)
Mar 9 06:59:23 genome-ldap2 cib: [4397]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.3Fcmhq (digest:
/var/lib/heartbeat/crm/cib.sbJeBo)
Mar 9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request:
Operation complete: op cib_modify for section crm_config
(origin=local/crmd/137, version=0.118.1): ok (rc=0)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) [
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) OK
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout) ]
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout)
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP:0:stop:stdout)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP:0_stop_0 (call=15, rc=0, cib-update=138, confirmed=true) ok
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP:0_stop_0 (2) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 11 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
67 (Complete=4, Pending=0, Fired=0, Skipped=2, Incomplete=0,
Source=/var/lib/pengine/pe-warn-100.bz2): Stopped
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 67 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
All 1 cluster nodes are eligible to run resources.
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query
139: Requesting the current CIB: S_POLICY_ENGINE
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback:
Invoking the PE: query=139, ref=pe_calc-dc-1268146763-113, seq=308,
quorate=0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On
loss of CCM Quorum: Ignore
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
determine_online_status: Node genome-ldap2 is online
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op:
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print:
LDAP-IP (ocf::heartbeat:IPaddr2): Started genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print:
Clone Set: LDAP-clone
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print:
Stopped: [ LDAP:0 LDAP:1 ]
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount:
LDAP-clone has failed 1000000 times on genome-ldap2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN:
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after
1000000 failures (max=1000000)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info:
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:0 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP:1 cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color:
Resource LDAP-IP cannot run anywhere
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop
resource LDAP-IP (genome-ldap2)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:0 (Stopped)
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave
resource LDAP:1 (Stopped)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked
transition 68: 2 actions in 2 synapses
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke:
Processing graph 68 (ref=pe_calc-dc-1268146763-113) derived from
/var/lib/pengine/pe-warn-101.bz2
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command:
Initiating action 5: stop LDAP-IP_stop_0 on genome-ldap2 (local)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op:
Performing key=5:68:0:0cc80735-d478-48c0-8260-02b627bed719
op=LDAP-IP_stop_0 )
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:16: stop
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_monitor_30000 (call=13, status=1, cib-update=0,
confirmed=true) Cancelled
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message:
Transition 68: WARNINGs found during PE processing. PEngine Input stored
in: /var/lib/pengine/pe-warn-101.bz2
Mar 9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message:
Configuration WARNINGs found during PE processing. Please run
"crm_verify -L" to identify issues.
Mar 9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -f inet addr delete
10.1.1.83/16 dev eth0
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -f inet addr delete
10.1.1.83/16 dev eth0
Mar 9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -o -f inet addr
show eth0
Mar 9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -o -f inet addr show
eth0
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM
operation LDAP-IP_stop_0 (call=16, rc=0, cib-update=140, confirmed=true) ok
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event:
Action LDAP-IP_stop_0 (5) confirmed on genome-ldap2 (rc=0)
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action:
Pseudo action 2 fired and confirmed
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph:
====================================================
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition
68 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pengine/pe-warn-101.bz2): Complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger:
Transition 68 is now complete
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: notify_crmd: Transition
68 status: done - <null>
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar 9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition:
Starting PEngine Recheck Timer
Mar 9 06:59:26 genome-ldap2 lrmd: [2035]: info: RA output:
(LDAP-IP:start:stderr) ARPING 10.1.1.83 from 10.1.1.83 eth0 Sent 5
probes (5 broadcast(s)) Received 0 response(s)
I also thought the expected behavior was to stop LDAP first, start the
IP, then restart LDAP. I think I'd rather have it behave that way. Any
clue as to what may be causing the problem?
Here again is my config:
node genome-ldap1
node genome-ldap2
primitive LDAP lsb:ldap \
op monitor interval="10s" timeout="15s" \
meta target-role="Started"
primitive LDAP-IP ocf:heartbeat:IPaddr2 \
params ip="10.1.1.83" nic="eth0" cidr_netmask="16" \
op monitor interval="30s" timeout="20s" \
meta target-role="Started"
clone LDAP-clone LDAP \
meta clone-max="2" clone-node-max="1" globally-unique="false"
location LDAP-IP-placement-1 LDAP-IP 100: genome-ldap1
location LDAP-IP-placement-2 LDAP-IP 50: genome-ldap2
location LDAP-placement-1 LDAP-clone 100: genome-ldap1
location LDAP-placement-2 LDAP-clone 100: genome-ldap2
colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
order LDAP-after-IP inf: LDAP-IP LDAP-clone
property $id="cib-bootstrap-options" \
dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
symmetric-cluster="false" \
no-quorum-policy="ignore" \
last-lrm-refresh="1268087338"
[r...@genome-ldap2 ~]#
Thanks!
-erich
Andrew Beekhof wrote:
On Tue, Mar 9, 2010 at 12:27 AM, Erich Weiler <wei...@soe.ucsc.edu> wrote:
I think I may have found an answer. I had this in my config:
order LDAP-after-IP inf: LDAP-IP LDAP-clone
And, according to the logs, it *looks* like what happens when genome-ldap1
goes gown, the IP goes over to genome-ldap2, AND THEN tries to start LDAP
there, even though LDAP is already started there because it is an anonymous
clone. LDAP cannot start (because it is already started) and throws an
error exit code, and presumably pacemaker freaks out because of that and
shuts down LDAP on all nodes. Then the floating IP disappears because of
the line:
colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
which is expected at that point. It seems that when I tested this with
older versions of pacemaker, this didn't happen. Should 'order' statements
be avoided entirely when dealing with anonymous clones? Is that behavior
expected?
The ordering constraint should have caused the cluster to stop LDAP first.
Have you checked both scripts are fully LSB compliant?
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html
_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker