On 25/06/2013, at 5:37 PM, Francesco Namuri <f.nam...@credires.it> wrote:
> Hi, > after an update to the new debian stable, from pacemaker 1.0.9.1 to > 1.1.7 I'm getting some strange errors on syslog: Thats a hell of a jump there. Can you attach /var/lib/pengine/pe-input-64.bz2 from SERVERNAME1 please? I'll be able to see if its something we've already fixed. > > Jun 25 09:20:01 SERVERNAME1 cib: [4585]: info: cib_stats: Processed 29 > operations (344.00us average, 0% utilization) in the last 10min > Jun 25 09:20:22 SERVERNAME1 lrmd: [4587]: info: operation monitor[8] on > resDRBD:1 for client 4590: pid 19371 exited with return code 8 > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: info: crm_timer_popped: PEngine > Recheck Timer (I_PE_CALC) just popped (900000ms) > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: info: do_state_transition: > Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: unpack_config: On loss > of CCM Quorum: Ignore > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: unpack_rsc_op: Operation > monitor found resource resDRBD:1 active in master mode on SERVERNAME1 > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: WARN: unpack_rsc_op: Processing > failed op resSNORT:1_last_failure_0 on SERVERNAME1: not running (7) > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: unpack_rsc_op: Operation > monitor found resource resDRBD:0 active in master mode on SERVERNAME2 > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: WARN: unpack_rsc_op: Processing > failed op resSNORT:0_last_failure_0 on SERVERNAME2: not running (7) > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: common_apply_stickiness: > cloneSNORT can fail 999998 more times on SERVERNAME2 before being forced off > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: common_apply_stickiness: > cloneSNORT can fail 999998 more times on SERVERNAME2 before being forced off > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: common_apply_stickiness: > cloneSNORT can fail 999998 more times on SERVERNAME1 before being forced off > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: common_apply_stickiness: > cloneSNORT can fail 999998 more times on SERVERNAME1 before being forced off > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: ERROR: rsc_expand_action: > Couldn't expand cloneDLM_demote_0 > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: ERROR: crm_abort: > clone_update_actions_interleave: Triggered assert at clone.c:1245 : > first_action != NULL || is_set(first_child->flags, pe_rsc_orphan) > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: ERROR: > clone_update_actions_interleave: No action found for demote in resDLM:1 > (first) > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: ERROR: crm_abort: > clone_update_actions_interleave: Triggered assert at clone.c:1245 : > first_action != NULL || is_set(first_child->flags, pe_rsc_orphan) > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: ERROR: > clone_update_actions_interleave: No action found for demote in resDLM:0 > (first) > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: info: do_te_invoke: Processing > graph 2004 (ref=pe_calc-dc-1372144851-2079) derived from > /var/lib/pengine/pe-input-64.bz2 > Jun 25 09:20:51 SERVERNAME1 pengine: [4589]: notice: process_pe_message: > Transition 2004: PEngine Input stored in: /var/lib/pengine/pe-input-64.bz2 > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: notice: run_graph: ==== Transition > 2004 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-64.bz2): Complete > Jun 25 09:20:51 SERVERNAME1 crmd: [4590]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Jun 25 09:23:26 SERVERNAME1 lrmd: [4587]: info: rsc:resSNORTSAM:1 monitor[9] > (pid 19862) > Jun 25 09:23:27 SERVERNAME1 lrmd: [4587]: info: operation monitor[9] on > resSNORTSAM:1 for client 4590: pid 19862 exited with return code 0 > Jun 25 09:25:20 SERVERNAME1 lrmd: [4587]: info: rsc:resDLM:0 monitor[11] (pid > 20080) > Jun 25 09:25:20 SERVERNAME1 lrmd: [4587]: info: operation monitor[11] on > resDLM:0 for client 4590: pid 20080 exited with return code 0 > Jun 25 09:30:01 SERVERNAME1 cib: [4585]: info: cib_stats: Processed 31 > operations (322.00us average, 0% utilization) in the last 10min > > my config is: > > node SERVERNAME2 > node SERVERNAME1 > primitive resDLM ocf:pacemaker:controld \ > op monitor interval="120s" \ > op start interval="0" timeout="90s" \ > op stop interval="0" timeout="100s" > primitive resDRBD ocf:linbit:drbd \ > params drbd_resource="SERVERNAME2CL" \ > operations $id="resDRBD-operation" \ > op monitor interval="20" role="Master" timeout="20" \ > op monitor interval="30" role="Slave" timeout="20" \ > op start interval="0" timeout="240s" \ > op stop interval="0" timeout="100s" > primitive resFS ocf:heartbeat:Filesystem \ > params device="/dev/drbd0" directory="/srv" fstype="ocfs2" \ > op monitor interval="120s" timeout="40s" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" > primitive resO2CB ocf:pacemaker:o2cb \ > op monitor interval="120s" \ > op start interval="0" timeout="90s" \ > op stop interval="0" timeout="100s" > primitive resSNORT lsb:snort \ > op monitor interval="150s" timeout="40s" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" > primitive resSNORTSAM lsb:snortsam \ > op monitor interval="180s" timeout="40s" \ > op start interval="0" timeout="60s" \ > op stop interval="0" timeout="60s" > ms msDRBD resDRBD \ > meta resource-stickines="100" notify="true" master-max="2" > interleave="true" > clone cloneDLM resDLM \ > meta globally-unique="false" interleave="true" target-role="Started" > clone cloneFS resFS \ > meta interleave="true" ordered="true" target-role="Started" > clone cloneO2CB resO2CB \ > meta globally-unique="false" interleave="true" target-role="Started" > clone cloneSNORT resSNORT \ > meta interleave="true" target-role="Started" > clone cloneSNORTSAM resSNORTSAM \ > meta interleave="true" target-role="Started" > colocation colDLMDRBD inf: cloneDLM msDRBD:Master > colocation colFSO2CB inf: cloneFS cloneO2CB > colocation colO2CBDLM inf: cloneO2CB cloneDLM > order ordDLMO2CB 0: cloneDLM cloneO2CB > order ordDRBDDLM 0: msDRBD:promote cloneDLM > order ordO2CBFS 0: cloneO2CB cloneFS > order ordSNORT inf: cloneFS cloneSNORT > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1371563001" > > Thanks in advance for any suggestion. > > Ciao, > francesco > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org