On Thu, Dec 13, 2012 at 6:31 AM, Latrous, Youssef <ylatr...@broadviewnet.com> wrote: > Hi, > > > > I run into the following issue and I couldn’t find what it really means: > > > > Detected action msgbroker_monitor_10000 from a different transition: > 16048 vs. 18014
18014 is where we're up to now, 16048 is the (old) one that scheduled the recurring monitor operation. I suspect you'll find the action failed earlier in the logs and thats why it needed to be restarted. Not the best log message though :( > > > > I can see that its impact is to stop/start a service but I’d like to > understand it a bit more. > > > > Thank you in advance for any information. > > > > > > Logs about this issue: > > … > > Dec 6 22:55:05 Node1 crmd: [5235]: info: process_graph_event: Detected > action msgbroker_monitor_10000 from a different transition: 16048 vs. 18014 > > Dec 6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph: > process_graph_event:477 - Triggered transition abort (complete=1, > tag=lrm_rsc_op, id=msgbroker_monitor_10000, > magic=0:7;104:16048:0:5fb57f01-3397-45a8-905f-c48cecdc8692, cib=0.971.5) : > Old event > > Dec 6 22:55:05 Node1 crmd: [5235]: WARN: update_failcount: Updating > failcount for msgbroker on Node0 after failed monitor: rc=7 (update=value++, > time=1354852505) > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > origin=abort_transition_graph ] > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_state_transition: All 2 cluster > nodes are eligible to run resources. > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28069: > Requesting the current CIB: S_POLICY_ENGINE > > Dec 6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph: > te_update_diff:142 - Triggered transition abort (complete=1, tag=nvpair, > id=status-Node0-fail-count-msgbroker, magic=NA, cib=0.971.6) : Transient > attribute: update > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28070: > Requesting the current CIB: S_POLICY_ENGINE > > Dec 6 22:55:05 Node1 crmd: [5235]: info: abort_transition_graph: > te_update_diff:142 - Triggered transition abort (complete=1, tag=nvpair, > id=status-Node0-last-failure-msgbroker, magic=NA, cib=0.971.7) : Transient > attribute: update > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke: Query 28071: > Requesting the current CIB: S_POLICY_ENGINE > > Dec 6 22:55:05 Node1 attrd: [5232]: info: find_hash_entry: Creating hash > entry for last-failure-msgbroker > > Dec 6 22:55:05 Node1 crmd: [5235]: info: do_pe_invoke_callback: Invoking > the PE: query=28071, ref=pe_calc-dc-1354852505-39407, seq=12, quorate=1 > > Dec 6 22:55:05 Node1 pengine: [5233]: notice: unpack_config: On loss of CCM > Quorum: Ignore > > Dec 6 22:55:05 Node1 pengine: [5233]: notice: unpack_rsc_op: Operation > txpublisher_monitor_0 found resource txpublisher active on Node1 > > Dec 6 22:55:05 Node1 pengine: [5233]: WARN: unpack_rsc_op: Processing > failed op msgbroker_monitor_10000 on Node0: not running (7) > > … > > Dec 6 22:55:05 Node1 pengine: [5233]: notice: common_apply_stickiness: > msgbroker can fail 999999 more times on Node0 before being forced off > > … > > Dec 6 22:55:05 Node1 pengine: [5233]: notice: RecurringOp: Start recurring > monitor (10s) for msgbroker on Node0 > > … > > Dec 6 22:55:05 Node1 pengine: [5233]: notice: LogActions: Recover msgbroker > (Started Node0) > > … > > Dec 6 22:55:05 Node1 crmd: [5235]: info: te_rsc_command: Initiating action > 37: stop msgbroker_stop_0 on Node0 > > > > > > Transition 18014 details: > > > > Dec 6 22:52:18 Node1 pengine: [5233]: notice: process_pe_message: > Transition 18014: PEngine Input stored in: > /var/lib/pengine/pe-input-3270.bz2 > > Dec 6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > > Dec 6 22:52:18 Node1 crmd: [5235]: info: unpack_graph: Unpacked transition > 18014: 0 actions in 0 synapses > > Dec 6 22:52:18 Node1 crmd: [5235]: info: do_te_invoke: Processing graph > 18014 (ref=pe_calc-dc-1354852338-39406) derived from > /var/lib/pengine/pe-input-3270.bz2 > > Dec 6 22:52:18 Node1 crmd: [5235]: info: run_graph: > ==================================================== > > Dec 6 22:52:18 Node1 crmd: [5235]: notice: run_graph: Transition 18014 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-3270.bz2): Complete > > Dec 6 22:52:18 Node1 crmd: [5235]: info: te_graph_trigger: Transition 18014 > is now complete > > Dec 6 22:52:18 Node1 crmd: [5235]: info: notify_crmd: Transition 18014 > status: done - <null> > > Dec 6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > > Dec 6 22:52:18 Node1 crmd: [5235]: info: do_state_transition: Starting > PEngine Recheck Timer > > > > > > Youssef > > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org