On 25 Feb 2014, at 7:23 pm, Kazunori INOUE <kazunori.ino...@gmail.com> wrote:
> 2014-02-24 11:09 GMT+09:00 Andrew Beekhof <and...@beekhof.net>: >> >> On 24 Feb 2014, at 12:59 pm, Andrew Beekhof <and...@beekhof.net> wrote: >> >>> >>> On 21 Feb 2014, at 9:36 pm, Kazunori INOUE <kazunori.ino...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> WARNING of the following is outputted after pacemaker stopped in >>>> Pacemaker-1.1.11. >>>> >>>> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not >>>> update default_ping_set = 100: rc=141 >>>> >>>> >>>> This is because pacemaker does not wait for completion of 'monitor' of >>>> the ping resource. >>>> >>>> Feb 21 18:22:52 bl460g1n6 lrmd[9100]: debug: >>>> recurring_action_timer: Scheduling another invokation of >>>> prmPing_monitor_30000 >>>> Feb 21 18:22:52 bl460g1n6 pacemakerd[9094]: info: >>>> crm_signal_dispatch: Invoking handler for signal 15: Terminated >>>> (snip) >>>> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM >>>> operation prmPing_stop_0 (call=9, rc=0, cib-update=29, confirmed=true) >>>> ok >>>> (snip) >>>> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: >>>> services_action_cancel: Cancelling op: prmPing_monitor_30000 will >>>> occur once operation completes >>>> (snip) >>>> Feb 21 18:22:54 bl460g1n6 pacemakerd[9094]: info: main: Exiting >>>> pacemakerd >>>> (snip) >>>> Feb 21 18:22:55 bl460g1n6 corosync[9083]: [MAIN ] Corosync Cluster >>>> Engine exiting normally >>>> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not >>>> update default_ping_set = 100: rc=141 >>>> >>>> >>>> As with pacemaker-1.0, I think that it's better to perform a 'stop' >>>> after completion of 'monitor'. >>> >>> Absolutely. If it is not then we have a problem. >> >> Feb 21 18:22:20 bl460g1n6 crmd[9103]: notice: te_rsc_command: Initiating >> action 6: monitor prmPing_monitor_30000 on bl460g1n6 (local) >> Feb 21 18:22:20 bl460g1n6 crmd[9103]: info: do_lrm_rsc_op: Performing >> key=6:1:0:82c5fc6f-611d-4ba9-8a4f-13155b12536e op=prmPing_monitor_30000 >> Feb 21 18:22:22 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM >> operation prmPing_monitor_30000 (call=7, rc=0, cib-update=27, >> confirmed=false) ok >> >> so far so good >> >> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: te_rsc_command: Initiating >> action 5: stop prmPing_stop_0 on bl460g1n6 (local) >> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: >> Cancelling op: prmPing_monitor_30000 will occur once operation completes >> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: do_lrm_rsc_op: Performing >> key=5:2:0:82c5fc6f-611d-4ba9-8a4f-13155b12536e op=prmPing_stop_0 >> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: log_execute: executing - >> rsc:prmPing action:stop call_id:9 >> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: log_finished: finished - >> rsc:prmPing action:stop call_id:9 pid:9207 exit-code:0 exec-time:36ms >> queue-time:0ms >> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: >> Cancelling op: prmPing_monitor_30000 will occur once operation completes >> Feb 21 18:22:54 bl460g1n6 crmd[9103]: notice: process_lrm_event: LRM >> operation prmPing_stop_0 (call=9, rc=0, cib-update=29, confirmed=true) ok >> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: match_graph_event: Action >> prmPing_stop_0 (5) confirmed on bl460g1n6 (rc=0) >> Feb 21 18:22:54 bl460g1n6 crmd[9103]: info: stop_recurring_actions: >> Cancelling op 7 for prmPing (prmPing:7) >> >> David: I think we have the order wrong here... this should happen prior to >> the stop not after. >> Also, does this mean we ran the stop action while the monitor was running? >> >> Feb 21 18:22:54 bl460g1n6 lrmd[9100]: info: services_action_cancel: >> Cancelling op: prmPing_monitor_30000 will occur once operation completes >> Feb 21 18:22:57 bl460g1n6 ping(prmPing)[9195]: WARNING: Could not update >> default_ping_set = 100: rc=141 >> >> >> Inoue-san: could we get a copy of this please? >> Feb 21 18:22:54 bl460g1n6 pengine[9102]: notice: process_pe_message: >> Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-13.bz2 >> > > Since the file was lost, I reproduced the same event. Thanks. I'm reasonably certain this is an lrmd bug, I've passed it on to David. > >> I'd like to see if we knew the recurring operation was running >> >> >>> >>>> (Who manages 'monitor' after pacemaker stopped?) >>>> >>>> Regards, >>>> Kazunori INOUE >>>> <ha-log>_______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > <pcmk-Tue-25-Feb-2014.tar.bz2>_______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org