On 26 Feb 2014, at 5:25 pm, yusuke iida <yusk.i...@gmail.com> wrote:
> Hi, Andrew > > 2014-02-21 10:47 GMT+09:00 Andrew Beekhof <and...@beekhof.net>: >> >> On 20 Feb 2014, at 8:39 pm, yusuke iida <yusk.i...@gmail.com> wrote: >> >>> Hi, Andrew >>> >>> 2014-02-20 17:28 GMT+09:00 Andrew Beekhof <and...@beekhof.net>: >>>> Who was pid 16243? >>>> Doesn't look like a pacemaker daemon. >>> pid 16243 is crm_mon. >> >> That means that the state displayed by crm_mon was > 500 updates behind. >> At that point, what its displaying is horribly out of date and evicting it >> seems like a pretty good idea. >> >>> In vm01, crm_mon was started and the state was checked. >>> >>> If there is information required for analysis to other, I get it. >> >> Some idea of what crm_mon is doing would be a good start. >> Adding a few -V options in addition to --disable-ncurses might be the best >> approach. > Run the following command, I get a log of crm_mon. > crm_mon -VVVV --disable-ncurses >crm_mon.log 2>&1 > I attach it. > > BTW, > I checked operation with the application of the following patches you made. > https://github.com/beekhof/pacemaker/commit/4002e4ab6a50ceb44e484613f2abd33e490492a7 > > The load of stonithd fell and queue stopped generating overflow. > This patch looks very effective. > > Is it possible to implement the crm_mon a process similar to this? I don't understand... crm_mon doesn't look for changes to resources or constraints and it should already be using the new faster diff format. [/me reads attachment] Ah, but perhaps I do understand afterall :-) This is repeated over and over: notice: crm_diff_update: [cib_diff_notify] Patch aborted: Application of an update diff failed (-206) notice: xml_patch_version_check: Current num_updates is too high (885 > 67) That would certainly drive up CPU usage and cause crm_mon to get left behind. Happily the fix for that should be: https://github.com/beekhof/pacemaker/commit/6c33820 > > Regards, > Yusuke >> >>> >>> Regards, >>> Yusuke >>>> >>>>> >>>>> Overflow of queue of vm09 has taken place between cib and stonithd. >>>>> Feb 20 14:20:22 [15519] vm09 cib: ( ipc.c:506 ) >>>>> trace: crm_ipcs_flush_events: Sent 36 events (530 remaining) for >>>>> 0x105ec10[15520]: Resource temporarily unavailable (-11) >>>>> Feb 20 14:20:22 [15519] vm09 cib: ( ipc.c:515 ) >>>>> error: crm_ipcs_flush_events: Evicting slow client 0x105ec10[15520]: >>>>> event queue reached 530 entries >>>>> >>>>> Although I checked the code of the problem part, it was not understood >>>>> by which it would be solved. >>>>> >>>>> Is it less likelihood of sending a message of 100 at a time? >>>>> Does calculation of the waiting time after message transmission have a >>>>> problem? >>>>> Threshold of 500 may be too low? >>>> >>>> being 500 behind is really quite a long way. >>> >>> >>> >>> >>> -- >>> ---------------------------------------- >>> METRO SYSTEMS CO., LTD >>> >>> Yusuke Iida >>> Mail: yusk.i...@gmail.com >>> ---------------------------------------- >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > > -- > ---------------------------------------- > METRO SYSTEMS CO., LTD > > Yusuke Iida > Mail: yusk.i...@gmail.com > ---------------------------------------- > <crm_mon.log>_______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org