Hi, Andrew I understand.
More, a lower batch-limit, there is a possibility that the operation of the cluster becomes too late. I examine avoiding by changing adjustment of a parameter, or the motive method. Thank you for various adjustments. Yusuke 2013/11/19 Andrew Beekhof <and...@beekhof.net>: > > On 16 Nov 2013, at 12:22 am, yusuke iida <yusk.i...@gmail.com> wrote: > >> Hi, Andrew >> >> Thanks for the suggestion variety. >> >> I fixed and tested the value of batch-limit by 1, 2, 3, and 4 from the >> beginning, in order to confirm what batch-limit is suitable. >> >> It was something like the following in my environment. >> Timeout did not occur batch-limit=1 and 2. >> batch-limit = 3 was 1 timeout. >> batch-limit = 4 was 5 timeout. >> >> I think the limit is still high in; From the above results, "limit = >> QB_MAX (1, peers / 4)". > > Remember these results are specific to your (virtual) hardware and configured > timeouts. > I would argue that 5 timeouts out of 2853 actions is actually quite > impressive for a default value in this sort of situation.[1] > > Some tuning in a cluster of this kind is to be expected. > > [1] It took crm_simulate 4 minutes to even pretend to perform all those > operations. > >> >> So I have created a fix to fixed to 2 batch-limit when it became a >> state of extreme. >> https://github.com/yuusuke/pacemaker/commit/efe2d6ebc55be39b8be43de38e7662f039b61dec >> >> Results of the test several times, it seems to work without problems. >> >> When batch-limit is fixed and tested, below has a report. >> batch-limit=1 >> https://drive.google.com/file/d/0BwMFJItoO-fVNk8wTGlYNjNnSHc/edit?usp=sharing >> batch-limit=2 >> https://drive.google.com/file/d/0BwMFJItoO-fVTnc4bXY2YXF2M2M/edit?usp=sharing >> batch-limit=3 >> https://drive.google.com/file/d/0BwMFJItoO-fVYl9Gbks2VlJMR0k/edit?usp=sharing >> batch-limit=4 >> https://drive.google.com/file/d/0BwMFJItoO-fVZnJIazd5MFQ1aGs/edit?usp=sharing >> >> The report at the time of making it operate by my test code is the following. >> https://drive.google.com/file/d/0BwMFJItoO-fVbzB0NjFLeVY3Zmc/edit?usp=sharing >> >> Regards, >> Yusuke >> >> 2013/11/13 Andrew Beekhof <and...@beekhof.net>: >>> Did you look at the load numbers in the logs? >>> The CPUs are being slammed for over 20 minutes. >>> >>> The automatic tuning can only help so much, you're simply asking the >>> cluster to do more work than it is capable of. >>> Giving more priority to cib operations the come via IPC is one option, but >>> as I explained earlier, it comes at the cost of correctness. >>> >>> Given the huge mismatch between the nodes' capacity and the tasks you're >>> asking them to achieve, your best path forward is probably setting a >>> load-threshold < 40% or a batch-limit <= 8. >>> Or we could try a patch like the one below if we think that the defaults >>> are not aggressive enough. >>> >>> diff --git a/crmd/throttle.c b/crmd/throttle.c >>> index d77195a..7636d4a 100644 >>> --- a/crmd/throttle.c >>> +++ b/crmd/throttle.c >>> @@ -611,14 +611,14 @@ throttle_get_total_job_limit(int l) >>> switch(r->mode) { >>> >>> case throttle_extreme: >>> - if(limit == 0 || limit > peers/2) { >>> - limit = peers/2; >>> + if(limit == 0 || limit > peers/4) { >>> + limit = QB_MAX(1, peers/4); >>> } >>> break; >>> >>> case throttle_high: >>> - if(limit == 0 || limit > peers) { >>> - limit = peers; >>> + if(limit == 0 || limit > peers/2) { >>> + limit = QB_MAX(1, peers/2); >>> } >>> break; >>> default: >>> >>> This may also be worthwhile: >>> >>> diff --git a/crmd/throttle.c b/crmd/throttle.c >>> index d77195a..586513a 100644 >>> --- a/crmd/throttle.c >>> +++ b/crmd/throttle.c >>> @@ -387,22 +387,36 @@ static bool throttle_io_load(float *load, unsigned >>> int *blocked) >>> } >>> >>> static enum throttle_state_e >>> -throttle_handle_load(float load, const char *desc) >>> +throttle_handle_load(float load, const char *desc, int cores) >>> { >>> - if(load > THROTTLE_FACTOR_HIGH * throttle_load_target) { >>> + float adjusted_load = load; >>> + >>> + if(cores <= 0) { >>> + /* No adjusting of the supplied load value */ >>> + >>> + } else if(cores == 1) { >>> + /* On a single core machine, a load of 1.0 is already too high */ >>> + adjusted_load = load * THROTTLE_FACTOR_MEDIUM; >>> + >>> + } else { >>> + /* Normalize the load to be per-core */ >>> + adjusted_load = load / cores; >>> + } >>> + >>> + if(adjusted_load > THROTTLE_FACTOR_HIGH * throttle_load_target) { >>> crm_notice("High %s detected: %f", desc, load); >>> return throttle_high; >>> >>> - } else if(load > THROTTLE_FACTOR_MEDIUM * throttle_load_target) { >>> + } else if(adjusted_load > THROTTLE_FACTOR_MEDIUM * >>> throttle_load_target) { >>> crm_info("Moderate %s detected: %f", desc, load); >>> return throttle_med; >>> >>> - } else if(load > THROTTLE_FACTOR_LOW * throttle_load_target) { >>> + } else if(adjusted_load > THROTTLE_FACTOR_LOW * throttle_load_target) { >>> crm_debug("Noticable %s detected: %f", desc, load); >>> return throttle_low; >>> } >>> >>> - crm_trace("Negligable %s detected: %f", desc, load); >>> + crm_trace("Negligable %s detected: %f", desc, adjusted_load); >>> return throttle_none; >>> } >>> >>> @@ -464,22 +478,12 @@ throttle_mode(void) >>> } >>> >>> if(throttle_load_avg(&load)) { >>> - float simple = load / cores; >>> - mode |= throttle_handle_load(simple, "CPU load"); >>> + mode |= throttle_handle_load(load, "CPU load", cores); >>> } >>> >>> if(throttle_io_load(&load, &blocked)) { >>> - float blocked_ratio = 0.0; >>> - >>> - mode |= throttle_handle_load(load, "IO load"); >>> - >>> - if(cores) { >>> - blocked_ratio = blocked / cores; >>> - } else { >>> - blocked_ratio = blocked; >>> - } >>> - >>> - mode |= throttle_handle_load(blocked_ratio, "blocked IO ratio"); >>> + mode |= throttle_handle_load(load, "IO load", 0); >>> + mode |= throttle_handle_load(blocked, "blocked IO ratio", cores); >>> } >>> >>> if(mode & throttle_extreme) { >>> >>> >>> >>> >>> On 12 Nov 2013, at 3:25 pm, yusuke iida <yusk.i...@gmail.com> wrote: >>> >>>> Hi, Andrew >>>> >>>> I'm sorry. >>>> This report was a thing when two cores were assigned to the virtual >>>> machine. >>>> https://drive.google.com/file/d/0BwMFJItoO-fVdlIwTVdFOGRkQ0U/edit?usp=sharing >>>> >>>> I'm sorry to be misleading. >>>> >>>> This is the report acquired with one core. >>>> https://drive.google.com/file/d/0BwMFJItoO-fVSlo0dE0xMzNORGc/edit?usp=sharing >>>> >>>> It does not define the LRMD_MAX_CHILDREN on any node. >>>> load-threshold is still default. >>>> cib_max_cpu is set to 0.4 by the following processing. >>>> >>>> if(cores == 1) { >>>> cib_max_cpu = 0.4; >>>> } >>>> >>>> since -- if it exceeds 60%, it will be in the state of Extreme. >>>> Nov 08 11:08:31 [2390] vm01 crmd: ( throttle.c:441 ) notice: >>>> throttle_mode: Extreme CIB load detected: 0.670000 >>>> >>>> From the state of a bit, DC is detecting that vm01 is in the state of >>>> Extreme. >>>> Nov 08 11:08:32 [2387] vm13 crmd: ( throttle.c:701 ) debug: >>>> throttle_update: Host vm01 supports a maximum of 2 jobs and >>>> throttle mode 1000. New job limit is 1 >>>> >>>> From the following log, a dynamic change of batch-limit also seems to >>>> process satisfactorily. >>>> # grep "throttle_get_total_job_limit" pacemaker.log >>>> (snip) >>>> Nov 08 11:08:31 [2387] vm13 crmd: ( throttle.c:629 ) trace: >>>> throttle_get_total_job_limit: No change to batch-limit=0 >>>> Nov 08 11:08:32 [2387] vm13 crmd: ( throttle.c:632 ) trace: >>>> throttle_get_total_job_limit: Using batch-limit=8 >>>> (snip) >>>> Nov 08 11:10:32 [2387] vm13 crmd: ( throttle.c:632 ) trace: >>>> throttle_get_total_job_limit: Using batch-limit=16 >>>> >>>> The above shows that it is not solved even if it restricts the whole >>>> number of jobs by batch-limit. >>>> Are there any other methods of reducing a synchronous message? >>>> >>>> Internal IPC message is not so much. >>>> Do not be able to handle even a little it on the way to handle the >>>> synchronization message? >>>> >>>> Regards, >>>> Yusuke >>>> >>>> 2013/11/12 Andrew Beekhof <and...@beekhof.net>: >>>>> >>>>> On 11 Nov 2013, at 11:48 pm, yusuke iida <yusk.i...@gmail.com> wrote: >>>>> >>>>>> Execution of the graph was also checked. >>>>>> Since the number of pending(s) is restricted to 16 from the middle, it >>>>>> is judged that batch-limit is effective. >>>>>> Observing here, even if a job is restricted by batch-limit, two or >>>>>> more jobs are always fired(ed) in 1 second. >>>>>> These performed jobs return a result and the synchronous message of >>>>>> CIB generates them. >>>>>> The node which continued receiving a synchronous message processes >>>>>> there preferentially, and postpones an internal IPC message. >>>>>> I think that it caused timeout. >>>>> >>>>> What load-threshold were you running this with? >>>>> >>>>> I see this in the logs: >>>>> "Host vm10 supports a maximum of 4 jobs and throttle mode 0100. New job >>>>> limit is 1" >>>>> >>>>> Have you set LRMD_MAX_CHILDREN=4 on these nodes? >>>>> I wouldn't recommend that for a single core VM. I'd let the default of >>>>> 2*cores be used. >>>>> >>>>> >>>>> Also, I'm not seeing "Extreme CIB load detected". Are these still single >>>>> core machines? >>>>> If so it would suggest that something about: >>>>> >>>>> if(cores == 1) { >>>>> cib_max_cpu = 0.4; >>>>> } >>>>> if(throttle_load_target > 0.0 && throttle_load_target < >>>>> cib_max_cpu) { >>>>> cib_max_cpu = throttle_load_target; >>>>> } >>>>> >>>>> if(load > 1.5 * cib_max_cpu) { >>>>> /* Can only happen on machines with a low number of cores */ >>>>> crm_notice("Extreme %s detected: %f", desc, load); >>>>> mode |= throttle_extreme; >>>>> >>>>> is wrong. >>>>> >>>>> What was load-threshold configured as? >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>> >>>> >>>> >>>> -- >>>> ---------------------------------------- >>>> METRO SYSTEMS CO., LTD >>>> >>>> Yusuke Iida >>>> Mail: yusk.i...@gmail.com >>>> ---------------------------------------- >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> >> >> -- >> ---------------------------------------- >> METRO SYSTEMS CO., LTD >> >> Yusuke Iida >> Mail: yusk.i...@gmail.com >> ---------------------------------------- >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- ---------------------------------------- METRO SYSTEMS CO., LTD Yusuke Iida Mail: yusk.i...@gmail.com ---------------------------------------- _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org