Re: [Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?

Andrew Beekhof Wed, 11 Sep 2013 21:42:16 -0700

On 11/09/2013, at 9:33 PM, Lars Marowsky-Bree <l...@suse.com> wrote:


> On 2013-09-11T19:55:38, Andrew Beekhof <and...@beekhof.net> wrote:
> 
>>> sorry for being thick, but I can't find this in the code now. Did this
>>> slip through again in April?
>> Apparently. But before we add it, I'd like to see if we can do something 
>> coherent.
>> Having 3 (or more) different variables (batch-limit, migration-limit and 
>> this) for controlling these things doesn't seem optimal or user friendly.
> 
> Well, they're all doing something completely different.

No, they're all crude approximations designed to stop the cluster as a whole 
from using up so much cpu/network/etc that recovery introduces more failures 
than it resolves.

> 
> A cluster-wide limit on operations (batch-limit) limits the total
> cluster and network/storage load.
> 
> The max_children prevent a given node from being overloaded by
> concurrent operations.

At the expense of introducing other failures... such as "I fired off an action 
N seconds ago with a timeout < N and still haven't heard back" which was 
possible if batch-limit and max children were too out of balance.
Which is why any limiting needs to happen at centrally on the DC.

> (Reducing batch-limit to emulate this kills
> cluster-wide parallelism and is not optimal.) Clearly, it's not perfect
> either (since it assumes all rsc ops on a node are identical in
> weight; whereas in reality we may want to limit VM start-up to 4, but
> would happily see 32 IP addresses go up at once, or 48 monitors ...),
> but it is an appropriate simplification.
> 
> migration-limit is indeed a special case (needed to limit nodes from
> being overloaded by migrate, which were at the time the only ops that
> affect two nodes at once - batch-limit="4" was too coarse a hammer). I
> do recall that we discussed making it more generic - so that one could
> configure cluster-/node-wide limits for certain operations of specific
> resource types, but that was (rightly) judged to be a rather complex can
> of worms by you.
> 
>> If anything, we should likely be putting work into auto-tuning this
>> stuff instead.  Somehow.
> 
> I'm not sure about how batch-limit can be auto-tuned.

If the cib's CPU usage starts going too high, its time to lower the limit.
Should be possible on linux.

> migration-threshold is mostly a function of the network bandwidth, too.
> 
> MAX_CHILDREN did, sort of, auto-tune (by defaulting to number of cores,
> or something similar, which was appropriate enough[1]).
> 
> It can all be made into a generic, powerful, flexible mechanism that
> describes them all. But I'm afraid that it'd also be quite complex. I'm
> happy to think about it, but the three limits we have/had seemed
> sufficient for the real-world.
> 
> 
> Regards,
>    Lars
> 
> [1] the main complaint was that it was configured via sysconfig, and not
> dynamic via a node attribute as it should be. When we reintroduce it, we
> may want to make nodes default to PCMK/LRMD_MAX_CHILDREN if unset in
> the CIB, and otherwise have that value override the environment
> variable?  That'd be a benefit now that pcmk and lrmd are more closely
> married.

As above, the rate limiting needs to happen on the DC which lends itself to 
being a property of the cib and/or transition graph rather than defined in 
sysconfig.

> 
> 
> -- 
> Architect Storage/HA
> SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, 
> HRB 21284 (AG Nürnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?

Reply via email to