fair: fix unnecessary increase of balance interval

Valentin Schneider Mon, 17 Dec 2018 08:56:37 -0800

Hi Vincent,

About time I had a look at this one...


On 14/12/2018 16:01, Vincent Guittot wrote:
> In case of active balance, we increase the balance interval to cover
> pinned tasks cases not covered by all_pinned logic.

AFAIUI the balance increase is there to have plenty of time to
stop the task before going through another load_balance().

Seeing as there is a cpus_allowed check that leads to

    env.flags |= LBF_ALL_PINNED;
    goto out_one_pinned;

in the active balancing part of load_balance(), the condition you're
changing should never be hit when we have pinned tasks. So you may want
to rephrase that bit.

> Neverthless, the active
> migration triggered by asym packing should be treated as the normal
> unbalanced case and reset the interval to default value otherwise active
> migration for asym_packing can be easily delayed for hundreds of ms
> because of this all_pinned detection mecanism.

Mmm so it's not exactly clear why we need this change. If we double the
interval because of a pinned task we wanted to active balance, well it's
just regular pinned task issues and we can't do much about it.

The only scenario I can think of is if you had a workload where you wanted
to do an active balance in several successive load_balance(), in which case
you would keep increasing the interval even though you do migrate a task
every time (which would harm every subsequent active_balance).

In that case every active_balance "user" (pressured CPU, misfit) is
affected, so maybe what we want is something like this:

-----8<-----
@@ -9136,13 +9149,11 @@ static int load_balance(int this_cpu, struct rq 
*this_rq,
                sd->balance_interval = sd->min_interval;
        } else {
                /*
-                * If we've begun active balancing, start to back off. This
-                * case may not be covered by the all_pinned logic if there
-                * is only 1 task on the busy runqueue (because we don't call
-                * detach_tasks).
+                * If we've begun active balancing, start to back off.
+                * Don't increase too much in case we have more active balances
+                * coming up.
                 */
-               if (sd->balance_interval < sd->max_interval)
-                       sd->balance_interval *= 2;
+               sd->balance_interval = 2 * sd->min_interval;
        }
 
        goto out;
----->8-----

Maybe we want a larger threshold - truth be told, it all depends on how
long the cpu stopper can take and if that delay increase is still relevant
nowadays.

> 
> Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org>
> ---
>  kernel/sched/fair.c | 27 +++++++++++++++------------
>  1 file changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9591e7a..487287e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8857,21 +8857,24 @@ static struct rq *find_busiest_queue(struct lb_env 
> *env,
>   */
>  #define MAX_PINNED_INTERVAL  512
>  
> +static inline bool
> +asym_active_balance(struct lb_env *env)
> +{
> +     /*
> +      * ASYM_PACKING needs to force migrate tasks from busy but
> +      * lower priority CPUs in order to pack all tasks in the
> +      * highest priority CPUs.
> +      */
> +     return env->idle != CPU_NOT_IDLE && (env->sd->flags & SD_ASYM_PACKING) 
> &&
> +            sched_asym_prefer(env->dst_cpu, env->src_cpu);
> +}
> +
>  static int need_active_balance(struct lb_env *env)
>  {
>       struct sched_domain *sd = env->sd;
>  
> -     if (env->idle != CPU_NOT_IDLE) {
> -
> -             /*
> -              * ASYM_PACKING needs to force migrate tasks from busy but
> -              * lower priority CPUs in order to pack all tasks in the
> -              * highest priority CPUs.
> -              */
> -             if ((sd->flags & SD_ASYM_PACKING) &&
> -                 sched_asym_prefer(env->dst_cpu, env->src_cpu))
> -                     return 1;
> -     }
> +     if (asym_active_balance(env))
> +             return 1;
>  
>       /*
>        * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
> @@ -9150,7 +9153,7 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
>       } else
>               sd->nr_balance_failed = 0;
>  
> -     if (likely(!active_balance)) {
> +     if (likely(!active_balance) || asym_active_balance(&env)) {

AFAICT this makes us reset the interval whenever we do an asym packing
active balance (if the task is pinned we would goto out_one_pinned).
This goes against the logic I had understood so far and that I explained
higher up.

>               /* We were unbalanced, so reset the balancing interval */
>               sd->balance_interval = sd->min_interval;
>       } else {
>

Re: [PATCH v2 3/3] sched/fair: fix unnecessary increase of balance interval

Reply via email to