Hi Peter,

On Thu, May 14, 2020 at 9:02 AM Peter Zijlstra <[email protected]> wrote:
>
> A little something like so, this syncs min_vruntime when we switch to
> single queue mode. This is very much SMT2 only, I got my head in twist
> when thikning about more siblings, I'll have to try again later.
>
Thanks for the quick patch! :-)

For SMT-n, would it work if sync vruntime if atleast one sibling is
forced idle? Since force_idle is for all the rqs, I think it would
be correct to sync the vruntime if atleast one cpu is forced idle.

> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> -               if (is_idle_task(rq_i->core_pick) && rq_i->nr_running)
> -                       rq_i->core_forceidle = true;
> +               if (is_idle_task(rq_i->core_pick)) {
> +                       if (rq_i->nr_running)
> +                               rq_i->core_forceidle = true;
> +               } else {
> +                       new_active++;
I think we need to reset new_active on restarting the selection.

> +               }
>
>                 if (i == cpu)
>                         continue;
> @@ -4476,6 +4473,16 @@ next_class:;
>                 WARN_ON_ONCE(!cookie_match(next, rq_i->core_pick));
>         }
>
> +       /* XXX SMT2 only */
> +       if (new_active == 1 && old_active > 1) {
As I mentioned above, would it be correct to check if atleast one sibling is
forced_idle? Something like:
if (cpumask_weight(cpu_smt_mask(cpu)) == old_active && new_active < old_active)

> +               /*
> +                * We just dropped into single-rq mode, increment the sequence
> +                * count to trigger the vruntime sync.
> +                */
> +               rq->core->core_sync_seq++;
> +       }
> +       rq->core->core_active = new_active;
core_active seems to be unused.

> +bool cfs_prio_less(struct task_struct *a, struct task_struct *b)
> +{
> +       struct sched_entity *se_a = &a->se, *se_b = &b->se;
> +       struct cfs_rq *cfs_rq_a, *cfa_rq_b;
> +       u64 vruntime_a, vruntime_b;
> +
> +       while (!is_same_tg(se_a, se_b)) {
> +               int se_a_depth = se_a->depth;
> +               int se_b_depth = se_b->depth;
> +
> +               if (se_a_depth <= se_b_depth)
> +                       se_b = parent_entity(se_b);
> +               if (se_a_depth >= se_b_depth)
> +                       se_a = parent_entity(se_a);
> +       }
> +
> +       cfs_rq_a = cfs_rq_of(se_a);
> +       cfs_rq_b = cfs_rq_of(se_b);
> +
> +       vruntime_a = se_a->vruntime - cfs_rq_a->core_vruntime;
> +       vruntime_b = se_b->vruntime - cfs_rq_b->core_vruntime;
Should we be using core_vruntime conditionally? should it be min_vruntime for
default comparisons and core_vruntime during force_idle?

Thanks,
Vineeth

Reply via email to