topology: Provide cfs_overload_cpus bitmap

Valentin Schneider Mon, 12 Nov 2018 08:42:57 -0800

Hi Steve,

On 09/11/2018 12:50, Steve Sistare wrote:
> From: Steve Sistare <steve.sist...@oracle.com>
> 
> Define and initialize a sparse bitmap of overloaded CPUs, per
> last-level-cache scheduling domain, for use by the CFS scheduling class.
> Save a pointer to cfs_overload_cpus in the rq for efficient access.
> 
> Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
> ---
>  include/linux/sched/topology.h |  1 +
>  kernel/sched/sched.h           |  2 ++
>  kernel/sched/topology.c        | 21 +++++++++++++++++++--
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 6b99761..b173a77 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -72,6 +72,7 @@ struct sched_domain_shared {
>       atomic_t        ref;
>       atomic_t        nr_busy_cpus;
>       int             has_idle_cores;
> +     struct sparsemask *cfs_overload_cpus;


Thinking about misfit stealing, we can't use the sd_llc_shared's because
on big.LITTLE misfit migrations happen across LLC domains.

I was thinking of adding a misfit sparsemask to the root_domain, but
then I thought we could do the same thing for cfs_overload_cpus.

By doing so we'd have a single source of information for overloaded CPUs,
and we could filter that down during idle balance - you mentioned earlier
wanting to try stealing at each SD level. This would also let you get
rid of [PATCH 02].

The main part of try_steal() could then be written down as something like
this:

----->8-----

for_each_domain(this_cpu, sd) {
        span = sched_domain_span(sd)
                
        for_each_sparse_wrap(src_cpu, overload_cpus) {
                if (cpumask_test_cpu(src_cpu, span) &&
                    steal_from(dts_rq, dst_rf, &locked, src_cpu)) {
                        stolen = 1;
                        goto out;
                }
        }
}

------8<-----

We could limit the stealing to stop at the highest SD_SHARE_PKG_RESOURCES
domain for now so there would be no behavioural change - but we'd
factorize the #ifdef SCHED_SMT bit. Furthermore, the door would be open
to further stealing.

What do you think?

[...]

Re: [PATCH v3 03/10] sched/topology: Provide cfs_overload_cpus bitmap

Reply via email to