smp: Generalize 2nd sched domain

Srikar Dronamraju Sun, 19 Jul 2020 23:22:02 -0700

* Gautham R Shenoy <e...@linux.vnet.ibm.com> [2020-07-17 12:07:55]:

> On Tue, Jul 14, 2020 at 10:06:19AM +0530, Srikar Dronamraju wrote:
> > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > powerpc_topology. This domain will collapse if cpumask of l2-cache is
> > same as SMT domain. However we could generalize this domain such that it
> > could mean either be a "CACHE" domain or a "BIGCORE" domain.
> > 
> > While setting up the "CACHE" domain, check if shared_cache is already
> > set.
> > 
> > Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
> > Cc: Michael Ellerman <micha...@au1.ibm.com>
> > Cc: Nick Piggin <npig...@au1.ibm.com>
> > Cc: Oliver OHalloran <olive...@au1.ibm.com>
> > Cc: Nathan Lynch <nath...@linux.ibm.com>
> > Cc: Michael Neuling <mi...@linux.ibm.com>
> > Cc: Anton Blanchard <an...@au1.ibm.com>
> > Cc: Gautham R Shenoy <e...@linux.vnet.ibm.com>
> > Cc: Vaidyanathan Srinivasan <sva...@linux.ibm.com>
> > Signed-off-by: Srikar Dronamraju <sri...@linux.vnet.ibm.com>
> > ---
> > @@ -867,11 +869,16 @@ static const struct cpumask *smallcore_smt_mask(int 
> > cpu)
> >  }
> >  #endif
> > 
> > +static const struct cpumask *cpu_bigcore_mask(int cpu)
> > +{
> > +   return cpu_core_mask(cpu);
> 
> It should be cpu_smt_mask() if we want the redundant big-core to be
> degenerated in favour of the SMT level on P8, no? Because
> cpu_core_mask refers to all the CPUs that are in the same chip.
>


Right, but it cant be cpu_smt_mask since cpu_smt_mask is only enabled in
CONFIG_SCHED_SMT. I was looking at using sibling_map, but we have to careful
for power9 / PowerNV mode. Guess that should be fine.

> > +}
> > +
> >  static struct sched_domain_topology_level powerpc_topology[] = {
> >  #ifdef CONFIG_SCHED_SMT
> >     { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
> >  #endif
> > -   { shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
> > +   { cpu_bigcore_mask, SD_INIT_NAME(BIGCORE) },
> >     { cpu_cpu_mask, SD_INIT_NAME(DIE) },
> >     { NULL, },
> >  };
> > @@ -1319,7 +1326,6 @@ static void add_cpu_to_masks(int cpu)
> >  void start_secondary(void *unused)
> >  {
> >     unsigned int cpu = smp_processor_id();
> > -   struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> > 
> >     mmgrab(&init_mm);
> >     current->active_mm = &init_mm;
> > @@ -1345,14 +1351,20 @@ void start_secondary(void *unused)
> >     /* Update topology CPU masks */
> >     add_cpu_to_masks(cpu);
> > 
> > -   if (has_big_cores)
> > -           sibling_mask = cpu_smallcore_mask;
> >     /*
> >      * Check for any shared caches. Note that this must be done on a
> >      * per-core basis because one core in the pair might be disabled.
> >      */
> > -   if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
> > -           shared_caches = true;
> > +   if (!shared_caches) {
> > +           struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
> > +           struct cpumask *mask = cpu_l2_cache_mask(cpu);
> > +
> > +           if (has_big_cores)
> > +                   sibling_mask = cpu_smallcore_mask;
> > +
> > +           if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
> > +                   shared_caches = true;
> 
> Shouldn't we use cpumask_subset() here ?

Wouldn't cpumask_subset should return 1 if both are same?
We dont want to have shared_caches set if both the masks are equal. 

>                       
> > +   }
> > 
> >     set_numa_node(numa_cpu_lookup_table[cpu]);
> >     set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
> > @@ -1390,6 +1402,14 @@ void __init smp_cpus_done(unsigned int max_cpus)
> >             smp_ops->bringup_done();
> > 
> >     dump_numa_cpu_topology();
> > +   if (shared_caches) {
> > +           pr_info("Using shared cache scheduler topology\n");
> > +           powerpc_topology[bigcore_idx].mask = shared_cache_mask;
> > +#ifdef CONFIG_SCHED_DEBUG
> > +           powerpc_topology[bigcore_idx].name = "CACHE";
> > +#endif
> > +           powerpc_topology[bigcore_idx].sd_flags = 
> > powerpc_shared_cache_flags;
> > +   }
> 
> 
> I would much rather that we have all the topology-fixups done in one
> function.
> 
> fixup_topology(void) {
>      if (has_big_core)
>         powerpc_topology[smt_idx].mask = smallcore_smt_mask;
> 
>     if (shared_caches) {
>        const char *name = "CACHE";
>        powerpc_topology[bigcore_idx].mask = shared_cache_mask;
>        strlcpy(powerpc_topology[bigcore_idx].name, name,
>                                       strlen(name));
>        powerpc_topology[bigcore_idx].sd_flags = powerpc_shared_cache_flags;
>     }
> 
>     /* Any other changes to the topology structure here */

We could do this.

> 
> And also as an optimization, get rid of degenerate structures here
> itself so that we don't pay additional penalty while building the
> sched-domains each time.
> 

Yes this is definitely in plan, but slightly later in time.

Thanks for the review and comments.

-- 
Thanks and Regards
Srikar Dronamraju

Re: [PATCH 06/11] powerpc/smp: Generalize 2nd sched domain

Reply via email to