On Thu, Oct 19, 2023 at 03:38:40PM +1100, Michael Ellerman wrote: > Srikar Dronamraju <sri...@linux.vnet.ibm.com> writes: > > If there are shared processor LPARs, underlying Hypervisor can have more > > virtual cores to handle than actual physical cores. > > > > Starting with Power 9, a core has 2 nearly independent thread groups. > > You need to be clearer here that you're talking about "big cores", not > SMT4 cores as seen on bare metal systems.
What is a 'big core' ? I'm thinking big.LITTLE, but I didn't think Power went that route (yet?).. help? > > On a shared processors LPARs, it helps to pack threads to lesser number > > of cores so that the overall system performance and utilization > > improves. PowerVM schedules at a core level. Hence packing to fewer > > cores helps. > > > > For example: Lets says there are two 8-core Shared LPARs that are > > actually sharing a 8 Core shared physical pool, each running 8 threads > > each. Then Consolidating 8 threads to 4 cores on each LPAR would help > > them to perform better. This is because each of the LPAR will get > > 100% time to run applications and there will no switching required by > > the Hypervisor. > > > > To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level. > > .. when the system is running in shared processor mode and has big cores. > > cheers > > > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c > > index 37c41297c9ce..498c2d51fc20 100644 > > --- a/arch/powerpc/kernel/smp.c > > +++ b/arch/powerpc/kernel/smp.c > > @@ -1009,9 +1009,20 @@ static int powerpc_smt_flags(void) > > */ > > static int powerpc_shared_cache_flags(void) > > { > > + if (static_branch_unlikely(&powerpc_asym_packing)) > > + return SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING; > > + > > return SD_SHARE_PKG_RESOURCES; > > } > > > > +static int powerpc_shared_proc_flags(void) > > +{ > > + if (static_branch_unlikely(&powerpc_asym_packing)) > > + return SD_ASYM_PACKING; > > + > > + return 0; > > +} Can you leave the future reader a clue in the form of a comment around here perhaps? Explaining *why* things are as they are etc.. > > + > > /* > > * We can't just pass cpu_l2_cache_mask() directly because > > * returns a non-const pointer and the compiler barfs on that. > > @@ -1048,8 +1059,8 @@ static struct sched_domain_topology_level > > powerpc_topology[] = { > > { cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) }, > > #endif > > { shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) }, > > - { cpu_mc_mask, SD_INIT_NAME(MC) }, > > - { cpu_cpu_mask, SD_INIT_NAME(DIE) }, > > + { cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) }, > > + { cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(DIE) }, > > { NULL, }, > > }; > > > > @@ -1687,6 +1698,8 @@ static void __init fixup_topology(void) > > if (cpu_has_feature(CPU_FTR_ASYM_SMT)) { > > pr_info_once("Enabling Asymmetric SMT scheduling\n"); > > static_branch_enable(&powerpc_asym_packing); > > + } else if (is_shared_processor() && has_big_cores) { > > + static_branch_enable(&powerpc_asym_packing); > > } > > > > #ifdef CONFIG_SCHED_SMT > > -- > > 2.31.1