On Thu, Dec 05, 2019 at 02:02:17PM +0530 Srikar Dronamraju wrote: > With commit 247f2f6f3c70 ("sched/core: Don't schedule threads on pre-empted > vCPUs"), scheduler avoids preempted vCPUs to schedule tasks on wakeup. > This leads to wrong choice of CPU, which in-turn leads to larger wakeup > latencies. Eventually, it leads to performance regression in latency > sensitive benchmarks like soltp, schbench etc. > > On Powerpc, vcpu_is_preempted only looks at yield_count. If the > yield_count is odd, the vCPU is assumed to be preempted. However > yield_count is increased whenever LPAR enters CEDE state. So any CPU > that has entered CEDE state is assumed to be preempted. > > Even if vCPU of dedicated LPAR is preempted/donated, it should have > right of first-use since they are suppose to own the vCPU. > > On a Power9 System with 32 cores > # lscpu > Architecture: ppc64le > Byte Order: Little Endian > CPU(s): 128 > On-line CPU(s) list: 0-127 > Thread(s) per core: 8 > Core(s) per socket: 1 > Socket(s): 16 > NUMA node(s): 2 > Model: 2.2 (pvr 004e 0202) > Model name: POWER9 (architected), altivec supported > Hypervisor vendor: pHyp > Virtualization type: para > L1d cache: 32K > L1i cache: 32K > L2 cache: 512K > L3 cache: 10240K > NUMA node0 CPU(s): 0-63 > NUMA node1 CPU(s): 64-127 > > # perf stat -a -r 5 ./schbench > v5.4 v5.4 + patch > Latency percentiles (usec) Latency percentiles (usec) > 50.0000th: 45 50.0000th: 39 > 75.0000th: 62 75.0000th: 53 > 90.0000th: 71 90.0000th: 67 > 95.0000th: 77 95.0000th: 76 > *99.0000th: 91 *99.0000th: 89 > 99.5000th: 707 99.5000th: 93 > 99.9000th: 6920 99.9000th: 118 > min=0, max=10048 min=0, max=211 > Latency percentiles (usec) Latency percentiles (usec) > 50.0000th: 45 50.0000th: 34 > 75.0000th: 61 75.0000th: 45 > 90.0000th: 72 90.0000th: 53 > 95.0000th: 79 95.0000th: 56 > *99.0000th: 691 *99.0000th: 61 > 99.5000th: 3972 99.5000th: 63 > 99.9000th: 8368 99.9000th: 78 > min=0, max=16606 min=0, max=228 > Latency percentiles (usec) Latency percentiles (usec) > 50.0000th: 45 50.0000th: 34 > 75.0000th: 61 75.0000th: 45 > 90.0000th: 71 90.0000th: 53 > 95.0000th: 77 95.0000th: 57 > *99.0000th: 106 *99.0000th: 63 > 99.5000th: 2364 99.5000th: 68 > 99.9000th: 7480 99.9000th: 100 > min=0, max=10001 min=0, max=134 > Latency percentiles (usec) Latency percentiles (usec) > 50.0000th: 45 50.0000th: 34 > 75.0000th: 62 75.0000th: 46 > 90.0000th: 72 90.0000th: 53 > 95.0000th: 78 95.0000th: 56 > *99.0000th: 93 *99.0000th: 61 > 99.5000th: 108 99.5000th: 64 > 99.9000th: 6792 99.9000th: 85 > min=0, max=17681 min=0, max=121 > Latency percentiles (usec) Latency percentiles (usec) > 50.0000th: 46 50.0000th: 33 > 75.0000th: 62 75.0000th: 44 > 90.0000th: 73 90.0000th: 51 > 95.0000th: 79 95.0000th: 54 > *99.0000th: 113 *99.0000th: 61 > 99.5000th: 2724 99.5000th: 64 > 99.9000th: 6184 99.9000th: 82 > min=0, max=9887 min=0, max=121 > > Performance counter stats for 'system wide' (5 runs): > > context-switches 43,373 ( +- 0.40% ) 44,597 ( +- 0.55% ) > cpu-migrations 1,211 ( +- 5.04% ) 220 ( +- 6.23% ) > page-faults 15,983 ( +- 5.21% ) 15,360 ( +- 3.38% ) > > Waiman Long suggested using static_keys. > > Reported-by: Parth Shah <pa...@linux.ibm.com> > Reported-by: Ihor Pasichnyk <ihor.pasich...@ibm.com> > Cc: Parth Shah <pa...@linux.ibm.com> > Cc: Ihor Pasichnyk <ihor.pasich...@ibm.com> > Cc: Juri Lelli <juri.le...@redhat.com> > Cc: Phil Auld <pa...@redhat.com> > Cc: Waiman Long <long...@redhat.com> > Cc: Gautham R. Shenoy <e...@linux.vnet.ibm.com> > Tested-by: Juri Lelli <juri.le...@redhat.com> > Ack-by: Waiman Long <long...@redhat.com> > Reviewed-by: Gautham R. Shenoy <e...@linux.vnet.ibm.com> > Signed-off-by: Srikar Dronamraju <sri...@linux.vnet.ibm.com> > --- > Changelog v1 (https://patchwork.ozlabs.org/patch/1204190/) ->v3: > Code is now under CONFIG_PPC_SPLPAR as it depends on CONFIG_PPC_PSERIES. > This was suggested by Waiman Long. > > arch/powerpc/include/asm/spinlock.h | 5 +++-- > arch/powerpc/mm/numa.c | 4 ++++ > 2 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/spinlock.h > b/arch/powerpc/include/asm/spinlock.h > index e9a960e28f3c..de817c25deff 100644 > --- a/arch/powerpc/include/asm/spinlock.h > +++ b/arch/powerpc/include/asm/spinlock.h > @@ -35,11 +35,12 @@ > #define LOCK_TOKEN 1 > #endif > > -#ifdef CONFIG_PPC_PSERIES > +#ifdef CONFIG_PPC_SPLPAR > +DECLARE_STATIC_KEY_FALSE(shared_processor); > #define vcpu_is_preempted vcpu_is_preempted > static inline bool vcpu_is_preempted(int cpu) > { > - if (!firmware_has_feature(FW_FEATURE_SPLPAR)) > + if (!static_branch_unlikely(&shared_processor)) > return false; > return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1); > } > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index 50d68d21ddcc..ffb971f3a63c 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -1568,9 +1568,13 @@ int prrn_is_enabled(void) > return prrn_enabled; > } > > +DEFINE_STATIC_KEY_FALSE(shared_processor); > +EXPORT_SYMBOL_GPL(shared_processor); > + > void __init shared_proc_topology_init(void) > { > if (lppaca_shared_proc(get_lppaca())) { > + static_branch_enable(&shared_processor); > bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask), > nr_cpumask_bits); > numa_update_cpu_topology(false); > -- > 2.18.1 >
This looks good to me, thanks Srikar. Acked-by: Phil Auld <pa...@redhat.com> --