Hi Barry, On 2021/3/21 6:14, Barry Song wrote: > update_idle_core() is only done for the case of sched_smt_present. > but test_idle_cores() is done for all machines even those without > smt.
The patch looks good to me. May I know for what case we need to keep CONFIG_SCHED_SMT for non-smt machines? Thanks, -Aubrey > this could contribute to up 8%+ hackbench performance loss on a > machine like kunpeng 920 which has no smt. this patch removes the > redundant test_idle_cores() for non-smt machines. > > we run the below hackbench with different -g parameter from 2 to > 14, for each different g, we run the command 10 times and get the > average time: > $ numactl -N 0 hackbench -p -T -l 20000 -g $1 > > hackbench will report the time which is needed to complete a certain > number of messages transmissions between a certain number of tasks, > for example: > $ numactl -N 0 hackbench -p -T -l 20000 -g 10 > Running in threaded mode with 10 groups using 40 file descriptors each > (== 400 tasks) > Each sender will pass 20000 messages of 100 bytes > > The below is the result of hackbench w/ and w/o this patch: > g= 2 4 6 8 10 12 14 > w/o: 1.8151 3.8499 5.5142 7.2491 9.0340 10.7345 12.0929 > w/ : 1.8428 3.7436 5.4501 6.9522 8.2882 9.9535 11.3367 > +4.1% +8.3% +7.3% +6.3% > > Signed-off-by: Barry Song <song.bao....@hisilicon.com> > --- > kernel/sched/fair.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 2e2ab1e..de42a32 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6038,9 +6038,11 @@ static inline bool test_idle_cores(int cpu, bool def) > { > struct sched_domain_shared *sds; > > - sds = rcu_dereference(per_cpu(sd_llc_shared, cpu)); > - if (sds) > - return READ_ONCE(sds->has_idle_cores); > + if (static_branch_likely(&sched_smt_present)) { > + sds = rcu_dereference(per_cpu(sd_llc_shared, cpu)); > + if (sds) > + return READ_ONCE(sds->has_idle_cores); > + } > > return def; > } >