On Sun, Mar 30, 2025 at 01:17:34PM +0530, Madadi Vineeth Reddy wrote: > Commit 030bdc3fd080 ("powerpc/defconfigs: Set HZ=100 on pseries and ppc64 > defconfigs") lowered CONFIG_HZ from 250 to 100, citing reduced need for a > higher tick rate due to high-resolution timers and concerns about timer > interrupt overhead and cascading effects in the timer wheel. > > However, improvements have been made to the timer wheel algorithm since > then, particularly in eliminating cascading effects at the cost of minor > timekeeping inaccuracies. More details are available here > https://lwn.net/Articles/646950/. This removes the original concern about > cascading, and the reliance on high-resolution timers is not applicable > to the scheduler, which still depends on periodic ticks set by CONFIG_HZ. > > With the introduction of the EEVDF scheduler, users can specify custom > slices for workloads. The default base_slice is 3ms, but with CONFIG_HZ=100 > (10ms tick interval), base_slice is ineffective. Workloads like stress-ng > that do not voluntarily yield the CPU run for ~10ms before switching out. > Additionally, setting a custom slice below 3ms (e.g., 2ms) should lower > task latency, but this effect is lost due to the coarse 10ms tick. > > By increasing CONFIG_HZ to 1000 (1ms tick), base_slice is properly honored, > and user-defined slices work as expected. Benchmark results support this > change: > > Latency improvements in schbench with EEVDF under stress-ng-induced noise: > > Scheduler CONFIG_HZ Custom Slice 99th Percentile Latency (µs) > -------------------------------------------------------------------- > EEVDF 1000 No 0.30x > EEVDF 1000 2 ms 0.29x > EEVDF (default) 100 No 1.00x > NIT: default value on top would be a little less confusing. > Switching to HZ=1000 reduces the 99th percentile latency in schbench by > ~70%. This improvement occurs because, with HZ=1000, stress-ng tasks run > for ~3ms before yielding, compared to ~10ms with HZ=100. As a result, > schbench gets CPU time sooner, reducing its latency. > > Daytrader Performance: > > Daytrader results show minor variation within standard deviation, > indicating no significant regression. > > Workload (Users/Instances) Throughput 1000HZ vs 100HZ (Std Dev%) > -------------------------------------------------------------------------- > 30 u, 1 i +3.01% (1.62%) > 60 u, 1 i +1.46% (2.69%) > 90 u, 1 i –1.33% (3.09%) > 30 u, 2 i -1.20% (1.71%) > 30 u, 3 i –0.07% (1.33%) > > Avg. Response Time: No Change (=) > > pgbench select queries: > > Metric 1000HZ vs 100HZ (Std Dev%) > ------------------------------------------------------------------ > Average TPS Change +2.16% (1.27%) > Average Latency Change –2.21% (1.21%) > > Average TPS: Higher the better > Average Latency: Lower the better > > pgbench shows both throughput and latency improvements beyond standard > deviation. > > Given these results and the improvements in timer wheel implementation, > increasing CONFIG_HZ to 1000 ensures that powerpc benefits from EEVDF’s > base_slice and allows fine-tuned scheduling for latency-sensitive > workloads. > > Signed-off-by: Madadi Vineeth Reddy <vinee...@linux.ibm.com> > --- > arch/powerpc/configs/powernv_defconfig | 2 +- > arch/powerpc/configs/ppc64_defconfig | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/configs/powernv_defconfig > b/arch/powerpc/configs/powernv_defconfig > index 6b6d7467fecf..8abf17d26b3a 100644 > --- a/arch/powerpc/configs/powernv_defconfig > +++ b/arch/powerpc/configs/powernv_defconfig > @@ -46,7 +46,7 @@ CONFIG_CPU_FREQ_GOV_POWERSAVE=y > CONFIG_CPU_FREQ_GOV_USERSPACE=y > CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y > CONFIG_CPU_IDLE=y > -CONFIG_HZ_100=y > +CONFIG_HZ_1000=y > CONFIG_BINFMT_MISC=m > CONFIG_PPC_TRANSACTIONAL_MEM=y > CONFIG_PPC_UV=y > diff --git a/arch/powerpc/configs/ppc64_defconfig > b/arch/powerpc/configs/ppc64_defconfig > index 5fa154185efa..45d437e4c62e 100644 > --- a/arch/powerpc/configs/ppc64_defconfig > +++ b/arch/powerpc/configs/ppc64_defconfig > @@ -57,7 +57,7 @@ CONFIG_CPU_FREQ_GOV_POWERSAVE=y > CONFIG_CPU_FREQ_GOV_USERSPACE=y > CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y > CONFIG_CPU_FREQ_PMAC64=y > -CONFIG_HZ_100=y > +CONFIG_HZ_1000=y > CONFIG_PPC_TRANSACTIONAL_MEM=y > CONFIG_KEXEC=y > CONFIG_KEXEC_FILE=y > -- > 2.47.0 > LGTM
Reviewed-by: Mukesh Kumar Chaurasiya <mchau...@linux.ibm.com>