Hi Thomas, Thank you for the review. Numbers from 400 CPUs that I had while back, baseline: Linux 6.19.0-rc4-00310-g755bc1335e3b
On PPC64 system with 400 CPUs: SMT8 to SMT1: baseline: real 1m14.792s baseline+patch: real 0m03.205s # ~23x improvement SMT1 to SMT8: baseline: real 2m27.695s baseline+patch: real 0m02.510s # ~58x improvement Note: We observe huge improvements for max config system which originally took approx to 1 hour to switch SMT states, with GPs expedited is taking 5 to 6 minutes. Analysis: why expediting GPs improves time to complete By expediting the grace period, we force an immediate IPI-driven quiescent state detection across all CPUs rather than lazily waiting, which dramatically reduces the time the calling thread remains blocked in synchronize_rcu() Why holding the cpus_write_lock() for the duration of SMT switch will not work? [1] This causes hung-task timeout splats [2] because there are threads blocked on cpus_read_lock(). Expediting grace periods shrinks the window but doesn't eliminate it. I plan to drop this patch and the next version will only carry the expedited RCU grace period change. I will incorporate all your other suggestions in the next version. [1] https://lore.kernel.org/all/[email protected]/ [2] https://lore.kernel.org/all/[email protected]/ On Wed, Mar 25, 2026 at 08:09:17PM +0100, Thomas Gleixner wrote: > On Wed, Feb 18 2026 at 14:09, Vishal Chourasia wrote: > > From: Joel Fernandes <[email protected]> > > > > Bulk CPU hotplug operations, such as an SMT switch operation, requires > > hotplugging multiple CPUs. The current implementation takes > > cpus_write_lock() for each individual CPU, causing multiple slow grace > > period requests. > > > > Introduce cpu_up_locked() and cpu_down_locked() that assume the caller > > already holds cpus_write_lock(). The cpuhp_smt_enable() and > > cpuhp_smt_disable() functions are updated to hold the lock once around > > the entire loop, rather than for each individual CPU. > > > > Link: > > https://lore.kernel.org/all/[email protected]/ > > Suggested-by: Peter Zijlstra <[email protected]> > > Signed-off-by: Vishal Chourasia <[email protected]> > > You dropped Joel's Signed-off-by .... Sorry for messing up the changelog w.r.t to signed-off-by tag. Will take care in future. > > > -/* Requires cpu_add_remove_lock to be held */ > > -static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, > > +/* Requires cpu_add_remove_lock and cpus_write_lock to be held */ > > +static int __ref cpu_down_locked(unsigned int cpu, int tasks_frozen, > > enum cpuhp_state target) > > No line break required. You have 100 chars. If you still need one: > > https://www.kernel.org/doc/html/latest/process/maintainer-tip.html Ack. > > > */ > > if (cpumask_any_and(cpu_online_mask, > > housekeeping_cpumask(HK_TYPE_DOMAIN)) >= > > nr_cpu_ids) { > > - ret = -EBUSY; > > - goto out; > > + return -EBUSY; > > } > > Please remove the brackets. They are not longer required. All over the place. Ack. > > > +static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, > > + enum cpuhp_state target) > > +{ > > + > > + int ret; > > + cpus_write_lock(); > > Coding style... Ack. > > > + ret = cpu_down_locked(cpu, tasks_frozen, target); > > cpus_write_unlock(); > > arch_smt_update(); > > return ret; > > @@ -2659,6 +2674,16 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) > > int cpu, ret = 0; > > > > cpu_maps_update_begin(); > > + if (cpu_hotplug_offline_disabled) { > > + ret = -EOPNOTSUPP; > > + goto out; > > + } > > + if (cpu_hotplug_disabled) { > > + ret = -EBUSY; > > + goto out; > > + } > > + /* Hold cpus_write_lock() for entire batch operation. */ > > + cpus_write_lock(); > > .... for the entire ... > > And please visiually separate things. Newlines exist for a reason. Sure. > > Thanks, > > tglx Thanks and Regards! Vishalc

