On Wed, May 07, 2025 at 07:26:05PM +0800, Zqiang wrote: > In the preparation stage of CPU online, if the corresponding > the rdp's->nocb_cb_kthread does not exist, will be created, > there is a situation where the rdp's rcuop kthreads creation fails, > and then de-offload this CPU's rdp, does not assign this CPU's > rdp->nocb_cb_kthread pointer, but this rdp's->nocb_gp_rdp and > rdp's->rdp_gp->nocb_gp_kthread is still valid. > > This will cause the subsequent re-offload operation of this offline > CPU, which will pass the conditional check and the kthread_unpark() > will access invalid rdp's->nocb_cb_kthread pointer. > > This commit therefore use rdp's->nocb_gp_kthread instead of > rdp_gp's->nocb_gp_kthread for safety check.
Let's see... The rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() functions invoke cpus_read_lock(), and thus exclude all the CPU-hotplug notifiers, including the one that invokes rcutree_prepare_cpu(). There is also rcu_spawn_gp_kthread(), but that is an early_initcall() that happens before CPU hotplug can happen, at least for non-boot CPUs. So rcu_spawn_cpu_nocb_kthread() cannot run concurrently with either rcu_nocb_cpu_offload() or rcu_nocb_cpu_deoffload(), correct? It appears that all CPUs (try to) create their rcuoc and rcuog kthreads when they come online, regardless of the nohz_full and rcu_nocbs kernel boot parameters, some old tree_nocb.h comments notwithstanding. ;-) The rcu_organize_nocb_kthreads() function looks to cover all CPUs as well, so ->nocb_gp_rdp will always be set after very early boot (give or take alloc_bootmem_cpumask_var() failure in rcu_nocb_setup() and checked for by the cpumask_available() in rcu_organize_nocb_kthreads()). The rcu_spawn_cpu_nocb_kthread() can fail to spawn the GP kthread, in which case both ->nocb_cb_kthread and ->nocb_gp_kthread remain NULL. If so, LGTM. Thanx, Paul > Signed-off-by: Zqiang <qiang.zhang1...@gmail.com> > --- > kernel/rcu/tree_nocb.h | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > index 1596812f7f12..6679140bb0b5 100644 > --- a/kernel/rcu/tree_nocb.h > +++ b/kernel/rcu/tree_nocb.h > @@ -1146,7 +1146,6 @@ static bool rcu_nocb_rdp_offload_wait_cond(struct > rcu_data *rdp) > static int rcu_nocb_rdp_offload(struct rcu_data *rdp) > { > int wake_gp; > - struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; > > WARN_ON_ONCE(cpu_online(rdp->cpu)); > /* > @@ -1156,7 +1155,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp) > if (!rdp->nocb_gp_rdp) > return -EINVAL; > > - if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread)) > + if (WARN_ON_ONCE(!rdp->nocb_gp_kthread)) > return -EINVAL; > > pr_info("Offloading %d\n", rdp->cpu); > @@ -1166,7 +1165,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp) > > wake_gp = rcu_nocb_queue_toggle_rdp(rdp); > if (wake_gp) > - wake_up_process(rdp_gp->nocb_gp_kthread); > + wake_up_process(rdp->nocb_gp_kthread); > > swait_event_exclusive(rdp->nocb_state_wq, > rcu_nocb_rdp_offload_wait_cond(rdp)); > -- > 2.17.1 >