On Fri, May 09, 2025 at 11:32:13AM +0800, Z qiang wrote: > > > > On Wed, May 07, 2025 at 07:26:05PM +0800, Zqiang wrote: > > > In the preparation stage of CPU online, if the corresponding > > > the rdp's->nocb_cb_kthread does not exist, will be created, > > > there is a situation where the rdp's rcuop kthreads creation fails, > > > and then de-offload this CPU's rdp, does not assign this CPU's > > > rdp->nocb_cb_kthread pointer, but this rdp's->nocb_gp_rdp and > > > rdp's->rdp_gp->nocb_gp_kthread is still valid. > > > > > > This will cause the subsequent re-offload operation of this offline > > > CPU, which will pass the conditional check and the kthread_unpark() > > > will access invalid rdp's->nocb_cb_kthread pointer. > > > > > > This commit therefore use rdp's->nocb_gp_kthread instead of > > > rdp_gp's->nocb_gp_kthread for safety check. > > > > Let's see... > > > > The rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() functions invoke > > cpus_read_lock(), and thus exclude all the CPU-hotplug notifiers, > > including the one that invokes rcutree_prepare_cpu(). There is also > > rcu_spawn_gp_kthread(), but that is an early_initcall() that happens > > before CPU hotplug can happen, at least for non-boot CPUs. > > > > So rcu_spawn_cpu_nocb_kthread() cannot run concurrently with either > > rcu_nocb_cpu_offload() or rcu_nocb_cpu_deoffload(), correct? > > Yes, the rcutree_prepare_cpu() is invoked under the cpus_write_lock() > protection.
Very good! > > It appears that all CPUs (try to) create their rcuoc and rcuog kthreads > > when they come online, regardless of the nohz_full and rcu_nocbs kernel > > boot parameters, some old tree_nocb.h comments notwithstanding. ;-) The > > rcu_organize_nocb_kthreads() function looks to cover all CPUs as well, > > so ->nocb_gp_rdp will always be set after very early boot (give or take > > alloc_bootmem_cpumask_var() failure in rcu_nocb_setup() and checked for > > by the cpumask_available() in rcu_organize_nocb_kthreads()). > > > > The rcu_spawn_cpu_nocb_kthread() can fail to spawn the GP kthread, > > in which case both ->nocb_cb_kthread and ->nocb_gp_kthread remain > > NULL. > > This is a low probability event, but it is possible, if this happens, > and we test it with rcutorture configured with parameters > nocbs_toggle and onoff_interval, it will trigger a null ptr access. Understood, and that might be another patch if you would like to persue it. Thanx, Paul > Thanks > Zqiang > > > > > > If so, LGTM. > > > > Thanx, Paul > > > > > Signed-off-by: Zqiang <qiang.zhang1...@gmail.com> > > > --- > > > kernel/rcu/tree_nocb.h | 5 ++--- > > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h > > > index 1596812f7f12..6679140bb0b5 100644 > > > --- a/kernel/rcu/tree_nocb.h > > > +++ b/kernel/rcu/tree_nocb.h > > > @@ -1146,7 +1146,6 @@ static bool rcu_nocb_rdp_offload_wait_cond(struct > > > rcu_data *rdp) > > > static int rcu_nocb_rdp_offload(struct rcu_data *rdp) > > > { > > > int wake_gp; > > > - struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; > > > > > > WARN_ON_ONCE(cpu_online(rdp->cpu)); > > > /* > > > @@ -1156,7 +1155,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data > > > *rdp) > > > if (!rdp->nocb_gp_rdp) > > > return -EINVAL; > > > > > > - if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread)) > > > + if (WARN_ON_ONCE(!rdp->nocb_gp_kthread)) > > > return -EINVAL; > > > > > > pr_info("Offloading %d\n", rdp->cpu); > > > @@ -1166,7 +1165,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data > > > *rdp) > > > > > > wake_gp = rcu_nocb_queue_toggle_rdp(rdp); > > > if (wake_gp) > > > - wake_up_process(rdp_gp->nocb_gp_kthread); > > > + wake_up_process(rdp->nocb_gp_kthread); > > > > > > swait_event_exclusive(rdp->nocb_state_wq, > > > rcu_nocb_rdp_offload_wait_cond(rdp)); > > > -- > > > 2.17.1 > > >