On Wed, May 07, 2025 at 07:26:05PM +0800, Zqiang wrote:
> In the preparation stage of CPU online, if the corresponding
> the rdp's->nocb_cb_kthread does not exist, will be created,
> there is a situation where the rdp's rcuop kthreads creation fails,
> and then de-offload this CPU's rdp, does not assign this CPU's
> rdp->nocb_cb_kthread pointer, but this rdp's->nocb_gp_rdp and
> rdp's->rdp_gp->nocb_gp_kthread is still valid.
> 
> This will cause the subsequent re-offload operation of this offline
> CPU, which will pass the conditional check and the kthread_unpark()
> will access invalid rdp's->nocb_cb_kthread pointer.
> 
> This commit therefore use rdp's->nocb_gp_kthread instead of
> rdp_gp's->nocb_gp_kthread for safety check.

Let's see...

The rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() functions invoke
cpus_read_lock(), and thus exclude all the CPU-hotplug notifiers,
including the one that invokes rcutree_prepare_cpu().  There is also
rcu_spawn_gp_kthread(), but that is an early_initcall() that happens
before CPU hotplug can happen, at least for non-boot CPUs.

So rcu_spawn_cpu_nocb_kthread() cannot run concurrently with either
rcu_nocb_cpu_offload() or rcu_nocb_cpu_deoffload(), correct?

It appears that all CPUs (try to) create their rcuoc and rcuog kthreads
when they come online, regardless of the nohz_full and rcu_nocbs kernel
boot parameters, some old tree_nocb.h comments notwithstanding.  ;-) The
rcu_organize_nocb_kthreads() function looks to cover all CPUs as well,
so ->nocb_gp_rdp will always be set after very early boot (give or take
alloc_bootmem_cpumask_var() failure in rcu_nocb_setup() and checked for
by the cpumask_available() in rcu_organize_nocb_kthreads()).

The rcu_spawn_cpu_nocb_kthread() can fail to spawn the GP kthread,
in which case both ->nocb_cb_kthread and ->nocb_gp_kthread remain
NULL.

If so, LGTM.

                                                        Thanx, Paul

> Signed-off-by: Zqiang <qiang.zhang1...@gmail.com>
> ---
>  kernel/rcu/tree_nocb.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 1596812f7f12..6679140bb0b5 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -1146,7 +1146,6 @@ static bool rcu_nocb_rdp_offload_wait_cond(struct 
> rcu_data *rdp)
>  static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
>  {
>       int wake_gp;
> -     struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
>  
>       WARN_ON_ONCE(cpu_online(rdp->cpu));
>       /*
> @@ -1156,7 +1155,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
>       if (!rdp->nocb_gp_rdp)
>               return -EINVAL;
>  
> -     if (WARN_ON_ONCE(!rdp_gp->nocb_gp_kthread))
> +     if (WARN_ON_ONCE(!rdp->nocb_gp_kthread))
>               return -EINVAL;
>  
>       pr_info("Offloading %d\n", rdp->cpu);
> @@ -1166,7 +1165,7 @@ static int rcu_nocb_rdp_offload(struct rcu_data *rdp)
>  
>       wake_gp = rcu_nocb_queue_toggle_rdp(rdp);
>       if (wake_gp)
> -             wake_up_process(rdp_gp->nocb_gp_kthread);
> +             wake_up_process(rdp->nocb_gp_kthread);
>  
>       swait_event_exclusive(rdp->nocb_state_wq,
>                             rcu_nocb_rdp_offload_wait_cond(rdp));
> -- 
> 2.17.1
> 

Reply via email to