Srikar Dronamraju <sri...@linux.vnet.ibm.com> writes:
> Currently the kernel detects if its running on a shared lpar platform
> and requests home node associativity before the scheduler sched_domains
> are setup. However between the time NUMA setup is initialized and the
> request for home node associativity, workqueue initializes its per node
> cpumask. The per node workqueue possible cpumask may turn invalid
> after home node associativity resulting in weird situations like
> workqueue possible cpumask being a subset of workqueue online cpumask.
>
> This can be fixed by requesting home node associativity earlier just
> before NUMA setup. However at the NUMA setup time, kernel may not be in
> a position to detect if its running on a shared lpar platform. So
> request for home node associativity and if the request fails, fallback
> on the device tree property.
>
> Signed-off-by: Srikar Dronamraju <sri...@linux.vnet.ibm.com>
> Cc: Michael Ellerman <m...@ellerman.id.au>
> Cc: Nicholas Piggin <npig...@gmail.com>
> Cc: Nathan Lynch <nath...@linux.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Abdul Haleem <abdha...@linux.vnet.ibm.com>
> Cc: Satheesh Rajendran <sathn...@linux.vnet.ibm.com>
> Reported-by: Abdul Haleem <abdha...@linux.vnet.ibm.com>
> Reviewed-by: Nathan Lynch <nath...@linux.ibm.com>
> ---
> Changelog (v2->v3):
> - Handled comments from Nathan Lynch
>   * Use first thread of the core for cpu-to-node map.
>   * get hardware-id in numa_setup_cpu
>
> Changelog (v1->v2):
> - Handled comments from Nathan Lynch
>   * Dont depend on pacas to be setup for the hwid
>
>
>  arch/powerpc/mm/numa.c | 45 +++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 63ec0c3c817f..f837a0e725bc 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -461,13 +461,27 @@ static int of_drconf_to_nid_single(struct drmem_lmb 
> *lmb)
>       return nid;
>  }
>  
> +static int vphn_get_nid(long hwid)
> +{
> +     __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> +     long rc;
> +
> +     rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);

This breaks the build for some defconfigs.

eg. ppc64_book3e_allmodconfig:

  arch/powerpc/mm/numa.c: In function ‘vphn_get_nid’:
  arch/powerpc/mm/numa.c:469:7: error: implicit declaration of function 
‘hcall_vphn’ [-Werror=implicit-function-declaration]
    469 |  rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
        |       ^~~~~~~~~~

It needs to be inside #ifdef CONFIG_PPC_SPLPAR.

> +     if (rc == H_SUCCESS)
> +             return associativity_to_nid(associativity);
> +
> +     return NUMA_NO_NODE;
> +}
> +
>  /*
>   * Figure out to which domain a cpu belongs and stick it there.
> + * cpu_to_phys_id is only valid between smp_setup_cpu_maps() and
> + * smp_setup_pacas(). If called outside this window, set get_hwid to true.
>   * Return the id of the domain used.
>   */
> -static int numa_setup_cpu(unsigned long lcpu)
> +static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)

I really dislike this bool.

> @@ -485,6 +499,27 @@ static int numa_setup_cpu(unsigned long lcpu)
>               return nid;
>       }
>  
> +     /*
> +      * On a shared lpar, device tree will not have node associativity.
> +      * At this time lppaca, or its __old_status field may not be
> +      * updated. Hence kernel cannot detect if its on a shared lpar. So
> +      * request an explicit associativity irrespective of whether the
> +      * lpar is shared or dedicated. Use the device tree property as a
> +      * fallback.
> +      */
> +     if (firmware_has_feature(FW_FEATURE_VPHN)) {
> +             long hwid;
> +
> +             if (get_hwid)
> +                     hwid = get_hard_smp_processor_id(lcpu);
> +             else
> +                     hwid = cpu_to_phys_id[lcpu];

This should move inside vphn_get_nid(), and just do:

        if (cpu_to_phys_id)
                hwid = cpu_to_phys_id[lcpu];
        else
                hwid = get_hard_smp_processor_id(lcpu);


> +             nid = vphn_get_nid(hwid);
> +     }
> +
> +     if (nid != NUMA_NO_NODE)
> +             goto out_present;
> +
>       cpu = of_get_cpu_node(lcpu, NULL);


cheers

Reply via email to