Srikar Dronamraju <sri...@linux.vnet.ibm.com> writes: > Currently the kernel detects if its running on a shared lpar platform > and requests home node associativity before the scheduler sched_domains > are setup. However between the time NUMA setup is initialized and the > request for home node associativity, workqueue initializes its per node > cpumask. The per node workqueue possible cpumask may turn invalid > after home node associativity resulting in weird situations like > workqueue possible cpumask being a subset of workqueue online cpumask. > > This can be fixed by requesting home node associativity earlier just > before NUMA setup. However at the NUMA setup time, kernel may not be in > a position to detect if its running on a shared lpar platform. So > request for home node associativity and if the request fails, fallback > on the device tree property. > > Signed-off-by: Srikar Dronamraju <sri...@linux.vnet.ibm.com> > Cc: Michael Ellerman <m...@ellerman.id.au> > Cc: Nicholas Piggin <npig...@gmail.com> > Cc: Nathan Lynch <nath...@linux.ibm.com> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: Abdul Haleem <abdha...@linux.vnet.ibm.com> > Cc: Satheesh Rajendran <sathn...@linux.vnet.ibm.com> > Reported-by: Abdul Haleem <abdha...@linux.vnet.ibm.com> > Reviewed-by: Nathan Lynch <nath...@linux.ibm.com> > --- > Changelog (v2->v3): > - Handled comments from Nathan Lynch > * Use first thread of the core for cpu-to-node map. > * get hardware-id in numa_setup_cpu > > Changelog (v1->v2): > - Handled comments from Nathan Lynch > * Dont depend on pacas to be setup for the hwid > > > arch/powerpc/mm/numa.c | 45 +++++++++++++++++++++++++++++++++++++----- > 1 file changed, 40 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index 63ec0c3c817f..f837a0e725bc 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -461,13 +461,27 @@ static int of_drconf_to_nid_single(struct drmem_lmb > *lmb) > return nid; > } > > +static int vphn_get_nid(long hwid) > +{ > + __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0}; > + long rc; > + > + rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
This breaks the build for some defconfigs. eg. ppc64_book3e_allmodconfig: arch/powerpc/mm/numa.c: In function ‘vphn_get_nid’: arch/powerpc/mm/numa.c:469:7: error: implicit declaration of function ‘hcall_vphn’ [-Werror=implicit-function-declaration] 469 | rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity); | ^~~~~~~~~~ It needs to be inside #ifdef CONFIG_PPC_SPLPAR. > + if (rc == H_SUCCESS) > + return associativity_to_nid(associativity); > + > + return NUMA_NO_NODE; > +} > + > /* > * Figure out to which domain a cpu belongs and stick it there. > + * cpu_to_phys_id is only valid between smp_setup_cpu_maps() and > + * smp_setup_pacas(). If called outside this window, set get_hwid to true. > * Return the id of the domain used. > */ > -static int numa_setup_cpu(unsigned long lcpu) > +static int numa_setup_cpu(unsigned long lcpu, bool get_hwid) I really dislike this bool. > @@ -485,6 +499,27 @@ static int numa_setup_cpu(unsigned long lcpu) > return nid; > } > > + /* > + * On a shared lpar, device tree will not have node associativity. > + * At this time lppaca, or its __old_status field may not be > + * updated. Hence kernel cannot detect if its on a shared lpar. So > + * request an explicit associativity irrespective of whether the > + * lpar is shared or dedicated. Use the device tree property as a > + * fallback. > + */ > + if (firmware_has_feature(FW_FEATURE_VPHN)) { > + long hwid; > + > + if (get_hwid) > + hwid = get_hard_smp_processor_id(lcpu); > + else > + hwid = cpu_to_phys_id[lcpu]; This should move inside vphn_get_nid(), and just do: if (cpu_to_phys_id) hwid = cpu_to_phys_id[lcpu]; else hwid = get_hard_smp_processor_id(lcpu); > + nid = vphn_get_nid(hwid); > + } > + > + if (nid != NUMA_NO_NODE) > + goto out_present; > + > cpu = of_get_cpu_node(lcpu, NULL); cheers