Hi Srikar,

Srikar Dronamraju <sri...@linux.vnet.ibm.com> writes:
> @@ -467,15 +467,20 @@ static int of_drconf_to_nid_single(struct drmem_lmb 
> *lmb)
>   */
>  static int numa_setup_cpu(unsigned long lcpu)
>  {
> -     int nid = NUMA_NO_NODE;
>       struct device_node *cpu;
> +     int fcpu = cpu_first_thread_sibling(lcpu);
> +     int nid = NUMA_NO_NODE;
>  
>       /*
>        * If a valid cpu-to-node mapping is already available, use it
>        * directly instead of querying the firmware, since it represents
>        * the most recent mapping notified to us by the platform (eg: VPHN).
> +      * Since cpu_to_node binding remains the same for all threads in the
> +      * core. If a valid cpu-to-node mapping is already available, for
> +      * the first thread in the core, use it.
>        */
> -     if ((nid = numa_cpu_lookup_table[lcpu]) >= 0) {
> +     nid = numa_cpu_lookup_table[fcpu];
> +     if (nid >= 0) {
>               map_cpu_to_node(lcpu, nid);
>               return nid;
>       }

Yes, we need to something like this to prevent a VPHN change that occurs
concurrently with onlining a core's threads from messing us up.

Is it a good assumption that the first thread of a sibling group will
have its mapping initialized first? I think the answer is yes for boot,
but hotplug... not so sure.


> @@ -496,6 +501,16 @@ static int numa_setup_cpu(unsigned long lcpu)
>       if (nid < 0 || !node_possible(nid))
>               nid = first_online_node;
>  
> +     /*
> +      * Update for the first thread of the core. All threads of a core
> +      * have to be part of the same node. This not only avoids querying
> +      * for every other thread in the core, but always avoids a case
> +      * where virtual node associativity change causes subsequent threads
> +      * of a core to be associated with different nid.
> +      */
> +     if (fcpu != lcpu)
> +             map_cpu_to_node(fcpu, nid);
> +

OK, I see that this somewhat addresses my concern above. But changing
this mapping for a remote cpu is unsafe except under specific
circumstances. I think this should first assert:

* numa_cpu_lookup_table[fcpu] == NUMA_NO_NODE
* cpu_online(fcpu) == false

to document and enforce the conditions that must hold for this to be OK.

Reply via email to