On 23.05.2023 08:31, Henry Wang wrote:
>> -----Original Message-----
>> Subject: Re: [PATCH v4 09/17] xen/arm: introduce a helper to parse device
>> tree NUMA distance map
>>
>>>>>>> +        /* The default value in node_distance_map is
>>>> NUMA_NO_DISTANCE
>>>>>> */
>>>>>>> +        if ( opposite == NUMA_NO_DISTANCE )
>>>>>>
>>>>>> And the matrix you're reading from can't hold NUMA_NO_DISTANCE
>>>> entries?
>>>>>> I ask because you don't check this above; you only check against
>>>>>> NUMA_LOCAL_DISTANCE.
>>>>>
>>>>> My understanding for the purpose of this part of code is to check if the
>>>> opposite
>>>>> way distance has already been set, so we need to compare the opposite
>>>> way
>>>>> distance with the default value NUMA_NO_DISTANCE here.
>>>>>
>>>>> Back to your question, I can see your point of the question. However I
>> don't
>>>> think
>>>>> NUMA_NO_DISTANCE is a valid value to describe the node distance in the
>>>> device
>>>>> tree. This is because I hunted down the previous discussions and found
>> [2]
>>>> about
>>>>> we should try to keep consistent between the value used in device tree
>> and
>>>> ACPI
>>>>> tables. From the ACPI spec, 0xFF, i.e. NUMA_NO_DISTANCE means
>>>> unreachable.
>>>>> I think this is also the reason why NUMA_NO_DISTANCE can be used as
>> the
>>>> default
>>>>> value of the distance map, otherwise we won't have any value to use.
>>>>
>>>> The [2] link you provided discusses NUMA_LOCAL_DISTANCE.
>>>
>>> I inferred the discussion as "we should try to keep consistent between the
>> value
>>> used in device tree and ACPI tables". Maybe my inference is wrong.
>>>
>>>> Looking at
>>>> Linux'es Documentation/devicetree/numa.txt, there's no mention of an
>>>> upper bound on the distance values. It only says that on the diagonal
>>>> entries should be 10 (i.e. matching ACPI, without really saying so).
>>>
>>> I agree that the NUMA device tree binding is a little bit vague. So I cannot
>>> say the case that you provided is not valid. I would like to ask Arm
>> maintainers
>>> (putting them into To:) opinion on this as I think I am not the one to 
>>> decide
>> the
>>> expected behavior on Arm.
>>>
>>> Bertrand/Julien/Stefano: Would you please kindly share your opinion on
>> which
>>> value should be used as the default value of the node distance map? Do
>> you
>>> think reusing the "unreachable" distance, i.e. 0xFF, as the default node
>> distance
>>> is acceptable here? Thanks!
>>
>> My suggestion would be to, rather than rejecting values >= 0xff, to saturate
>> at 0xfe, while keeping 0xff for NUMA_NO_DISTANCE (and overall keeping
>> things
>> consistent with ACPI).
> 
> Since it has been a while and there were no feedback from Arm maintainers,
> I would like to follow your suggestions for v5. However while I was writing 
> the
> code for the "saturation", i.e., adding below snippet in numa_set_distance():
> ```
> if ( distance > NUMA_NO_DISTANCE )
>         distance = NUMA_MAX_DISTANCE;
> ```
> I noticed an issue which needs your clarification:
> Currently, the default value of the distance map is NUMA_NO_DISTANCE,
> which indicates the nodes are not reachable. IMHO, if in device tree, we do
> saturations like above for ACPI invalid distances (distances >= 0xff), by 
> saturating
> the distance to 0xfe, we are making the unreachable nodes to reachable. I am
> not sure if this will lead to problems. Do you have any more thoughts? Thanks!

I can only answer this with a question back: How is "unreachable" represented
in DT? Or is "unreachable" simply expressed by the absence of any data?

Jan

Reply via email to