On 2/7/2025 4:56 AM, Kim, Jonathan wrote:
> [Public]
>
>> -----Original Message-----
>> From: Lazar, Lijo <lijo.la...@amd.com>
>> Sent: Thursday, February 6, 2025 8:13 AM
>> To: amd-gfx@lists.freedesktop.org; Lazar, Lijo <lijo.la...@amd.com>
>> Cc: Zhang, Hawking <hawking.zh...@amd.com>; Kim, Jonathan
>> <jonathan....@amd.com>
>> Subject: [PATCH 2/4] drm/amdgpu: Add xgmi speed/width related info
>>
>> Add APIs to initialize XGMI speed, width details and get to max
>> bandwidth supported. It is assumed that a device only supports same
>> generation of XGMI links with uniform width.
>>
>> Signed-off-by: Lijo Lazar <lijo.la...@amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 41 ++++++++++++++++++++++++
>> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 11 +++++++
>> 2 files changed, 52 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index 03d909c7b14b..edef5763e2b8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -1679,3 +1679,44 @@ bool amdgpu_xgmi_same_hive(struct amdgpu_device
>> *adev,
>> adev->gmc.xgmi.hive_id &&
>> adev->gmc.xgmi.hive_id == bo_adev->gmc.xgmi.hive_id);
>> }
>> +
>> +static inline uint32_t
>> +__amdgpu_xgmi_get_max_bandwidth(uint8_t link_width,
>> + enum amdgpu_xgmi_link_speed max_speed)
>> +{
>> + /* Returns unidirectional bandwidth in Mbps */
>> + return max_speed * 1000 * link_width;
>
> Instead of wrapping this in a new static, would it work to declare a new
> member adev->gmc.xgmi.max_bandwidth?
> Then in amdgpu_xgmi_init_info, you can define max_bandwidth directly after
> the switch.
>
Link Width/Link speed is generally the standard way to represent link
info. Presently, tools like SMI show these for PCIe. Followed the same
here and kept a max bandwidth value calculation.
>> +}
>> +
>> +/**
>> + * amdgpu_xgmi_get_max_bandwidth - Get max possible bandwidth of a single
>> XGMI
>> + * link of the device in Mbps
>> + * @adev: Target device.
>> + *
>> + * Gets the max possible bandwidth of a single XGMI link of the device in
>> Mbps.
>> + * Assumption is all links of the device have the same width and are of the
>> + * same XGMI generation.
>> + */
>> +uint32_t amdgpu_xgmi_get_max_bandwidth(struct amdgpu_device *adev)
>> +{
>> + if (!adev->gmc.xgmi.supported)
>> + return 0;
>> +
>> + return __amdgpu_xgmi_get_max_bandwidth(adev->gmc.xgmi.max_width,
>> adev->gmc.xgmi.max_speed);
>
> If you do the suggested above, both tests can be simplified into 1 line as
> -> return = xgmi_supported ? max_bandwidth : 0;
>
Yes, primarily width/speed is chosen to represent link info.
>> +}
>> +
>> +void amdgpu_xgmi_init_info(struct amdgpu_device *adev)
>> +{
>> + switch (amdgpu_ip_version(adev, XGMI_HWIP, 0)) {
>> + case IP_VERSION(6, 1, 0):
>> + adev->gmc.xgmi.max_speed = XGMI2_5_SPEED_GT;
>> + adev->gmc.xgmi.max_width = 16;
>> + break;
>> + case IP_VERSION(6, 4, 0):
>> + adev->gmc.xgmi.max_speed = XGMI3_SPEED_GT;
>> + adev->gmc.xgmi.max_width = 16;
>> + break;
>> + default:
>> + break;
>> + }
>> +}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> index 044c4f6be44a..384c4ddef78a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> @@ -25,6 +25,12 @@
>> #include <drm/task_barrier.h>
>> #include "amdgpu_ras.h"
>>
>> +enum amdgpu_xgmi_link_speed {
>> + XGMI2_SPEED_GT = 16,
>> + XGMI2_5_SPEED_GT = 25,
>> + XGMI3_SPEED_GT = 32
>
> Why not declare and define them as units of Mbps to begin with so you don't
> have to do the calculation later?
Actually, this is also something which is followed from PCIe to
represent as GT/s. If bandwidth field alone is kept, then this also
won't be required.
I'm not so sure if that is a good way to keep things.
Thanks,
Lijo
>
> Jon
>
>> +};
>> +
>> struct amdgpu_hive_info {
>> struct kobject kobj;
>> uint64_t hive_id;
>> @@ -75,6 +81,8 @@ struct amdgpu_xgmi {
>> struct ras_common_if *ras_if;
>> bool connected_to_cpu;
>> struct amdgpu_xgmi_ras *ras;
>> + enum amdgpu_xgmi_link_speed max_speed;
>> + uint8_t max_width;
>> };
>>
>> struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev);
>> @@ -102,4 +110,7 @@ int amdgpu_xgmi_request_nps_change(struct
>> amdgpu_device *adev,
>> int amdgpu_get_xgmi_link_status(struct amdgpu_device *adev,
>> int global_link_num);
>>
>> +void amdgpu_xgmi_init_info(struct amdgpu_device *adev);
>> +uint32_t amdgpu_xgmi_get_max_bandwidth(struct amdgpu_device *adev);
>> +
>> #endif
>> --
>> 2.25.1
>