Thanks
------------------------------------------
Monk Liu | Cloud-GPU Core team
------------------------------------------
-----Original Message-----
From: Christian König <ckoenig.leichtzumer...@gmail.com>
Sent: Wednesday, June 30, 2021 7:15 PM
To: Wang, YuBiao <yubiao.w...@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Grodzovsky, Andrey <andrey.grodzov...@amd.com>; Xiao, Jack <jack.x...@amd.com>; Xu, Feifei <feifei...@amd.com>; Chen,
Horace <horace.c...@amd.com>; Wang, Kevin(Yang) <kevin1.w...@amd.com>; Tuikov, Luben <luben.tui...@amd.com>; Deucher, Alexander
<alexander.deuc...@amd.com>; Quan, Evan <evan.q...@amd.com>; Koenig, Christian <christian.koe...@amd.com>; Liu, Monk
<monk....@amd.com>; Zhang, Hawking <hawking.zh...@amd.com>
Subject: Re: [PATCH 1/1] drm/amdgpu: Read clock counter via MMIO to reduce
delay (v4)
Am 30.06.21 um 12:10 schrieb YuBiao Wang:
[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.
[How]
It could be directly read by MMIO.
v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.
Signed-off-by: YuBiao Wang <yubiao.w...@amd.com>
Acked-by: Horace Chen <horace.c...@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ff7e9f49040e..9355494002a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -7609,7 +7609,7 @@ static int gfx_v10_0_soft_reset(void *handle)
static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
{
- uint64_t clock;
+ uint64_t clock, clock_lo, clock_hi, hi_check;
amdgpu_gfx_off_ctrl(adev, false);
mutex_lock(&adev->gfx.gpu_clock_mutex);
@@ -7620,8 +7620,15 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct
amdgpu_device *adev)
((uint64_t)RREG32_SOC15(SMUIO, 0,
mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL);
break;
default:
- clock = (uint64_t)RREG32_SOC15(SMUIO, 0,
mmGOLDEN_TSC_COUNT_LOWER) |
- ((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER)
<< 32ULL);
If you want to be extra sure you could add a preempt_disable(); here.
+ clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0,
mmGOLDEN_TSC_COUNT_UPPER);
+ clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0,
mmGOLDEN_TSC_COUNT_LOWER);
+ hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0,
mmGOLDEN_TSC_COUNT_UPPER);
+ /* Carryover happens every 4 Giga time cycles counts which is
roughly 42 secs */
+ if (hi_check != clock_hi) {
+ clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0,
mmGOLDEN_TSC_COUNT_LOWER);
+ clock_hi = hi_check;
+ }
And a preempt_enable(); here. This way the critical section is also not
interrupted by a task switch.
But either way the patch is Reviewed-by: Christian König
<christian.koe...@amd.com>
Regards,
Christian.
+ clock = (uint64_t)clock_lo | ((uint64_t)clock_hi << 32ULL);
break;
}
mutex_unlock(&adev->gfx.gpu_clock_mutex);