[AMD Official Use Only - General]

I can see the soft reset can be called from amdgpu_device_gpu_recover and pci 
error handler, they have called amdgpu_device_lock_reset_domain to acquire a 
reset lock before soft reset.

Regards,
Jack

-----Original Message-----
From: Christian König <ckoenig.leichtzumer...@gmail.com>
Sent: Wednesday, December 20, 2023 10:06 PM
To: Xiao, Jack <jack.x...@amd.com>; Alex Deucher <alexdeuc...@gmail.com>
Cc: Deucher, Alexander <alexander.deuc...@amd.com>; 
amd-gfx@lists.freedesktop.org; Zhang, Hawking <hawking.zh...@amd.com>
Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before access 
CP_VMID_RESET

Well not the reset lock, but there should only be a single reset queue which 
this runs on.

Regards,
Christian.

Am 20.12.23 um 10:49 schrieb Xiao, Jack:
> [AMD Official Use Only - General]
>
> It's already protected by the reset lock. In my understanding, soft reset 
> should not run in parallel.
>
> Regards,
> Jack
>
> -----Original Message-----
> From: Alex Deucher <alexdeuc...@gmail.com>
> Sent: Wednesday, December 20, 2023 1:04 AM
> To: Xiao, Jack <jack.x...@amd.com>
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> <alexander.deuc...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com>
> Subject: Re: [PATCH] drm/amdgpu/gfx11: need acquire mutex before
> access CP_VMID_RESET
>
> On Tue, Dec 19, 2023 at 4:30 AM Jack Xiao <jack.x...@amd.com> wrote:
>> It's required to take the gfx mutex before access to CP_VMID_RESET,
>> for there is a race condition with CP firmware to write the register.
>>
>> Signed-off-by: Jack Xiao <jack.x...@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> index bdcf96df69e6..ae3370d34d11 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
>> @@ -4518,6 +4518,22 @@ static int gfx_v11_0_soft_reset(void *handle)
>>                  }
>>          }
>>
> We probably want a CPU mutex or spinlock to protect this as well unless this 
> is already protected by the reset lock.
>
> Alex
>
>> +       /* Try to require the gfx mutex before access to CP_VMID_RESET */
>> +       for (i = 0; i < adev->usec_timeout; i++) {
>> +               /* Request with MeId=2, PipeId=0 */
>> +               tmp = REG_SET_FIELD(0, CP_GFX_INDEX_MUTEX, REQUEST, 1);
>> +               tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, CLIENTID, 4);
>> +               WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> +               if (RREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX) == tmp)
>> +                       break;
>> +               udelay(1);
>> +       }
>> +
>> +       if (i >= adev->usec_timeout) {
>> +               printk("Failed to require the gfx mutex during soft 
>> reset\n");
>> +               return -EINVAL;
>> +       }
>> +
>>          WREG32_SOC15(GC, 0, regCP_VMID_RESET, 0xfffffffe);
>>
>>          // Read CP_VMID_RESET register three times.
>> @@ -4526,6 +4542,10 @@ static int gfx_v11_0_soft_reset(void *handle)
>>          RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>>          RREG32_SOC15(GC, 0, regCP_VMID_RESET);
>>
>> +       /* release the gfx mutex */
>> +       tmp = REG_SET_FIELD(tmp, CP_GFX_INDEX_MUTEX, REQUEST, 0);
>> +       WREG32_SOC15(GC, 0, regCP_GFX_INDEX_MUTEX, tmp);
>> +
>>          for (i = 0; i < adev->usec_timeout; i++) {
>>                  if (!RREG32_SOC15(GC, 0, regCP_HQD_ACTIVE) &&
>>                      !RREG32_SOC15(GC, 0, regCP_GFX_HQD_ACTIVE))
>> --
>> 2.41.0
>>

Reply via email to