On 5/27/25 13:27, Khatri, Sunil wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> -----Original Message-----
> From: Koenig, Christian <christian.koe...@amd.com>
> Sent: Tuesday, May 27, 2025 2:32 PM
> To: Khatri, Sunil <sunil.kha...@amd.com>; amd-gfx@lists.freedesktop.org; 
> Deucher, Alexander <alexander.deuc...@amd.com>
> Subject: Re: [PATCH] Revert "drm/amdgpu: promote the implicit sync to the 
> dependent read fences"
> 
> On 5/27/25 10:58, Sunil Khatri wrote:
>> This reverts commit 714bbbf20a7266e48632fab466563e695af9acb5.
>> bisected to this change which is causing the flikering issue in the UI
>> for various apps like glxgears and unigen heaven.
> 
> Is that flickering also there when using kernel queues?
> 
> If not then without an explanation where that flickering is coming from for 
> user queues I have to reject that.
> 
> There is more to just flickering here. We are planning to use 
> amdgpu.user_queue=1 in our CI testing.  That means both kernel and user 
> submission are enabled. When I ran the testing amdgpu_basic tests which runs 
> both kernel and user submissions, all the user queue tests are consistently 
> failing. With this revert change in place all those issues are fixed, Both 
> user and kernel queue tests pass consistently without any artifacts.

That sounds like we are missing some dependency and that works only by 
coincident.

> The reason of why this is helping is I am not sure of but it’s the 
> observation which was observed earlier by Arvind too. We could investigate 
> the probable reason of it infact Arvind is checking that but to enable CI we 
> could revert this if you agree.

I think figuring out what is missing here is more important than enabling CI.

We most likely incorrectly sync to TLB flush fences on the kernel queues now 
and that could not only have negative performance side effects but also hide 
real bugs we need to fix ASAP.

Regards,
Christian.

> 
> Regards
> Sunil Khatri
> 
> It seems to be the right thing TODO and we are still in the process of 
> hammering out all the bugs for user queues.
> 
> Regards,
> Christian.
> 
>> Also when we set user_queue=1, which enable both user and kernel
>> submissions the userqueue tests in IGT also fail.
>>
>> Signed-off-by: Sunil Khatri <sunil.kha...@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> index d6ae9974c952..5576ed0b508f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
>> @@ -249,8 +249,9 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
>> struct amdgpu_sync *sync,
>>
>>       if (resv == NULL)
>>               return -EINVAL;
>> -     /* Implicitly sync only to KERNEL, WRITE and READ */
>> -     dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_READ, f) {
>> +
>> +     /* TODO: Use DMA_RESV_USAGE_READ here */
>> +     dma_resv_for_each_fence(&cursor, resv, DMA_RESV_USAGE_BOOKKEEP, f) {
>>               dma_fence_chain_for_each(f, f) {
>>                       struct dma_fence *tmp = dma_fence_chain_contained(f);
>>
> 

Reply via email to