On 4/16/2025 5:37 PM, Pierre-Eric Pelloux-Prayer wrote:
Hi,
Le 16/04/2025 à 12:01, Khatri, Sunil a écrit :
On 4/16/2025 12:56 PM, Tvrtko Ursulin wrote:
On 15/04/2025 19:43, Sunil Khatri wrote:
add process and pid information in the userqueue error
logging to make it more useful in resolving the error
by logs.
Sample log:
[ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]]
*ERROR* Timed out waiting for fence f=000000001c74d978 for
comm:Xwayland pid:3427
[ 42.444669] [drm:amdgpu_userqueue_suspend [amdgpu]] *ERROR* Not
suspending userqueue, timeout waiting for comm:Xwayland pid:3427
[ 42.824729] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]]
*ERROR* Timed out waiting for fence f=0000000074407d3e for
comm:systemd-logind pid:1058
[ 42.825082] [drm:amdgpu_userqueue_suspend [amdgpu]] *ERROR* Not
suspending userqueue, timeout waiting for comm:systemd-logind pid:1058
Signed-off-by: Sunil Khatri <sunil.kha...@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
b/drivers/gpu/drm/amd/amdgpu/ amdgpu_userqueue.c
index 1867520ba258..05c1ee27a319 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
@@ -43,7 +43,7 @@ amdgpu_userqueue_cleanup(struct amdgpu_userq_mgr
*uq_mgr,
if (f && !dma_fence_is_signaled(f)) {
ret = dma_fence_wait_timeout(f, true,
msecs_to_jiffies(100));
if (ret <= 0) {
- DRM_ERROR("Timed out waiting for fence f=%p\n", f);
+ drm_file_err(uq_mgr->file, "Timed out waiting for
fence f=%p\n", f);
You decided to leave %p after all?
Yes we are printing the fence ptr here to see which fence is timing
out. Anyways right now intention of this patch is to add additional
process information along with existing information like fence here.
I agree with Tvrtko, "fence=%llu:%llu" would be better to identify
"which fence is timing out".
I agree to it for sure, just that there are other places also where we
are printing fence ptr and will take that up in another patch.
Regards
Sunil Khatri
Pierre-Eric
regards
Sunil
return;
}
}
@@ -440,7 +440,8 @@ amdgpu_userqueue_resume_all(struct
amdgpu_userq_mgr *uq_mgr)
}
if (ret)
- DRM_ERROR("Failed to map all the queues\n");
+ drm_file_err(uq_mgr->file, "Failed to map all the queue\n");
You lost the plural by accident.
Yes i will add 's'. Noted.
I am also not sure "all the queues" makes sense in this context
versus "all queues" but it's inconsequential really.
Regards
Sunil
Yes it all queues from a uq_mgr.
+
return ret;
}
@@ -598,7 +599,8 @@ amdgpu_userqueue_suspend_all(struct
amdgpu_userq_mgr *uq_mgr)
}
if (ret)
- DRM_ERROR("Couldn't unmap all the queues\n");
+ drm_file_err(uq_mgr->file, "Couldn't unmap all the
queues\n");
+
return ret;
}
@@ -615,7 +617,7 @@ amdgpu_userqueue_wait_for_signal(struct
amdgpu_userq_mgr *uq_mgr)
continue;
ret = dma_fence_wait_timeout(f, true,
msecs_to_jiffies(100));
if (ret <= 0) {
- DRM_ERROR("Timed out waiting for fence f=%p\n", f);
+ drm_file_err(uq_mgr->file, "Timed out waiting for
fence f=%p\n", f);
return -ETIMEDOUT;
}
}
@@ -634,13 +636,13 @@ amdgpu_userqueue_suspend(struct
amdgpu_userq_mgr *uq_mgr,
/* Wait for any pending userqueue fence work to finish */
ret = amdgpu_userqueue_wait_for_signal(uq_mgr);
if (ret) {
- DRM_ERROR("Not suspending userqueue, timeout waiting for
work\n");
+ drm_file_err(uq_mgr->file, "Not suspending userqueue,
timeout waiting\n");
return;
}
ret = amdgpu_userqueue_suspend_all(uq_mgr);
if (ret) {
- DRM_ERROR("Failed to evict userqueue\n");
+ drm_file_err(uq_mgr->file, "Failed to evict userqueue\n");
return;
It is pre-existing but strikes me as odd that failure to
amdgpu_userqueue_suspend_all() logs a failure to *evict* instead of
suspend (as the previous log does). Anyway, I did not look at the
surrounding code so just thinking out loud.
Yes suspend failed as all the fences were not evicted and thats why
suspend failed. Anyways there are already alex patches which will
change this to unmap as a code reorganisation for suspend/resume is
in pipeline.
regards
Sunil
Regards,
Tvrtko
}