On 07/01/2026 14:46, Tvrtko Ursulin wrote:
On 07/01/2026 09:01, Christian König wrote:
On 12/19/25 14:41, Tvrtko Ursulin wrote:
Struct amdgpu_ctx contains two copies of the pointer to the context
manager. Remove one.
Signed-off-by: Tvrtko Ursulin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 3 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 -
2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/
drm/amd/amdgpu/amdgpu_ctx.c
index afedea02188d..41c05358d86d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -232,7 +232,7 @@ static int amdgpu_ctx_init_entity(struct
amdgpu_ctx *ctx, u32 hw_ip,
} else {
struct amdgpu_fpriv *fpriv;
- fpriv = container_of(ctx->ctx_mgr, struct amdgpu_fpriv,
ctx_mgr);
+ fpriv = container_of(ctx->mgr, struct amdgpu_fpriv, ctx_mgr);
r = amdgpu_xcp_select_scheds(adev, hw_ip, hw_prio, fpriv,
&num_scheds, &scheds);
Well that code is utterly nonsense to begin with.
amdgpu_xcp_select_scheds() needs the xcp id to select from and not fpriv.
Can you look into re-structuring this so that we don't need that cast?
I had a look and so far only cleanup it up visually a bit so there is
fewer long array subscript dereferences and confusion between sel_xcp_id
and priv->xcp_id.
But on a more fundamental level, since it needs to write to fpriv, the
caller will need to have it one way or the other, no?
And then I noticed not only the atomic_read/inc usage is dodgy, but the
fpriv->xcp_id assignment itself is racy. Two threads submitting to the
same new entity appears can end up with a refcount imbalance and
probably worse.
Shall I replace the ref_cnt atomic with a mutex and protect the whole
selection?
Or maybe there is no race?
fpriv->xcp_id is first assigned in amdgpu_driver_open_kms() and there it
looks it can mostly fail or succeed. I say mostly because the one silent
failure path (not failing the device open) I see if xcp->ddev would be
NULL. I am not sure if/when that can happen? If it can happen and that
is the reason ctx init needs to retry the xcp_id selection? In which
case it is racy.
Regards,
Tvrtko
Regards,
Tvrtko
if (r)
@@ -349,7 +349,6 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr
*mgr, int32_t priority,
else
ctx->stable_pstate = current_stable_pstate;
- ctx->ctx_mgr = &(fpriv->ctx_mgr);
return 0;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/
drm/amd/amdgpu/amdgpu_ctx.h
index aed758d0acaa..cf8d700a22fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -56,7 +56,6 @@ struct amdgpu_ctx {
unsigned long ras_counter_ce;
unsigned long ras_counter_ue;
struct amdgpu_ctx_mgr *mgr;
- struct amdgpu_ctx_mgr *ctx_mgr;
struct amdgpu_ctx_entity *entities[AMDGPU_HW_IP_NUM]
[AMDGPU_MAX_ENTITY_NUM];
};