On 3/3/26 14:49, Khatri, Sunil wrote:
>>> @@ -1000,11 +1018,14 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void
>>> *data,
>>> drm_file_err(filp, "Failed to create usermode queue\n");
>>> break;
>>> - case AMDGPU_USERQ_OP_FREE:
>>> - r = amdgpu_userq_destroy(filp, args->in.queue_id);
>>> - if (r)
>>> - drm_file_err(filp, "Failed to destroy usermode queue\n");
>>> + case AMDGPU_USERQ_OP_FREE: {
>>> + queue = xa_erase(&fpriv->userq_mgr.userq_xa, args->in.queue_id);
>> You need to have xa_lock around that or otherwise xa_load/kref_get can race.
> __xa_erase is unlocked version, but xa_erase internally does have xa_lock,
> since there is nothing else but just erase here to it made more sense to have
> clear xa_erase.
No, that doesn't work like that.
The problem is that xa_erase() takes the lock only optionally when it can't
exchange the entry with NULL atomically (e.g. because a full page needs to be
freed up or similar).
But in our use case taking the lock is mandatory because kref_get otherwise
doesn't work.
Regards,
Christian.