Am 12.03.25 um 16:10 schrieb SRINIVASAN SHANMUGAM: > On 3/7/2025 7:18 PM, Christian König wrote: >> That was quite troublesome for gang submit. Completely drop this >> approach and enforce the isolation separately. >> >> Signed-off-by: Christian König <christian.koe...@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 +-------- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 11 +++-------- >> drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h | 3 +-- >> 4 files changed, 6 insertions(+), 19 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> index 2ce0c6a152a6..4375e5019418 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> @@ -1111,7 +1111,7 @@ static int amdgpu_cs_vm_handling(struct >> amdgpu_cs_parser *p) >> struct drm_gpu_scheduler *sched = entity->rq->sched; >> struct amdgpu_ring *ring = to_amdgpu_ring(sched); >> >> - if (amdgpu_vmid_uses_reserved(adev, vm, ring->vm_hub)) >> + if (amdgpu_vmid_uses_reserved(vm, ring->vm_hub)) >> return -EINVAL; >> } >> } >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c >> index a194bf3347cb..9e5f6b11d923 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c >> @@ -1665,15 +1665,8 @@ static ssize_t >> amdgpu_gfx_set_enforce_isolation(struct device *dev, >> } >> >> mutex_lock(&adev->enforce_isolation_mutex); >> - for (i = 0; i < num_partitions; i++) { >> - if (adev->enforce_isolation[i] && !partition_values[i]) >> - /* Going from enabled to disabled */ >> - amdgpu_vmid_free_reserved(adev, AMDGPU_GFXHUB(i)); >> - else if (!adev->enforce_isolation[i] && partition_values[i]) >> - /* Going from disabled to enabled */ >> - amdgpu_vmid_alloc_reserved(adev, AMDGPU_GFXHUB(i)); >> + for (i = 0; i < num_partitions; i++) >> adev->enforce_isolation[i] = partition_values[i]; >> - } >> mutex_unlock(&adev->enforce_isolation_mutex); >> >> amdgpu_mes_update_enforce_isolation(adev); >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c >> index 92ab821afc06..4c4e087230ac 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c >> @@ -411,7 +411,7 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct >> amdgpu_ring *ring, >> if (r || !idle) >> goto error; >> >> - if (amdgpu_vmid_uses_reserved(adev, vm, vmhub)) { >> + if (amdgpu_vmid_uses_reserved(vm, vmhub)) { >> r = amdgpu_vmid_grab_reserved(vm, ring, job, &id, fence); >> if (r || !id) >> goto error; >> @@ -464,19 +464,14 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct >> amdgpu_ring *ring, >> >> /* >> * amdgpu_vmid_uses_reserved - check if a VM will use a reserved VMID >> - * @adev: amdgpu_device pointer >> * @vm: the VM to check >> * @vmhub: the VMHUB which will be used >> * >> * Returns: True if the VM will use a reserved VMID. >> */ >> -bool amdgpu_vmid_uses_reserved(struct amdgpu_device *adev, >> - struct amdgpu_vm *vm, unsigned int vmhub) >> +bool amdgpu_vmid_uses_reserved(struct amdgpu_vm *vm, unsigned int vmhub) >> { >> - return vm->reserved_vmid[vmhub] || >> - (adev->enforce_isolation[(vm->root.bo->xcp_id != >> AMDGPU_XCP_NO_PARTITION) ? >> - vm->root.bo->xcp_id : 0] && >> - AMDGPU_IS_GFXHUB(vmhub)); >> + return vm->reserved_vmid[vmhub]; >> } >> >> int amdgpu_vmid_alloc_reserved(struct amdgpu_device *adev, >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h >> index 4012fb2dd08a..240fa6751260 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h >> @@ -78,8 +78,7 @@ void amdgpu_pasid_free_delayed(struct dma_resv *resv, >> >> bool amdgpu_vmid_had_gpu_reset(struct amdgpu_device *adev, >> struct amdgpu_vmid *id); >> -bool amdgpu_vmid_uses_reserved(struct amdgpu_device *adev, >> - struct amdgpu_vm *vm, unsigned int vmhub); >> +bool amdgpu_vmid_uses_reserved(struct amdgpu_vm *vm, unsigned int vmhub); >> int amdgpu_vmid_alloc_reserved(struct amdgpu_device *adev, >> unsigned vmhub); >> void amdgpu_vmid_free_reserved(struct amdgpu_device *adev, > here we are trying to remove process isolation based on VMID's this is > because the constraints that we have limited number of vimids, that could be > assigned to limited number of clients, now we have switched one level above, > that is rather than enforcing isolation on client VMID's , we are directly > applying enforce_isolation onto clients, but not on client's vmids,?
Exactly that, yes. We basically separate the two features. > so based on my this understanding we have removed the enforce_isolation based > on vmids in this patch6 & now we are doing process isolation directly on > client based on > https://patchwork.freedesktop.org/patch/638081/?series=145031&rev=1 > <https://patchwork.freedesktop.org/patch/638081/?series=145031&rev=1> ie., > based on patch4 > [4/8] drm/amdgpu: rework how isolation is enforced v2, we are doing process > isolation directly based on first client ie., based on who is the first > client or first owner & then after completing all jobs wrt first client or > owner , then it is switched to second client or next owner? is my > understanding is right please? ie., based on > "amdgpu_device_enforce_isolation()" function Yes, exactly that. Regards, Christian.