Am 10.09.19 um 15:30 schrieb Zeng, Oak: > > Regards, > Oak > > -----Original Message----- > From: Koenig, Christian <christian.koe...@amd.com> > Sent: Monday, September 9, 2019 1:14 PM > To: Zeng, Oak <oak.z...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; > amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 9/9] drm/amdgpu: add graceful VM fault handling v2 > >> Well first of all we are not in interrupt context here, this is handled by a >> work item or otherwise we couldn't do all the locking. > This is called from amdgpu_irq_handler. I think this is interrupt context.
No, that's incorrect. Only the primary interrupt ring is handled in interrupt context. Ring 1 and 2 are handled with a work item. > This is also the reason why we use spin lock instead of other sleepable > lock like a semaphore. Well we do have a sleep-able lock here to make it possible to update page tables and directories. That's also the reason why we protect the call with in !in_interrupt(). Regards, Christian. > >> But even in interrupt context another CPU can easily destroy the VM when we >> just handle a stale fault or the process was killed. > Agree with this point. > > So this extra double checking is strictly necessary. > > Regards, > Christian. > > Am 09.09.19 um 16:08 schrieb Zeng, Oak: >> Is looking up vm twice necessary? I think we are in interrupt context, is it >> possible that the user space application can be switched in between? My >> understanding is, if user space application is can't kick in during >> interrupt handling, application shouldn't have chance to exit (then their vm >> being destroyed). >> >> Regards, >> Oak >> >> -----Original Message----- >> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of >> Christian König >> Sent: Monday, September 9, 2019 8:08 AM >> To: Kuehling, Felix <felix.kuehl...@amd.com>; >> amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH 9/9] drm/amdgpu: add graceful VM fault handling v2 >> >> Am 05.09.19 um 00:47 schrieb Kuehling, Felix: >>> On 2019-09-04 11:02 a.m., Christian König wrote: >>>> Next step towards HMM support. For now just silence the retry fault >>>> and optionally redirect the request to the dummy page. >>>> >>>> v2: make sure the VM is not destroyed while we handle the fault. >>>> >>>> Signed-off-by: Christian König <christian.koe...@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 74 ++++++++++++++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 + >>>> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 4 ++ >>>> 3 files changed, 80 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> index 951608fc1925..410d89966a66 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> @@ -3142,3 +3142,77 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm) >>>> } >>>> } >>>> } >>>> + >>>> +/** >>>> + * amdgpu_vm_handle_fault - graceful handling of VM faults. >>>> + * @adev: amdgpu device pointer >>>> + * @pasid: PASID of the VM >>>> + * @addr: Address of the fault >>>> + * >>>> + * Try to gracefully handle a VM fault. Return true if the fault >>>> +was handled and >>>> + * shouldn't be reported any more. >>>> + */ >>>> +bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int >>>> pasid, >>>> + uint64_t addr) >>>> +{ >>>> + struct amdgpu_ring *ring = &adev->sdma.instance[0].page; >>>> + struct amdgpu_bo *root; >>>> + uint64_t value, flags; >>>> + struct amdgpu_vm *vm; >>>> + long r; >>>> + >>>> + if (!ring->sched.ready) >>>> + return false; >>>> + >>>> + spin_lock(&adev->vm_manager.pasid_lock); >>>> + vm = idr_find(&adev->vm_manager.pasid_idr, pasid); >>>> + if (vm) >>>> + root = amdgpu_bo_ref(vm->root.base.bo); >>>> + else >>>> + root = NULL; >>>> + spin_unlock(&adev->vm_manager.pasid_lock); >>>> + >>>> + if (!root) >>>> + return false; >>>> + >>>> + r = amdgpu_bo_reserve(root, true); >>>> + if (r) >>>> + goto error_unref; >>>> + >>>> + spin_lock(&adev->vm_manager.pasid_lock); >>>> + vm = idr_find(&adev->vm_manager.pasid_idr, pasid); >>>> + spin_unlock(&adev->vm_manager.pasid_lock); >>> I think this deserves a comment. If I understand it correctly, you're >>> looking up the vm twice so that you have the VM root reservation to >>> protect against user-after-free. Otherwise the vm pointer is only >>> valid as long as you're holding the spin-lock. >>> >>> >>>> + >>>> + if (!vm || vm->root.base.bo != root) >>> The check of vm->root.base.bo should probably still be under the >>> spin_lock. Because you're not sure yet it's the right VM, you can't >>> rely on the reservation here to prevent use-after-free. >> Good point, going to fix that. >> >>>> + goto error_unlock; >>>> + >>>> + addr /= AMDGPU_GPU_PAGE_SIZE; >>>> + flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED | >>>> + AMDGPU_PTE_SYSTEM; >>>> + >>>> + if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) { >>>> + /* Redirect the access to the dummy page */ >>>> + value = adev->dummy_page_addr; >>>> + flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE | >>>> + AMDGPU_PTE_WRITEABLE; >>>> + } else { >>>> + value = 0; >>>> + } >>>> + >>>> + r = amdgpu_vm_bo_update_mapping(adev, vm, true, NULL, addr, addr + 1, >>>> + flags, value, NULL, NULL); >>>> + if (r) >>>> + goto error_unlock; >>>> + >>>> + r = amdgpu_vm_update_pdes(adev, vm, true); >>>> + >>>> +error_unlock: >>>> + amdgpu_bo_unreserve(root); >>>> + if (r < 0) >>>> + DRM_ERROR("Can't handle page fault (%ld)\n", r); >>>> + >>>> +error_unref: >>>> + amdgpu_bo_unref(&root); >>>> + >>>> + return false; >>>> +} >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>> index 0a97dc839f3b..4dbbe1b6b413 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h >>>> @@ -413,6 +413,8 @@ void amdgpu_vm_check_compute_bug(struct >>>> amdgpu_device *adev); >>>> >>>> void amdgpu_vm_get_task_info(struct amdgpu_device *adev, unsigned int >>>> pasid, >>>> struct amdgpu_task_info *task_info); >>>> +bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int >>>> pasid, >>>> + uint64_t addr); >>>> >>>> void amdgpu_vm_set_task_info(struct amdgpu_vm *vm); >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>> index 9d15679df6e0..15a1ce51befa 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c >>>> @@ -353,6 +353,10 @@ static int gmc_v9_0_process_interrupt(struct >>>> amdgpu_device *adev, >>>> } >>>> >>>> /* If it's the first fault for this address, process it >>>> normally */ >>>> + if (retry_fault && !in_interrupt() && >>>> + amdgpu_vm_handle_fault(adev, entry->pasid, addr)) >>>> + return 1; /* This also prevents sending it to KFD */ >>> The !in_interrupt() is meant to only do this on the rerouted >>> interrupt ring that's handled by a worker function? >> Yes, exactly. But I plan to add a workaround where the CPU redirects the >> fault to the other ring buffer for firmware versions which doesn't have that. >> >> Adds quite a bunch of overhead on Vega10, because of the fault storm but >> should work fine on Vega20. >> >> Key point is that we already released firmware without the redirection, but >> it's still better to have that than to run into the storm. >> >>> Looks like amdgpu_vm_handle_fault never returns true for now. So >>> we'll never get to the "return 1" here. >> Oh, yes that actually belongs into a follow up patch. >> >> Thanks, >> Christian. >> >>> Regards, >>> Felix >>> >>> >>>> + >>>> if (!amdgpu_sriov_vf(adev)) { >>>> /* >>>> * Issue a dummy read to wait for the status register to >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx