amdgpu: add graceful VM fault handling v2

Koenig, Christian Wed, 11 Sep 2019 02:32:35 -0700

Am 10.09.19 um 15:30 schrieb Zeng, Oak:
>
> Regards,
> Oak
>
> -----Original Message-----
> From: Koenig, Christian <christian.koe...@amd.com>
> Sent: Monday, September 9, 2019 1:14 PM
> To: Zeng, Oak <oak.z...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; 
> amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 9/9] drm/amdgpu: add graceful VM fault handling v2
>
>> Well first of all we are not in interrupt context here, this is handled by a 
>> work item or otherwise we couldn't do all the locking.
> This is called from amdgpu_irq_handler. I think this is interrupt context.


No, that's incorrect. Only the primary interrupt ring is handled in 
interrupt context. Ring 1 and 2 are handled with a work item.

>   This is also the reason why we use spin lock instead of other sleepable 
> lock like a semaphore.

Well we do have a sleep-able lock here to make it possible to update 
page tables and directories.

That's also the reason why we protect the call with in !in_interrupt().

Regards,
Christian.

>
>> But even in interrupt context another CPU can easily destroy the VM when we 
>> just handle a stale fault or the process was killed.
> Agree with this point.
>
> So this extra double checking is strictly necessary.
>
> Regards,
> Christian.
>
> Am 09.09.19 um 16:08 schrieb Zeng, Oak:
>> Is looking up vm twice necessary? I think we are in interrupt context, is it 
>> possible that the user space application can be switched in between? My 
>> understanding is, if user space application is can't kick in during 
>> interrupt handling, application shouldn't have chance to exit (then their vm 
>> being destroyed).
>>
>> Regards,
>> Oak
>>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of
>> Christian König
>> Sent: Monday, September 9, 2019 8:08 AM
>> To: Kuehling, Felix <felix.kuehl...@amd.com>;
>> amd-gfx@lists.freedesktop.org
>> Subject: Re: [PATCH 9/9] drm/amdgpu: add graceful VM fault handling v2
>>
>> Am 05.09.19 um 00:47 schrieb Kuehling, Felix:
>>> On 2019-09-04 11:02 a.m., Christian König wrote:
>>>> Next step towards HMM support. For now just silence the retry fault
>>>> and optionally redirect the request to the dummy page.
>>>>
>>>> v2: make sure the VM is not destroyed while we handle the fault.
>>>>
>>>> Signed-off-by: Christian König <christian.koe...@amd.com>
>>>> ---
>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 74 ++++++++++++++++++++++++++
>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  2 +
>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  4 ++
>>>>      3 files changed, 80 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> index 951608fc1925..410d89966a66 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>> @@ -3142,3 +3142,77 @@ void amdgpu_vm_set_task_info(struct amdgpu_vm *vm)
>>>>                    }
>>>>            }
>>>>      }
>>>> +
>>>> +/**
>>>> + * amdgpu_vm_handle_fault - graceful handling of VM faults.
>>>> + * @adev: amdgpu device pointer
>>>> + * @pasid: PASID of the VM
>>>> + * @addr: Address of the fault
>>>> + *
>>>> + * Try to gracefully handle a VM fault. Return true if the fault
>>>> +was handled and
>>>> + * shouldn't be reported any more.
>>>> + */
>>>> +bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int 
>>>> pasid,
>>>> +                      uint64_t addr)
>>>> +{
>>>> +  struct amdgpu_ring *ring = &adev->sdma.instance[0].page;
>>>> +  struct amdgpu_bo *root;
>>>> +  uint64_t value, flags;
>>>> +  struct amdgpu_vm *vm;
>>>> +  long r;
>>>> +
>>>> +  if (!ring->sched.ready)
>>>> +          return false;
>>>> +
>>>> +  spin_lock(&adev->vm_manager.pasid_lock);
>>>> +  vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
>>>> +  if (vm)
>>>> +          root = amdgpu_bo_ref(vm->root.base.bo);
>>>> +  else
>>>> +          root = NULL;
>>>> +  spin_unlock(&adev->vm_manager.pasid_lock);
>>>> +
>>>> +  if (!root)
>>>> +          return false;
>>>> +
>>>> +  r = amdgpu_bo_reserve(root, true);
>>>> +  if (r)
>>>> +          goto error_unref;
>>>> +
>>>> +  spin_lock(&adev->vm_manager.pasid_lock);
>>>> +  vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
>>>> +  spin_unlock(&adev->vm_manager.pasid_lock);
>>> I think this deserves a comment. If I understand it correctly, you're
>>> looking up the vm twice so that you have the VM root reservation to
>>> protect against user-after-free. Otherwise the vm pointer is only
>>> valid as long as you're holding the spin-lock.
>>>
>>>
>>>> +
>>>> +  if (!vm || vm->root.base.bo != root)
>>> The check of vm->root.base.bo should probably still be under the
>>> spin_lock. Because you're not sure yet it's the right VM, you can't
>>> rely on the reservation here to prevent use-after-free.
>> Good point, going to fix that.
>>
>>>> +          goto error_unlock;
>>>> +
>>>> +  addr /= AMDGPU_GPU_PAGE_SIZE;
>>>> +  flags = AMDGPU_PTE_VALID | AMDGPU_PTE_SNOOPED |
>>>> +          AMDGPU_PTE_SYSTEM;
>>>> +
>>>> +  if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_NEVER) {
>>>> +          /* Redirect the access to the dummy page */
>>>> +          value = adev->dummy_page_addr;
>>>> +          flags |= AMDGPU_PTE_EXECUTABLE | AMDGPU_PTE_READABLE |
>>>> +                  AMDGPU_PTE_WRITEABLE;
>>>> +  } else {
>>>> +          value = 0;
>>>> +  }
>>>> +
>>>> +  r = amdgpu_vm_bo_update_mapping(adev, vm, true, NULL, addr, addr + 1,
>>>> +                                  flags, value, NULL, NULL);
>>>> +  if (r)
>>>> +          goto error_unlock;
>>>> +
>>>> +  r = amdgpu_vm_update_pdes(adev, vm, true);
>>>> +
>>>> +error_unlock:
>>>> +  amdgpu_bo_unreserve(root);
>>>> +  if (r < 0)
>>>> +          DRM_ERROR("Can't handle page fault (%ld)\n", r);
>>>> +
>>>> +error_unref:
>>>> +  amdgpu_bo_unref(&root);
>>>> +
>>>> +  return false;
>>>> +}
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> index 0a97dc839f3b..4dbbe1b6b413 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>> @@ -413,6 +413,8 @@ void amdgpu_vm_check_compute_bug(struct
>>>> amdgpu_device *adev);
>>>>      
>>>>      void amdgpu_vm_get_task_info(struct amdgpu_device *adev, unsigned int 
>>>> pasid,
>>>>                                 struct amdgpu_task_info *task_info);
>>>> +bool amdgpu_vm_handle_fault(struct amdgpu_device *adev, unsigned int 
>>>> pasid,
>>>> +                      uint64_t addr);
>>>>      
>>>>      void amdgpu_vm_set_task_info(struct amdgpu_vm *vm);
>>>>      
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> index 9d15679df6e0..15a1ce51befa 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> @@ -353,6 +353,10 @@ static int gmc_v9_0_process_interrupt(struct 
>>>> amdgpu_device *adev,
>>>>            }
>>>>      
>>>>            /* If it's the first fault for this address, process it
>>>> normally */
>>>> +  if (retry_fault && !in_interrupt() &&
>>>> +      amdgpu_vm_handle_fault(adev, entry->pasid, addr))
>>>> +          return 1; /* This also prevents sending it to KFD */
>>> The !in_interrupt() is meant to only do this on the rerouted
>>> interrupt ring that's handled by a worker function?
>> Yes, exactly. But I plan to add a workaround where the CPU redirects the 
>> fault to the other ring buffer for firmware versions which doesn't have that.
>>
>> Adds quite a bunch of overhead on Vega10, because of the fault storm but 
>> should work fine on Vega20.
>>
>> Key point is that we already released firmware without the redirection, but 
>> it's still better to have that than to run into the storm.
>>
>>> Looks like amdgpu_vm_handle_fault never returns true for now. So
>>> we'll never get to the "return 1" here.
>> Oh, yes that actually belongs into a follow up patch.
>>
>> Thanks,
>> Christian.
>>
>>> Regards,
>>>       Felix
>>>
>>>
>>>> +
>>>>            if (!amdgpu_sriov_vf(adev)) {
>>>>                    /*
>>>>                     * Issue a dummy read to wait for the status register to
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 9/9] drm/amdgpu: add graceful VM fault handling v2

Reply via email to