Hi Pixel,

yeah agree with David here, but even busy waiting for the KIQ register read is not really an option in the interrupt handler.

Additional to that when we have a VM fault we usually see a mass storm of them. So allocating and scheduling a work item for each fault like you do here will certainly only result in a whole system crash.

For now I would say just use DRM_ERROR() to print a warning that a VM fault happen and we can't decode it because we can't access the register under SRIOV.

Regards,
Christian.

Am 06.02.2017 um 09:36 schrieb zhoucm1:
Hi Pixel,
I got your mean just now, since your VF must use KIQ to read/write registers, which use fence_wait to wait reading register completed. The alternative way is implementing a new kiq reading/writing register way by using udelay instead of fence wait when reading/writing register in interrupt context.

Regards,
David Zhou

On 2017年02月06日 15:55, Ding, Pixel wrote:
Thanks you for your comments, David. I totally agree on your point.

However, The VM fault status registers record the latest VM fault info no matter when they’re changed, that’s what we care about since we don’t handle too much for VM fault even in bare metal system.

On the other hand, what do you think if we insist to sleep in the interrupt context and let the driver runs into a software bug?

The VM registers are shared among all VFs and we don’t have a copy for each in hardware, I think there's no other way to access them in this case. Do you have any suggestion?

—
Sincerely Yours,
Pixel








On 06/02/2017, 3:33 PM, "Zhou, David(ChunMing)" <david1.z...@amd.com> wrote:

INIT_WORK(&work->base, gmc_v8_0_vm_fault_sched);
However VF is used or not, schedule work shouldn't handle registers reading for interrupt, especially for status register, which could have been changed when you handle it in schedule work after interrupt.

Regards,
David Zhou

-----Original Message-----
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Pixel Ding
Sent: Monday, February 06, 2017 3:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Ding, Pixel <pixel.d...@amd.com>
Subject: [PATCH 2/2] drm/amdgpu/virt: schedule work to handle vm fault for VF

VF uses KIQ to access registers that invoking fence_wait to get the accessing completed. When VM fault occurs, the driver can't sleep in interrupt context.

For some test cases, VM fault is 'legal' and shouldn't cause driver soft lockup.

Signed-off-by: Pixel Ding <pixel.d...@amd.com>
---
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 46 ++++++++++++++++++++++++++++++++---
1 file changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 7669b32..75c913f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -1231,9 +1231,9 @@ static int gmc_v8_0_vm_fault_interrupt_state(struct amdgpu_device *adev,
    return 0;
}

-static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
-                      struct amdgpu_irq_src *source,
-                      struct amdgpu_iv_entry *entry)
+static int gmc_v8_0_process_vm_fault(struct amdgpu_device *adev,
+                     struct amdgpu_irq_src *source,
+                     struct amdgpu_iv_entry *entry)
{
    u32 addr, status, mc_client;

@@ -1262,6 +1262,46 @@ static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
    return 0;
}

+struct gmc_vm_fault_work {
+    struct work_struct      base;
+    struct amdgpu_device    *adev;
+    struct amdgpu_irq_src   *source;
+    struct amdgpu_iv_entry  *entry;
+};
+
+static void gmc_v8_0_vm_fault_sched(struct work_struct *work) {
+    struct gmc_vm_fault_work *vm_work =
+        container_of(work, struct gmc_vm_fault_work, base);
+    struct amdgpu_device *adev = vm_work->adev;
+    struct amdgpu_irq_src *source = vm_work->source;
+    struct amdgpu_iv_entry *entry = vm_work->entry;
+
+    gmc_v8_0_process_vm_fault(adev, source, entry);
+
+    kfree(vm_work);
+}
+
+static int gmc_v8_0_process_interrupt(struct amdgpu_device *adev,
+                      struct amdgpu_irq_src *source,
+                      struct amdgpu_iv_entry *entry) {
+    struct gmc_vm_fault_work *work = NULL;
+
+    if (amdgpu_sriov_vf(adev)) {
+        work = kmalloc(sizeof(struct gmc_vm_fault_work), GFP_ATOMIC);
+        if (!work)
+            return -ENOMEM;
+        INIT_WORK(&work->base, gmc_v8_0_vm_fault_sched);
+        work->adev = adev;
+        work->source = source;
+        work->entry = entry;
+        return schedule_work(&work->base);
+    }
+
+    return gmc_v8_0_process_vm_fault(adev, source, entry); }
+
static void fiji_update_mc_medium_grain_clock_gating(struct amdgpu_device *adev,
                             bool enable)
{
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to