Since commit 821ed7df6e2a ("drm/i915: Update reset path to fix
incomplete requests"), setting the device as wedged is permanent as we
cannot recover the engine->submit_request. Stop clearing the I915_WEDGED
status to prevent userspace can getting itself in a muddle.

To fix this correctly, we need to stop overriding engine->submit_request
for the inflight requests and instead need to track the errors in
flight. In the meantime, let's start with the correctness fix.

Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com>
Cc: Mika Kuoppala <mika.kuopp...@intel.com>
Cc: <sta...@vger.kernel.org> # v4.9+
---
 drivers/gpu/drm/i915/i915_drv.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b1e9027a4f80..1c4f0a21eb22 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1824,8 +1824,11 @@ void i915_reset(struct drm_i915_private *dev_priv)
        if (!test_and_clear_bit(I915_RESET_IN_PROGRESS, &error->flags))
                return;
 
-       /* Clear any previous failed attempts at recovery. Time to try again. */
-       __clear_bit(I915_WEDGED, &error->flags);
+       if (test_bit(I915_WEDGED, &error->flags)) {
+               wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS);
+               goto out;
+       }
+
        error->reset_count++;
 
        pr_notice("drm/i915: Resetting chip after gpu hang\n");
@@ -1874,6 +1877,7 @@ void i915_reset(struct drm_i915_private *dev_priv)
 wakeup:
        i915_gem_reset_finish(dev_priv);
        enable_irq(dev_priv->drm.irq);
+out:
        wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS);
        return;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to