We need to flush our srcu protecting resources about to be clobbered
by the reset, inside of our timer failsafe but outside of the
error->wedge_mutex, so that the failsafe can run in case the
synchronize_srcu() takes too long (hits a shrinker deadlock?).

Fixes: 72eb16df010a ("drm/i915: Serialise resets with wedging")
References: https://bugs.freedesktop.org/show_bug.cgi?id=109605
Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuopp...@intel.com>
---
 drivers/gpu/drm/i915/i915_reset.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reset.c 
b/drivers/gpu/drm/i915/i915_reset.c
index 9494b015185a..c2b7570730c2 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -941,9 +941,6 @@ static int do_reset(struct drm_i915_private *i915, unsigned 
int stalled_mask)
 {
        int err, i;
 
-       /* Flush everyone currently using a resource about to be clobbered */
-       synchronize_srcu(&i915->gpu_error.reset_backoff_srcu);
-
        err = intel_gpu_reset(i915, ALL_ENGINES);
        for (i = 0; err && i < RESET_MAX_RETRIES; i++) {
                msleep(10 * (i + 1));
@@ -1140,6 +1137,9 @@ static void i915_reset_device(struct drm_i915_private 
*i915,
        i915_wedge_on_timeout(&w, i915, 5 * HZ) {
                intel_prepare_reset(i915);
 
+               /* Flush everyone using a resource about to be clobbered */
+               synchronize_srcu(&error->reset_backoff_srcu);
+
                mutex_lock(&error->wedge_mutex);
                i915_reset(i915, engine_mask, reason);
                mutex_unlock(&error->wedge_mutex);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to