On Fri, Sep 13, 2019 at 08:47:20AM +0100, Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
> 
> v2: Pull in the special-case reset_clobbers_display, and explain why it
> should be safe in the context of unwedging.
> 
> Reported-by: Janusz Krzysztofik <janusz.krzyszto...@linux.intel.com>
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzyszto...@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospu...@intel.com>
> Cc: Ville Syrjälä <ville.syrj...@linux.intel.com>
> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospu...@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 30 ++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
> b/drivers/gpu/drm/i915/gt/intel_reset.c
> index ee52947eb31d..d3b1cdafd4c2 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -7,6 +7,7 @@
>  #include <linux/sched/mm.h>
>  #include <linux/stop_machine.h>
>  
> +#include "display/intel_display.h"
>  #include "display/intel_display_types.h"
>  #include "display/intel_overlay.h"
>  
> @@ -729,6 +730,28 @@ static void nop_submit_request(struct i915_request 
> *request)
>       intel_engine_queue_breadcrumbs(engine);
>  }
>  
> +static bool reset_clobbers_display(struct drm_i915_private *i915)
> +{
> +     struct intel_crtc *crtc;
> +
> +     if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> +             return false;
> +
> +     /*
> +      * While this appears racy, we should only be inspecting the display
> +      * state at runtime from inside a GPU reset, which will be serialized
> +      * with modesets on affected machines. For a full device reset,
> +      * we should already have cleared the active CRTC state in
> +      * intel_prepare_reset().
> +      */
> +     for_each_intel_crtc(&i915->drm, crtc) {
> +             if (crtc->active)
> +                     return true;
> +     }
> +
> +     return false;
> +}
> +
>  static void __intel_gt_set_wedged(struct intel_gt *gt)
>  {
>       struct intel_engine_cs *engine;
> @@ -793,6 +816,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>       struct intel_gt_timelines *timelines = &gt->timelines;
>       struct intel_timeline *tl;
>       unsigned long flags;
> +     bool ok;
>  
>       if (!test_bit(I915_WEDGED, &gt->reset.flags))
>               return true;
> @@ -838,7 +862,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>       }
>       spin_unlock_irqrestore(&timelines->lock, flags);
>  
> -     intel_gt_sanitize(gt, false);
> +     ok = false;
> +     if (!reset_clobbers_display(gt->i915))
> +             ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;

/me re-reading a lot of the reset code..

I'm rather confused about the whole reset flow. We now do a reset here,
but then we still do another one later on? Except for i915_gem_sanitize(),
which gets called during probe and resume so only does the single reset
I guess.

Hopefully we can't be marked as wedged during probe because I think
this gets called before crtc->active is populated so we'd just do the
reset anyway.

As for the resume cases, I think the display should be off already
when this gets called.

So I guess I'm not really sure what this check is meant to do for us.

> +     if (!ok)
> +             return false;
>  
>       /*
>        * Undo nop_submit_request. We prevent all new i915 requests from
> -- 
> 2.23.0

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to