Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB

Daniel Vetter Tue, 10 Mar 2015 03:29:57 -0700

On Fri, Feb 13, 2015 at 02:35:59PM +0000, Chris Wilson wrote:
> Long ago I found that I was getting sporadic errors when booting SNB,
> with the symptom being that the first batch died with IPEHR != *ACTHD,
> typically caused by the TLB being invalid. These magically disappeared
> if I held the forcewake during the entire ring initialisation sequence.
> (It can probably be shortened to a short critical section, but the whole
> initialisation is full of register writes and so we would be taking and
> releasing forcewake almost continually, and so holding it over the
> entire sequence will probably be a net win!)
> 
> Note some of the kernels I encounted the issue already had the deferred
> forcewake release, so it is still relevant.
> 
> I know that there have been a few other reports with similar failure
> conditions on SNB, I think such as
> References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
> 
> v2: Wrap i915_gem_init_hw() with its own security blanket as we take
> that path following resume and reset.
> 
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8d15c8110962..08450922f373 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
>       if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>               return -EIO;
>  
> +     /* Double layer security blanket, see i915_gem_init() */
> +     intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +
>       if (dev_priv->ellc_size)
>               I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
>  
> @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
>       for_each_ring(ring, dev_priv, i) {
>               ret = ring->init_hw(ring);
>               if (ret)
> -                     return ret;
> +                     goto out;
>       }
>  
>       for (i = 0; i < NUM_L3_SLICES(dev); i++)
> @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
>               DRM_ERROR("Context enable failed %d\n", ret);
>               i915_gem_cleanup_ringbuffer(dev);
>  
> -             return ret;
> +             goto out;
>       }
>  
> +out:
> +     intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>       return ret;
>  }
>  
> @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
>               dev_priv->gt.stop_ring = intel_logical_ring_stop;
>       }
>  
> +     /* This is just a security blanket to placate dragons.
> +      * On some systems, we very sporadically observe that the first TLBs
> +      * used by the CS may be stale, despite us poking the TLB reset. If
> +      * we hold the forcewake during initialisation these problems
> +      * just magically go away.
> +      */
> +     intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);


gem_init shouldn't ever touch the hw except through gem_init_hw. Do we
really need the double-layer here? Also the forcewake hack in the ring
init code should now be redundant, too.
-Daniel

> +
>       ret = i915_gem_init_userptr(dev);
>       if (ret)
>               goto out_unlock;
> @@ -4894,6 +4907,7 @@ int i915_gem_init(struct drm_device *dev)
>       }
>  
>  out_unlock:
> +     intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>       mutex_unlock(&dev->struct_mutex);
>  
>       return ret;
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB

Reply via email to