Chris Wilson <ch...@chris-wilson.co.uk> writes:

> When capturing the bo, we allocate an array for min(vma->size,
> vma->node.size) pages, plus a bit for compression overhead. Through my
> and CI testing, this was sufficient for the mostly empty NULL context as
> it compressed well (or the out-of-bounds access simply didn't cause an
> issue). However, in real workloads on Cannonlake, we were overflowing
> that array and causing havoc with the random memory corruption.
>

When capturing the error object we allocate a struct for bookkeeping
plus an array for min(vma->size, vma->node.size) pages and a bit for
compression overhead. We use this mechanism when capturing state object
by constructing a fake vma for it. We forgot to set the vma size
causing allocation to cater only for bookkeepping struct, overflowing
and causing havoc with the random memory corruption.

This is how I see it so with above and including possible language fixes,

Reviewed-by: Mika Kuoppala <mika.kuopp...@linux.intel.com>

> Reported-by: Rafael Antognolli <rafael.antogno...@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103964
> Fixes: 4e90a6e22272 ("drm/i915: Record default HW state in the GPU error 
> state")
> Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> Cc: Chris Wilson <ch...@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuopp...@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com>
> Tested-by: Rodrigo Vivi <rodrigo.v...@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 876be8f1d930..48418fb81066 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1424,6 +1424,7 @@ capture_object(struct drm_i915_private *dev_priv,
>       if (obj && i915_gem_object_has_pages(obj)) {
>               struct i915_vma fake = {
>                       .node = { .start = U64_MAX, .size = obj->base.size },
> +                     .size = obj->base.size,
>                       .pages = obj->mm.pages,
>                       .obj = obj,
>               };
> -- 
> 2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to