On Tue, Jul 13, 2021 at 11:45:54AM +0100, Matthew Auld wrote:
> EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> possible for userspace to bypass the GTT caching bits set by the kernel,
> as per the given object cache_level. This is troublesome since the heavy
> flush we apply when first acquiring the pages is skipped if the kernel
> thinks the object is coherent with the GPU. As a result it might be
> possible to bypass the cache and read the contents of the page directly,
> which could be stale data. If it's just a case of userspace shooting
> themselves in the foot then so be it, but since i915 takes the stance of
> always zeroing memory before handing it to userspace, we need to prevent
> this.
> 
> v2: this time actually set cache_dirty in put_pages()
> v3: move to get_pages() which looks simpler
> 
> BSpec: 34007
> References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
> Signed-off-by: Matthew Auld <matthew.a...@intel.com>
> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadh...@intel.com>
> Cc: Francisco Jerez <francisco.jerez.pl...@intel.com>
> Cc: Lucas De Marchi <lucas.demar...@intel.com>
> Cc: Jon Bloomfield <jon.bloomfi...@intel.com>
> Cc: Chris Wilson <chris.p.wil...@intel.com>
> Cc: Matt Roper <matthew.d.ro...@intel.com>
> Cc: Daniel Vetter <dan...@ffwll.ch>

Reviewed-by: Daniel Vetter <daniel.vet...@ffwll.ch>

I was pondering whether we can have a solid testcase for this, but:
- igt lacks the visibility, since we can't check easily whether stuff
  leaks.
- selftests don't have rendercopy, where we could select the nasty
  mocs entry

So it's a bit awkward. Is there something, or is this pure hw workaround
stuff on theoretical grounds?
-Daniel
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h   |  6 ++++++
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c      | 18 ++++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index da2194290436..7089d1b222c5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -522,6 +522,12 @@ struct drm_i915_gem_object {
>        * I915_BO_CACHE_COHERENT_FOR_WRITE, i.e that the GPU will be coherent
>        * for both reads and writes though the CPU cache. So pretty much this
>        * should only be needed for I915_CACHE_NONE objects.
> +      *
> +      * Update: Some bonkers hardware decided to add the 'Bypass LLC' MOCS
> +      * entry, which defeats our @cache_coherent tracking, since userspace
> +      * can freely bypass the CPU cache when touching the pages with the GPU,
> +      * where the kernel is completely unaware. On such platform we need
> +      * apply the sledgehammer-on-acquire regardless of the @cache_coherent.
>        */
>       unsigned int cache_dirty:1;
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 6a04cce188fc..11f072193f3b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -182,6 +182,24 @@ static int shmem_get_pages(struct drm_i915_gem_object 
> *obj)
>       if (i915_gem_object_needs_bit17_swizzle(obj))
>               i915_gem_object_do_bit_17_swizzle(obj, st);
>  
> +     /*
> +      * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
> +      * possible for userspace to bypass the GTT caching bits set by the
> +      * kernel, as per the given object cache_level. This is troublesome
> +      * since the heavy flush we apply when first gathering the pages is
> +      * skipped if the kernel thinks the object is coherent with the GPU. As
> +      * a result it might be possible to bypass the cache and read the
> +      * contents of the page directly, which could be stale data. If it's
> +      * just a case of userspace shooting themselves in the foot then so be
> +      * it, but since i915 takes the stance of always zeroing memory before
> +      * handing it to userspace, we need to prevent this.
> +      *
> +      * By setting cache_dirty here we make the clflush in set_pages
> +      * unconditional on such platforms.
> +      */
> +     if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
> +             obj->cache_dirty = true;
> +
>       __i915_gem_object_set_pages(obj, st, sg_page_sizes);
>  
>       return 0;
> -- 
> 2.26.3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Reply via email to