On Fri, Jul 7, 2017 at 10:17 AM, Chris Wilson <ch...@chris-wilson.co.uk> wrote:
> The goal here was to minimise doing any thing or any check inside the > kernel that was not strictly required. For a userspace that assumes > complete control over the cache domains, the kernel is usually using > outdated information and may trigger clflushes where none were > required. > > However, swapping is a situation where userspace has no knowledge of the > domain transfer, and will leave the object in the CPU cache. The kernel > must flush this out to the backing storage prior to use with the GPU. As > we use an asynchronous task tracked by an implicit fence for this, we > also need to cancel the ASYNC flag on the object so that the object will > wait for the clflush to complete before being executed. This also absolves > userspace of the responsibility imposed by commit 77ae9957897d ("drm/i915: > Enable userspace to opt-out of implicit fencing") that its needed to ensure > that the object was out of the CPU cache prior to use on the GPU. > Given that domain tracking is global, we can also run into interesting issues if process A does a CPU map, writes to it, and then hands it to process B which uses it from the GPU. Without being very aggressive about set_domain, we have no knowledge that this is a problem. > Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit > fencing") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101571 > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com> > Cc: Jason Ekstrand <ja...@jlekstrand.net> > Chris and I discussed this for a while on Friday and I think I understand the problem and I think this is the correct fix. Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net> That said, I don't think I'm really qualified to review as I don't understand all of the details. --Jason > --- > drivers/gpu/drm/i915/i915_gem_clflush.c | 7 ++++--- > drivers/gpu/drm/i915/i915_gem_clflush.h | 2 +- > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++++---- > 3 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c > b/drivers/gpu/drm/i915/i915_gem_clflush.c > index 152f16c11878..348b29a845c9 100644 > --- a/drivers/gpu/drm/i915/i915_gem_clflush.c > +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c > @@ -114,7 +114,7 @@ i915_clflush_notify(struct i915_sw_fence *fence, > return NOTIFY_DONE; > } > > -void i915_gem_clflush_object(struct drm_i915_gem_object *obj, > +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > unsigned int flags) > { > struct clflush *clflush; > @@ -128,7 +128,7 @@ void i915_gem_clflush_object(struct > drm_i915_gem_object *obj, > */ > if (!i915_gem_object_has_struct_page(obj)) { > obj->cache_dirty = false; > - return; > + return false; > } > > /* If the GPU is snooping the contents of the CPU cache, > @@ -140,7 +140,7 @@ void i915_gem_clflush_object(struct > drm_i915_gem_object *obj, > * tracking. > */ > if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent) > - return; > + return false; > > trace_i915_gem_object_clflush(obj); > > @@ -179,4 +179,5 @@ void i915_gem_clflush_object(struct > drm_i915_gem_object *obj, > } > > obj->cache_dirty = false; > + return true; > } > diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.h > b/drivers/gpu/drm/i915/i915_gem_clflush.h > index 2455a7820937..f390247561b3 100644 > --- a/drivers/gpu/drm/i915/i915_gem_clflush.h > +++ b/drivers/gpu/drm/i915/i915_gem_clflush.h > @@ -28,7 +28,7 @@ > struct drm_i915_private; > struct drm_i915_gem_object; > > -void i915_gem_clflush_object(struct drm_i915_gem_object *obj, > +bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, > unsigned int flags); > #define I915_CLFLUSH_FORCE BIT(0) > #define I915_CLFLUSH_SYNC BIT(1) > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index 2c76d6980855..aeacfe9a0878 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -1826,7 +1826,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) > int err; > > for (i = 0; i < count; i++) { > - const struct drm_i915_gem_exec_object2 *entry = > &eb->exec[i]; > + struct drm_i915_gem_exec_object2 *entry = &eb->exec[i]; > struct i915_vma *vma = exec_to_vma(entry); > struct drm_i915_gem_object *obj = vma->obj; > > @@ -1842,12 +1842,14 @@ static int eb_move_to_gpu(struct i915_execbuffer > *eb) > eb->request->capture_list = capture; > } > > + if (unlikely(obj->cache_dirty && !obj->cache_coherent)) { > + if (i915_gem_clflush_object(obj, 0)) > + entry->flags &= ~EXEC_OBJECT_ASYNC; > + } > + > if (entry->flags & EXEC_OBJECT_ASYNC) > goto skip_flushes; > > - if (unlikely(obj->cache_dirty && !obj->cache_coherent)) > - i915_gem_clflush_object(obj, 0); > - > err = i915_gem_request_await_object > (eb->request, obj, entry->flags & > EXEC_OBJECT_WRITE); > if (err) > -- > 2.13.2 > >
_______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx