i915: Defer transfer onto execution timeline to actual hw submission

Chris Wilson Thu, 10 Nov 2016 03:12:07 -0800

On Thu, Nov 10, 2016 at 10:43:29AM +0000, Tvrtko Ursulin wrote:
> 
> On 07/11/2016 13:59, Chris Wilson wrote:
> >Defer the transfer from the client's timeline onto the execution
> >timeline from the point of readiness to the point of actual submission.
> >For example, in execlists, a request is finally submitted to hardware
> >when the hardware is ready, and only put onto the hardware queue when
> >the request is ready. By deferring the transfer, we ensure that the
> >timeline is maintained in retirement order if we decide to queue the
> >requests onto the hardware in a different order than fifo.
> >
> >v2: Rebased onto distinct global/user timeline lock classes.
> >
> >Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> >---
> > drivers/gpu/drm/i915/i915_gem_request.c    | 31 
> > +++++++++++++++++-------------
> > drivers/gpu/drm/i915/i915_gem_request.h    |  2 ++
> > drivers/gpu/drm/i915/i915_guc_submission.c | 14 +++++++++++++-
> > drivers/gpu/drm/i915/intel_lrc.c           | 23 +++++++++++++---------
> > drivers/gpu/drm/i915/intel_ringbuffer.c    |  2 ++
> > 5 files changed, 49 insertions(+), 23 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
> >b/drivers/gpu/drm/i915/i915_gem_request.c
> >index e41d51a68ed8..19c29fafb07a 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >@@ -307,25 +307,16 @@ static u32 timeline_get_seqno(struct i915_gem_timeline 
> >*tl)
> >     return atomic_inc_return(&tl->next_seqno);
> > }
> >
> >-static int __i915_sw_fence_call
> >-submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >+void __i915_gem_request_submit(struct drm_i915_gem_request *request)
> > {
> >-    struct drm_i915_gem_request *request =
> >-            container_of(fence, typeof(*request), submit);
> >     struct intel_engine_cs *engine = request->engine;
> >     struct intel_timeline *timeline;
> >-    unsigned long flags;
> >     u32 seqno;
> >
> >-    if (state != FENCE_COMPLETE)
> >-            return NOTIFY_DONE;
> >-
> >     /* Transfer from per-context onto the global per-engine timeline */
> >     timeline = engine->timeline;
> >     GEM_BUG_ON(timeline == request->timeline);
> >-
> >-    /* Will be called from irq-context when using foreign DMA fences */
> >-    spin_lock_irqsave(&timeline->lock, flags);
> >+    assert_spin_locked(&timeline->lock);
> >
> >     seqno = timeline_get_seqno(timeline->common);
> >     GEM_BUG_ON(!seqno);
> >@@ -345,15 +336,29 @@ submit_notify(struct i915_sw_fence *fence, enum 
> >i915_sw_fence_notify state)
> >     GEM_BUG_ON(!request->global_seqno);
> >     engine->emit_breadcrumb(request,
> >                             request->ring->vaddr + request->postfix);
> >-    engine->submit_request(request);
> >
> >     spin_lock(&request->timeline->lock);
> >     list_move_tail(&request->link, &timeline->requests);
> >     spin_unlock(&request->timeline->lock);
> >
> >     i915_sw_fence_commit(&request->execute);
> >+}
> >
> >-    spin_unlock_irqrestore(&timeline->lock, flags);
> >+static int __i915_sw_fence_call
> >+submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
> >+{
> >+    if (state == FENCE_COMPLETE) {
> >+            struct drm_i915_gem_request *request =
> >+                    container_of(fence, typeof(*request), submit);
> >+            struct intel_engine_cs *engine = request->engine;
> >+            unsigned long flags;
> >+
> >+            /* Will be called from irq-context when using foreign fences. */
> >+            spin_lock_irqsave_nested(&engine->timeline->lock, flags,
> >+                                     SINGLE_DEPTH_NESTING);
> >+            engine->submit_request(request);
> >+            spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
> Would it be cleaner to move the lock taking to engine->submit_request?


Perhaps. Certainly pushes the ugliness down a layer!
 
> And is _nested still required? I thought you said it is not. I can't
> find signalling under the timeline lock either.

It is still required to sort out the ordering between external
lockclasses (vgem/nouveau/etc)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 03/11] drm/i915: Defer transfer onto execution timeline to actual hw submission

Reply via email to