On Mon, Jun 06, 2016 at 04:34:27PM +0100, Tvrtko Ursulin wrote:
> 
> On 03/06/16 17:08, Chris Wilson wrote:
> >If we flag the seqno as potentially stale upon receiving an interrupt,
> >we can use that information to reduce the frequency that we apply the
> >heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
> >
> >Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_drv.h          | 15 ++++++++++++++-
> >  drivers/gpu/drm/i915/i915_irq.c          |  1 +
> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 ++++++++++------
> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  1 +
> >  4 files changed, 26 insertions(+), 7 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> >b/drivers/gpu/drm/i915/i915_drv.h
> >index 4ddb9ff319cb..a71d08199d57 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -3935,7 +3935,20 @@ static inline bool __i915_request_irq_complete(struct 
> >drm_i915_gem_request *req)
> >      * but it is easier and safer to do it every time the waiter
> >      * is woken.
> >      */
> >-    if (engine->irq_seqno_barrier) {
> >+    if (engine->irq_seqno_barrier && READ_ONCE(engine->irq_posted)) {
> >+            /* The ordering of irq_posted versus applying the barrier
> >+             * is crucial. The clearing of the current irq_posted must
> >+             * be visible before we perform the barrier operation,
> >+             * such that if a subsequent interrupt arrives, irq_posted
> >+             * is reasserted and our task rewoken (which causes us to
> >+             * do another __i915_request_irq_complete() immediately
> >+             * and reapply the barrier). Conversely, if the clear
> >+             * occurs after the barrier, then an interrupt that arrived
> >+             * whilst we waited on the barrier would not trigger a
> >+             * barrier on the next pass, and the read may not see the
> >+             * seqno update.
> >+             */
> >+            WRITE_ONCE(engine->irq_posted, false);
> 
> Why is this not smp_store_mb ?

We only require the ordering wrt to irq_seqno_barrier().

How about:

if (engine->irq_seqno_barrier &&
    cmpxchg_relaxed(&engine->irq_post, 1, 0)) {

Less shouty?

> >diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
> >b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >index 44346de39794..0f5fe114c204 100644
> >--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> >@@ -43,12 +43,18 @@ static void intel_breadcrumbs_fake_irq(unsigned long 
> >data)
> >
> >  static void irq_enable(struct intel_engine_cs *engine)
> >  {
> >+    /* Enabling the IRQ may miss the generation of the interrupt, but
> >+     * we still need to force the barrier before reading the seqno,
> >+     * just in case.
> >+     */
> >+    engine->irq_posted = true;
> 
> Should it be smp_store_mb here as well?

No, this is written/read on the same callchain.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to