On Wed, Apr 23, 2014 at 10:52:10AM +0300, Imre Deak wrote:
> On Wed, 2014-04-23 at 09:07 +0200, Daniel Vetter wrote:
> > On Wed, Apr 23, 2014 at 01:13:40AM +0300, Imre Deak wrote:
> > > Atm we can end up in the GPU reset deferred work in D3 state if the last
> > > runtime PM reference is dropped between detecting a hang/scheduling the
> > > work and executing the work. At least one such case I could trigger is
> > > the simulated reset via the i915_wedged debugfs entry. Fix this by
> > > getting an RPM reference around accessing the HW in the reset work.
> > > 
> > > v2:
> > > - Instead of getting/putting the RPM reference in the reset work itself,
> > >   get it already before scheduling the work. By this we also prevent
> > >   going to D3 before the work gets to run, in addition to making sure
> > >   that we run the work itself in D0. (Ville, Daniel)
> > > v3:
> > > - fix inverted logic fail when putting the RPM ref on behalf of a
> > >   cancelled GPU reset work (Ville)
> > > v4:
> > > - Taking the RPM ref in the interrupt handler isn't really needed b/c
> > >   it's already guaranteed that we hold an RPM ref until the end of the
> > >   reset work in all cases we care about. So take the ref in the reset
> > >   work (for cases like i915_wedged_set). (Daniel)
> > > 
> > > Signed-off-by: Imre Deak <imre.d...@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_irq.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c 
> > > b/drivers/gpu/drm/i915/i915_irq.c
> > > index a651d0d..0e47111 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -2175,6 +2175,14 @@ static void i915_error_work_func(struct 
> > > work_struct *work)
> > >                              reset_event);
> > >  
> > >           /*
> > > +          * In most cases it's guaranteed that we get here with an RPM
> > > +          * reference held, for example because there is a pending GPU
> > > +          * request that won't finish until the reset is done. This
> > > +          * isn't the case at least when we get here by doing a
> > > +          * simulated reset via debugs, so get an RPM reference.
> > 
> >                                       ^debugfs
> > 
> > Also maybe "..., so get a RPM reference just to quiet the warnings."
> 
> Do you mean it's not a problem to perform the reset in D3 state other
> than getting warnings?

Ah right, we actually touch the hardware ;-) So your comment as-is is good
enough.
-Daniel

> 
> --Imre
> 
> > -Daniel
> > 
> > > +          */
> > > +         intel_runtime_pm_get(dev_priv);
> > > +         /*
> > >            * All state reset _must_ be completed before we update the
> > >            * reset counter, for otherwise waiters might miss the reset
> > >            * pending state and not properly drop locks, resulting in
> > > @@ -2184,6 +2192,8 @@ static void i915_error_work_func(struct work_struct 
> > > *work)
> > >  
> > >           intel_display_handle_reset(dev);
> > >  
> > > +         intel_runtime_pm_put(dev_priv);
> > > +
> > >           if (ret == 0) {
> > >                   /*
> > >                    * After all the gem state is reset, increment the reset
> > > -- 
> > > 1.8.4
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to