On Tue, 2016-06-28 at 15:38 +0100, Tvrtko Ursulin wrote: > On 28/06/16 14:53, Imre Deak wrote: > > On ti, 2016-06-28 at 14:29 +0100, Tvrtko Ursulin wrote: > > > On 28/06/16 13:19, Imre Deak wrote: > > > > On ti, 2016-06-28 at 12:51 +0100, Tvrtko Ursulin wrote: > > > > > From: Tvrtko Ursulin <tvrtko.ursu...@intel.com> > > > > > > > > > > usleep_range is not recommended for waits shorten than 10us. > > > > > > > > > > Make the wait_for_us use the atomic variant for such waits. > > > > > > > > > > To do so we need to disable the !in_atomic warning for such uses > > > > > and also disable preemption since the macro is written in a way > > > > > to only be safe to be used in atomic context (local_clock() and > > > > > no second COND check after the timeout). > > > > > > > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursu...@intel.com> > > > > > Cc: Chris Wilson <ch...@chris-wilson.co.uk> > > > > > Cc: Imre Deak <imre.d...@intel.com> > > > > > Cc: Mika Kuoppala <mika.kuopp...@intel.com> > > > > > --- > > > > > drivers/gpu/drm/i915/intel_drv.h | 29 +++++++++++++++++++++-------- > > > > > 1 file changed, 21 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h > > > > > b/drivers/gpu/drm/i915/intel_drv.h > > > > > index 3156d8df7921..e21bf6e6f119 100644 > > > > > --- a/drivers/gpu/drm/i915/intel_drv.h > > > > > +++ b/drivers/gpu/drm/i915/intel_drv.h > > > > > @@ -69,20 +69,21 @@ > > > > > }) > > > > > > > > > > #define wait_for(COND, MS) _wait_for((COND), (MS) * 1000, > > > > > 1000) > > > > > -#define wait_for_us(COND, US) _wait_for((COND), (US), > > > > > 1) > > > > > > > > > > /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports > > > > > false. */ > > > > > #if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) > > > > > -# define _WAIT_FOR_ATOMIC_CHECK WARN_ON_ONCE(!in_atomic()) > > > > > +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && > > > > > !in_atomic()) > > > > > #else > > > > > -# define _WAIT_FOR_ATOMIC_CHECK do { } while (0) > > > > > +# define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0) > > > > > #endif > > > > > > > > > > -#define _wait_for_atomic(COND, US) ({ \ > > > > > +#define _wait_for_atomic(COND, US, ATOMIC) ({ \ > > > > > unsigned long end__; \ > > > > > int ret__ = 0; \ > > > > > - _WAIT_FOR_ATOMIC_CHECK; \ > > > > > - BUILD_BUG_ON((US) > 50000); \ > > > > > + _WAIT_FOR_ATOMIC_CHECK(ATOMIC); \ > > > > > + BUILD_BUG_ON((ATOMIC) && (US) > 50000); \ > > > > > + if (!(ATOMIC)) \ > > > > > + preempt_disable(); \ > > > > > > > > Disabling preemption for this purpose (scheduling a timeout) could be > > > > frowned upon, although for 10us may be not an issue. Another > > > > > > Possibly, but I don't see how to otherwise do it. > > > > > > And about the number itself - I chose 10us just because usleep_range is > > > not recommended for <10us due setup overhead. > > > > > > > possibility would be to use cpu_clock() instead which would have some > > > > overhead in case of scheduling away from the initial CPU, but we'd only > > > > incur it for the non-atomic <10us case, so would be negligible imo. > > > > You'd also have to re-check the condition with that solution. > > > > > > How would you implement it with cpu_clock? What would you do when > > > re-scheduled? > > > > By calculating the expiry in the beginning with cpu_clock() > > using raw_smp_processor_id() and then calling cpu_clock() in > > time_after() with the same CPU id. cpu_clock() would then internally > > handle the scheduling away scenario. > > Right, but that is also not ideal since if the two cpu_clocks differ the > running time domain is not identical to the timeout one.
Hm, this is what I meant: int cpu = raw_smp_processor_id(); end = cpu_clock(cpu) + timeout; while (!time_after(cpu_clock(cpu), end)) check condition... So cpu_clock() would be always called with the same CPU id and hence it would return timestamps from the same time domain. In case of scheduling away at any point cpu_clock() would internally detect this and return the timestamp from the original CPU time domain. This also has the advantage that in case of a stable, synchronized TSC - which should be the generic case - cpu_clock() simply translates to reading the TSC, without disabling pre-emption. I don't see any problem with Chris' approach either though. > Probably would not matter but feels hacky. > > > > > Also could you explain how can we ignore hard IRQs as hinted by the > > > > comment in _wait_for_atomic()? > > > > > > Hm, in retrospect it does not look safe. Upside that after your fixes > > > from today it will be, since all remaining callers are with interrupts > > > disabled. > > > > Well, except for the GuC path, but that's for a 10ms timeout, so > > probably doesn't matter (or else we have a bigger problem). > > I've just sent a patch for that. > > > > And downside that the patch from this thread is not safe then > > > and would need the condition put back in. Possibly only in the !ATOMIC > > > case but that might be too fragile for the future. > > > > I'd say we'd need the extra check at least whenever hard IRQs are not > > disabled. Even then there could be NMIs or some other background stuff > > (ME) that could be a problem. OTOH we'd incur the overhead from the > > extra check only in the exceptional timeout case, so I think doing it > > in all cases wouldn't be a big problem. > > Yeah I'll put it in. > > Regards, > > Tvrtko > > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx