Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Will Deacon
On Fri, Apr 07, 2017 at 01:30:11AM +1000, Nicholas Piggin wrote: > On Thu, 6 Apr 2017 15:13:53 +0100 > Will Deacon wrote: > > On Thu, Apr 06, 2017 at 10:59:58AM +1000, Nicholas Piggin wrote: > > > Thanks for taking a look. The default spin primitives should just > > > continue to do the right thin

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Nicholas Piggin
On Fri, 7 Apr 2017 11:43:49 +0200 Peter Zijlstra wrote: > On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote: > > But maybe "monitor" is really cheap. I suspect it's microcoded, > > though, which implies "no". > > On my IVB-EP (will also try on something newer): > > MONITOR

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote: > But maybe "monitor" is really cheap. I suspect it's microcoded, > though, which implies "no". On my IVB-EP (will also try on something newer): MONITOR ~332 cycles MWAIT ~224 cycles (C0, explicitly invalidated MONITOR) So yes, ex

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Nicholas Piggin
On Thu, 6 Apr 2017 12:41:52 -0700 Linus Torvalds wrote: > On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra wrote: > > > > Something like so then. According to the SDM mwait is a no-op if we do > > not execute monitor first. So this variant should get the first > > iteration without expensive inst

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Linus Torvalds
On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra wrote: > > Something like so then. According to the SDM mwait is a no-op if we do > not execute monitor first. So this variant should get the first > iteration without expensive instructions. No, the problem is that we *would* have executed a prior

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote: > And we'd probably want to make it even more strict, in that soem mwait > implementations might simply not be very good for short waits. Yeah, we need to find something that works; assuming its beneficial at all on modern chips. > >

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Linus Torvalds
On Thu, Apr 6, 2017 at 9:36 AM, Peter Zijlstra wrote: > > Something like the below, which is ugly (because I couldn't be bothered > to resolve the header recursion and thus duplicates the monitor/mwait > functions) and broken (because it hard assumes the hardware can do > monitor/mwait). Yeah, I

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 08:16:19AM -0700, Linus Torvalds wrote: > In theory x86 could use monitor/mwait for it too, in practice I think > it tends to still be too high latency (because it was originally just > designed for the idle loop). mwait got extended to actually be useful, > but I'm not sur

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Nicholas Piggin
On Thu, 6 Apr 2017 15:13:53 +0100 Will Deacon wrote: > Hi Nick, > > On Thu, Apr 06, 2017 at 10:59:58AM +1000, Nicholas Piggin wrote: > > On Wed, 05 Apr 2017 07:01:57 -0700 (PDT) > > David Miller wrote: > > > > > From: Nicholas Piggin > > > Date: Tue, 4 Apr 2017 13:02:33 +1000 > > > > > >

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Linus Torvalds
On Thu, Apr 6, 2017 at 7:13 AM, Will Deacon wrote: > > We've wrapped this up in the arm64 code as __cmpwait, and we use that > to build smp_cond_load_acquire. It would be nice to use the same machinery > for the conditional spinning here, unless you anticipate that we're only > going to be spinnin

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-06 Thread Will Deacon
Hi Nick, On Thu, Apr 06, 2017 at 10:59:58AM +1000, Nicholas Piggin wrote: > On Wed, 05 Apr 2017 07:01:57 -0700 (PDT) > David Miller wrote: > > > From: Nicholas Piggin > > Date: Tue, 4 Apr 2017 13:02:33 +1000 > > > > > On Mon, 3 Apr 2017 17:43:05 -0700 > > > Linus Torvalds wrote: > > > > >

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-05 Thread Nicholas Piggin
On Wed, 05 Apr 2017 07:01:57 -0700 (PDT) David Miller wrote: > From: Nicholas Piggin > Date: Tue, 4 Apr 2017 13:02:33 +1000 > > > On Mon, 3 Apr 2017 17:43:05 -0700 > > Linus Torvalds wrote: > > > >> But that depends on architectures having some pattern that we *can* > >> abstract. Would som

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-05 Thread David Miller
From: Nicholas Piggin Date: Tue, 4 Apr 2017 13:02:33 +1000 > On Mon, 3 Apr 2017 17:43:05 -0700 > Linus Torvalds wrote: > >> But that depends on architectures having some pattern that we *can* >> abstract. Would some "begin/in-loop/end" pattern like the above be >> sufficient? > > Yes. begin/in

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-03 Thread Nicholas Piggin
On Tue, 4 Apr 2017 13:02:33 +1000 Nicholas Piggin wrote: > On Mon, 3 Apr 2017 17:43:05 -0700 > Linus Torvalds wrote: > > > But that depends on architectures having some pattern that we *can* > > abstract. Would some "begin/in-loop/end" pattern like the above be > > sufficient? > > Yes. begi

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-03 Thread Nicholas Piggin
On Mon, 3 Apr 2017 17:43:05 -0700 Linus Torvalds wrote: > On Mon, Apr 3, 2017 at 4:50 PM, Nicholas Piggin wrote: > > If you have any ideas, I'd be open to them. > > So the idea would be that maybe we can just make those things > explicit. IOW, instead of having that magical looping construct

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-03 Thread Linus Torvalds
On Mon, Apr 3, 2017 at 4:50 PM, Nicholas Piggin wrote: > > POWER does not have an instruction like pause. We can only set current > thread priority, and current implementations do something like allocate > issue cycles to threads based on relative priorities. So there should > be at least one or t

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-03 Thread Nicholas Piggin
On Mon, 3 Apr 2017 08:31:30 -0700 Linus Torvalds wrote: > On Mon, Apr 3, 2017 at 1:13 AM, Nicholas Piggin wrote: > > > > The loops have some restrictions on what can be used, but they are > > intended to be small and simple so it's not generally a problem: > > - Don't use cpu_relax. > > - Don'