Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-03-01 Thread Linus Torvalds
On Fri, Mar 1, 2013 at 10:18 AM, Davidlohr Bueso wrote: > On Fri, 2013-03-01 at 01:42 -0500, Rik van Riel wrote: >> >> Checking try_atomic_semop and do_smart_update, it looks like neither >> is using atomic operations. That part of the semaphore code would >> still benefit from spinlocks. > > Agre

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-03-01 Thread Rik van Riel
On 03/01/2013 01:18 PM, Davidlohr Bueso wrote: On Fri, 2013-03-01 at 01:42 -0500, Rik van Riel wrote: On 02/28/2013 06:09 PM, Linus Torvalds wrote: So I almost think that *everything* there in the semaphore code could be done under RCU. The actual spinlock doesn't seem to much matter, at least

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-03-01 Thread Davidlohr Bueso
On Fri, 2013-03-01 at 01:42 -0500, Rik van Riel wrote: > On 02/28/2013 06:09 PM, Linus Torvalds wrote: > > > So I almost think that *everything* there in the semaphore code could > > be done under RCU. The actual spinlock doesn't seem to much matter, at > > least for semaphores. The semaphore valu

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Rik van Riel
On 02/28/2013 06:09 PM, Linus Torvalds wrote: So I almost think that *everything* there in the semaphore code could be done under RCU. The actual spinlock doesn't seem to much matter, at least for semaphores. The semaphore values themselves seem to be protected by the atomic operations, but I mi

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Linus Torvalds
On Thu, Feb 28, 2013 at 2:38 PM, Rik van Riel wrote: > > I could see doing the permission checks under a seq lock. > > If the permissions, or any other aspect of the semaphore > array changed while we were doing our permission check, > we can simply jump back to the top of the function and > try a

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Rik van Riel
On 02/28/2013 04:58 PM, Linus Torvalds wrote: I'm not seeing any real reason the permission checking couldn't be done just under the RCU lock, before we get the spinlock. Except for the fact that the "helper" routines in ipc/util.c are written the way they are, so it's a layering violation. But

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Linus Torvalds
On Thu, Feb 28, 2013 at 1:14 PM, Rik van Riel wrote: > > I have modified one of the semop tests to use multiple semaphores. Ooh yeah. This shows contention quite nicely. And it's all from ipc_lock, and looking at the top-10 loffenders of the profile: 43.01% semop-multi [kernel.kallsyms] [k]

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Rik van Riel
On 02/28/2013 03:26 PM, Linus Torvalds wrote: On Thu, Feb 28, 2013 at 10:22 AM, Linus Torvalds wrote: I'm sure there are other things we could do to improve ipc lock times even if we don't actually split the lock, but the security one might be a good first step. Btw, if somebody has a benchm

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Linus Torvalds
On Thu, Feb 28, 2013 at 10:22 AM, Linus Torvalds wrote: > > I'm sure there are other things we could do to improve ipc lock times > even if we don't actually split the lock, but the security one might > be a good first step. Btw, if somebody has a benchmark for threads using multiple ipc semaphor

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Linus Torvalds
On Thu, Feb 28, 2013 at 7:13 AM, Rik van Riel wrote: > > Btw, the IPC lock is already fairly fine grained. One ipc > lock is allocated for each set of semaphores allocated through > sys_semget. Looking up those semaphores in the namespace, when > they are used later, is done under RCU. Bullshit.

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-28 Thread Rik van Riel
On 02/27/2013 11:49 PM, Linus Torvalds wrote: On Wed, Feb 27, 2013 at 8:06 PM, Davidlohr Bueso wrote: The attached file shows how the amount of sys time used by the ipc lock for a 4 and 8 socket box. I have to say, even with the improvements, that looks pretty disgusting. It really makes me

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 8:06 PM, Davidlohr Bueso wrote: > > The attached file shows how the amount of sys time used by the ipc lock > for a 4 and 8 socket box. I have to say, even with the improvements, that looks pretty disgusting. It really makes me wonder if that thing couldn't be done better

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Davidlohr Bueso
On Wed, 2013-02-27 at 21:58 -0500, Rik van Riel wrote: > On 02/27/2013 05:13 PM, Linus Torvalds wrote: > > > > On Feb 27, 2013 1:56 PM, "Rik van Riel" > > wrote: > >> > >> No argument there, but that does in no way negate the need for some > >> performance robustness. > > >

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 6:58 PM, Rik van Riel wrote: > > On the other hand, both MCS and the fast queue locks > implemented by Michel showed low variability and high > performance. On microbenchmarks, and when implemented for only one single subsystem, yes. > The numbers for Michel's MCS and fas

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Rik van Riel
On 02/27/2013 05:13 PM, Linus Torvalds wrote: On Feb 27, 2013 1:56 PM, "Rik van Riel" mailto:r...@redhat.com>> wrote: No argument there, but that does in no way negate the need for some performance robustness. The very numbers you posted showed that the backoff was *not* more robust. Quite t

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Rik van Riel
On 02/27/2013 03:18 PM, Linus Torvalds wrote: On Wed, Feb 27, 2013 at 11:53 AM, Rik van Riel wrote: If we have two classes of spinlocks, I suspect we would be better off making those high-demand spinlocks MCS or LCH locks, which have the property that having N+1 CPUs contend on the lock will n

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 11:53 AM, Rik van Riel wrote: > > If we have two classes of spinlocks, I suspect we would be better > off making those high-demand spinlocks MCS or LCH locks, which have > the property that having N+1 CPUs contend on the lock will never > result in slower aggregate throughp

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Rik van Riel
On 02/27/2013 12:10 PM, Linus Torvalds wrote: Ugh. That really is rather random. "short" and fserver seems to improve a lot (including the "new" version), the others look like they are either unchanged or huge regressions. Is there any way to get profiles for the improved versions vs the regres

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 8:42 AM, Rik van Riel wrote: > > To keep the results readable and relevant, I am reporting the > plateau performance numbers. Comments are given where required. > > 3.7.6 vanilla 3.7.6 w/ backoff > > all_utime 333000 333000 > alltest

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-27 Thread Rik van Riel
On 02/13/2013 08:21 PM, Linus Torvalds wrote: On Wed, Feb 13, 2013 at 3:41 PM, Rik van Riel wrote: I have an example of the second case. It is a test case from a customer issue, where an application is contending on semaphores, doing semaphore lock and unlock operations. The test case simply h

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-15 Thread Ingo Molnar
* Linus Torvalds wrote: > Btw, it may end up that almost nobody cares. Modern CPU's are > really good at handling the straightforward "save/restore to > stack" instructions. One of the reasons I care is not > performance per se, butu the fact that I still look at asm > code every time I do a

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-14 Thread Benjamin Herrenschmidt
On Wed, 2013-02-13 at 10:30 -0800, Linus Torvalds wrote: > On Wed, Feb 13, 2013 at 8:20 AM, Linus Torvalds > wrote: > > > > Adding an external function call is *horrible*, and you might almost > > as well just uninline the spinlock entirely if you do this. It means > > that all the small callers n

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-14 Thread Linus Torvalds
On Thu, Feb 14, 2013 at 2:50 AM, Ingo Molnar wrote: > > At least on x86, how about saving *all* volatile registers in > the slow out of line code path (to stack)? Sure. The reason I suggested perhaps not saving %rax/%rdx is simply that if it's a function that returns a value, %rax obviously canno

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-14 Thread Ingo Molnar
* Linus Torvalds wrote: > On Wed, Feb 13, 2013 at 8:20 AM, Linus Torvalds > wrote: > > > > Adding an external function call is *horrible*, and you > > might almost as well just uninline the spinlock entirely if > > you do this. It means that all the small callers now have > > their registers

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-14 Thread Ingo Molnar
* Linus Torvalds wrote: > > Eric got a 45% increase in network throughput, and I saw a > > factor 4x or so improvement with the semaphore test. I > > realize these are not "real workloads", and I will give you > > numbers with those once I have gathered some, on different > > systems. > > G

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread H. Peter Anvin
On 02/13/2013 05:31 PM, Linus Torvalds wrote: > On Wed, Feb 13, 2013 at 4:54 PM, H. Peter Anvin wrote: >> >> It does for the callee, but only on a whole-file basis. It would be a >> lot nicer if we could do it with function attributes. > > A way to just set the callee-clobbered list on a per-fun

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 5:21 PM, Linus Torvalds wrote: > > Now, on other machines you get the call chain even with pebs because > you can get the whole Oops, that got cut short early, because I started looking up when PEBS and the last-branch-buffer work together, and couldn't find it, and then c

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 4:54 PM, H. Peter Anvin wrote: > > It does for the callee, but only on a whole-file basis. It would be a > lot nicer if we could do it with function attributes. A way to just set the callee-clobbered list on a per-function basis would be lovely. Gcc has limited support fo

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 3:41 PM, Rik van Riel wrote: > > I have an example of the second case. It is a test case > from a customer issue, where an application is contending on > semaphores, doing semaphore lock and unlock operations. The > test case simply has N threads, trying to lock and unlock

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread H. Peter Anvin
On 02/13/2013 10:30 AM, Linus Torvalds wrote: > > Sadly, gcc doesn't seem to allow specifying which registers are > clobbered any easy way, which means that both the caller and the > callee *both* tend to need to have some asm interface. So we bothered > to do this for __read_lock_failed, but we h

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Rik van Riel
On 02/13/2013 05:40 PM, Linus Torvalds wrote: On Wed, Feb 13, 2013 at 2:21 PM, Rik van Riel wrote: What kind of numbers would you like? Numbers showing that the common case is not affected by this code? Or numbers showing that performance of something is improved with this code? Of course,

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 2:21 PM, Rik van Riel wrote: > > What kind of numbers would you like? > > Numbers showing that the common case is not affected by this > code? > > Or numbers showing that performance of something is improved > with this code? > > Of course, the latter would point out a scal

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Rik van Riel
On 02/13/2013 02:36 PM, Linus Torvalds wrote: On Wed, Feb 13, 2013 at 11:08 AM, Rik van Riel wrote: The spinlock backoff code prevents these last cases from experiencing large performance regressions when the hardware is upgraded. I still want *numbers*. There are real cases where backoff do

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 11:08 AM, Rik van Riel wrote: > > The spinlock backoff code prevents these last cases from > experiencing large performance regressions when the hardware > is upgraded. I still want *numbers*. There are real cases where backoff does exactly the reverse, and makes things m

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Rik van Riel
On 02/13/2013 11:20 AM, Linus Torvalds wrote: On Wed, Feb 13, 2013 at 4:06 AM, tip-bot for Rik van Riel wrote: x86/smp: Move waiting on contended ticket lock out of line Moving the wait loop for congested loops to its own function allows us to add things to that wait loop, without growing the

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 8:20 AM, Linus Torvalds wrote: > > Adding an external function call is *horrible*, and you might almost > as well just uninline the spinlock entirely if you do this. It means > that all the small callers now have their registers trashed, whether > the unlikely function call

[tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread tip-bot for Rik van Riel
Commit-ID: 4aef331850b637169ff036ed231f0d236874f310 Gitweb: http://git.kernel.org/tip/4aef331850b637169ff036ed231f0d236874f310 Author: Rik van Riel AuthorDate: Wed, 6 Feb 2013 15:04:03 -0500 Committer: Ingo Molnar CommitDate: Wed, 13 Feb 2013 09:06:28 +0100 x86/smp: Move waiting on con