Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-21 Thread Peter Zijlstra
On Wed, Oct 21, 2015 at 01:28:04PM +0800, Ling Ma wrote: > Ok, we will put the spinlock test into the perf bench. The attached is what I used back when we were doing the initial qspinlock stuff. I've not looked at it in quite some time, so it might be out of sync with the kernel sources. spin

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Ling Ma
> > I did see some performance improvement when I used your test program on a > Haswell-EX system. It seems like the use of cmpxchg has forced the changed > memory values to be visible to other processors earlier. I also ran your > test on an older machine with Westmere-EX processors. This time, I

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Ling Ma
2015-10-20 17:16 GMT+08:00 Peter Zijlstra : > On Tue, Oct 20, 2015 at 11:24:02AM +0800, Ling Ma wrote: >> 2015-10-19 17:46 GMT+08:00 Peter Zijlstra : >> > On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: >> >> From: Ma Ling >> >> >> >> All load instructions can run specul

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Ling Ma
Ok, we will put the spinlock test into the perf bench. Thanks Ling 2015-10-20 16:48 GMT+08:00 Ingo Molnar : > > * Ling Ma wrote: > >> > So it would be nice to create a new user-space spinlock testing facility, >> > via >> > a new 'perf bench spinlock' feature or so. That way others can test and

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Waiman Long
On 10/19/2015 11:12 PM, Ling Ma wrote: 2015-10-20 1:18 GMT+08:00 Waiman Long: On 10/18/2015 10:27 PM, ling.ma.prog...@gmail.com wrote: From: Ma Ling All load instructions can run speculatively but they have to follow memory order rule in multiple cores as below: _x = _y = 0 Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Peter Zijlstra
On Tue, Oct 20, 2015 at 11:24:02AM +0800, Ling Ma wrote: > 2015-10-19 17:46 GMT+08:00 Peter Zijlstra : > > On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: > >> From: Ma Ling > >> > >> All load instructions can run speculatively but they have to follow > >> memory order r

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Peter Zijlstra
On Tue, Oct 20, 2015 at 10:57:53AM +0800, Ling Ma wrote: > > > > So it would be nice to create a new user-space spinlock testing facility, > > via a > > new 'perf bench spinlock' feature or so. That way others can test and > > validate > > your results on different hardware as well. > > > Attache

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-20 Thread Ingo Molnar
* Ling Ma wrote: > > So it would be nice to create a new user-space spinlock testing facility, > > via > > a new 'perf bench spinlock' feature or so. That way others can test and > > validate your results on different hardware as well. > > Attached the spinlock test module . Queued spinlock w

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ling Ma
2015-10-19 17:46 GMT+08:00 Peter Zijlstra : > On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: >> From: Ma Ling >> >> All load instructions can run speculatively but they have to follow >> memory order rule in multiple cores as below: >> _x = _y = 0 >> >> Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ling Ma
2015-10-20 1:18 GMT+08:00 Waiman Long : > On 10/18/2015 10:27 PM, ling.ma.prog...@gmail.com wrote: >> >> From: Ma Ling >> >> All load instructions can run speculatively but they have to follow >> memory order rule in multiple cores as below: >> _x = _y = 0 >> >> Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ling Ma
2015-10-19 17:46 GMT+08:00 Peter Zijlstra : > On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: >> From: Ma Ling >> >> All load instructions can run speculatively but they have to follow >> memory order rule in multiple cores as below: >> _x = _y = 0 >> >> Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ling Ma
2015-10-19 17:33 GMT+08:00 Peter Zijlstra : > On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: >> From: Ma Ling >> >> All load instructions can run speculatively but they have to follow >> memory order rule in multiple cores as below: >> _x = _y = 0 >> >> Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ling Ma
> > So it would be nice to create a new user-space spinlock testing facility, via > a > new 'perf bench spinlock' feature or so. That way others can test and validate > your results on different hardware as well. > Attached the spinlock test module . Queued spinlock will run very slowly in user sp

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Waiman Long
On 10/19/2015 07:24 AM, Ingo Molnar wrote: * Peter Zijlstra wrote: On Mon, Oct 19, 2015 at 09:58:23AM +0200, Ingo Molnar wrote: * ling.ma.prog...@gmail.com wrote: From: Ma Ling All load instructions can run speculatively but they have to follow memory order rule in multiple cores as below

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Waiman Long
On 10/19/2015 05:33 AM, Peter Zijlstra wrote: On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: From: Ma Ling All load instructions can run speculatively but they have to follow memory order rule in multiple cores as below: _x = _y = 0 Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Waiman Long
On 10/18/2015 10:27 PM, ling.ma.prog...@gmail.com wrote: From: Ma Ling All load instructions can run speculatively but they have to follow memory order rule in multiple cores as below: _x = _y = 0 Processor 0 Processor 1 mov r1, [ _y] //M1 mov [

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Mon, Oct 19, 2015 at 09:58:23AM +0200, Ingo Molnar wrote: > > > > * ling.ma.prog...@gmail.com wrote: > > > > > From: Ma Ling > > > > > > All load instructions can run speculatively but they have to follow > > > memory order rule in multiple cores as below: > > >

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Peter Zijlstra
On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: > From: Ma Ling > > All load instructions can run speculatively but they have to follow > memory order rule in multiple cores as below: > _x = _y = 0 > > Processor 0 Processor 1 > > mov r1, [ _y

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Peter Zijlstra
On Mon, Oct 19, 2015 at 09:58:23AM +0200, Ingo Molnar wrote: > > * ling.ma.prog...@gmail.com wrote: > > > From: Ma Ling > > > > All load instructions can run speculatively but they have to follow > > memory order rule in multiple cores as below: > > _x = _y = 0 > > > > Processor 0

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Peter Zijlstra
On Mon, Oct 19, 2015 at 10:27:22AM +0800, ling.ma.prog...@gmail.com wrote: > From: Ma Ling > > All load instructions can run speculatively but they have to follow > memory order rule in multiple cores as below: > _x = _y = 0 > > Processor 0 Processor 1 > > mov r1, [ _y

Re: [RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-19 Thread Ingo Molnar
* ling.ma.prog...@gmail.com wrote: > From: Ma Ling > > All load instructions can run speculatively but they have to follow > memory order rule in multiple cores as below: > _x = _y = 0 > > Processor 0 Processor 1 > > mov r1, [ _y] //M1 mov [ _x],

[RFC PATCH] qspinlock: Improve performance by reducing load instruction rollback

2015-10-18 Thread ling . ma . program
From: Ma Ling All load instructions can run speculatively but they have to follow memory order rule in multiple cores as below: _x = _y = 0 Processor 0 Processor 1 mov r1, [ _y] //M1 mov [ _x], 1 //M3 mov r2, [ _x] //M2 mov