Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Waiman Long
On 7/17/19 1:44 PM, Alex Kogan wrote: >> On Jul 16, 2019, at 10:50 AM, Waiman Long wrote: >> >> On 7/16/19 10:29 AM, Alex Kogan wrote: On Jul 15, 2019, at 7:22 PM, Waiman Long >>> > wrote: On 7/15/19 5:30 PM, Waiman Long wrote: >> -#ifndef _GEN_PV_LOCK

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Alex Kogan
> On Jul 16, 2019, at 10:50 AM, Waiman Long wrote: > > On 7/16/19 10:29 AM, Alex Kogan wrote: >> >>> On Jul 15, 2019, at 7:22 PM, Waiman Long >> > wrote: >>> >>> On 7/15/19 5:30 PM, Waiman Long wrote: > -#ifndef _GEN_PV_LOCK_SLOWPATH > +#if !defined(_GEN_PV_

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Peter Zijlstra
On Wed, Jul 17, 2019 at 10:27:30AM -0400, Alex Kogan wrote: > > On Jul 17, 2019, at 4:39 AM, Peter Zijlstra wrote: > > static void cna_splice_tail(struct cna_node *cn, struct cna_node *head, > > struct cna_node *tail) > > { > > struct cna_node *list; > > > > /* remove [head,tail] */ > >

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Alex Kogan
> On Jul 17, 2019, at 4:59 AM, Peter Zijlstra wrote: > > On Wed, Jul 17, 2019 at 10:39:44AM +0200, Peter Zijlstra wrote: >> On Tue, Jul 16, 2019 at 08:47:24PM +0200, Peter Zijlstra wrote: > >>> My primary concern was readability; I find the above suggestion much >>> more readable. Maybe it can

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Alex Kogan
>> *mcs_node >> * ++ ++ ++ >> * | next | ---> |next| -> ... |next| -> NULL [Main queue] >> * | locked | -+ ++ ++ >> * ++ | >> * | +-+ ++ >> * +-> |mcs::next| -> ... |n

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Waiman Long
On 7/17/19 3:44 AM, Peter Zijlstra wrote: > On Tue, Jul 16, 2019 at 10:16:29PM -0400, Waiman Long wrote: >> A simple graphic to illustrate those queues will help too, for example > Very much yes! > >> /* >>  * MCS lock holder >>  * === >>  *    mcs_node >>  *   ++  ++  

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Peter Zijlstra
On Wed, Jul 17, 2019 at 10:39:44AM +0200, Peter Zijlstra wrote: > On Tue, Jul 16, 2019 at 08:47:24PM +0200, Peter Zijlstra wrote: > > My primary concern was readability; I find the above suggestion much > > more readable. Maybe it can be written differently; you'll have to play > > around a bit. >

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Peter Zijlstra
On Tue, Jul 16, 2019 at 08:47:24PM +0200, Peter Zijlstra wrote: > On Tue, Jul 16, 2019 at 01:19:16PM -0400, Alex Kogan wrote: > > > On Jul 16, 2019, at 11:50 AM, Peter Zijlstra wrote: > > > > static void cna_move(struct cna_node *cn, struct cna_node *cni) > > > { > > > struct cna_node *head, *t

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-17 Thread Peter Zijlstra
On Tue, Jul 16, 2019 at 10:16:29PM -0400, Waiman Long wrote: > A simple graphic to illustrate those queues will help too, for example Very much yes! > /* >  * MCS lock holder >  * === >  *    mcs_node >  *   ++  ++ ++ >  *   | next   | ---> |next| -> ... 

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Waiman Long
On 7/15/19 3:25 PM, Alex Kogan wrote: > +/* > + * Implement a NUMA-aware version of MCS (aka CNA, or compact NUMA-aware > lock). > + * > + * In CNA, spinning threads are organized in two queues, a main queue for > + * threads running on the same node as the current lock holder, and a > + * seconda

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Peter Zijlstra
On Tue, Jul 16, 2019 at 01:19:16PM -0400, Alex Kogan wrote: > > On Jul 16, 2019, at 11:50 AM, Peter Zijlstra wrote: > > static void cna_move(struct cna_node *cn, struct cna_node *cni) > > { > > struct cna_node *head, *tail; > > > > /* remove @cni */ > > WRITE_ONCE(cn->mcs.next, cni->

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Alex Kogan
Hi, Peter. Thanks for the review and all the suggestions! A couple of comments are inlined below. > On Jul 16, 2019, at 11:50 AM, Peter Zijlstra wrote: > > On Mon, Jul 15, 2019 at 03:25:34PM -0400, Alex Kogan wrote: >> +static struct cna_node *find_successor(struct mcs_spinlock *me) >> +{ >> +

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Peter Zijlstra
On Mon, Jul 15, 2019 at 03:25:34PM -0400, Alex Kogan wrote: > +static struct cna_node *find_successor(struct mcs_spinlock *me) > +{ > + struct cna_node *me_cna = CNA_NODE(me); > + struct cna_node *head_other, *tail_other, *cur; > + struct cna_node *next = CNA_NODE(READ_ONCE(me->next));

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Waiman Long
On 7/16/19 10:29 AM, Alex Kogan wrote: > >> On Jul 15, 2019, at 7:22 PM, Waiman Long > > wrote: >> >> On 7/15/19 5:30 PM, Waiman Long wrote: -#ifndef _GEN_PV_LOCK_SLOWPATH +#if !defined(_GEN_PV_LOCK_SLOWPATH) && !defined(_GEN_CNA_LOCK_SLOWPATH) #inc

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Waiman Long
On 7/16/19 10:26 AM, Alex Kogan wrote: >> On Jul 15, 2019, at 5:30 PM, Waiman Long wrote: >> >> On 7/15/19 3:25 PM, Alex Kogan wrote: >> >>> /* >>> - * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in >>> - * size and four of them will fit nicely in one 64-byte cacheline. Fo

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Alex Kogan
> On Jul 16, 2019, at 7:05 AM, Peter Zijlstra wrote: > > On Mon, Jul 15, 2019 at 03:25:34PM -0400, Alex Kogan wrote: >> +/** >> + * find_successor - Scan the main waiting queue looking for the first >> + * thread running on the same node as the lock holder. If found (call it >> + * thread T), m

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Alex Kogan
> On Jul 15, 2019, at 5:30 PM, Waiman Long wrote: > > On 7/15/19 3:25 PM, Alex Kogan wrote: >> In CNA, spinning threads are organized in two queues, a main queue for >> threads running on the same node as the current lock holder, and a >> secondary queue for threads running on other nodes. At t

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Peter Zijlstra
On Mon, Jul 15, 2019 at 03:25:34PM -0400, Alex Kogan wrote: > +/** > + * find_successor - Scan the main waiting queue looking for the first > + * thread running on the same node as the lock holder. If found (call it > + * thread T), move all threads in the main queue between the lock holder > + * a

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-16 Thread Peter Zijlstra
On Mon, Jul 15, 2019 at 05:30:01PM -0400, Waiman Long wrote: > On 7/15/19 3:25 PM, Alex Kogan wrote: > > /* > > - * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in > > - * size and four of them will fit nicely in one 64-byte cacheline. For > > - * pvqspinlock, however, we n

Re: [PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-15 Thread Waiman Long
On 7/15/19 3:25 PM, Alex Kogan wrote: > In CNA, spinning threads are organized in two queues, a main queue for > threads running on the same node as the current lock holder, and a > secondary queue for threads running on other nodes. At the unlock time, > the lock holder scans the main queue lookin

[PATCH v3 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

2019-07-15 Thread Alex Kogan
In CNA, spinning threads are organized in two queues, a main queue for threads running on the same node as the current lock holder, and a secondary queue for threads running on other nodes. At the unlock time, the lock holder scans the main queue looking for a thread running on the same node. If fo