On Tue, Mar 29, 2016 at 7:04 PM, Bruce Richardson <bruce.richardson at intel.com> wrote: > > On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote: > > Hi, > > > > > > On 03/29/2016 10:54 AM, Bruce Richardson wrote: > > >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: > > >>Hi Olivier, > > >> > > >>We could have two threads (running on different cores in the general > > >>case) that both succeed the cmpset operation. In the dequeue path, > > >>when n == 0, then cons_next == cons_head, and cmpset will always > > >>succeed. Now, if they both see an old r->cons.tail value from a > > >>previous dequeue, they can get stuck in the while > > > > > >Hi, > > > > > >I don't see how threads reading an "old r->cons.tail" value is even > > >possible. > > >The head and tail pointers on the ring are marked in the code as volatile, > > >so > > >all reads and writes to those values are always done from memory and not > > >cached > > >in registers. No deadlock should be possible on that while loop, unless a > > >process crashes in the middle of a ring operation. Each thread which > > >updates > > >the head pointer from x to y, is responsible for updating the tail pointer > > >in > > >a similar manner. The loop ensures the tail updates are in the same order > > >as the > > >head updates. > > > > > >If you believe deadlock is possible, can you outline the sequence of > > >operations > > >which would lead to such a state, because I cannot see how it could occur > > >without > > >a crash inside one of the threads. > > > > I think the deadlock Lazaros describes could occur in the following > > condition: > > > > current ring state > > r->prod.head = 0 > > r->prod.tail = 0 > > > > core 0 core 1 > > ==================================================================== > > enqueue 0 object > > cmpset(&r->prod.head, 0, 0) > > core 0 is interrupted here > > enqueue 1 object > > cmpset(&r->prod.head, 0, 1) > > copy the objects in box 0 > > while (r->prod.tail != prod_head)) > > r->prod.tail = prod_next > > copy 0 object (-> nothing to do) > > while (r->prod.tail != prod_head)) > > <loop forever> > > > > > > I think this issue is indeed fixed by Lazaros' patch (I missed it > > in previous review). However, I don't think this deadlock could > > happen once we avoided the (n == 0) case. > > > Yes, I agree with your scenario. However, I thought the issue was occuring > with > non-zero updates, but I may well be mistaken. > If it's fixed now, all good... :-) > > /Bruce
Hi all, Bruce, I thought it could be still possible because of my threads being stuck inside two dequeue(32) calls. But haven't been able to reproduce it with non-zero dequeues. And by trying to find a scenario of my own, it seems that at least one dequeue(0) is needed, similarly to Olivier's example on the enqueue path. Thanks, Lazaros.