Hi Konstantin, > Hi Juhamatti, > > > > > Hi Konstantin, > > > > > > > > > It is quite safe to move the barrier before DEQUEUE because > > > > > > > after the DEQUEUE there is nothing really that we would want > > > > > > > to protect > > > with a > > > > > read barrier. > > > > > > > > > > > > I don't think so. > > > > > > If you remove barrier after DEQUEUE(), that means on systems > > > > > > with relaxed memory ordering cons.tail could be updated before > > > > > > DEQUEUE() will be finished and producer can overwrite queue > > > > > > entries that were > > > not > > > > > yet dequeued. > > > > > > So if cpu can really do such speculative out of order loads, > > > > > > then we do need for __rte_ring_sc_do_dequeue() something like: > > > > > > > > > > > > rte_smp_rmb(); > > > > > > DEQUEUE_PTRS(); > > > > > > rte_smp_rmb(); > > > > > > > > You have a valid point here, there needs to be a guarantee that > > > > cons_tail > > > cannot > > > > be updated before DEQUEUE is completed. Nevertheless, my point was > > > that it is > > > > not guaranteed with a read barrier anyway. The implementation has > > > > the > > > following > > > > sequence > > > > > > > > DEQUEUE_PTRS(); (i.e. READ/LOAD) > > > > rte_smp_rmb(); > > > > .. > > > > r->cons.tail = cons_next; (i.e WRITE/STORE) > > > > > > > > Above read barrier does not guarantee any ordering for the > > > > following > > > writes/stores. > > > > As a guarantee is needed, I think we in fact need to change the > > > > read barrier > > > on the > > > > dequeue to a full barrier, which guarantees the read+write order, > > > > as > > > follows > > > > > > > > DEQUEUE_PTRS(); > > > > rte_smp_mb(); > > > > .. > > > > r->cons.tail = cons_next; > > > > > > > > If you agree, I can for sure prepare another patch for this issue. > > > > > > Hmm, I think for __rte_ring_mc_do_dequeue() we are ok with > > > smp_rmb(), as we have to read cons.tail anyway. > > > > Are you certain that this read creates strong enough dependency between > read of cons.tail and the write of it on the mc_do_dequeue()? > > Yes, I believe so. > > > I think it does not really create any control dependency there as the next > write is not dependent of the result of the read. > > I think it is dependent: cons.tail can be updated only if it's current value > is > eual to precomputed before cons_head. > So cpu has to read cons.tail value first.
I was thinking that it might match to this processing pattern: S1. if (a == b) S2. a = a + b S3. b = a + b Above S3 has no control dependency to S1 and can be in theory executed in parallel. In the (simplified) implementation we have: X1. while (a != b) X2. { } X3. a = c If we would consider S3 and X3 equal, there might be a problem with coherence without a write barrier after the while(). It may of course be purely theoretical, depending on the HW. This is anyway the reason for my suggestion to have a write-barrier also on the mc_do_dequeue() just before tail update. -- Juhamatti > > The CPU also > > knows already the value that will be written to cons.tail and that > > value does not depend on the previous read either. The CPU does not > know we are planning to do a spinlock there, so it might do things out-of- > order without proper dependencies. > > > > > For __rte_ring_sc_do_dequeue(), I think you right, we might need > > > something stronger. > > > I don't want to put rte_smp_mb() here as it would cause full HW > > > barrier even on machines with strong memory order (IA). > > > I think that rte_smp_wmb() might be enough here: > > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become > > > visible, which means reads have to be completed too. > > > > In practice I think that rte_smp_wmb() would work fine, even though it > > is not strictly according to the book. Below solution would be my > > proposal as a fix to the issue of sc dequeueing (and also to mc dequeueing, > if we have the problem of CPU completely ignoring the spinlock in reality > there): > > > > DEQUEUE_PTRS(); > > .. > > rte_smp_wmb(); > > r->cons.tail = cons_next; > > As I said in previous email - it looks good for me for > _rte_ring_sc_do_dequeue(), but I am interested to hear what ARM and PPC > maintainers think about it. > Jan, Jerin do you have any comments on it? > Chao, sorry but I still not sure why PPC is considered as architecture with > strong memory ordering? > Might be I am missing something obvious here. > Thank > Konstantin > > > > > -- > > Juhamatti > > > > > Another option would be to define a new macro: rte_weak_mb() or so, > > > that would be expanded into CB on boxes with strong memory model, > > > and to full MB on machines with relaxed ones. > > > Interested to hear what ARM and PPC guys think. > > > Konstantin > > > > > > P.S. Another thing a bit off-topic - for PPC guys: > > > As I can see smp_rmb/smp_wmb are just a complier barriers: > > > find lib/librte_eal/common/include/arch/ppc_64/ -type f | xargs grep > > > smp_ lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_mb() rte_mb() > > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_wmb() rte_compiler_barrier() > > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define > > > rte_smp_rmb() rte_compiler_barrier() My knowledge about PPC > > > architecture is rudimental, but is that really enough? > > > > > > > > > > > Thanks, > > > > -- > > > > Juhamatti > > > > > > > > > > Konstantin > > > > > > > > > > > > > The read > > > > > > > barrier is mapped to a compiler barrier on strong memory > > > > > > > model systems and this works fine too as the order of the > > > > > > > head,tail updates is still guaranteed on the new location. > > > > > > > Even if the problem would be theoretical on most systems, it > > > > > > > is worth fixing as the risk for > > > > > problems is very low. > > > > > > > > > > > > > > -- > > > > > > > Juhamatti > > > > > > > > > > > > > > > Konstantin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Juhamatti Kuusisaari > > > > > > > > > > > <juhamatti.kuusisaari at coriant.com> > > > > > > > > > > > --- > > > > > > > > > > > lib/librte_ring/rte_ring.h | 6 ++++-- > > > > > > > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/lib/librte_ring/rte_ring.h > > > > > > > > > > > b/lib/librte_ring/rte_ring.h index eb45e41..a923e49 > > > > > > > > > > > 100644 > > > > > > > > > > > --- a/lib/librte_ring/rte_ring.h > > > > > > > > > > > +++ b/lib/librte_ring/rte_ring.h > > > > > > > > > > > @@ -662,9 +662,10 @@ > __rte_ring_mc_do_dequeue(struct > > > > > > > > > > > rte_ring *r, > > > > > > > > > > void **obj_table, > > > > > > > > > > > cons_next); > > > > > > > > > > > } while (unlikely(success == 0)); > > > > > > > > > > > > > > > > > > > > > > + rte_smp_rmb(); > > > > > > > > > > > + > > > > > > > > > > > /* copy in table */ > > > > > > > > > > > DEQUEUE_PTRS(); > > > > > > > > > > > - rte_smp_rmb(); > > > > > > > > > > > > > > > > > > > > > > /* > > > > > > > > > > > * If there are other dequeues in progress > > > > > > > > > > > that preceded us, @@ -746,9 +747,10 @@ > > > > > > > > > > > __rte_ring_sc_do_dequeue(struct rte_ring *r, > > > > > > > > > > void **obj_table, > > > > > > > > > > > cons_next = cons_head + n; > > > > > > > > > > > r->cons.head = cons_next; > > > > > > > > > > > > > > > > > > > > > > + rte_smp_rmb(); > > > > > > > > > > > + > > > > > > > > > > > /* copy in table */ > > > > > > > > > > > DEQUEUE_PTRS(); > > > > > > > > > > > - rte_smp_rmb(); > > > > > > > > > > > > > > > > > > > > > > __RING_STAT_ADD(r, deq_success, n); > > > > > > > > > > > r->cons.tail = cons_next; > > > > > > > > > > > -- > > > > > > > > > > > 2.9.0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ========================================================== > > > > > > > > > > == > > > > > > > > > > > The information contained in this message may be > > > > > > > > > > > privileged and confidential and protected from > > > > > > > > > > > disclosure. If the reader of this message is not the > > > > > > > > > > > intended recipient, or an employee or agent > > > > > > > > > > > responsible for delivering this message to the > > > > > > > > > > > intended recipient, you are hereby notified that any > > > > > > > > > > > reproduction, dissemination or distribution of this > > > > > > > > > > > communication is strictly prohibited. If you have > > > > > > > > > > > received this communication in error, please notify > > > > > > > > > > > us immediately by replying to the message and > > > > > > > > > > > deleting it from your > > > > > > > > computer. Thank you. > > > > > > > > > > > Coriant-Tellabs > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ========================================================== > > > > > > > > > > ==