On Wed, Aug 23, 2017 at 01:39:08PM +0200, Nélio Laranjeiro wrote:
> On Mon, Aug 21, 2017 at 10:47:01AM +0300, Sagi Grimberg wrote:
> > From: Shahaf Shuler <shah...@mellanox.com>
> > 
> > The reason for the requirement of a barrier between the txq writes
> > and the doorbell record writes is to avoid a case where the device
> > reads the doorbell record's new value before the txq writes are flushed
> > to memory.
> > 
> > The current use of rte_wmb is not necessary, and can be replaced by
> > rte_compiler_barrier as it acts as a write memory barrier.
> > More details on this type of barrier can be found on [1]
> > 
> > Replacing the rte_wmb is also expected to improve the throughput.
> > 
> > [1] https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
> > 
> > Signed-off-by: Shahaf Shuler <shah...@mellanox.com>
> > Signed-off-by: Yongseok Koh <ys...@mellanox.com>
> > Signed-off-by: Alexander Solganik <solga...@gmail.com>
> > Signed-off-by: Sagi Grimberg <s...@grimberg.me>
> > ---
> >  drivers/net/mlx5/mlx5_rxtx.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> > index 7de1d10863e5..59b9ff24fb82 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -602,7 +602,7 @@ mlx5_tx_dbrec(struct txq *txq, volatile struct mlx5_wqe 
> > *wqe)
> >     uint64_t *dst = (uint64_t *)((uintptr_t)txq->bf_reg);
> >     volatile uint64_t *src = ((volatile uint64_t *)wqe);
> >  
> > -   rte_wmb();
> > +   rte_compiler_barrier();
> >     *txq->qp_db = htonl(txq->wqe_ci);
> >     /* Ensure ordering between DB record and BF copy. */
> >     rte_wmb();
> > -- 
> > 2.7.4
>  
> Acked-by: Nelio Laranjeiro <nelio.laranje...@6wind.com>
> 
While a compiler barrier may do on platforms with strong ordering, I'm
wondering if the rte_smp_wmb() macro may be needed here to give compiler
barrier or actual memory barrier depending on platform?

/Bruce

Reply via email to