Hi Joyce, On 9/6/19 1:34 PM, Joyce Kong wrote: > In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend > and backend are assumed to be implemented in software, that is they can > run on identical CPUs in an SMP configuration. Thus a weak form of memory > barriers like rte_smp_r/wmb, other than rte_cio_r/wmb, is sufficient for > this case(vq->hw->weak_barriers == 1) and yields better performance. > For the above case, this patch helps yielding even better performance > by replacing the two-way barriers with C11 one-way barriers. > > Meanwhile, a read barrier is required to ensure ordering between > descriptor's flags and content reads[1]. With C11, load-acquire can > enforce the ordering instead of rmb barrier. > > [1]https://patchwork.dpdk.org/patch/49109/ > > Signed-off-by: Joyce Kong <joyce.k...@arm.com> > Reviewed-by: Gavin Hu <gavin...@arm.com> > Reviewed-by: Phil Yang <phil.y...@arm.com> > --- > drivers/net/virtio/virtio_rxtx.c | 26 > ++++++++++++++++++------ > drivers/net/virtio/virtio_user/virtio_user_dev.c | 6 +++++- > lib/librte_vhost/vhost.h | 2 +- > lib/librte_vhost/virtio_net.c | 11 +++++----- > 4 files changed, 31 insertions(+), 14 deletions(-) > > diff --git a/drivers/net/virtio/virtio_rxtx.c > b/drivers/net/virtio/virtio_rxtx.c > index 27ead19..2a2153c 100644 > --- a/drivers/net/virtio/virtio_rxtx.c > +++ b/drivers/net/virtio/virtio_rxtx.c > @@ -456,8 +456,14 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue > *vq, > vq->vq_desc_head_idx = dxp->next; > if (vq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END) > vq->vq_desc_tail_idx = vq->vq_desc_head_idx; > - virtio_wmb(hw->weak_barriers); > - start_dp[idx].flags = flags; > + > + if (hw->weak_barriers) > + __atomic_store_n(&start_dp[idx].flags, flags, > + __ATOMIC_RELEASE); > + else { > + rte_cio_wmb(); > + start_dp[idx].flags = flags; > + } It looks good to me. I just wonder whether it would be cleaner to put that in an inline function:
static inline void virtqueue_store_flags_packed() Same for the fetch.