This patch set replaces the two-way barriers with C11 one-way barriers for packed vring flags, when the frontend and backend are implemented in software.
By doing vhost-user + virtio-user case benchmarking, 9% performance gain in the RFC2544 test was measured on Thunderx2 platform.[1] And by doing VM2VM case benchmarking, 11% perf gain was measured on Ampere platform. [1]https://doc.dpdk.org/dts/test_plans/pvp_multi_paths_performance_test_plan.html PVP test with virtio 1.1 mergeable path v4: Use rte_smp_rmb/wmb instead of __atomic_load/store_n on x86 as it reports a better perf(~1.5%), which comes from the saved branch by the compiler. The if and else branch are identical with the smp and cio barriers both defined as compiler barriers on x86. http://inbox.dpdk.org/dev/e0cba5a1980f1f408e1f28f9991b5b1d50eff...@shsmsx104.ccr.corp.intel.com/ v3: Wrap C11 one-way barriers and DMA barriers(rte_cio_*) together with an inline fuction. v2: Convert RFC to patch. Joyce Kong (2): virtio: one way barrier for packed vring desc avail flags virtio: one way barrier for packed vring desc used flags drivers/net/virtio/virtio_rxtx.c | 25 +++++++----- drivers/net/virtio/virtio_user/virtio_user_dev.c | 10 +++-- drivers/net/virtio/virtqueue.h | 49 +++++++++++++++++++++++- lib/librte_vhost/vhost.h | 2 +- lib/librte_vhost/virtio_net.c | 16 ++++---- 5 files changed, 79 insertions(+), 23 deletions(-) -- 2.7.4