From: Hiroshi Shimamoto <h-shimam...@ct.jp.nec.com> x86 can keep store ordering with standard operations.
Using memory barrier is much expensive in main packet processing loop. Removing this improves xmit/recv packet performance. We can see performance improvements with memnic-tester. Using Xeon E5-2697 v2 @ 2.70GHz, 4 vCPU. size | before | after 64 | 4.18Mpps | 4.59Mpps 128 | 3.85Mpps | 4.87Mpps 256 | 4.01Mpps | 4.72Mpps 512 | 3.52Mpps | 4.41Mpps 1024 | 3.18Mpps | 3.64Mpps 1280 | 2.86Mpps | 3.15Mpps 1518 | 2.59Mpps | 2.87Mpps Note: we have to take care if we use non-temporal cache. Signed-off-by: Hiroshi Shimamoto <h-shimamoto at ct.jp.nec.com> Reviewed-by: Hayato Momma <h-momma at ce.jp.nec.com> --- pmd/pmd_memnic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pmd/pmd_memnic.c b/pmd/pmd_memnic.c index 872f3c4..0783440 100644 --- a/pmd/pmd_memnic.c +++ b/pmd/pmd_memnic.c @@ -316,7 +316,7 @@ static uint16_t memnic_recv_pkts(void *rx_queue, bytes += p->len; drop: - rte_mb(); + rte_compiler_barrier(); p->status = MEMNIC_PKT_ST_FREE; if (++idx >= MEMNIC_NR_PACKET) @@ -403,7 +403,7 @@ retry: pkts++; bytes += pkt_len; - rte_mb(); + rte_compiler_barrier(); p->status = MEMNIC_PKT_ST_FILLED; rte_pktmbuf_free(tx_pkts[nr]); -- 1.8.3.1