As packet length extraction code was simplified,the ordering was not necessary any more.[1]
2% performance gain was measured on Marvell ThunderX2. 4.3% performance gain was measure on Ampere eMAG80 [1] http://mails.dpdk.org/archives/dev/2016-April/037529.html Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM") Cc: sta...@dpdk.org Signed-off-by: Gavin Hu <gavin...@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> Reviewed-by: Steve Capper <steve.cap...@arm.com> --- drivers/net/i40e/i40e_rxtx_vec_neon.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index 5555e9b..864eb9a 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -307,9 +307,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts, rte_mbuf_prefetch_part2(rx_pkts[pos + 3]); } - /* avoid compiler reorder optimization */ - rte_compiler_barrier(); - /* pkt 3,4 shift the pktlen field to be 16-bit aligned*/ uint32x4_t len3 = vshlq_u32(vreinterpretq_u32_u64(descs[3]), len_shl); -- 2.7.4