Regards, /Ruifeng
> -----Original Message----- > From: Jerin Jacob Kollanukkaran <jer...@marvell.com> > Sent: 2019年3月11日 22:17 > To: Ruifeng Wang (Arm Technology China) <ruifeng.w...@arm.com>; > jingjing...@intel.com; bernard.iremon...@intel.com; > wenzhuo...@intel.com > Cc: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; nd > <n...@arm.com>; hemant.agra...@nxp.com; dev@dpdk.org > Subject: Re: [PATCH v1] app/testpmd: optimized MAC swap by using neon > intrinsics > > On Mon, 2019-03-11 at 16:14 +0800, Ruifeng Wang wrote: > > ------------------------------------------------------------------- > > --- > > Improved MAC swap performance for ARM platform. > > The improvement was achieved by using neon intrinsics to save CPU > > cycles and doing swap for four packets at a time. > > The optimization had 15% - 20% throughput boost in testpmd MAC swap > > mode. > > > > Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com> > > Reviewed-by: Gavin Hu <gavin...@arm.com> > > Reviewed-by: Phil Yang <phil.y...@arm.com> > > --- > > app/test-pmd/macswap.c | 4 +- > > app/test-pmd/macswap_neon.h | 93 > > +++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 96 insertions(+), 1 deletion(-) create mode 100644 > > app/test-pmd/macswap_neon.h > > > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c > > > > +static inline void > > +do_macswap(struct rte_mbuf *pkts[], uint16_t nb, > > + struct rte_port *txp) > > +{ > > + struct ether_hdr *eth_hdr[4]; > > + struct rte_mbuf *mb[4]; > > + uint64_t ol_flags; > > + int i; > > + int r; > > + uint8x16_t v0, v1, v2, v3; > > + /** > > + * Index map be used to shuffle the 16 bytes. > > + * byte 0-5 will be swapped with byte 6-11. > > + * byte 12-15 will keep unchanged. > > + */ > > + uint8x16_t idx_map = {6, 7, 8, 9, 10, 11, 0, 1, 2, 3, 4, 5, > > + 12, 13, 14, 15}; > > Nit: I think, we can make it as "const uint8x16_t idx_map". > > Other than that it looks good to me. > Regarding the performance, I have tested with two SoCs. > > octeontx: +13% improvement > octeontx2: +46% improvement > > > Acked-by: Jerin Jacob <jer...@marvell.com> > Thanks Jerin for your test and data. The code change will be included in v2.