> -----Original Message----- > From: Jakub Grajciar -X (jgrajcia - PANTHEON TECHNOLOGIES at Cisco) > <jgraj...@cisco.com> > Sent: Tuesday, October 8, 2019 7:05 PM > To: Phil Yang (Arm Technology China) <phil.y...@arm.com>; dev@dpdk.org > Cc: tho...@monjalon.net; jer...@marvell.com; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; Damjan Marion (damarion) > <damar...@cisco.com>; nd <n...@arm.com>; Gavin Hu (Arm Technology > China) <gavin...@arm.com>; nd <n...@arm.com> > Subject: RE: [dpdk-dev] [PATCH v1] net/memif: optimized with one-way > barrier > > > > -----Original Message----- > > > From: dev <dev-boun...@dpdk.org> On Behalf Of Phil Yang > > > Sent: Monday, August 26, 2019 7:00 PM > > > To: jgraj...@cisco.com; dev@dpdk.org > > > Cc: tho...@monjalon.net; jer...@marvell.com; Honnappa Nagarahalli > > > <honnappa.nagaraha...@arm.com>; damar...@cisco.com; nd > > <n...@arm.com> > > > Subject: [dpdk-dev] [PATCH v1] net/memif: optimized with one-way > > > barrier > > > > > > Using 'rte_mb' to synchronize the shared ring head/tail between > > > producer and consumer will stall the pipeline and damage performance > > > on the weak memory model platforms, such like aarch64. Meanwhile > > > update the shared ring head and tail are observable and ordered > between > > CPUs on IA. > > > > > > Optimized this full barrier with the one-way barrier can improve the > > > throughput. On aarch64 n1sdp server this patch make testpmd > throughput > > > boost 2.1%. On Intel E5-2640, testpmd got 3.98% performance gain. > > > > > > Signed-off-by: Phil Yang <phil.y...@arm.com> > > > Reviewed-by: Gavin Hu <gavin...@arm.com> > > The patch is looking good, but 'MEMIF_VERSION_MAJOR' in memif.h needs > to > be set to 3 as ring pointers are no longer volatile.
Updated in v2. Thanks for your comments. Thanks, Phil