Adding other platform maintainers as it affects all platforms.
> -----Original Message----- > From: Gavin Hu (Arm Technology China) <gavin...@arm.com> > Sent: Thursday, December 13, 2018 7:30 PM > To: Stephen Hemminger <step...@networkplumber.org>; Joyce Kong (Arm > Technology China) <joyce.k...@arm.com> > Cc: dev@dpdk.org; nd <n...@arm.com>; tho...@monjalon.net; > jerin.ja...@caviumnetworks.com; hemant.agra...@nxp.com; Honnappa > Nagarahalli <honnappa.nagaraha...@arm.com> > Subject: RE: [dpdk-dev] [PATCH v1 0/2] reimplement rwlock and add relevant > perf test case > > Hi Stephen, > > Thanks for your comment and sharing the link! > We are looking into it and it may take more time for performance profiling. > > Best Regards, > Gavin > > > -----Original Message----- > > From: Stephen Hemminger <step...@networkplumber.org> > > Sent: Thursday, December 13, 2018 1:27 PM > > To: Joyce Kong (Arm Technology China) <joyce.k...@arm.com> > > Cc: dev@dpdk.org; nd <n...@arm.com>; tho...@monjalon.net; > > jerin.ja...@caviumnetworks.com; hemant.agra...@nxp.com; Honnappa > > Nagarahalli <honnappa.nagaraha...@arm.com>; Gavin Hu (Arm Technology > > China) <gavin...@arm.com> > > Subject: Re: [dpdk-dev] [PATCH v1 0/2] reimplement rwlock and add > > relevant perf test case > > > > On Thu, 13 Dec 2018 11:37:43 +0800 > > Joyce Kong <joyce.k...@arm.com> wrote: > > > > > v1: reimplement rwlock with __atomic builtins, and add a rwlock perf test > > > on all available cores to benchmark the improvement. > > > > > > We tested the patches on three arm64 platforms, ThundeX2 gained 20% > > > performance, Qualcomm gained 36% and the 4-Cortex-A72 Marvell > > MACCHIATObin gained 19.6%. > > > Below is the detailed test result on ThunderX2: > > > > > > *** rwlock_autotest without __atomic builtins *** Rwlock Perf Test > > > on > > > 128 cores... > > > Core [0] count = 281 > > > Core [1] count = 252 > > > Core [2] count = 290 > > > Core [3] count = 259 > > > Core [4] count = 287 > > > ... > > > Core [209] count = 3 > > > Core [210] count = 31 > > > Core [211] count = 120 > > > Total count = 18537 > > > > > > *** rwlock_autotest with __atomic builtins *** Rwlock Perf Test on > > > 128 cores... > > > Core [0] count = 346 > > > Core [1] count = 355 > > > Core [2] count = 259 > > > Core [3] count = 285 > > > Core [4] count = 320 > > > ... > > > Core [209] count = 2 > > > Core [210] count = 23 > > > Core [211] count = 63 > > > Total count = 22194 > > > > > > Gavin Hu (1): > > > rwlock: reimplement with __atomic builtins > > > > > > Joyce Kong (1): > > > test/rwlock: add perf test case > > > > > > lib/librte_eal/common/include/generic/rte_rwlock.h | 16 ++--- > > > test/test/test_rwlock.c | 71 > > > ++++++++++++++++++++++ > > > 2 files changed, 79 insertions(+), 8 deletions(-) > > > > > > > Did you consider using a better algorithm not just better primitives. > > See https://locklessinc.com/articles/locks/ for a more complete > > discussion of alternatives like ticket locks.