v5: Minor format change. v4: Change _try_ functions to use __atomic too (for consistency)(Suggested by Ananyev, Konstantin).
v3: Fix headline format error. v2: Rebase and modify the rwlock test case to address the comments in v1. v1: Reimplement rwlock with atomic builtins, and add a rwlock perf test on all available cores to benchmark the improvement. We tested the patches on three arm64 platforms. ThundeX2 gained 20% performance, Qualcomm gained 36% and the 4-Cortex-A72 Marvell MACCHIATObin gained 19.6%. Below is the detailed test result on ThunderX2: *** rwlock_autotest without atomic builtins *** Rwlock Perf Test on 128 cores... Core [0] count = 281 Core [1] count = 252 Core [2] count = 290 Core [3] count = 259 Core [4] count = 287 ... Core [209] count = 3 Core [210] count = 31 Core [211] count = 120 Total count = 18537 *** rwlock_autotest with atomic builtins *** Rwlock Perf Test on 128 cores... Core [0] count = 346 Core [1] count = 355 Core [2] count = 259 Core [3] count = 285 Core [4] count = 320 ... Core [209] count = 2 Core [210] count = 23 Core [211] count = 63 Total count = 22194 Gavin Hu (1): rwlock: reimplement with atomic builtins Joyce Kong (2): test/rwlock: add perf test case on all available cores test/rwlock: amortize the cost of getting time app/test/test_rwlock.c | 77 ++++++++++++++++++++++ lib/librte_eal/common/include/generic/rte_rwlock.h | 29 ++++---- 2 files changed, 92 insertions(+), 14 deletions(-) -- 2.7.4