While using ticket lock, cores repeatedly poll the lock variable. This is replaced by rte_wait_until_equal API.
Running ticketlock_autotest on ThunderX2, with different numbers of cores and depths of rings, 3%~8% performance gains were measured. Signed-off-by: Gavin Hu <gavin...@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> Tested-by: Pavan Nikhilesh <pbhagavat...@marvell.com> --- lib/librte_eal/common/include/generic/rte_ticketlock.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/librte_eal/common/include/generic/rte_ticketlock.h b/lib/librte_eal/common/include/generic/rte_ticketlock.h index d9bec87..f0821f2 100644 --- a/lib/librte_eal/common/include/generic/rte_ticketlock.h +++ b/lib/librte_eal/common/include/generic/rte_ticketlock.h @@ -66,8 +66,7 @@ static inline void rte_ticketlock_lock(rte_ticketlock_t *tl) { uint16_t me = __atomic_fetch_add(&tl->s.next, 1, __ATOMIC_RELAXED); - while (__atomic_load_n(&tl->s.current, __ATOMIC_ACQUIRE) != me) - rte_pause(); + rte_wait_until_equal16(&tl->s.current, me, __ATOMIC_ACQUIRE); } /** -- 2.7.4