The rte_atomic ops and rte_smp barriers enforce DMB barriers on aarch64. Using c11 atomics with explicit memory ordering instead of the rte_atomic ops and rte_smp barriers for inter-threads synchronization can uplift the performance on aarch64 and no performance loss on x86.
This patchset contains: 1) fix race condition for MT unsafe service. 2) clean up redundant code. 3) use c11 atomics for service core lib to avoid unnecessary barriers. v2: Still waiting on Harry for the final solution on the MT unsafe race condition issue. But I have incorporated the comments so far. 1. add 'Fixes' tag for bug-fix patches. 2. remove 'Fixes' tag for code cleanup patches. 3. remove unused parameter for service_dump_one function. 4. replace the execute_lock atomic CAS operation to spinlock_try_lock. 5. use c11 atomics with RELAXED memory ordering for num_mapped_cores. 6. relax barriers for guard variables runstate, comp_runstate and app_runstate with c11 one-way barriers. v3: Sending this version since Phil is on holiday. 1. Updated the API documentation to indicate how the locking can be avoided. v4: 1. Fix the nits in 2/6 commit message and comments in code. Honnappa Nagarahalli (2): service: fix race condition for MT unsafe service service: fix identification of service running on other lcore Phil Yang (4): service: remove rte prefix from static functions service: remove redundant code service: optimize with c11 atomics service: relax barriers with C11 atomics lib/librte_eal/common/rte_service.c | 234 ++++++++++-------- lib/librte_eal/include/rte_service.h | 8 +- .../include/rte_service_component.h | 6 +- lib/librte_eal/meson.build | 4 + 4 files changed, 141 insertions(+), 111 deletions(-) -- 2.17.1