On Fri, Sep 25, 2020 at 7:44 PM Steven Lariau <steven.lar...@arm.com> wrote:
>
> One implementation of the DPDK stack library is lockfree,
> based on C11 memory model for atomics.
> Some of these atomic operations use unnecessary memory orders,
> that can be relaxed.
> This patch relax some of these operations in order to improve
> the performance of the stack library.
>
> The patch was tested on several architectures, to ensure that
> the implementation is correct, and to measure performance.
> Below are the results for a few architectures on multithread stack
> lockfree test.
> The cycles count is the average number of cycles per item to perform
> a bulk push / pop.
>
> $sudo ./builddir/app/dpdk-test
> RTE>>stack_lf_perf_autotest
>                               difference compared to main
> Cycles count on ThunderX2
>  2 cores, bulk size =  8:           -15.85%
>  2 cores, bulk size = 32:           -04.56%
>  4 cores, bulk size =  8:           -05.00%
>  4 cores, bulk size = 32:           -04.35%
> 16 cores, bulk size =  8:           -02.38%
> 16 cores, bulk size = 32:           -01.88%
>
>                               difference compared to main
> Cycles count on N1SDP
>  2 cores, batch size =  8:          +00.77%
>  2 cores, batch size = 32:          -16.00%
>
>                               difference compared to main
> Cycles count on Skylake
>  2 cores, bulk size =  8:           -00.18%
>  2 cores, bulk size = 32:           -00.95%
>  4 cores, bulk size =  8:           -01.19%
>  4 cores, bulk size = 32:           +00.64%
> 16 cores, bulk size =  8:           +01.20%
> 16 cores, bulk size = 32:           +00.48%
>
> v2: add comment to explain why pop head CAS relaxed is valid
>     added Fixes information
>
> Steven Lariau (5):
>   lib/stack: fix inconsistent weak / strong cas
>   lib/stack: remove push acquire fence
>   lib/stack: remove redundant orderings for list->len
>   lib/stack: reload head when pop fails
>   lib/stack: remove pop cas release ordering
>
>  lib/librte_stack/rte_stack_lf_c11.h | 32 +++++++++++++++++++----------
>  1 file changed, 21 insertions(+), 11 deletions(-)

Series applied, thanks for those optimisations.


-- 
David Marchand

Reply via email to