On 8/21/2024 3:38 PM, Vipin Varghese wrote:
> Goal of the patch series is to improve SSE macswap on x86_64 by
> reducing the stalls in backend engine. Original implementation of
> the SSE-mac-swap makes loop call to multiple load, shuffle & store.
> 
> Using SIMD ISA interleaving, register variable and reducing L1 & L2
> cache eviction, we can reduce the stalls for
>  - load SSE token exhaustion
>  - Shuffle and Load dependency
> 
> Build test using meson script:
> ``````````````````````````````
> build-gcc-static
> buildtools
> build-gcc-shared
> build-mini
> build-clang-static
> build-clang-shared
> build-x86-generic
> 
> Test Results:
> `````````````
> 
> Platform-1: AMD EPYC SIENA 8594P @2.3GHz, no boost
> Platform-2: AMD EPYC 9554 @3.1GHz, no boost
> 
> NIC:
>  1) mellanox CX-7 1*200Gbps
>  2) intel E810 1*100Gbps
>  3) intel E810 2*200Gbps (2CQ-DA2) - loopback
>  4) braodcom P2100 2*100Gbps - loopback
> 
> ------------------------------------------------
> TEST IO 64B: baseline <NIC : MPPs>
>  - NIC-1: 42.0
>  - NIC-2: 82.0
>  - NIC-3: 82.45
>  - NIC-3: 47.03
> ------------------------------------------------
> TEST MACSWAP 64B: <NIC : Before : After>
>  - NIC-1: 31.533 : 31.90
>  - NIC-2: 48.0   : 48.9 
>  - NIC-3: 48.840 : 49.827
>  - NIC-4: 44.3   : 45.5
> ------------------------------------------------
> TEST MACSWAP 128B: <NIC : Before: After>
>  - NIC-1: 30.946 : 31.770
>  - NIC-2: 47.4   : 48.3
>  - NIC-3: 47.979 : 48.503
>  - NIC-4: 41.53  : 44.59
> ------------------------------------------------
> TEST MACSWAP 256B: <NIC: Before: After>
>  - NIC-1: 32.480 : 33.150
>  - NIC-2: 45.29  : 45.571
>  - NIC-3: 45.033 : 45.117
>  - NIC-4: 36.49  : 37.5
> ------------------------------------------------
> 
> 
> ------------------------------------------------
> TEST IO 64B: baseline <NIC : MPPs>
>  - intel E810 2*200Gbps (2CQ-DA2): 82.49
> ------------------------------------------------
> <NIC intel E810 2*200Gbps (2CQ-DA2): Before : After>
> TEST MACSWAP: 1Q 1C1T
>  64B: : 45.0 : 45.54
> 128B: : 44.48 : 44.43
> 256B: : 42.0 : 41.99
> +++++++++++++++++++++++++
> TEST MACSWAP: 2Q 2C2T
>  64B: : 59.5 : 60.55
> 128B: : 56.78 : 58.1
> 256B: : 41.85 : 41.99
> ------------------------------------------------
> 
> Signed-off-by: Vipin Varghese <vipin.vargh...@amd.com>
> 
> Vipin Varghese (3):
>   app/testpmd: add register keyword
>   app/testpmd: move offload update
>   app/testpmd: interleave SSE SIMD
>

For series,
Acked-by: Ferruh Yigit <ferruh.yi...@amd.com>


Bruce, if your testing is not aligned with the presented results please
let us know, otherwise I am planning to get the patch for -rc1.

Reply via email to