On 10/4/2024 6:08 AM, Ferruh Yigit wrote: > On 8/21/2024 3:38 PM, Vipin Varghese wrote: >> Goal of the patch series is to improve SSE macswap on x86_64 by >> reducing the stalls in backend engine. Original implementation of >> the SSE-mac-swap makes loop call to multiple load, shuffle & store. >> >> Using SIMD ISA interleaving, register variable and reducing L1 & L2 >> cache eviction, we can reduce the stalls for >> - load SSE token exhaustion >> - Shuffle and Load dependency >> >> Build test using meson script: >> `````````````````````````````` >> build-gcc-static >> buildtools >> build-gcc-shared >> build-mini >> build-clang-static >> build-clang-shared >> build-x86-generic >> >> Test Results: >> ````````````` >> >> Platform-1: AMD EPYC SIENA 8594P @2.3GHz, no boost >> Platform-2: AMD EPYC 9554 @3.1GHz, no boost >> >> NIC: >> 1) mellanox CX-7 1*200Gbps >> 2) intel E810 1*100Gbps >> 3) intel E810 2*200Gbps (2CQ-DA2) - loopback >> 4) braodcom P2100 2*100Gbps - loopback >> >> ------------------------------------------------ >> TEST IO 64B: baseline <NIC : MPPs> >> - NIC-1: 42.0 >> - NIC-2: 82.0 >> - NIC-3: 82.45 >> - NIC-3: 47.03 >> ------------------------------------------------ >> TEST MACSWAP 64B: <NIC : Before : After> >> - NIC-1: 31.533 : 31.90 >> - NIC-2: 48.0 : 48.9 >> - NIC-3: 48.840 : 49.827 >> - NIC-4: 44.3 : 45.5 >> ------------------------------------------------ >> TEST MACSWAP 128B: <NIC : Before: After> >> - NIC-1: 30.946 : 31.770 >> - NIC-2: 47.4 : 48.3 >> - NIC-3: 47.979 : 48.503 >> - NIC-4: 41.53 : 44.59 >> ------------------------------------------------ >> TEST MACSWAP 256B: <NIC: Before: After> >> - NIC-1: 32.480 : 33.150 >> - NIC-2: 45.29 : 45.571 >> - NIC-3: 45.033 : 45.117 >> - NIC-4: 36.49 : 37.5 >> ------------------------------------------------ >> >> >> ------------------------------------------------ >> TEST IO 64B: baseline <NIC : MPPs> >> - intel E810 2*200Gbps (2CQ-DA2): 82.49 >> ------------------------------------------------ >> <NIC intel E810 2*200Gbps (2CQ-DA2): Before : After> >> TEST MACSWAP: 1Q 1C1T >> 64B: : 45.0 : 45.54 >> 128B: : 44.48 : 44.43 >> 256B: : 42.0 : 41.99 >> +++++++++++++++++++++++++ >> TEST MACSWAP: 2Q 2C2T >> 64B: : 59.5 : 60.55 >> 128B: : 56.78 : 58.1 >> 256B: : 41.85 : 41.99 >> ------------------------------------------------ >> >> Signed-off-by: Vipin Varghese <vipin.vargh...@amd.com> >> >> Vipin Varghese (3): >> app/testpmd: add register keyword >> app/testpmd: move offload update >> app/testpmd: interleave SSE SIMD >> > > For series, > Acked-by: Ferruh Yigit <ferruh.yi...@amd.com> > > > Bruce, if your testing is not aligned with the presented results please > let us know, otherwise I am planning to get the patch for -rc1. >
Series applied to dpdk-next-net/main, thanks.