Currently SSE SIMD variables are declared as stack variables. Allowing
the use of keyword register for shuffle mask and address variables,
improves the mac-swap Mpps by 1 for single queue.

Test Result:
 * Platform: AMD EPYC 9554 @3.1GHz, no boost
 * Test scenarios: TEST-PMD 64B IO vs MAC-SWAP
 * NIC: broadcom P2100: loopback 2*100Gbps

<mode : Mpps Ingress: Mpps Egress>
------------------------------------------------
 - IO: 47.23 : 46.0
 - MAC-SWAP original: 45.75 : 43.8
 - MAC-SWAP register mod 45.73 : 44.83

Signed-off-by: Vipin Varghese <vipin.vargh...@amd.com>
---
 app/test-pmd/macswap_sse.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h
index 223f87a539..29088843b7 100644
--- a/app/test-pmd/macswap_sse.h
+++ b/app/test-pmd/macswap_sse.h
@@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb,
        uint64_t ol_flags;
        int i;
        int r;
-       __m128i addr0, addr1, addr2, addr3;
+       register __m128i addr0, addr1, addr2, addr3;
        /**
         * shuffle mask be used to shuffle the 16 bytes.
         * byte 0-5 wills be swapped with byte 6-11.
         * byte 12-15 will keep unchanged.
         */
-       __m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12,
+       register const __m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12,
                                        5, 4, 3, 2,
                                        1, 0, 11, 10,
                                        9, 8, 7, 6);
-- 
2.34.1

Reply via email to