On Tue, 27 Aug 2024 21:02:00 +0530 "Varghese, Vipin" <vipin.vargh...@amd.com> wrote:
> On 8/21/2024 8:25 PM, Stephen Hemminger wrote: > > Caution: This message originated from an External Source. Use proper > > caution when opening attachments, clicking links, or responding. > > > > > > On Wed, 21 Aug 2024 20:08:55 +0530 > > Vipin Varghese<vipin.vargh...@amd.com> wrote: > > > >> diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h > >> index 223f87a539..29088843b7 100644 > >> --- a/app/test-pmd/macswap_sse.h > >> +++ b/app/test-pmd/macswap_sse.h > >> @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb, > >> uint64_t ol_flags; > >> int i; > >> int r; > >> - __m128i addr0, addr1, addr2, addr3; > >> + register __m128i addr0, addr1, addr2, addr3; > > Some compilers treat register as a no-op. Are you sure? Did you check with > > godbolt. > > Thank you Stephen, I have tested the code changes on Linux using GCC and > Clang compiler. > > In both cases in Linux environment, we have seen the the values loaded > onto register `xmm`. > > ``` > registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4, 3, 2, > 1, 0, 11, 10, 9, 8, 7, 6); > vmovdqaxmm0, xmmwordptr[rip+ .LCPI0_0] > > ``` > > Both cases we have performance improvement. > > > Can you please help us understand if we have missed out something? Ok, not sure why compiler would not decide to already use a register here?