On Tue, 27 Aug 2024 21:02:00 +0530
"Varghese, Vipin" <vipin.vargh...@amd.com> wrote:

> On 8/21/2024 8:25 PM, Stephen Hemminger wrote:
> > Caution: This message originated from an External Source. Use proper 
> > caution when opening attachments, clicking links, or responding.
> >
> >
> > On Wed, 21 Aug 2024 20:08:55 +0530
> > Vipin Varghese<vipin.vargh...@amd.com>  wrote:
> >  
> >> diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h
> >> index 223f87a539..29088843b7 100644
> >> --- a/app/test-pmd/macswap_sse.h
> >> +++ b/app/test-pmd/macswap_sse.h
> >> @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb,
> >>        uint64_t ol_flags;
> >>        int i;
> >>        int r;
> >> -     __m128i addr0, addr1, addr2, addr3;
> >> +     register __m128i addr0, addr1, addr2, addr3;  
> > Some compilers treat register as a no-op. Are you sure? Did you check with 
> > godbolt.  
> 
> Thank you Stephen, I have tested the code changes on Linux using GCC and 
> Clang compiler.
> 
> In both cases in Linux environment, we have seen the the values loaded 
> onto register `xmm`.
> 
> ```
> registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4, 3, 2, 
> 1, 0, 11, 10, 9, 8, 7, 6);
> vmovdqaxmm0, xmmwordptr[rip+ .LCPI0_0]
> 
> ```
> 
> Both cases we have performance improvement.
> 
> 
> Can you please help us understand if we have missed out something?

Ok, not sure why compiler would not decide to already use a register here?

Reply via email to