On Wed, Apr 12, 2023 at 10:05:57AM +0100, Bruce Richardson wrote:
> On Tue, Apr 11, 2023 at 02:12:20PM -0700, Tyler Retzlaff wrote:
> > Inline assembly is not supported for MSVC x64 instead use _mm_prefetch
> > and _mm_cldemote intrinsics.
> > 
> > Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com>
> > ---
> 
> Acked-by: Bruce Richardson <bruce.richard...@intel.com>
> 
> One comment inline below for future consideration.
> 
> >  lib/eal/x86/include/rte_prefetch.h | 29 +++++++++++++++++++++++++++++
> >  1 file changed, 29 insertions(+)
> > 
> > diff --git a/lib/eal/x86/include/rte_prefetch.h 
> > b/lib/eal/x86/include/rte_prefetch.h
> > index 7fd01c4..1391af0 100644
> > --- a/lib/eal/x86/include/rte_prefetch.h
> > +++ b/lib/eal/x86/include/rte_prefetch.h
> > @@ -13,6 +13,7 @@
> >  #include <rte_common.h>
> >  #include "generic/rte_prefetch.h"
> >  
> > +#ifndef RTE_TOOLCHAIN_MSVC
> >  static inline void rte_prefetch0(const volatile void *p)
> >  {
> >     asm volatile ("prefetcht0 %[p]" : : [p] "m" (*(const volatile char 
> > *)p));
> > @@ -43,6 +44,34 @@ static inline void rte_prefetch_non_temporal(const 
> > volatile void *p)
> >  {
> >     asm volatile(".byte 0x0f, 0x1c, 0x06" :: "S" (p));
> >  }
> > +#else
> > +static inline void rte_prefetch0(const volatile void *p)
> > +{
> > +   _mm_prefetch(p, 1);
> > +}
> > +
> > +static inline void rte_prefetch1(const volatile void *p)
> > +{
> > +   _mm_prefetch(p, 2);
> > +}
> > +
> > +static inline void rte_prefetch2(const volatile void *p)
> > +{
> > +   _mm_prefetch(p, 3);
> > +}
> > +
> > +static inline void rte_prefetch_non_temporal(const volatile void *p)
> > +{
> > +   _mm_prefetch(p, 0);
> > +}
> 
> For these prefetch instructions, I'm not sure there is any reason why we
> can't drop the inline assembly versions. The instructions are very old at
> this point and should be widely supported by all compilers we use.
> 
> Rather than using hard-coded 1, 2, 3 values in the prefetch calls, I
> believe there should be defines for the levels: "_MM_HINT_T0",
> "_MM_HINT_T1" etc.

hm, i did not know about these and i bet they fix the problem i had.
i.e. if i use e.g. bare '1' i would not get the same prefetch codegen on
gcc/msvc but these defines probably resolve that problem.

let me take another look at this one.

> 
> > +__rte_experimental
> > +static inline void
> > +rte_cldemote(const volatile void *p)
> > +{
> > +   _mm_cldemote(p);
> > +}
> > +#endif
> > +
> >  
> >  #ifdef __cplusplus
> >  }
> > -- 
> > 1.8.3.1
> > 

Reply via email to