Hi,

13/10/2017 11:01, Xiaoyun Li:
> This patch dynamically selects functions of memcpy at run-time based
> on CPU flags that current machine supports. This patch uses function
> pointers which are bind to the relative functions at constrctor time.
> In addition, AVX512 instructions set would be compiled only if users
> config it enabled and the compiler supports it.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun...@intel.com>
> ---
Keeping only the major changes of the patch for later discussions:
[...]
>  static inline void *
>  rte_memcpy(void *dst, const void *src, size_t n)
>  {
> -     if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> -             return rte_memcpy_aligned(dst, src, n);
> +     if (n <= RTE_X86_MEMCPY_THRESH)
> +             return rte_memcpy_internal(dst, src, n);
>       else
> -             return rte_memcpy_generic(dst, src, n);
> +             return (*rte_memcpy_ptr)(dst, src, n);
>  }
[...]
> +static inline void *
> +rte_memcpy_internal(void *dst, const void *src, size_t n)
> +{
> +     if (!(((uintptr_t)dst | (uintptr_t)src) & ALIGNMENT_MASK))
> +             return rte_memcpy_aligned(dst, src, n);
> +     else
> +             return rte_memcpy_generic(dst, src, n);
> +}

The significant change of this patch is to call a function pointer
for packet size > 128 (RTE_X86_MEMCPY_THRESH).

Please could you provide some benchmark numbers?

>From a test done at Mellanox, there might be a performance degradation
of about 15% in testpmd txonly with AVX2.
Is there someone else seeing a performance degradation?

Reply via email to