On Fri, May 20, 2022 at 1:31 AM Timothy McDaniel
<timothy.mcdan...@intel.com> wrote:
>
> On Xeon, as 512b accesses are available, movdir64 instruction is able to
> perform 512b read and write to DLB producer port. In order for movdir64
> to be able to pull its data from store buffers (store-buffer-forwarding)
> (before actual write), data should be in single 512b write format.
> This commit add change when code is built for Xeon with 512b AVX support
> to make single 512b write of all 4 QEs instead of 4x64b writes.
>
> Signed-off-by: Timothy McDaniel <timothy.mcdan...@intel.com>
> Acked-by: Kent Wires <kent.wi...@intel.com>
> ===
>
> Changes since V1:
> 1) Split out dlb2_event_build_hcws into two implementations, one
> that uses AVX512 instructions, and one that does not. Each implementation
> is in its own source file in order to avoid build errors if the compiler
> does not support the newer AVX512 instructions.
> 2) Update meson.build to and pull in appropriate source file based on
> whether the compiler supports AVX512VL
> 3) Check if target supports AVX512VL, and use appropriate implementation
> based on this runtime check.
> ---
>  drivers/event/dlb2/dlb2.c          | 206 +---------------------
>  drivers/event/dlb2/dlb2_avx512.c   | 267 +++++++++++++++++++++++++++++
>  drivers/event/dlb2/dlb2_noavx512.c | 219 +++++++++++++++++++++++

Could you change the file name to dlb2_sve.c as noavx512 means it can
be NEON too.
Rest looks good to me. Will merge the next version.

Reply via email to