On Mon, May 23, 2022 at 11:09:55AM -0500, Timothy McDaniel wrote: > On Xeon, as 512b accesses are available, movdir64 instruction is able to > perform 512b read and write to DLB producer port. In order for movdir64 > to be able to pull its data from store buffers (store-buffer-forwarding) > (before actual write), data should be in single 512b write format. > This commit add change when code is built for Xeon with 512b AVX support > to make single 512b write of all 4 QEs instead of 4x64b writes. > > Signed-off-by: Timothy McDaniel <timothy.mcdan...@intel.com> > Acked-by: Kent Wires <kent.wi...@intel.com> > === > > Changes since V3: > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build > for new file name. > > Changes since V1: > 1) Split out dlb2_event_build_hcws into two implementations, one > that uses AVX512 instructions, and one that does not. Each implementation > is in its own source file in order to avoid build errors if the compiler > does not support the newer AVX512 instructions. > 2) Update meson.build to and pull in appropriate source file based on > whether the compiler supports AVX512VL > 3) Check if target supports AVX512VL, and use appropriate implementation > based on this runtime check. > --- > drivers/event/dlb2/dlb2.c | 206 +----------------------- > drivers/event/dlb2/dlb2_avx512.c | 267 +++++++++++++++++++++++++++++++ > drivers/event/dlb2/dlb2_priv.h | 8 + > drivers/event/dlb2/dlb2_sve.c | 219 +++++++++++++++++++++++++ > drivers/event/dlb2/meson.build | 14 ++ > 5 files changed, 513 insertions(+), 201 deletions(-) > create mode 100644 drivers/event/dlb2/dlb2_avx512.c > create mode 100644 drivers/event/dlb2/dlb2_sve.c > <snip> > diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build > index f963589fd3..0ad4d31785 100644 > --- a/drivers/event/dlb2/meson.build > +++ b/drivers/event/dlb2/meson.build > @@ -19,6 +19,20 @@ sources = files( > 'dlb2_selftest.c', > ) > > +dlb2_avx512_support = false > + > +if dpdk_conf.has('RTE_ARCH_X86_64') > + dlb2_avx512_support = ( > + cc.get_define('__AVX512VL__', args: machine_args) != '' > + ) > +endif > + > +if dlb2_avx512_support == true > + sources += files('dlb2_avx512.c') > +else > + sources += files('dlb2_sve.c') > +endif > + > headers = files('rte_pmd_dlb2.h') > > deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
I believe this can be improved upon further, since it still does not allow a generic build to opportunistically use the AVX-512 code path. It also makes the runtime check largely pointless as the whole build will have been done with global AVX-512 support, meaning that the binary likely will fail to run if AVX-512 is not available. Instead, I'd recommend doing as other places in DPDK - such as in ACL library, or i40e or ice net drivers - where we not only check the current build support, but also check the compiler support. That way, even if we are building for e.g. a target of AVX2, we can still build the AVX-512 parts using the appropriate compiler flags, and choose them opportunistically at runtime. See the meson.build files in any of the above component directories for examples. Regards, /Bruce