> -----Original Message-----
> From: Richardson, Bruce <bruce.richard...@intel.com>
> Sent: Monday, May 23, 2022 11:34 AM
> To: McDaniel, Timothy <timothy.mcdan...@intel.com>
> Cc: jer...@marvell.com; dev@dpdk.org; Wires, Kent <kent.wi...@intel.com>
> Subject: Re: [PATCH v4] event/dlb2: add support for single 512B write of 4 QEs
>
> On Mon, May 23, 2022 at 11:09:55AM -0500, Timothy McDaniel wrote:
> > On Xeon, as 512b accesses are available, movdir64 instruction is able to
> > perform 512b read and write to DLB producer port. In order for movdir64
> > to be able to pull its data from store buffers (store-buffer-forwarding)
> > (before actual write), data should be in single 512b write format.
> > This commit add change when code is built for Xeon with 512b AVX support
> > to make single 512b write of all 4 QEs instead of 4x64b writes.
> >
> > Signed-off-by: Timothy McDaniel <timothy.mcdan...@intel.com>
> > Acked-by: Kent Wires <kent.wi...@intel.com>
> > ===
> >
> > Changes since V3:
> > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build
> > for new file name.
> >
> > Changes since V1:
> > 1) Split out dlb2_event_build_hcws into two implementations, one
> > that uses AVX512 instructions, and one that does not. Each implementation
> > is in its own source file in order to avoid build errors if the compiler
> > does not support the newer AVX512 instructions.
> > 2) Update meson.build to and pull in appropriate source file based on
> > whether the compiler supports AVX512VL
> > 3) Check if target supports AVX512VL, and use appropriate implementation
> > based on this runtime check.
> > ---
> > drivers/event/dlb2/dlb2.c | 206 +-----------------------
> > drivers/event/dlb2/dlb2_avx512.c | 267
> +++++++++++++++++++++++++++++++
> > drivers/event/dlb2/dlb2_priv.h | 8 +
> > drivers/event/dlb2/dlb2_sve.c | 219 +++++++++++++++++++++++++
> > drivers/event/dlb2/meson.build | 14 ++
> > 5 files changed, 513 insertions(+), 201 deletions(-)
> > create mode 100644 drivers/event/dlb2/dlb2_avx512.c
> > create mode 100644 drivers/event/dlb2/dlb2_sve.c
> >
> <snip>
> > diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
> > index f963589fd3..0ad4d31785 100644
> > --- a/drivers/event/dlb2/meson.build
> > +++ b/drivers/event/dlb2/meson.build
> > @@ -19,6 +19,20 @@ sources = files(
> > 'dlb2_selftest.c',
> > )
> >
> > +dlb2_avx512_support = false
> > +
> > +if dpdk_conf.has('RTE_ARCH_X86_64')
> > + dlb2_avx512_support = (
> > + cc.get_define('__AVX512VL__', args: machine_args) != ''
> > + )
> > +endif
> > +
> > +if dlb2_avx512_support == true
> > + sources += files('dlb2_avx512.c')
> > +else
> > + sources += files('dlb2_sve.c')
> > +endif
> > +
> > headers = files('rte_pmd_dlb2.h')
> >
> > deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci']
>
> I believe this can be improved upon further, since it still does not allow
> a generic build to opportunistically use the AVX-512 code path.
What does this mean - " generic build to opportunistically use the AVX-512 code
path"
It also
> makes the runtime check largely pointless as the whole build will have been
> done with global AVX-512 support, meaning that the binary likely will fail
> to run if AVX-512 is not available.
If built for avx512, then that build supports using either avx512, or not.
>
> Instead, I'd recommend doing as other places in DPDK - such as in ACL
> library, or i40e or ice net drivers - where we not only check the current
> build support, but also check the compiler support. That way, even if we
> are building for e.g. a target of AVX2, we can still build the AVX-512
> parts using the appropriate compiler flags, and choose them
> opportunistically at runtime.
I do not understand what you are getting at here.
See the meson.build files in any of the above
> component directories for examples.
>
> Regards,
>
> /Bruce