Sorry Bruce, but I don't have a clue what you are talking about here.

> -----Original Message-----
> From: Richardson, Bruce <bruce.richard...@intel.com>
> Sent: Monday, May 23, 2022 11:37 AM
> To: McDaniel, Timothy <timothy.mcdan...@intel.com>
> Cc: jer...@marvell.com; dev@dpdk.org; Wires, Kent <kent.wi...@intel.com>
> Subject: Re: [PATCH v4] event/dlb2: add support for single 512B write of 4 QEs
> 
> On Mon, May 23, 2022 at 11:09:55AM -0500, Timothy McDaniel wrote:
> > On Xeon, as 512b accesses are available, movdir64 instruction is able to
> > perform 512b read and write to DLB producer port. In order for movdir64
> > to be able to pull its data from store buffers (store-buffer-forwarding)
> > (before actual write), data should be in single 512b write format.
> > This commit add change when code is built for Xeon with 512b AVX support
> > to make single 512b write of all 4 QEs instead of 4x64b writes.
> >
> > Signed-off-by: Timothy McDaniel <timothy.mcdan...@intel.com>
> > Acked-by: Kent Wires <kent.wi...@intel.com>
> > ===
> >
> > Changes since V3:
> > 1) Renamed dlb2_noavx512.c to dlb2_sve.c, and fixed up meson.build
> > for new file name.
> >
> > Changes since V1:
> > 1) Split out dlb2_event_build_hcws into two implementations, one
> > that uses AVX512 instructions, and one that does not. Each implementation
> > is in its own source file in order to avoid build errors if the compiler
> > does not support the newer AVX512 instructions.
> > 2) Update meson.build to and pull in appropriate source file based on
> > whether the compiler supports AVX512VL
> > 3) Check if target supports AVX512VL, and use appropriate implementation
> > based on this runtime check.
> > ---
> >  drivers/event/dlb2/dlb2.c        | 206 +-----------------------
> >  drivers/event/dlb2/dlb2_avx512.c | 267
> +++++++++++++++++++++++++++++++
> >  drivers/event/dlb2/dlb2_priv.h   |   8 +
> >  drivers/event/dlb2/dlb2_sve.c    | 219 +++++++++++++++++++++++++
> >  drivers/event/dlb2/meson.build   |  14 ++
> >  5 files changed, 513 insertions(+), 201 deletions(-)
> >  create mode 100644 drivers/event/dlb2/dlb2_avx512.c
> >  create mode 100644 drivers/event/dlb2/dlb2_sve.c
> >
> > diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
> > index 36f07d0061..ac7572a28d 100644
> > --- a/drivers/event/dlb2/dlb2.c
> > +++ b/drivers/event/dlb2/dlb2.c
> > @@ -1834,6 +1834,11 @@ dlb2_eventdev_port_setup(struct rte_eventdev
> *dev,
> >
> >     dev->data->ports[ev_port_id] = &dlb2->ev_ports[ev_port_id];
> >
> > +   if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL))
> > +           ev_port->qm_port.use_avx512 = true;
> > +   else
> > +           ev_port->qm_port.use_avx512 = false;
> > +
> >     return 0;
> >  }
> >
> 
> Additional comment for this runtime check. You also should check the
> max_simd_bitwidth in DPDK i.e. the value specified with
> --force-max-simd-bitwidth EAL argument, or set programmatically by the app.
> This is to allow the user runtime control over when the various instruction
> sets get used, and it's also very useful for testing and debugging various
> code paths.
> 
> /Bruce

Reply via email to