On Mon, Jun 09, 2025 at 01:23:18PM +0100, Bruce Richardson wrote:
> On Mon, Jun 09, 2025 at 11:59:15AM +0000, Varghese, Vipin wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> > 
> > > -----Original Message-----
> > > From: Bruce Richardson <bruce.richard...@intel.com>
> > > Sent: Monday, June 9, 2025 1:28 PM
> > > To: Varghese, Vipin <vipin.vargh...@amd.com>
> > > Cc: dev@dpdk.org; Song, Keesang <keesang.s...@amd.com>
> > > Subject: Re: [PATCH v4] build: reduce use of AVX compiler flags
> > >
> > > Caution: This message originated from an External Source. Use proper 
> > > caution
> > > when opening attachments, clicking links, or responding.
> > >
> > >
> > > On Mon, Jun 09, 2025 at 06:02:02AM +0000, Varghese, Vipin wrote:
> > > > [Public]
> > > >
> > > > Snipped
> > > >
> > > > >
> > > > >
> > > > > When doing a build for a target that already has the instruction
> > > > > sets for
> > > > > AVX2/AVX512 enabled, skip emitting the AVX compiler flags, or the
> > > > > skylake-avx512 '-march' flags, as they are unnecessary. Instead,
> > > > > when the default flags produce the desired output, just use them
> > > > > unmodified, and don't bother adding in extra enabling flags for AVX2 
> > > > > or AVX-512.
> > > > >
> > > > > Depends-on: series-35006 ("doc/linux_gsg: update recommended
> > > > > compiler
> > > > > versions")
> > > > >
> > > > > Signed-off-by: Bruce Richardson <bruce.richard...@intel.com>
> > > > > ---
> > > > >
> > > > > V4: Fix error flagged by CI with clang builds without AVX512 - change
> > > > >     "cc_avx512_args" to correct "cc_avx512_flags"
> > > > >
> > > > > V3: put in version check to work around an issues with some meson
> > > > >     versions, (hopefully) allowing builds to pass in all CIs. The
> > > > >     printout of the extra flags now only happens with meson >=
> > > > > 0.60.2
> > > > >
> > > > > V2: dropped the doc update for the minimum compiler version.  Based on
> > > > >     discussion, that version bump is larger than proposed in RFC and 
> > > > > is
> > > > >     now a separate patch/series [series 35006 referenced above]
> > > > >
> > > > > ---
> > > > >  config/x86/meson.build | 31 ++++++++++++++++++++-----------
> > > > >  drivers/meson.build    |  9 +--------
> > > > >  lib/meson.build        |  9 +--------
> > > > >  3 files changed, 22 insertions(+), 27 deletions(-)
> > > > >
> > > > > diff --git a/config/x86/meson.build b/config/x86/meson.build index
> > > > > c3564b0011..e6612dbd80 100644
> > > > > --- a/config/x86/meson.build
> > > > > +++ b/config/x86/meson.build
> > > > > @@ -4,11 +4,13 @@
> > > > >  if is_ms_compiler
> > > > >      cc_avx2_flags = ['/arch:AVX2']
> > > > >  else
> > > > > -    cc_avx2_flags = ['-mavx2']
> > > > > +    cc_avx2_flags = []
> > > > > +    if cc.get_define('__AVX2__', args: machine_args) == ''
> > > > > +        cc_avx2_flags = ['-mavx2']
> > > > > +    endif
> > > > >  endif
> > > > >
> > > > >  cc_has_avx512 = false
> > > > > -target_has_avx512 = false
> > > > >
> > > > >  dpdk_conf.set('RTE_ARCH_X86', 1)
> > > > >  if dpdk_conf.get('RTE_ARCH_64')
> > > > > @@ -65,26 +67,33 @@ if is_linux or cc.get_id() == 'gcc'
> > > > >      endif
> > > > >  endif
> > > > >
> > > > > -cc_avx512_flags = ['-mavx512f', '-mavx512vl', '-mavx512dq',
> > > > > '-mavx512bw', '- mavx512cd'] -if (binutils_ok and
> > > > > cc.has_multi_arguments(cc_avx512_flags)
> > > > > +avx512_march_flag = '-march=skylake-avx512'
> > > > > +cc_avx512_flags = []
> > > > > +if (binutils_ok and cc.has_argument(avx512_march_flag)
> > > > >          and '-mno-avx512f' not in get_option('c_args'))
> > > > >      # check if compiler is working with _mm512_extracti64x4_epi64
> > > > >      # Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82887
> > > > >      code = '''#include <immintrin.h>
> > > > >      void test(__m512i zmm){
> > > > >          __m256i ymm = _mm512_extracti64x4_epi64(zmm, 0);}'''
> > > > > -    result = cc.compiles(code, args : cc_avx512_flags, name : 'AVX512
> > > checking')
> > > > > +    result = cc.compiles(code, args : [avx512_march_flag], name :
> > > > > + 'AVX512 checking')
> > > > >      if result == false
> > > > >          machine_args += '-mno-avx512f'
> > > > >          warning('Broken _mm512_extracti64x4_epi64, disabling AVX512 
> > > > > support')
> > > > >      else
> > > > >          cc_has_avx512 = true
> > > > > -        target_has_avx512 = (
> > > > > -                cc.get_define('__AVX512F__', args: machine_args) != 
> > > > > '' and
> > > > > -                cc.get_define('__AVX512BW__', args: machine_args) != 
> > > > > '' and
> > > > > -                cc.get_define('__AVX512DQ__', args: machine_args) != 
> > > > > '' and
> > > > > -                cc.get_define('__AVX512VL__', args: machine_args) != 
> > > > > ''
> > > > > -            )
> > > > > +        if cc.get_define('__AVX512F__', args: machine_args) == ''
> > > > > +            cc_avx512_flags = [avx512_march_flag]
> > > >
> > > > Hi Bruce, we have reviewed this internally and tested the same. We 
> > > > would like
> > > your thought for the following.
> > > >
> > > > - Before patch: we were directly setting AVX512 falgs for F, BW, DQ,
> > > > VL
> > > > - new patch: we are setting the flags for `skylake-server` as bare 
> > > > minimal.
> > > > - AMD supports AVX512 from `znver4 and higher`.
> > > >
> > > > As per GCC `https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html`, the 
> > > > extra ISA
> > > supported between skylake-server (super set) and znver4 and znver5 are 
> > > `SAHF,
> > > FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`.
> > > > Currently for DPDK microbenchmarks and examples runs safe as it is not 
> > > > using
> > > the `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`
> > > instructions.
> > > >
> > > > Question: should we check if target is `AMD EPYC` then apply bare 
> > > > minimum as
> > > `-march=znver4`, thus avoid possible unsupported instruction generation 
> > > when non
> > > `c_args for march` is passed?
> > > >
> > >
> > > Can you clarify why you mean by the "target" here? Is there a specific 
> > > value you
> > > are thinking of for the "cpu_instruction_set" option?
> > 
> > `Target` is target CPU, when generated without any arguments we get code 
> > for `native build`.
> > 
> > On AMD target cpu zen4 or zen5; Before patch as per the code ` AVX512 flags 
> > for F, BW, DQ` are used in ` cc_avx512_flags`.
> > 
> > With the patch, the cc_avx512_flags is set to `-march=skylake-avx512` 
> > (where compiler optimizations `can add HLE, PREFETCHW, SGX`).
> > 
> 
> With this patch for zen4 or zen5 the AVX512 code paths should be compiled
> with no additional flags, since the -march=zen* flag should include
> everything necessary. Can you confirm that you see the extra -march flag in
> those cases?
>

Ran a quick test myself, this is what I see, doing a zen4 build:

   $ meson setup -Dcpu_instruction_set=znver4 build-zen4
   The Meson build system
   Version: 1.7.0
   Source dir: /home/bruce/dpdk-github
   Build dir: /home/bruce/dpdk-github/build-zen4
   ...
   Fetching value of define "__AVX512F__" : 1 
   Message: Extra C flags needed for AVX2 output: []
   Message: Extra C flags needed for AVX512 output: []
   ...

Checking the build.ninja file in the build-zen4 directory, there is no use
of march=skylake. Here is the compilation recipe for i40e AVX-512 code, for
example:

build 
drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o: 
c_COMPILER ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c
 DEPFILE = 
drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
 DEPFILE_UNQUOTED = 
drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d
 ARGS = -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers 
-Idrivers/net/intel/i40e -I../drivers/net/intel/i40e 
-Idrivers/net/intel/i40e/base -I../drivers/net/intel/i40e/base -Ilib/ethdev 
-I../lib/ethdev -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig 
-I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include 
-I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include 
-I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log 
-I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry 
-Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool 
-Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Idrivers/bus/pci 
-I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci 
-Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu 
-I../lib/rcu -I/usr/include/x86_64-linux-gnu -fdiagnostics-color=always 
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include 
rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral 
-Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs 
-Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes 
-Wundef -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers 
-D_GNU_SOURCE -fPIC -march=znver4 -mrtm -DALLOW_EXPERIMENTAL_API 
-DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed-member 
-DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT

Regards,
/Bruce

Reply via email to