Hi Neil, > -----Original Message----- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Monday, December 15, 2014 4:00 PM > To: Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 10/17] librte_acl: add AVX2 as new > rte_acl_classify() method > > On Sun, Dec 14, 2014 at 06:10:52PM +0000, Konstantin Ananyev wrote: > > Introduce new classify() method that uses AVX2 instructions. > > From my measurements: > > On HSW boards when processing >= 16 packets per call, > > AVX2 method outperforms it's SSE counterpart by 10-25%, > > (depending on the ruleset). > > At runtime, this method is selected as default one on HW that supports AVX2. > > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev at intel.com> > > --- > > lib/librte_acl/Makefile | 9 + > > lib/librte_acl/acl.h | 4 + > > lib/librte_acl/acl_run.h | 2 +- > > lib/librte_acl/acl_run_avx2.c | 58 +++++ > > lib/librte_acl/acl_run_avx2.h | 305 ++++++++++++++++++++++++ > > lib/librte_acl/acl_run_sse.c | 537 > > +----------------------------------------- > > lib/librte_acl/acl_run_sse.h | 533 > > +++++++++++++++++++++++++++++++++++++++++ > > lib/librte_acl/rte_acl.c | 5 +- > > lib/librte_acl/rte_acl.h | 2 + > > 9 files changed, 917 insertions(+), 538 deletions(-) > > create mode 100644 lib/librte_acl/acl_run_avx2.c > > create mode 100644 lib/librte_acl/acl_run_avx2.h > > create mode 100644 lib/librte_acl/acl_run_sse.h > > > > diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile > > index 65e566d..223ec31 100644 > > --- a/lib/librte_acl/Makefile > > +++ b/lib/librte_acl/Makefile > > @@ -45,8 +45,17 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c > > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c > > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c > > SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c > > +SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_avx2.c > > > > CFLAGS_acl_run_sse.o += -msse4.1 > > +ifeq ($(CC), icc) > > +CFLAGS_acl_run_avx2.o += -march=core-avx2 > > +else ifneq ($(shell \ > > +test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 6 && echo 1), > > 1) > > +CFLAGS_acl_run_avx2.o += -mavx2 > > +else > > +CFLAGS_acl_run_avx2.o += -msse4.1 > > +endif > > > This seems broken. You've unilaterally included acl_run_avx2.c in the build > list above, but only enable -mavx2 if the compiler is at least gcc 4.6.
Actually 4.7 (before that version, as I know, gcc doesn't support avx2) > Unless > you want to make gcc 4.6 a requirement for building, I believe DPDK is required to be buildable by gcc 4.6 As I remember, we have to support it all way down to gcc 4.3. > you need to also exclude > the file above from the build list. That means that for gcc 4.6 and below rte_acl_classify_avx2() would not be defined. And then at runtime, I have to check for that somehow and (re)populate classify_fns[]. Doesn't seems like a good way to me. Instead, I prefer to always build acl_run_avx2.c, but for old compilers that don't support AVX2 - rte_acl_classify_avx2() would simply be identical to rte_acl_classify_sse(). > That in turn I think allows you to remove a > bunch of the ifdeffing that you've done in some of the avx2 specific files. Actually there are not many of them. One in acl_run_avx2.h and another in acl_run_avx2.c. Konstantin > > Neil