On Mon, Jan 09, 2017 at 07:50:44AM +0000, David Hunt wrote: > Signed-off-by: David Hunt <david.h...@intel.com> > --- > lib/librte_distributor/Makefile | 4 + > lib/librte_distributor/rte_distributor_burst.c | 11 +- > lib/librte_distributor/rte_distributor_match_sse.c | 113 > +++++++++++++++++++++ > lib/librte_distributor/rte_distributor_priv.h | 6 ++ > 4 files changed, 133 insertions(+), 1 deletion(-) > create mode 100644 lib/librte_distributor/rte_distributor_match_sse.c > > diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile > index 2acc54d..a725aaf 100644 > --- a/lib/librte_distributor/Makefile > +++ b/lib/librte_distributor/Makefile > @@ -44,6 +44,10 @@ LIBABIVER := 1 > # all source are stored in SRCS-y > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor.c > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += rte_distributor_burst.c > +ifeq ($(CONFIG_RTE_ARCH_X86),y) > +SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += rte_distributor_match_sse.c > +endif > + >
I believe some of the intrinsics used in the vector code are SSE4.2 instructions, so you need to pass that flag for the compilation for e.g. the "default" target for packaging into distros. > # install this header file > SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor.h > diff --git a/lib/librte_distributor/rte_distributor_burst.c > b/lib/librte_distributor/rte_distributor_burst.c > index ae7cf9d..35044c4 100644 > --- a/lib/librte_distributor/rte_distributor_burst.c > +++ b/lib/librte_distributor/rte_distributor_burst.c > @@ -352,6 +352,9 @@ rte_distributor_process_burst(struct > rte_distributor_burst *d, > } > > switch (d->dist_match_fn) { > + case RTE_DIST_MATCH_VECTOR: > + find_match_vec(d, &flows[0], &matches[0]); > + break; > default: > find_match_scalar(d, &flows[0], &matches[0]); > } > @@ -538,7 +541,13 @@ rte_distributor_create_burst(const char *name, > snprintf(d->name, sizeof(d->name), "%s", name); > d->num_workers = num_workers; > > - d->dist_match_fn = RTE_DIST_MATCH_SCALAR; > +#if defined(RTE_ARCH_X86) > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE2)) { > + d->dist_match_fn = RTE_DIST_MATCH_VECTOR; > + } else { > +#endif > + d->dist_match_fn = RTE_DIST_MATCH_SCALAR; > + } > Two issues here: 1) the check needs to be for SSE4.2, not SSE2 [minimum for DPDK on x86 is SSE3 anyway, so no need for any checks for SSE2] 2) The closing brace should be ifdefed out to fix compilation on non-x86 platforms. A simpler/better solution might actually be to remove the braces since only a single line is involved in each branch. Regards, /Bruce