On Tue, Oct 02, 2018 at 05:08:21PM +0100, Ferruh Yigit wrote: > On 9/28/2018 8:25 PM, John Daley wrote: > > From: Hyong Youb Kim <hyon...@cisco.com> > > > > Add the vectorized version of the no-scatter Rx handler. It aims to > > process 8 descriptors per loop using AVX2 SIMD instructions. This > > handler is in its own file enic_rxtx_vec_avx2.c, and makefile and > > meson.build are modified to compile it when the compiler supports > > AVX2. Under ideal conditions, the vectorized handler reduces > > cycles/packet by more than 30%, when compared against the no-scatter > > Rx handler. Most implementation ideas come from i40e's AVX2 based > > handler, so credit goes to its authors. > > > > At this point, the new handler is meant for field trials, and is not > > selected by default. So add a new devarg enable-avx2-rx to allow the > > user to request the use of the new handler. When enable-avx2-rx=1, the > > driver will consider using the new handler. > > > > Also update the guide doc and introduce the vectorized handler. > > > > Signed-off-by: Hyong Youb Kim <hyon...@cisco.com> > > Reviewed-by: John Daley <johnd...@cisco.com> > > <...> > [...] > > +- Set ``devargs`` parameter ``enable-avx2-rx=1`` to explicitly request that > > + PMD consider the vectorized handler when selecting the receive handler. > > Can be good to show a usage sample, as done in disable-overlay=1, like > -w 12:00.0,enable-avx2-rx=1 >
Fixed. > > + > > + As the current implementation is intended for field trials, by default, > > the > > + vectorized handler is not considerd (``enable-avx2-rx=0``). > > + > > +- Run on a UCS M4 or later server with CPUs that support AVX2. > > + > > +PMD selects the vectorized handler when the handler is compiled into > > +the driver, the user requests its use via ``enable-avx2-rx=1``, CPU > > +supports AVX2, and scatter Rx is not used. To verify that the > > Code doesn't check if user requested DEV_RX_OFFLOAD_SCATTER, if so perhaps > shouldn't enable vector path. > The code below actually checks if scatter is used and selects the avx2 handler only if scatter is not used. bool enic_use_vector_rx_handler(struct enic *enic) { [...] /* Do not support scatter Rx */ if (!(enic->rq_count > 0 && enic->rq[0].data_queue_enable == 0)) return false; For enic, data_queue_enable == 0 implies that scatter Rx is not used. It covers two cases. First, scatter Rx offload (DEV_RX_OFFLOAD_SCATTER) is not requested. Second, scatter offload is requested (flag is set), but the scatter feature on the NIC is not needed, hence not enabled because maximum receive packet (max_rx_pkt_len) fits in one mbuf. > > --- a/drivers/net/enic/Makefile > > +++ b/drivers/net/enic/Makefile > > @@ -39,4 +39,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_intr.c > > SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_rq.c > > SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_rss.c > > > > +ifeq ($(findstring > > RTE_MACHINE_CPUFLAG_AVX2,$(CFLAGS)),RTE_MACHINE_CPUFLAG_AVX2) > > +# The current implementation assumes 64-bit pointers > > +ifeq ($(CONFIG_RTE_ARCH_X86_64),y) > > + SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_rxtx_vec_avx2.c > > +endif > > +endif > > This is not exactly enough. > > CFLAGS has MACHINE_CPUFLAG based on the parameters pass the compiler. This > flag > is what compiler supports. > > For example if DPDK build for "default" architecture AVX2 won't be supported > and > your vector path won't be build at all. Please check i40e Makefile. > Ah, thanks for the explanation. Fixed. > > + > > include $(RTE_SDK)/mk/rte.lib.mk > > diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h > > index 775cd5d55..665f5668a 100644 > > --- a/drivers/net/enic/enic.h > > +++ b/drivers/net/enic/enic.h > > @@ -106,6 +106,7 @@ struct enic { > > struct vnic_dev_bar bar0; > > struct vnic_dev *vdev; > > > > + uint64_t mbuf_initializer; > > A comment can be good. > Added a comment. > > + /* > > + * mbuf_initializer contains const-after-init fields of > > + * receive mbufs (i.e. 64 bits of fields from rearm_data). > > + * It is currently used by the vectorized handler. > > + */ > > + mb_def.nb_segs = 1; > > + mb_def.data_off = RTE_PKTMBUF_HEADROOM; > > + mb_def.port = enic->port_id; > > + rte_mbuf_refcnt_set(&mb_def, 1); > > + rte_compiler_barrier(); > > + p = (uintptr_t)&mb_def.rearm_data; > > + enic->mbuf_initializer = *(uint64_t *)p; > > What do you think wrapping this block with "enic->enable_avx2_rx" check, > mostly > to show the usage intention? > Makes sense. Moved the block inside if. > > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2)) { > > + PMD_INIT_LOG(DEBUG, " use the non-scatter avx2" > > + " Rx handler"); > > No need to split the log line. Fixed. Thanks for the review. John will review the new patch and submit v4. -Hyong