On Mon, Sep 28, 2015 at 01:05:24AM +0800, Zhe Tao wrote: > The vPMD RX function uses the multi-buffer and SSE instructions to > accelerate the RX speed, but now the pktype cannot be supported by the vPMD > RX, > because it will decrease the performance heavily. > > Signed-off-by: Zhe Tao <zhe.tao at intel.com> > --- > config/common_bsdapp | 2 + > config/common_linuxapp | 2 + > drivers/net/i40e/Makefile | 1 + > drivers/net/i40e/base/i40e_type.h | 3 + > drivers/net/i40e/i40e_rxtx.c | 28 ++- > drivers/net/i40e/i40e_rxtx.h | 20 +- > drivers/net/i40e/i40e_rxtx_vec.c | 484 > ++++++++++++++++++++++++++++++++++++++ > 7 files changed, 535 insertions(+), 5 deletions(-) > create mode 100644 drivers/net/i40e/i40e_rxtx_vec.c > <snip> > + > + /* vPMD receive routine, now only accept (nb_pkts == RTE_I40E_VPMD_RX_BURST) > + * in one loop > + * > + * Notice: > + * - nb_pkts < RTE_I40E_VPMD_RX_BURST, just return no packet
I don't think this comment matches the implementation below. I think you are allowed to request bursts as small as RTE_I40E_DESCS_PER_LOOP. > + * - nb_pkts > RTE_I40E_VPMD_RX_BURST, only scan RTE_I40E_VPMD_RX_BURST > + * numbers of DD bits > + > + */ > +static inline uint16_t > +_recv_raw_pkts_vec(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_pkts, > + uint16_t nb_pkts, uint8_t *split_packet) > +{ > + volatile union i40e_rx_desc *rxdp; > + struct i40e_rx_entry *sw_ring; > + uint16_t nb_pkts_recd; > + int pos; > + uint64_t var; > + __m128i shuf_msk; > + > + __m128i crc_adjust = _mm_set_epi16( > + 0, 0, 0, /* ignore non-length fields */ > + -rxq->crc_len, /* sub crc on data_len */ > + 0, /* ignore high-16bits of pkt_len */ > + -rxq->crc_len, /* sub crc on pkt_len */ > + 0, 0 /* ignore pkt_type field */ > + ); > + __m128i dd_check, eop_check; > + > + /* nb_pkts shall be less equal than RTE_I40E_MAX_RX_BURST */ > + nb_pkts = RTE_MIN(nb_pkts, RTE_I40E_MAX_RX_BURST); > + > + /* nb_pkts has to be floor-aligned to RTE_I40E_DESCS_PER_LOOP */ > + nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_I40E_DESCS_PER_LOOP); /Bruce