Hi, [...] > > > uint16_t > > > ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, > > > uint16_t nb_pkts) > > > { > > > struct ixgbe_rx_queue *rxq = rx_queue; > > > - uint8_t split_flags[RTE_IXGBE_VPMD_RX_BURST] = {0}; > > > + uint8_t split_flags[nb_pkts]; > > > + > > > + memset(split_flags, 0, nb_pkts); > > > > > > /* get some new buffers */ > > > uint16_t nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts, > > > > After this _recv_raw_pkts_vec it checks 32 bytes in split_flags (4x8 > > bytes), that can overrun or miss some flags. > > Btw. Bruce just fixed that part in "ixgbe: fix check for split packets" > > Ah yes, missed that when reviewing, that code would be broken if nb_bufs > 32: > > const uint64_t *split_fl64 = (uint64_t *)split_flags; > if (rxq->pkt_first_seg == NULL && > split_fl64[0] == 0 && split_fl64[1] == 0 && > split_fl64[2] == 0 && split_fl64[3] == 0) > return nb_bufs; > > right?
We can either rollback and only allow 'nb_pkts<=32', or do some broken fix as below diff. By the result of performance test (4*10GE 64B burst_size(32) iofwd by scattered_pkts_vec), there's no drop. But I'm not sure it is important or not to allow burst size larger than 32. Your comments will be important. diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c index e94c68b..8f34236 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c @@ -537,26 +537,35 @@ uint16_t ixgbe_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) { +#define NB_SPLIT_ELEM (8) struct ixgbe_rx_queue *rxq = rx_queue; uint8_t split_flags[nb_pkts]; + uint32_t i, nb_scan; + uint16_t nb_bufs; + uint64_t *split_fl64 = (uint64_t *)split_flags; memset(split_flags, 0, nb_pkts); /* get some new buffers */ - uint16_t nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts, - split_flags); + nb_bufs = _recv_raw_pkts_vec(rxq, rx_pkts, nb_pkts, + split_flags); if (nb_bufs == 0) return 0; /* happy day case, full burst + no packets to be joined */ - const uint64_t *split_fl64 = (uint64_t *)split_flags; - if (rxq->pkt_first_seg == NULL && - split_fl64[0] == 0 && split_fl64[1] == 0 && - split_fl64[2] == 0 && split_fl64[3] == 0) + nb_scan = RTE_ALIGN(nb_bufs, NB_SPLIT_ELEM); + if (rxq->pkt_first_seg == NULL) { + for (i = 0; i < nb_scan; + i += NB_SPLIT_ELEM, split_fl64++) { + if (*split_fl64 != 0) + goto reassemble; + } return nb_bufs; + } +reassemble: /* reassemble any packets that need reassembly*/ - unsigned i = 0; + i = 0; if (rxq->pkt_first_seg == NULL) { /* find the first split flag, and only reassemble then*/ while (i < nb_bufs && !split_flags[i]) /Steve > > Another thing, that I just thought about: > Right now we invoke ixgbe_rxq_rearm() only at the start of > _recv_raw_pkts_vec(). > Before it was ok, as _recv_raw_pkts_vec() would never try to read more then 32 > RXDs. > But what would happen if nb_pkts > rxq->nb_desc and rxq->rxrearm_nb == 0? > I suppose, _recv_raw_pkts_vec() can wrpa around RXD ring and 'receive' same > packet twice? > So we probably better still limit nb_pkts <= 32 at _recv_raw_pkts_vec(). The _recv_raw_pkts_vec() won't wrap around RXD ring. When it reaches the last one, the DD bit of padding desc. always 0. So in the case nb_pkts > rxq->nb_desc, the '_recv_raw_pkts_vec()' can only get no more than 'rxq->nb_desc' packets. > > Konstantin > > >