Hi Wenzhuo > -----Original Message----- > From: dev <dev-boun...@dpdk.org> On Behalf Of Wenzhuo Lu > Sent: Friday, March 12, 2021 1:27 AM > To: dev@dpdk.org > Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; sta...@dpdk.org > Subject: [dpdk-dev] [PATCH 1/3] net/iavf: fix segment fault in AVX512 > > Fix segment fault when failing to get the memory from the pool. > > Fixes: 31737f2b66fb ("net/iavf: enable AVX512 for legacy Rx") > Cc: sta...@dpdk.org > > Reported-by: David Coyle <david.co...@intel.com> > Signed-off-by: Wenzhuo Lu <wenzhuo...@intel.com> > --- > drivers/net/iavf/iavf_rxtx_vec_avx512.c | 130 > ++++++++++++++++++++++++++++++++ > 1 file changed, 130 insertions(+) > > diff --git a/drivers/net/iavf/iavf_rxtx_vec_avx512.c > b/drivers/net/iavf/iavf_rxtx_vec_avx512.c > index 5cb4c7c..6134520 100644 > --- a/drivers/net/iavf/iavf_rxtx_vec_avx512.c > +++ b/drivers/net/iavf/iavf_rxtx_vec_avx512.c > @@ -25,6 +25,9 @@ > > rxdp = rxq->rx_ring + rxq->rxrearm_start; > > + if (!cache) > + goto normal;
[DC] In the Tx path, in iavf_tx_free_bufs_avx512(), it also checks for cache->len == 0 Not sure if the extra check is necessary though - I don't know if 'cache' can be valid pointer but have a length of 0 if (!cache || cache->len == 0) goto normal; > + > /* We need to pull 'n' more MBUFs into the software ring from > mempool > * We inline the mempool function here, so we can vectorize the > copy > * from the cache into the shadow ring. > @@ -127,6 +130,133 @@ > cache->len -= IAVF_DESCS_PER_LOOP_AVX; > } > > + goto done; > + > +normal: > + /* Pull 'n' more MBUFs into the software ring */ > + if (rte_mempool_get_bulk(rxq->mp, > + (void *)rxp, > + IAVF_RXQ_REARM_THRESH) < 0) { > + if (rxq->rxrearm_nb + IAVF_RXQ_REARM_THRESH >= > + rxq->nb_rx_desc) { > + __m128i dma_addr0; > + > + dma_addr0 = _mm_setzero_si128(); > + for (i = 0; i < IAVF_DESCS_PER_LOOP_AVX; i++) { > + rxp[i] = &rxq->fake_mbuf; > + _mm_store_si128((__m128i *)&rxdp[i].read, > + dma_addr0); > + } > + } > + rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed > += > + IAVF_RXQ_REARM_THRESH; > + return; > + } > + > +#ifndef RTE_LIBRTE_IAVF_16BYTE_RX_DESC > + struct rte_mbuf *mb0, *mb1; > + __m128i dma_addr0, dma_addr1; > + __m128i hdr_room = _mm_set_epi64x(RTE_PKTMBUF_HEADROOM, > + RTE_PKTMBUF_HEADROOM); > + /* Initialize the mbufs in vector, process 4 mbufs in one loop */ [DC] Comment above should say 2 mbufs I think > + for (i = 0; i < IAVF_RXQ_REARM_THRESH; i += 2, rxp += 2) { > + __m128i vaddr0, vaddr1; > + > + mb0 = rxp[0]; > + mb1 = rxp[1]; > + > + /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */ > + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, buf_iova) != > + offsetof(struct rte_mbuf, buf_addr) + 8); > + vaddr0 = _mm_loadu_si128((__m128i *)&mb0->buf_addr); > + vaddr1 = _mm_loadu_si128((__m128i *)&mb1->buf_addr); > + > + /* convert pa to dma_addr hdr/data */ > + dma_addr0 = _mm_unpackhi_epi64(vaddr0, vaddr0); > + dma_addr1 = _mm_unpackhi_epi64(vaddr1, vaddr1); > + > + /* add headroom to pa values */ > + dma_addr0 = _mm_add_epi64(dma_addr0, hdr_room); > + dma_addr1 = _mm_add_epi64(dma_addr1, hdr_room); > + > + /* flush desc with pa dma_addr */ > + _mm_store_si128((__m128i *)&rxdp++->read, dma_addr0); > + _mm_store_si128((__m128i *)&rxdp++->read, dma_addr1); > + } [DC] Large blocks of the code above is the same as in avx2 file... any possibility to have a common function or functions? > +#else > + struct rte_mbuf *mb0, *mb1, *mb2, *mb3; > + struct rte_mbuf *mb4, *mb5, *mb6, *mb7; > + __m512i dma_addr0_3, dma_addr4_7; > + __m512i hdr_room = > _mm512_set1_epi64(RTE_PKTMBUF_HEADROOM); > + /* Initialize the mbufs in vector, process 4 mbufs in one loop */ [DC] Comment above should say 8 mbufs > + for (i = 0; i < IAVF_RXQ_REARM_THRESH; > + i += 8, rxp += 8, rxdp += 8) { <snip> > + > + /** > + * merge 0 & 1, by casting 0 to 256-bit and inserting 1 > + * into the high lanes. Similarly for 2 & 3 > + */ [DC] Comment above should say "Similarly for 2 & 3, 4 & 5, 6 & 7" > + vaddr0_1 = > + > _mm256_inserti128_si256(_mm256_castsi128_si256(vaddr0), > + vaddr1, 1); > + vaddr2_3 = > + The patch fixes the seg fault, but note I have only tested the default '#ifndef RTE_LIBRTE_IAVF_16BYTE_RX_DESC' path Tested-by: David Coyle <david.co...@intel.com>