On 1/30/2024 1:13 AM, lon...@linuxonhyperv.com wrote: > From: Long Li <lon...@microsoft.com> > > Instead of allocating mbufs one by one during RX, use rte_pktmbuf_alloc_bulk() > to allocate them in a batch. > > Signed-off-by: Long Li <lon...@microsoft.com> >
Can you please quantify the performance improvement (as percentage), this clarifies the impact of the modification. <...> > @@ -121,19 +115,32 @@ mana_alloc_and_post_rx_wqe(struct mana_rxq *rxq) > * Post work requests for a Rx queue. > */ > static int > -mana_alloc_and_post_rx_wqes(struct mana_rxq *rxq) > +mana_alloc_and_post_rx_wqes(struct mana_rxq *rxq, uint32_t count) > { > int ret; > uint32_t i; > + struct rte_mbuf **mbufs; > + > + mbufs = rte_calloc_socket("mana_rx_mbufs", count, sizeof(struct > rte_mbuf *), > + 0, rxq->mp->socket_id); > + if (!mbufs) > + return -ENOMEM; > 'mbufs' is temporarily storage for allocated mbuf pointers, why not allocate if from stack instead, can be faster and easier to manage: "struct rte_mbuf *mbufs[count]" > + > + ret = rte_pktmbuf_alloc_bulk(rxq->mp, mbufs, count); > + if (ret) { > + DP_LOG(ERR, "failed to allocate mbufs for RX"); > + rxq->stats.nombuf += count; > + goto fail; > + } > > #ifdef RTE_ARCH_32 > rxq->wqe_cnt_to_short_db = 0; > #endif > - for (i = 0; i < rxq->num_desc; i++) { > - ret = mana_alloc_and_post_rx_wqe(rxq); > + for (i = 0; i < count; i++) { > + ret = mana_post_rx_wqe(rxq, mbufs[i]); > if (ret) { > DP_LOG(ERR, "failed to post RX ret = %d", ret); > - return ret; > + goto fail; > This may leak memory. There are allocated mbufs, if exit from loop here and free 'mubfs' variable, how remaining mubfs will be freed?