On 09/11, Zhang, Qi Z wrote: > > >> -----Original Message----- >> From: Ye, Xiaolong >> Sent: Tuesday, September 10, 2019 11:09 PM >> To: Zhang, Qi Z <qi.z.zh...@intel.com> >> Cc: Yigit, Ferruh <ferruh.yi...@intel.com>; Loftus, Ciara >> <ciara.lof...@intel.com>; dev@dpdk.org; sta...@dpdk.org; Karlsson, >> Magnus <magnus.karls...@intel.com> >> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets >> >> On 09/10, Zhang, Qi Z wrote: >> > >> > >> >> -----Original Message----- >> >> From: Ye, Xiaolong >> >> Sent: Tuesday, September 10, 2019 9:54 PM >> >> To: Zhang, Qi Z <qi.z.zh...@intel.com> >> >> Cc: Yigit, Ferruh <ferruh.yi...@intel.com>; Loftus, Ciara >> >> <ciara.lof...@intel.com>; dev@dpdk.org; sta...@dpdk.org; Karlsson, >> >> Magnus <magnus.karls...@intel.com> >> >> Subject: Re: [PATCH] net/af_xdp: fix Tx halt when no recv packets >> >> >> >> On 09/10, Zhang, Qi Z wrote: >> >> > >> >> > >> >> >> -----Original Message----- >> >> >> From: Ye, Xiaolong >> >> >> Sent: Tuesday, September 10, 2019 12:13 AM >> >> >> To: Yigit, Ferruh <ferruh.yi...@intel.com>; Loftus, Ciara >> >> >> <ciara.lof...@intel.com>; Ye, Xiaolong <xiaolong...@intel.com>; >> >> >> Zhang, Qi Z <qi.z.zh...@intel.com> >> >> >> Cc: dev@dpdk.org; sta...@dpdk.org >> >> >> Subject: [PATCH] net/af_xdp: fix Tx halt when no recv packets >> >> >> >> >> >> The kernel only consumes Tx packets if we have some Rx traffic on >> >> >> specified queue or we have called send(). So we need to issue a >> >> >> send() even when the allocation fails so that kernel will start to >> >> >> consume >> >> packets again. >> >> > >> >> >So "allocation fails" means " xsk_ring_prod__reserve" fail right? >> >> >> >> Yes. >> >> >> >> >I don't understand when xsk_ring_prod__needs_wakeup is true why >> >> >kernel will stop Tx packet at this situation would you share more >> insight? >> >> >> >> Actually, the fail case is xsk_ring_prod__needs_wakeup is false, then >> >> we can't issue a send() when xsk_ring_prod__reserve fails. >> > >> >Sorry, I think my question should be for the case when >> >xsk_ring_prod__needs_wakeup is false, I don't understand why we need to >> >handle different at below two situations 1. when xsk_ring_prod__reserve >> >fails 2. normal tx scenario. >> >My understanding is when xsk_ring_prod__needs_wakeup(tx) is false, >> which means Tx is ongoing, we don't need to wake up kernel to continue. >> > >> >> The problem is that kernel does not guarantee that all entries are sent for >> Tx. >> There are a number of reasons that this might not happen, but usually some >> Rx packet will at some point in time in the very short future trigger >> further Tx >> processing and the packets will be sent. But if you only have Tx processing >> and no Rx at all, you have to trigger a sento() again. > >Ok , so the question is why we have below code. >#if defined(XDP_USE_NEED_WAKEUP) >if (xsk_ring_prod__needs_wakeup(&txq->tx)) >#endif > kick_tx(txq); > >Here, when xsk_ring_prod__needs_wakeup is false, we can skip kick_tx (send), >but why same "if check" can't be applied to the case when >xsk_ring_prod__reserve failed?
When the system is running out of Tx entries, it needs some explicit action to trigger kernel consumes the Tx buffers. > >Btw, think about below case >when xsk_ring_prod_reserve failed, if we don't kick_tx, and no following rx >happens, >does that mean the remain packets in tx queue will never get chance be >transmitted?, what happen if the last tx_burst is never be called? This is exactly the issue this patch try to fix, in this case, xsk_ring_prod__reserve failure means there is no more available entries in tx queue, if we don't call send/sendto or there is no rx traffic, Tx just halts. Thanks, Xiaolong > >> >> Thanks, >> Xiaolong >> >> >> >> >> Thanks, >> >> Xiaolong >> >> >> >> > >> >> >Thanks >> >> >Qi >> >> > >> >> >> >> >> >> Commit 45bba02c95b0 ("net/af_xdp: support need wakeup feature") >> >> >> breaks above rule by adding some condition to send, this patch >> >> >> fixes it while still keeps the need_wakeup feature for Tx. >> >> >> >> >> >> Fixes: 45bba02c95b0 ("net/af_xdp: support need wakeup feature") >> >> >> Cc: sta...@dpdk.org >> >> >> >> >> >> Signed-off-by: Xiaolong Ye <xiaolong...@intel.com> >> >> >> --- >> >> >> drivers/net/af_xdp/rte_eth_af_xdp.c | 28 >> >> >> ++++++++++++++-------------- >> >> >> 1 file changed, 14 insertions(+), 14 deletions(-) >> >> >> >> >> >> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c >> >> >> b/drivers/net/af_xdp/rte_eth_af_xdp.c >> >> >> index 41ed5b2af..e496e9aaa 100644 >> >> >> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c >> >> >> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c >> >> >> @@ -286,19 +286,16 @@ kick_tx(struct pkt_tx_queue *txq) { >> >> >> struct xsk_umem_info *umem = txq->pair->umem; >> >> >> >> >> >> -#if defined(XDP_USE_NEED_WAKEUP) >> >> >> - if (xsk_ring_prod__needs_wakeup(&txq->tx)) >> >> >> -#endif >> >> >> - while (send(xsk_socket__fd(txq->pair->xsk), NULL, >> >> >> - 0, MSG_DONTWAIT) < 0) { >> >> >> - /* some thing unexpected */ >> >> >> - if (errno != EBUSY && errno != EAGAIN && errno >> >> >> != >> EINTR) >> >> >> - break; >> >> >> - >> >> >> - /* pull from completion queue to leave more >> >> >> space */ >> >> >> - if (errno == EAGAIN) >> >> >> - pull_umem_cq(umem, >> ETH_AF_XDP_TX_BATCH_SIZE); >> >> >> - } >> >> >> + while (send(xsk_socket__fd(txq->pair->xsk), NULL, >> >> >> + 0, MSG_DONTWAIT) < 0) { >> >> >> + /* some thing unexpected */ >> >> >> + if (errno != EBUSY && errno != EAGAIN && errno != EINTR) >> >> >> + break; >> >> >> + >> >> >> + /* pull from completion queue to leave more space */ >> >> >> + if (errno == EAGAIN) >> >> >> + pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); >> >> >> + } >> >> >> pull_umem_cq(umem, ETH_AF_XDP_TX_BATCH_SIZE); } >> >> >> >> >> >> @@ -367,7 +364,10 @@ eth_af_xdp_tx(void *queue, struct rte_mbuf >> >> >> **bufs, uint16_t nb_pkts) >> >> >> >> >> >> xsk_ring_prod__submit(&txq->tx, nb_pkts); >> >> >> >> >> >> - kick_tx(txq); >> >> >> +#if defined(XDP_USE_NEED_WAKEUP) >> >> >> + if (xsk_ring_prod__needs_wakeup(&txq->tx)) >> >> >> +#endif >> >> >> + kick_tx(txq); >> >> >> >> >> >> txq->stats.tx_pkts += nb_pkts; >> >> >> txq->stats.tx_bytes += tx_bytes; >> >> >> -- >> >> >> 2.17.1 >> >> >