On Fri, 17 Dec 2021 17:00:32 +0200 Tudor Cornea <tudor.cor...@gmail.com> wrote:
> This patch attempts to avoid dropping packets that are to be > transmitted, in case there is no space in the KNI ring. > > We have a use case in which we leverage the Linux TCP / IP stack for > control plane, and some protocols might be sensitive to packet drops. > > This might mean that the sender (Kernel) might be moving at a faster pace > than the receiver end (DPDK application), or it might have some brief > moments of bursty traffic patterns. > > Requeuing the packets could add a kind of backpressure until a transmit > window is available to us. > > The burden of retransmitting is shifted to the caller of ndo_start_xmit, > which in our case is the configured queuing discipline. This way, the > user should be able to influence the behavior w.r.t dropping packets, > by picking the desired queuing discipline. > > Although it should technically be a good approach, from what > I have tested, stopping the queue prior to returning NETDEV_TX_BUSY seems > to add some extra overhead, and degrade the control-plane performance > a bit. > > Signed-off-by: Tudor Cornea <tudor.cor...@gmail.com> NAK Doing this risks having a CPU lockup if userspace does not keep up or the DPDK application gets stuck. There are better ways to solve the TCP stack queue overrun issue: 1. Use a better queueing discipline on the kni device. The Linux default of pfifo_fast has bufferbloat issues. Use fq_codel, fq, codel or pie? 2. KNI should implement BQL so that TCP stack can see lock backpressure about possible queue depth. As a simple workaround increase the KNI ring size. It won't solve the whole problem but i tcan help