On Fri, 17 Dec 2021 17:00:32 +0200
Tudor Cornea <tudor.cor...@gmail.com> wrote:

> This patch attempts to avoid dropping packets that are to be
> transmitted, in case there is no space in the KNI ring.
> 
> We have a use case in which we leverage the Linux TCP / IP stack for
> control plane, and some protocols might be sensitive to packet drops.
> 
> This might mean that the sender (Kernel) might be moving at a faster pace
> than the receiver end (DPDK application), or it might have some brief
> moments of bursty traffic patterns.
> 
> Requeuing the packets could add a kind of backpressure until a transmit
> window is available to us.
> 
> The burden of retransmitting is shifted to the caller of ndo_start_xmit,
> which in our case is the configured queuing discipline. This way, the
> user should be able to influence the behavior w.r.t dropping packets,
> by picking the desired queuing discipline.
> 
> Although it should technically be a good approach, from what
> I have tested, stopping the queue prior to returning NETDEV_TX_BUSY seems
> to add some extra overhead, and degrade the control-plane performance
> a bit.
> 
> Signed-off-by: Tudor Cornea <tudor.cor...@gmail.com>

NAK
Doing this risks having a CPU lockup if userspace does not keep up
or the DPDK application gets stuck.

There are better ways to solve the TCP stack queue overrun issue:
1. Use a better queueing discipline on the kni device. The Linux default
   of pfifo_fast has bufferbloat issues. Use fq_codel, fq, codel or pie?
2. KNI should implement BQL so that TCP stack can see lock backpressure
   about possible queue depth.

As a simple workaround increase the KNI ring size. It won't solve the whole
problem but i tcan help

Reply via email to