From: Neil Horman <nhor...@tuxdriver.com> Date: Tue, 25 Jun 2019 17:57:49 -0400
> When an application is run that: > a) Sets its scheduler to be SCHED_FIFO > and > b) Opens a memory mapped AF_PACKET socket, and sends frames with the > MSG_DONTWAIT flag cleared, its possible for the application to hang > forever in the kernel. This occurs because when waiting, the code in > tpacket_snd calls schedule, which under normal circumstances allows > other tasks to run, including ksoftirqd, which in some cases is > responsible for freeing the transmitted skb (which in AF_PACKET calls a > destructor that flips the status bit of the transmitted frame back to > available, allowing the transmitting task to complete). > > However, when the calling application is SCHED_FIFO, its priority is > such that the schedule call immediately places the task back on the cpu, > preventing ksoftirqd from freeing the skb, which in turn prevents the > transmitting task from detecting that the transmission is complete. > > We can fix this by converting the schedule call to a completion > mechanism. By using a completion queue, we force the calling task, when > it detects there are no more frames to send, to schedule itself off the > cpu until such time as the last transmitted skb is freed, allowing > forward progress to be made. > > Tested by myself and the reporter, with good results > > Appies to the net tree > > Signed-off-by: Neil Horman <nhor...@tuxdriver.com> > Reported-by: Matteo Croce <mcr...@redhat.com> > CC: "David S. Miller" <da...@davemloft.net> > CC: Willem de Bruijn <willemdebruijn.ker...@gmail.com> ... Applied and queued up for -stable.