Hi, On Tue, 2018-06-05 at 08:35 -0700, Tom Herbert wrote: > On Tue, Jun 5, 2018 at 7:53 AM, David Miller <da...@davemloft.net> wrote: > > From: Paolo Abeni <pab...@redhat.com> > > Date: Tue, 5 Jun 2018 12:32:33 +0200 > > > >> @@ -1157,7 +1158,9 @@ static int kcm_recvmsg(struct socket *sock, struct > >> msghdr *msg, > >> /* Finished with message */ > >> msg->msg_flags |= MSG_EOR; > >> KCM_STATS_INCR(kcm->stats.rx_msgs); > >> + spin_lock_bh(&kcm->mux->rx_lock); > >> skb_unlink(skb, &sk->sk_receive_queue); > >> + spin_unlock_bh(&kcm->mux->rx_lock); > > > > Hmmm, maybe I don't understand the corruption. > > > > But, skb_unlink() takes the sk->sk_receive_queue.lock which should > > prevent SKB list corruption. > > It looks like there is a case where the list is being manipulated > without the queue lock. That is in requeue_rx_msgs where > __skb_dequeue is being called instead of skb_dequeue which is in > requeue_rx_msgs. requeue_rx_msgs holds the mux rx_lock which would > explain why the suggested patch avoids the issue.
Yep, I belive this is the correct explanation. Sorry for the noise with the previous patch, I underlooked the skb_queue lock already in place. > Paolo, thanks for looking into this! Can you try replacing > __skb_dequeue in requeue_rx_msgs with skb_dequeue to see if that is > the fix. Sure, I'll retrigger the test, and report the result here (or directly a new patch, should the test be succesful) Thanks, Paolo