On 25 January 2016 at 17:11, Joe Stringer <j...@ovn.org> wrote: > On 22 January 2016 at 17:22, Eric Dumazet <eric.duma...@gmail.com> wrote: >> On Fri, 2016-01-22 at 15:49 -0800, Joe Stringer wrote: >>> Later parts of the stack (including fragmentation) expect that there is >>> never a socket attached to frag in a frag_list, however this invariant >>> was not enforced on all defrag paths. This could lead to the >>> BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the >>> end of this commit message. >>> >>> While the call could be added to openvswitch to fix this particular >>> error, the head and tail of the frags list are already orphaned >>> indirectly inside ip_defrag(), so it seems like the remaining fragments >>> should all be orphaned in all circumstances. >> >> >> Yes, it looks we have a problem, and even IP early demux apparently does >> not check if incoming packet is a fragment. >> >> Your patch could also remove some socket leaks in this respect. >> >> I guess we also could add a safety check (ipv4 only, but ipv6 needs care >> as well) >> >> diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c >> index b1209b63381f..99513c829213 100644 >> --- a/net/ipv4/ip_input.c >> +++ b/net/ipv4/ip_input.c >> @@ -316,7 +316,9 @@ static int ip_rcv_finish(struct net *net, struct sock >> *sk, struct sk_buff *skb) >> const struct iphdr *iph = ip_hdr(skb); >> struct rtable *rt; >> >> - if (sysctl_ip_early_demux && !skb_dst(skb) && !skb->sk) { >> + if (sysctl_ip_early_demux && >> + !skb_dst(skb) && !skb->sk && >> + !ip_is_fragment(iph)) { >> const struct net_protocol *ipprot; >> int protocol = iph->protocol; > > Thanks, I can roll this into a v2 (or keep as a separate patch?). I > got sidetracked on the IPv6 side, some other issues are blocking me on > that but I intend to continue following up there as well.
FWIW I confirmed that all frags in frag list coming back from nf_ct_frag6_gather() have skb->sk == NULL, so this bug is not present on that path.