On Fri, 2015-06-05 at 17:46 -0700, Martin KaFai Lau wrote: > The problem is caught by this WARN_ON(len > skb->len) in tcp_fragment(): > > [<ffffffff810510ca>] warn_slowpath_null+0x1a/0x20 > [<ffffffff8160ec90>] tcp_fragment+0x2a0/0x2b0 > [<ffffffff81604e06>] tcp_mark_head_lost+0x196/0x230 > [<ffffffff8160585d>] tcp_update_scoreboard+0x4d/0x80 > [<ffffffff8160a9ac>] tcp_fastretrans_alert+0x6ac/0xa90 > [<ffffffff8160b834>] tcp_ack+0x9d4/0x10e0 > [<ffffffff8160c699>] tcp_rcv_established+0x309/0x7e0 > > The WARN_ON pointed out that tcp_skb_pcount (i.e. > TCP_SKB_CB(skb)->tcp_gso_segs) and skb->len is inconsistent. > > The WARN_ON stack goes away after setting net.ipv4.tcp_mtu_probing to 0. > > v2 > - Replace the skb slicing codes by the existing tcp_trim_head(), > suggested by Eric Dumazet. > > v1 > - Call tcp_set_skb_tso_segs() for all slicing cases. > > Signed-off-by: Martin KaFai Lau <ka...@fb.com> > Reported-by: Grant Zhang <gzh...@fastly.com> > Cc: Grant Zhang <gzh...@fastly.com> > Cc: Eric Dumazet <eduma...@google.com> > Cc: Neal Cardwell <ncardw...@google.com> > Cc: Yuchung Cheng <ych...@google.com> > --- > net/ipv4/tcp_output.c | 12 ++---------- > 1 file changed, 2 insertions(+), 10 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index a369e8a..4ae4f0c 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -1977,16 +1977,8 @@ static int tcp_mtu_probe(struct sock *sk) > } else { > TCP_SKB_CB(nskb)->tcp_flags |= > TCP_SKB_CB(skb)->tcp_flags & > ~(TCPHDR_FIN|TCPHDR_PSH); > - if (!skb_shinfo(skb)->nr_frags) { > - skb_pull(skb, copy); > - if (skb->ip_summed != CHECKSUM_PARTIAL) > - skb->csum = csum_partial(skb->data, > - skb->len, 0); > - } else { > - __pskb_trim_head(skb, copy); > - tcp_set_skb_tso_segs(sk, skb, mss_now); > - } > - TCP_SKB_CB(skb)->seq += copy; > + tcp_skb_pcount_set(skb, 0); > + tcp_trim_head(sk, skb, copy); > } > > len += copy;
I think the invariant should be that if a packet had been never sent, its pcount should be already 0. (cleared in do_tcp_sendpages() and tcp_sendmsg() : it seems we hacked these functions already in the past :( ) So we might need to track places where we violate this rule, then get rid of the tcp_skb_pcount_set(skb, 0); done in do_tcp_sendpages() and tcp_sendmsg(). Here, trimming a packet that was never sent (by definition) should not force pcount to 0, it should already be the case. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html