A few comments on this one: - obviously this is fairly serious, as it can let corrupted data all the way up to the application - I am not nuts about the patch itself, the code feels a bit cluttered, but it's the least invasive way I could think of. Probably some refactoring is needed at some point. - here's how to easily reproduce it:
On the sender, set up artificial packet corruption: #!/bin/sh tc qdisc add dev eth0 root handle 1: prio tc qdisc add dev eth0 parent 1:3 netem corrupt 50.0% tc filter add dev eth0 protocol ip parent 1:0 prio 3 u32 match ip dst $DSTADDR flowid 10:3 Then, run the following on the sender: while :; do dd if=/dev/zero of=/dev/stdout bs=4096 count=4 | nc $DSTADDR 10000; sleep 1; done ..and this on the receiver: uname -r; grep ^Tcp /proc/net/snmp; ttl=$((${SECONDS} + 300)); while [[ ${SECONDS} -lt ${ttl} ]]; do out="foo.$(date +%s)"; nc -l 10000 > "${out}"; md5=$(md5sum "${out}"|cut -d\ -f 1); echo -n "${md5}"; if [[ "${md5}" != "ce338fe6899778aacfc28414f2d9498b" ]]; then echo " corrupted"; else echo; fi; done; grep ^Tcp /proc/net/snmp [obviously, if you change the size / content, the md5 sum has to be changed here] This reproduces it fairly quickly (within 20 seconds) for us, on 4.14.x kernels. 4.17 kernels were verified to still have the issue. Frank On 6/11/18, 4:15 PM, "Frank van der Linden" <fllin...@amazon.com> wrote: commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") introduced an optimization for the handling of child sockets created for a new TCP connection. But this optimization passes any data associated with the last ACK of the connection handshake up the stack without verifying its checksum, because it calls tcp_child_process(), which in turn calls tcp_rcv_state_process() directly. These lower-level processing functions do not do any checksum verification. Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to fix this. Signed-off-by: Frank van der Linden <fllin...@amazon.com> --- net/ipv4/tcp_ipv4.c | 8 +++++++- net/ipv6/tcp_ipv6.c | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index f70586b50838..1ec4c0d4aba5 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1703,7 +1703,13 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); tcp_v4_fill_cb(skb, iph, th); - nsk = tcp_check_req(sk, skb, req, false, &req_stolen); + + if (tcp_checksum_complete(skb)) { + __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); + } else { + nsk = tcp_check_req(sk, skb, req, false, + &req_stolen); + } } if (!nsk) { reqsk_put(req); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 6d664d83cd16..a12b694d3d1e 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1486,7 +1486,13 @@ static int tcp_v6_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; hdr = ipv6_hdr(skb); tcp_v6_fill_cb(skb, hdr, th); - nsk = tcp_check_req(sk, skb, req, false, &req_stolen); + + if (tcp_checksum_complete(skb)) { + __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); + } else { + nsk = tcp_check_req(sk, skb, req, false, + &req_stolen); + } } if (!nsk) { reqsk_put(req); -- 2.14.4