When receiver does not accept TCP Fast Open it will only ack the SYN, and not the data. We detect this and immediately queue the data for (re)transmission in tcp_rcv_fastopen_synack().
In DC networks with very low RTT and without RFS the SYN-ACK may arrive before NIC driver reported Tx completion on the original SYN. In which case skb_still_in_host_queue() returns true and sender will need to wait for the retransmission timer to fire milliseconds later. Revert back to non-fast clone skbs, this way skb_still_in_host_queue() won't prevent the recovery flow from completing. Suggested-by: Eric Dumazet <eduma...@google.com> Fixes: 355a901e6cf1 ("tcp: make connect() mem charging friendly") Signed-off-by: Neil Spring <ntspr...@fb.com> Signed-off-by: Jakub Kicinski <k...@kernel.org> --- net/ipv4/tcp_output.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index fbf140a770d8..cd9461588539 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3759,9 +3759,16 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) /* limit to order-0 allocations */ space = min_t(size_t, space, SKB_MAX_HEAD(MAX_TCP_HEADER)); - syn_data = sk_stream_alloc_skb(sk, space, sk->sk_allocation, false); + syn_data = alloc_skb(MAX_TCP_HEADER + space, sk->sk_allocation); if (!syn_data) goto fallback; + if (!sk_wmem_schedule(sk, syn_data->truesize)) { + __kfree_skb(syn_data); + goto fallback; + } + skb_reserve(syn_data, MAX_TCP_HEADER); + INIT_LIST_HEAD(&syn_data->tcp_tsorted_anchor); + syn_data->ip_summed = CHECKSUM_PARTIAL; memcpy(syn_data->cb, syn->cb, sizeof(syn->cb)); if (space) { -- 2.26.2