When receiver does not accept TCP Fast Open it will only ack
the SYN, and not the data. We detect this and immediately queue
the data for (re)transmission in tcp_rcv_fastopen_synack().

In DC networks with very low RTT and without RFS the SYN-ACK
may arrive before NIC driver reported Tx completion on
the original SYN. In which case skb_still_in_host_queue()
returns true and sender will need to wait for the retransmission
timer to fire milliseconds later.

Revert back to non-fast clone skbs, this way
skb_still_in_host_queue() won't prevent the recovery flow
from completing.

Suggested-by: Eric Dumazet <eduma...@google.com>
Fixes: 355a901e6cf1 ("tcp: make connect() mem charging friendly")
Signed-off-by: Neil Spring <ntspr...@fb.com>
Signed-off-by: Jakub Kicinski <k...@kernel.org>
---
 net/ipv4/tcp_output.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index fbf140a770d8..cd9461588539 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3759,9 +3759,16 @@ static int tcp_send_syn_data(struct sock *sk, struct 
sk_buff *syn)
        /* limit to order-0 allocations */
        space = min_t(size_t, space, SKB_MAX_HEAD(MAX_TCP_HEADER));
 
-       syn_data = sk_stream_alloc_skb(sk, space, sk->sk_allocation, false);
+       syn_data = alloc_skb(MAX_TCP_HEADER + space, sk->sk_allocation);
        if (!syn_data)
                goto fallback;
+       if (!sk_wmem_schedule(sk, syn_data->truesize)) {
+               __kfree_skb(syn_data);
+               goto fallback;
+       }
+       skb_reserve(syn_data, MAX_TCP_HEADER);
+       INIT_LIST_HEAD(&syn_data->tcp_tsorted_anchor);
+
        syn_data->ip_summed = CHECKSUM_PARTIAL;
        memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
        if (space) {
-- 
2.26.2

Reply via email to