My recent patch had at least two problems : 1) TX zerocopy wants notification when skb is acknowledged, thus we need to call skb_zcopy_clear() if the skb is cached into sk->sk_tx_skb_cache
2) Some applications might expect precise EPOLLOUT notifications, so we need to update sk->sk_wmem_queued and call sk_mem_uncharge() from sk_wmem_free_skb() in all cases. The SOCK_QUEUE_SHRUNK flag must also be set. Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx") Signed-off-by: Eric Dumazet <eduma...@google.com> Cc: Willem de Bruijn <will...@google.com> Cc: Soheil Hassas Yeganeh <soh...@google.com> --- include/net/sock.h | 9 +++++---- net/ipv4/tcp.c | 2 -- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 577d91fb56267371c6bc5ae65f7454deba726bd6..7fa2232785226bcafd46b230559964fd16f3c4f4 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1465,13 +1465,14 @@ static inline void sk_mem_uncharge(struct sock *sk, int size) static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) { - if (!sk->sk_tx_skb_cache) { - sk->sk_tx_skb_cache = skb; - return; - } sock_set_flag(sk, SOCK_QUEUE_SHRUNK); sk->sk_wmem_queued -= skb->truesize; sk_mem_uncharge(sk, skb->truesize); + if (!sk->sk_tx_skb_cache) { + skb_zcopy_clear(skb, true); + sk->sk_tx_skb_cache = skb; + return; + } __kfree_skb(skb); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 29b94edf05f9357d3a33744d677827ce624738ae..77fa3d26fb88c42a0b3152d4408081083c893b2a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -871,8 +871,6 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp, fclones = container_of(skb, struct sk_buff_fclones, skb1); if (refcount_read(&fclones->fclone_ref) == 1) { - sk->sk_wmem_queued -= skb->truesize; - sk_mem_uncharge(sk, skb->truesize); skb->truesize -= skb->data_len; sk->sk_tx_skb_cache = NULL; pskb_trim(skb, 0); -- 2.21.0.392.gf8f6787159e-goog