On Wed, 21 Aug 2019 11:03:46 -0700, Jakub Kicinski wrote: > On Tue, 20 Aug 2019 23:51:12 -0700, Jakub Kicinski wrote: > > > If you have more details I can also spend some cycles looking into it. > > > > Awesome, I'll let you know what the details are as soon as I get them. > > Just a quick update on that. > > The test case is nginx running with ktls offload. > > The client (hurl or openssl client) requests a file of ~2M, but only > 44K ever gets across (not even sure which side sees an error at this > point, outputs are pretty quiet).
I had a look, it's this: diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 6848a8196711..8a05e4bf1c58 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -370,7 +370,8 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) lock_sock(sk); - if (tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo)) + ret = tls_complete_pending_work(sk, tls_ctx, msg->msg_flags, &timeo); + if (ret) goto send_end; if (unlikely(msg->msg_controllen)) { Which is commit 150085791afb ("net/tls: Fixed return value when tls_complete_pending_work() fails"). I also tried to test what we described previously for sk_write_space and it seems to work okay (although TBH I'm not sure my testing is 100% here, I can't reliably trigger that race in the first place).