On Thu, 17 Oct 2019 16:48:25 -0700, Jakub Kicinski wrote: > > The only patch that we have been able to make consistently work > > without crashing and also without compromising performance, is the > > previously submitted one where later thread bails out of > > tls_tx_records. And as mentioned, it can perhaps be made more > > efficient by rescheduling delayed work in the case where work handler > > thread turns out to be the later thread that has to bail. > > Let me try to find a way to repro this reliably without any funky > accelerators. The sleep in do_tcp_sendpages() should affect all cases. > I should have some time today and tomorrow to look into this, bear with > me..
Could you please try this? ---->8----- diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index c2b5e0d2ba1a..ab7b0af162a7 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1204,12 +1204,10 @@ static int tls_sw_do_sendpage(struct sock *sk, struct page *page, goto alloc_payload; } - if (num_async) { - /* Transmit if any encryptions have completed */ - if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { - cancel_delayed_work(&ctx->tx_work.work); - tls_tx_records(sk, flags); - } + /* Transmit if any encryptions have completed */ + if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { + cancel_delayed_work(&ctx->tx_work.work); + tls_tx_records(sk, flags); } sendpage_end: ret = sk_stream_error(sk, flags, ret); @@ -2171,7 +2169,8 @@ static void tx_work_handler(struct work_struct *work) if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) return; lock_sock(sk); - tls_tx_records(sk, -1); + if (!sk->sk_write_pending) + tls_tx_records(sk, -1); release_sock(sk); }