On Thu, 17 Oct 2019 16:48:25 -0700, Jakub Kicinski wrote:
> > The only patch that we have been able to make consistently work
> > without crashing and also without compromising performance, is the
> > previously submitted one where later thread bails out of
> > tls_tx_records. And as mentioned, it can perhaps be made more
> > efficient by rescheduling delayed work in the case where work handler
> > thread turns out to be the later thread that has to bail.  
> 
> Let me try to find a way to repro this reliably without any funky
> accelerators. The sleep in do_tcp_sendpages() should affect all cases.
> I should have some time today and tomorrow to look into this, bear with
> me..

Could you please try this?

---->8-----

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index c2b5e0d2ba1a..ab7b0af162a7 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1204,12 +1204,10 @@ static int tls_sw_do_sendpage(struct sock *sk, struct 
page *page,
                goto alloc_payload;
        }
 
-       if (num_async) {
-               /* Transmit if any encryptions have completed */
-               if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) {
-                       cancel_delayed_work(&ctx->tx_work.work);
-                       tls_tx_records(sk, flags);
-               }
+       /* Transmit if any encryptions have completed */
+       if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) {
+               cancel_delayed_work(&ctx->tx_work.work);
+               tls_tx_records(sk, flags);
        }
 sendpage_end:
        ret = sk_stream_error(sk, flags, ret);
@@ -2171,7 +2169,8 @@ static void tx_work_handler(struct work_struct *work)
        if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
                return;
        lock_sock(sk);
-       tls_tx_records(sk, -1);
+       if (!sk->sk_write_pending)
+               tls_tx_records(sk, -1);
        release_sock(sk);
 }
 

Reply via email to