From: Maxim Mikityanskiy <maxi...@mellanox.com> tls_device_offload_cleanup_rx doesn't clear tls_ctx->netdev after calling tls_dev_del if TLX TX offload is also enabled. Clearing tls_ctx->netdev gets postponed until tls_device_gc_task. It leaves a time frame when tls_device_down may get called and call tls_dev_del for RX one extra time, confusing the driver, which may lead to a crash.
This patch corrects this racy behavior by adding a flag to prevent tls_device_down from calling tls_dev_del the second time. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maxi...@mellanox.com> Signed-off-by: Saeed Mahameed <sae...@nvidia.com> --- For -stable: 5.3 include/net/tls.h | 1 + net/tls/tls_device.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/net/tls.h b/include/net/tls.h index baf1e99d8193..a0deddfde412 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -199,6 +199,7 @@ enum tls_context_flags { * to be atomic. */ TLS_TX_SYNC_SCHED = 1, + TLS_RX_DEV_RELEASED = 2, }; struct cipher_context { diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index cec86229a6a0..b2261caac6be 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -1241,6 +1241,7 @@ void tls_device_offload_cleanup_rx(struct sock *sk) netdev->tlsdev_ops->tls_dev_del(netdev, tls_ctx, TLS_OFFLOAD_CTX_DIR_RX); + set_bit(TLS_RX_DEV_RELEASED, &tls_ctx->flags); if (tls_ctx->tx_conf != TLS_HW) { dev_put(netdev); @@ -1274,7 +1275,7 @@ static int tls_device_down(struct net_device *netdev) if (ctx->tx_conf == TLS_HW) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_TX); - if (ctx->rx_conf == TLS_HW) + if (ctx->rx_conf == TLS_HW && !test_bit(TLS_RX_DEV_RELEASED, &ctx->flags)) netdev->tlsdev_ops->tls_dev_del(netdev, ctx, TLS_OFFLOAD_CTX_DIR_RX); WRITE_ONCE(ctx->netdev, NULL); -- 2.26.2