From: Sameeh Jubran <same...@amazon.com> There is a race condition that can occur when calling ena_down(). The ena_clean_tx_irq() - which is a part of the napi handler - function might wake up the tx queue when the queue is supposed to be down (during recovery or changing the size of the queues for example) This causes the ena_start_xmit() function to trigger and possibly try to access the destroyed queues.
The race is illustrated below: Flow A: Flow B(napi handler) ena_down() netif_carrier_off() netif_tx_disable() ena_clean_tx_irq() netif_tx_wake_queue() ena_napi_disable_all() ena_destroy_all_io_queues() After these flows the tx queue is active and ena_start_xmit() accesses the destroyed queue which leads to a kernel panic. fixes: 1738cd3ed342 (net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)) Signed-off-by: Sameeh Jubran <same...@amazon.com> --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 664e3ed97..d118ed4c5 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -823,7 +823,8 @@ static int ena_clean_tx_irq(struct ena_ring *tx_ring, u32 budget) above_thresh = ena_com_sq_have_enough_space(tx_ring->ena_com_io_sq, ENA_TX_WAKEUP_THRESH); - if (netif_tx_queue_stopped(txq) && above_thresh) { + if (netif_tx_queue_stopped(txq) && above_thresh && + test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags)) { netif_tx_wake_queue(txq); u64_stats_update_begin(&tx_ring->syncp); tx_ring->tx_stats.queue_wakeup++; -- 2.17.1