On Mon, Aug 31, 2020 at 5:59 PM Yunsheng Lin <linyunsh...@huawei.com> wrote: > > Currently there is concurrent reset and enqueue operation for the > same lockless qdisc when there is no lock to synchronize the > q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in > qdisc_deactivate() called by dev_deactivate_queue(), which may cause > out-of-bounds access for priv->ring[] in hns3 driver if user has > requested a smaller queue num when __dev_xmit_skb() still enqueue a > skb with a larger queue_mapping after the corresponding qdisc is > reset, and call hns3_nic_net_xmit() with that skb later.
Can you be more specific here? Which call path requests a smaller tx queue num? If you mean netif_set_real_num_tx_queues(), clearly we already have a synchronize_net() there. > > Avoid the above concurrent op by calling synchronize_rcu_tasks() > after assigning new qdisc to dev_queue->qdisc and before calling > qdisc_deactivate() to make sure skb with larger queue_mapping > enqueued to old qdisc will always be reset when qdisc_deactivate() > is called. Like Eric said, it is not nice to call such a blocking function when we have a large number of TX queues. Possibly we just need to add a synchronize_net() as in netif_set_real_num_tx_queues(), if it is missing. Thanks.