On 2020/11/3 0:55, Cong Wang wrote: > On Fri, Oct 30, 2020 at 12:38 AM Yunsheng Lin <linyunsh...@huawei.com> wrote: >> >> On 2020/10/30 3:05, Cong Wang wrote: >>> >>> I do not see how and why it should. synchronize_net() is merely an optimized >>> version of synchronize_rcu(), it should wait for RCU readers, softirqs are >>> not >>> necessarily RCU readers, net_tx_action() does not take RCU read lock either. >> >> Ok, make sense. >> >> Taking RCU read lock in net_tx_action() does not seems to solve the problem, >> what about the time window between __netif_reschedule() and net_tx_action()? >> >> It seems we need to re-dereference the qdisc whenever RCU read lock is >> released >> and qdisc is still in sd->output_queue or wait for the sd->output_queue to >> drain? > > Not suggesting you to take RCU read lock. We already wait for TX action with > a loop of sleep. To me, the only thing missing is just moving the > reset after that > wait.
__QDISC_STATE_SCHED is cleared before calling qdisc_run() in net_tx_action(), some_qdisc_is_busy does not seem to wait fully for TX action, at least qdisc is still being accessed even if __QDISC_STATE_DEACTIVATED is set. > > >>>>>> If we do any additional reset that is not related to qdisc in >>>>>> dev_reset_queue(), we >>>>>> can move it after some_qdisc_is_busy() checking. >>>>> >>>>> I am not suggesting to do an additional reset, I am suggesting to move >>>>> your reset after the busy waiting. >>>> >>>> There maybe a deadlock here if we reset the qdisc after the >>>> some_qdisc_is_busy() checking, >>>> because some_qdisc_is_busy() may require the qdisc reset to clear the skb, >>>> so that >>> >>> some_qdisc_is_busy() checks the status of qdisc, not the skb queue. >> >> Is there any reason why we do not check the skb queue in the dqisc? >> It seems there may be skb left when netdev is deactivated, maybe at least >> warn >> about that when there is still skb left when netdev is deactivated? >> Is that why we call qdisc_reset() to clear the leftover skb in >> qdisc_destroy()? >> >>> >>> >>>> some_qdisc_is_busy() can return false. I am not sure this is really a >>>> problem, but >>>> sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return >>>> TX_BUSY. >>> >>> Sounds like another reason we should move the reset as late as possible? >> >> Why? > > You said "sch_direct_xmit() may requeue the skb", I agree. I assume you mean > net_tx_action() calls sch_direct_xmit() which does the requeue then races with > reset. No? > Look at current code again, I think there is no race between sch_direct_xmit() in net_tx_action() and dev_reset_queue() in dev_deactivate_many(), because qdisc_lock(qdisc) or qdisc->seqlock has been taken when calling sch_direct_xmit() or dev_reset_queue(). > >> >> There current netdev down order is mainly below: >> >> netif_tx_stop_all_queues() >> >> dev_deactivate_queue() >> >> synchronize_net() >> >> dev_reset_queue() >> >> some_qdisc_is_busy() >> >> >> You suggest to change it to below order, right? >> >> netif_tx_stop_all_queues() >> >> dev_deactivate_queue() >> >> synchronize_net() >> >> some_qdisc_is_busy() >> >> dev_reset_queue() > > Yes. > >> >> >> What is the semantics of some_qdisc_is_busy()? > > Waiting for flying TX action. It wait for __QDISC_STATE_SCHED to clear and qdisc running to finish, but there is still time window between __QDISC_STATE_SCHED clearing and qdisc running, right? > >> From my understanding, we can do anything about the old qdisc (including >> destorying the old qdisc) after some_qdisc_is_busy() return false. > > But the current code does the reset _before_ some_qdisc_is_busy(). ;) If lock is taken when doing reset, it does not matter if the reset is before some_qdisc_is_busy(), right? > > Thanks. > . >