On Tue, Sep 18, 2018 at 2:56 PM Song Liu <songliubrav...@fb.com> wrote: > > > > > On Sep 18, 2018, at 2:46 PM, Eric Dumazet <eduma...@google.com> wrote: > > > > On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubrav...@fb.com> wrote: > >> > >> > >> > >>> > >> I would submit the patch if Eric prefer not to. :) > > > > > > Hmmm.... maybe the bug is really in ixgbe_netpoll() > > > > This whole ndo_poll_controller() stuff is crazy. > > > > All sane implementations should only call napi_schedule() > > Current implement is about identical to napi_schedule_irqoff(). Do > we really need napi_schedule() instead? > > On the other hand, I think we should check napi_complete_done() in > ixgbe_poll() anyway.
It seems the netpoll code is racy, since another cpu might be calling ixgbe poll(), and return early from napi_complete_done() : if (unlikely(n->state & (NAPIF_STATE_NPSVC | NAPIF_STATE_IN_BUSY_POLL))) return false; This is why netpoll enabled drivers _must_ check the napi_complete[_done]() return value, otherwise they might re-enable IRQs why they should not.