On Fri, Jan 15, 2021 at 4:44 PM Wei Wang <wei...@google.com> wrote: > > On Fri, Jan 15, 2021 at 3:08 PM Alexander Duyck > <alexander.du...@gmail.com> wrote: > > > > On Fri, Jan 15, 2021 at 1:54 PM Wei Wang <wei...@google.com> wrote: > > > > > > On Thu, Jan 14, 2021 at 7:34 PM Alexander Duyck > > > <alexander.du...@gmail.com> wrote: > > > > > > > > On Thu, Jan 14, 2021 at 4:34 PM Wei Wang <wei...@google.com> wrote: > > > > > > > > > > This patch adds a new sysfs attribute to the network device class. > > > > > Said attribute provides a per-device control to enable/disable the > > > > > threaded mode for all the napi instances of the given network device. > > > > > > > > > > Co-developed-by: Paolo Abeni <pab...@redhat.com> > > > > > Signed-off-by: Paolo Abeni <pab...@redhat.com> > > > > > Co-developed-by: Hannes Frederic Sowa <han...@stressinduktion.org> > > > > > Signed-off-by: Hannes Frederic Sowa <han...@stressinduktion.org> > > > > > Co-developed-by: Felix Fietkau <n...@nbd.name> > > > > > Signed-off-by: Felix Fietkau <n...@nbd.name> > > > > > Signed-off-by: Wei Wang <wei...@google.com> > > > > > --- > > > > > include/linux/netdevice.h | 2 ++ > > > > > net/core/dev.c | 28 +++++++++++++++++ > > > > > net/core/net-sysfs.c | 63 > > > > > +++++++++++++++++++++++++++++++++++++++ > > > > > 3 files changed, 93 insertions(+) > > > > > > > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > > > > index c24ed232c746..11ae0c9b9350 100644 > > > > > --- a/include/linux/netdevice.h > > > > > +++ b/include/linux/netdevice.h > > > > > @@ -497,6 +497,8 @@ static inline bool napi_complete(struct > > > > > napi_struct *n) > > > > > return napi_complete_done(n, 0); > > > > > } > > > > > > > > > > +int dev_set_threaded(struct net_device *dev, bool threaded); > > > > > + > > > > > /** > > > > > * napi_disable - prevent NAPI from scheduling > > > > > * @n: NAPI context > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > > > index edcfec1361e9..d5fb95316ea8 100644 > > > > > --- a/net/core/dev.c > > > > > +++ b/net/core/dev.c > > > > > @@ -6754,6 +6754,34 @@ static int napi_set_threaded(struct > > > > > napi_struct *n, bool threaded) > > > > > return err; > > > > > } > > > > > > > > > > +static void dev_disable_threaded_all(struct net_device *dev) > > > > > +{ > > > > > + struct napi_struct *napi; > > > > > + > > > > > + list_for_each_entry(napi, &dev->napi_list, dev_list) > > > > > + napi_set_threaded(napi, false); > > > > > +} > > > > > + > > > > > +int dev_set_threaded(struct net_device *dev, bool threaded) > > > > > +{ > > > > > + struct napi_struct *napi; > > > > > + int ret; > > > > > + > > > > > + dev->threaded = threaded; > > > > > + list_for_each_entry(napi, &dev->napi_list, dev_list) { > > > > > + ret = napi_set_threaded(napi, threaded); > > > > > + if (ret) { > > > > > + /* Error occurred on one of the napi, > > > > > + * reset threaded mode on all napi. > > > > > + */ > > > > > + dev_disable_threaded_all(dev); > > > > > + break; > > > > > + } > > > > > + } > > > > > + > > > > > + return ret; > > > > > > > > This doesn't seem right. The NAPI instances can be active while this > > > > is occuring can they not? I would think at a minimum you need to go > > > > through a napi_disable/napi_enable in order to toggle this value for > > > > each NAPI instance. Otherwise aren't you at risk for racing and having > > > > a napi_schedule attempt to wake up the thread before it has been > > > > allocated? > > > > > > > > > Yes. The napi instance could be active when this occurs. And I think > > > it is OK. It is cause napi_set_threaded() only sets > > > NAPI_STATE_THREADED bit after successfully created the kthread. And > > > ___napi_schedule() only tries to wake up the kthread after testing the > > > THREADED bit. > > > > But what do you have guaranteeing that the kthread has been written to > > memory? That is what I was getting at. Just because you have written > > the value doesn't mean it is in memory yet so you would probably need > > an smb_mb__before_atomic() barrier call before you set the bit. > > > Noted. Will look into this. > > > Also I am not sure it is entirely safe to have the instance polling > > while you are doing this. That is why I am thinking if the instance is > > enabled then a napi_disable/napi_enable would be preferable. > When the napi is actively being polled in threaded mode, we will keep > rescheduling the kthread and calling __napi_poll() until > NAPI_SCHED_STATE is cleared by napi_complete_done(). And during the > next time napi_schedule() is called, we re-evaluate > NAPI_STATE_THREADED bit to see if we should wake up kthread, or > generate softirq. > And for the other way around, if napi is being handled during > net_rx_action(), toggling the bit won't cause immediate wake-up of the > kthread, but will wait for NAPI_SCHED_STATE to be cleared, and the > next time napi_schedule() is called. > I think it is OK. WDYT?
It is hard to say. The one spot that gives me a bit of concern is the NAPIF_STATE_MISSED case in napi_complete_done. It is essentially would become a switchover point between the two while we are actively polling inside the driver. You end up with NAPI_SCHED_STATE not being toggled but jumping from one to the other.