On Sun, 11 Oct 2020 15:42:24 +0200 Heiner Kallweit wrote: > On 10.10.2020 17:22, Jakub Kicinski wrote: > > On Sat, 10 Oct 2020 15:08:15 +0200 Heiner Kallweit wrote: > >> On 09.10.2020 18:06, Heiner Kallweit wrote: > >>> On 09.10.2020 17:58, Jakub Kicinski wrote: > >>>> On Fri, 9 Oct 2020 16:54:06 +0200 Heiner Kallweit wrote: > >>>>> I'm thinking about a __napi_schedule version that disables hard irq's > >>>>> conditionally, based on variable force_irqthreads, exported by the irq > >>>>> subsystem. This would allow to behave correctly with threadirqs set, > >>>>> whilst not loosing the _irqoff benefit with threadirqs unset. > >>>>> Let me come up with a proposal. > >>>> > >>>> I think you'd need to make napi_schedule_irqoff() behave like that, > >>>> right? Are there any uses of napi_schedule_irqoff() that are disabling > >>>> irqs and not just running from an irq handler? > >>>> > >>> Right, the best approach depends on the answer to the latter question. > >>> I didn't check this yet, therefore I described the least intrusive > >>> approach. > >>> > >> > >> With some help from coccinelle I identified the following functions that > >> call napi_schedule_irqoff() or __napi_schedule_irqoff() and do not run > >> from an irq handler (at least not at the first glance). > >> > >> dpaa2_caam_fqdan_cb > >> qede_simd_fp_handler > >> mlx4_en_rx_irq > >> mlx4_en_tx_irq > > > > Don't know the others but FWIW the mlx4 ones run from an IRQ handler, > > AFAICT: > > > > static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr) > > static irqreturn_t mlx4_msi_x_interrupt(int irq, void *eq_ptr) > > mlx4_eq_int() > > mlx4_cq_completion > > cq->comp() > > > >> qeth_qdio_poll > >> netvsc_channel_cb > >> napi_watchdog > > > > This one runs from a hrtimer, which I believe will be a hard irq > > context on anything but RT. I could be wrong. > > > > Typically forced irq threading will not be enabled, therefore going > back to use napi_schedule() in drivers in most cases will cause > losing the benefit of the irqoff version. Something like the following > should be better. Only small drawback I see is that in case of forced > irq threading hrtimers will still run in hardirq context and we lose > the benefit of the irqoff version in napi_watchdog(). > > diff --git a/net/core/dev.c b/net/core/dev.c > index a146bac84..7d18560b2 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6393,7 +6393,11 @@ EXPORT_SYMBOL(napi_schedule_prep); > */ > void __napi_schedule_irqoff(struct napi_struct *n) > { > - ____napi_schedule(this_cpu_ptr(&softnet_data), n); > + /* hard irqs may not be masked in case of forced irq threading */ > + if (force_irqthreads) > + __napi_schedule(n); > + else > + ____napi_schedule(this_cpu_ptr(&softnet_data), n); > } > EXPORT_SYMBOL(__napi_schedule_irqoff);
Does if (force_irqthreads) local_irq_save(flags); ____napi_schedule(this_cpu_ptr(&softnet_data), n); if (force_irqthreads) local_irq_restore(flags); not produce more concise assembly?