On Fri, Sep 25, 2020 at 9:06 PM Jakub Kicinski <k...@kernel.org> wrote: > > On Fri, 25 Sep 2020 15:48:35 +0200 Magnus Karlsson wrote: > > I really like this RFC and would encourage you to submit it as a > > patch. Would love to see it make it into the kernel. > > > > I see the same positive effects as you when trying it out with AF_XDP > > sockets. Made some simple experiments where I sent 64-byte packets to > > a single AF_XDP socket. Have not managed to figure out how to do > > percentiles on my load generator, so this is going to be min, avg and > > max only. The application using the AF_XDP socket just performs a mac > > swap on the packet and sends it back to the load generator that then > > measures the round trip latency. The kthread is taskset to the same > > core as ksoftirqd would run on. So in each experiment, they always run > > on the same core id (which is not the same as the application). > > > > Rate 12 Mpps with 0% loss. > > Latencies (us) Delay Variation between packets > > min avg max avg max > > sofirq 11.0 17.1 78.4 0.116 63.0 > > kthread 11.2 17.1 35.0 0.116 20.9 > > > > Rate ~58 Mpps (Line rate at 40 Gbit/s) with substantial loss > > Latencies (us) Delay Variation between packets > > min avg max avg max > > softirq 87.6 194.9 282.6 0.062 25.9 > > kthread 86.5 185.2 271.8 0.061 22.5 > > > > For the last experiment, I also get 1.5% to 2% higher throughput with > > your kthread approach. Moreover, just from the per-second throughput > > printouts from my application, I can see that the kthread numbers are > > more stable. The softirq numbers can vary quite a lot between each > > second, around +-3%. But for the kthread approach, they are nice and > > stable. Have not examined why. > > Sure, it's better than status quo for AF_XDP but it's going to be far > inferior to well implemented busy polling.
Agree completely. Björn is looking into this at the moment, so I will let him comment on it and post some patches. > We already discussed the potential scheme with Bjorn, since you prompted > me again, let me shoot some code from the hip at ya: > > diff --git a/net/core/dev.c b/net/core/dev.c > index 74ce8b253ed6..8dbdfaeb0183 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6668,6 +6668,7 @@ static struct napi_struct *napi_by_id(unsigned int > napi_id) > > static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock) > { > + unsigned long to; > int rc; > > /* Busy polling means there is a high chance device driver hard irq > @@ -6682,6 +6683,13 @@ static void busy_poll_stop(struct napi_struct *napi, > void *have_poll_lock) > clear_bit(NAPI_STATE_MISSED, &napi->state); > clear_bit(NAPI_STATE_IN_BUSY_POLL, &napi->state); > > + if (READ_ONCE(napi->dev->napi_defer_hard_irqs)) { > + netpoll_poll_unlock(have_poll_lock); > + to = ns_to_ktime(READ_ONCE(napi->dev->gro_flush_timeout)); > + hrtimer_start(&n->timer, to, HRTIMER_MODE_REL_PINNED); > + return; > + } > + > local_bh_disable(); > > /* All we really want here is to re-enable device interrupts. > > > With basic busy polling implemented for AF_XDP this is all** you need > to make busy polling work very well. > > ** once bugs are fixed :D I haven't even compiled this > > Eric & co. already implemented hard IRQ deferral. All we need to do is > push the timer away when application picks up frames. I think. > > Please, no loose threads for AF_XDP apps (or other busy polling apps). > Let the application burn 100% of the core :(