On Wed, May 31, 2023 at 11:55 AM Jason Wang <jasow...@redhat.com> wrote:
>
> On Wed, May 31, 2023 at 11:47 AM Jon Kohler <j...@nutanix.com> wrote:
> >
> >
> >
> > > On May 30, 2023, at 11:35 PM, Jason Wang <jasow...@redhat.com> wrote:
> > >
> > > On Wed, May 31, 2023 at 11:32 AM Jason Wang <jasow...@redhat.com> wrote:
> > >>
> > >> On Wed, May 31, 2023 at 11:17 AM Jon Kohler <j...@nutanix.com> wrote:
> > >>>
> > >>> If kernel supports IFF_NAPI, lets use it, which is especially useful
> > >>> on kernels containing fb3f903769e8 ("tun: support NAPI for packets
> > >>> received from batched XDP buffs"), as IFF_NAPI allows the
> > >>> vhost_tx_batch path to use NAPI on XDP buffs.
> > >>>
> > >>> Benchmark w/ iperf -c (remote srv) -P (thread count) -l (stream size)
> > >>> from a guest running kernel 5.10.105 to remote bare metal running
> > >>> patched code on kernel 5.10.139. Guests were configured 1x virtio-net
> > >>> device with 4x queues, resulting in 4x vhost-worker threads. Hosts are
> > >>> identical with Intel ICX 4314 @ 2.4 GHz with Mellanox CX5 25GbE NIC ->
> > >>> Arista 25GbE switch. vhost-worker threads largely maxed out on CPU on
> > >>> "Before" and around ~50-60% utilization "After".
> > >>>
> > >>> Single Stream: iperf -P 1
> > >>> iperf -l size | Before          | After          | Increase
> > >>> 64B           | 593 Mbits/sec   | 712 Mbits/sec  | ~20%
> > >>> 128B          | 1.00 Gbits/sec  | 1.18 Gbits/sec | ~18%
> > >>> 4KB           | 17.6 Gbits/sec  | 22.7 Gbits/sec | ~29%
> > >>>
> > >>> Multiple Stream: iperf -P 12
> > >>> iperf -l size | Before          | After          | Increase
> > >>> 64B           | 6.35 Gbits/sec  | 7.78 Gbits/sec | ~23%
> > >>> 128B          | 10.8 Gbits/sec  | 14.2 Gbits/sec | ~31%
> > >>> 4KB           | 23.6 Gbits/sec  | 23.6 Gbits/sec | (line speed)
> > >>>
> > >>> Signed-off-by: Jon Kohler <j...@nutanix.com>
> > >>
> > >> Great, but I would suggest having an option.
> > >>
> > >> So we can:
> > >>
> > >> 1) ease the debug and compare
> > >> 2) enable this by default only for 8.1, disable it for pre 8.1
> >
> > Fair enough, one favor to ask though -
> > Would you be able to point me to an existing option like what you’re
> > proposing so I could make sure I’m on the same page?
>
> For example, the vhost option for tap. Maybe we can have an napi option.
>
> >
> > >
> > > More thought, if the performance boost only after fb3f903769e8, we
> > > probably need to disable it by default and let the mgmt layer to
> > > enable it.
> > >
> >
> > I focused my testing with that commit, but I could take it out and
> > we still should get benefit. Would you like me to profile that to validate?
> >
>
> One problem is that NAPI for TAP was originally used for kernel
> hardening. Looking at the history, it introduces a lot of bugs.
>
> Consider:
>
> 1) it has been merged for several years
> 2) tap has been widely used for a long time as well
>
> I think it would be still safe to keep the option off (at least for
> pre 8.1 machines).
>
> > Asking as NAPI support in tun.c has been there for a while, guessing
> > at first glance that there would be non-zero gains, with little downsides.
> > Looking at git blame, seems about ~5-6 years of support.
>
> Yes.
>
> >
> > Also for posterity, that commit has been in since 5.18, so a little over 1 
> > year.
>
> Then I think we can make it enabled by default for 8.1 and see.

Btw, it would be better if we can have some PPS benchmark as well.

Thanks

>
> Thanks
>
> >
> > > Thanks
> > >
> > >>
> > >> Thanks
> > >>
> > >> Thanks
> > >>
> > >>> ---
> > >>> net/tap-linux.c | 4 ++++
> > >>> net/tap-linux.h | 1 +
> > >>> 2 files changed, 5 insertions(+)
> > >>>
> > >>> diff --git a/net/tap-linux.c b/net/tap-linux.c
> > >>> index f54f308d359..fd94df166e0 100644
> > >>> --- a/net/tap-linux.c
> > >>> +++ b/net/tap-linux.c
> > >>> @@ -62,6 +62,10 @@ int tap_open(char *ifname, int ifname_size, int 
> > >>> *vnet_hdr,
> > >>>         ifr.ifr_flags |= IFF_ONE_QUEUE;
> > >>>     }
> > >>>
> > >>> +    if (features & IFF_NAPI) {
> > >>> +        ifr.ifr_flags |= IFF_NAPI;
> > >>> +    }
> > >>> +
> > >>>     if (*vnet_hdr) {
> > >>>         if (features & IFF_VNET_HDR) {
> > >>>             *vnet_hdr = 1;
> > >>> diff --git a/net/tap-linux.h b/net/tap-linux.h
> > >>> index bbbb62c2a75..f4d8e55270b 100644
> > >>> --- a/net/tap-linux.h
> > >>> +++ b/net/tap-linux.h
> > >>> @@ -37,6 +37,7 @@
> > >>>
> > >>> /* TUNSETIFF ifr flags */
> > >>> #define IFF_TAP          0x0002
> > >>> +#define IFF_NAPI         0x0010
> > >>> #define IFF_NO_PI        0x1000
> > >>> #define IFF_ONE_QUEUE    0x2000
> > >>> #define IFF_VNET_HDR     0x4000
> > >>> --
> > >>> 2.30.1 (Apple Git-130)
> >


Reply via email to