> On Sep 16, 2024, at 10:47 PM, Aleksandr Fedorov <wignedd...@yandex.ru> wrote:
>
> If we are talking about local traffic between jails and/or host, then in
> terms of TCP throughput we have a room to improve, for example:
Without RSS option enabled, if_epair will only use one thread to move packets
between the pair of interfaces. I reviewed the code
and I think it can be improved event without RSS.
>
> 1. Stop calculating checksums for packets between VNET jails and host.
I've local WIP for this, inspired by the introduction of IFCAP_VLAN_MTU. Should
have better improvement especially on low freq CPUs.
>
> 2. Use large packets (TSO) up to 64k in size.
>
> Just for example, a simple patch increases the throughput of if_pair(4)
> between two ends from 10 Gbps to 30 Gbps.
That is impressing !
>
> diff --git a/sys/net/if_epair.c b/sys/net/if_epair.c
> index aeed993249f5..79c2dfcfc445 100644
> --- a/sys/net/if_epair.c
> +++ b/sys/net/if_epair.c
> @@ -164,6 +164,10 @@ epair_tx_start_deferred(void *arg, int pending)
> while (m != NULL) {
> n = STAILQ_NEXT(m, m_stailqpkt);
> m->m_nextpkt = NULL;
> +
> + m->m_pkthdr.csum_flags = CSUM_IP_CHECKED | CSUM_IP_VALID |
> CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
> + m->m_pkthdr.csum_data = 0xFFFF;
> +
> if_input(ifp, m);
> m = n;
> }
> @@ -538,8 +542,9 @@ epair_setup_ifp(struct epair_softc *sc, char *name, int
> unit)
> ifp->if_dunit = unit;
> ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
> ifp->if_flags |= IFF_KNOWSEPOCH;
> - ifp->if_capabilities = IFCAP_VLAN_MTU;
> - ifp->if_capenable = IFCAP_VLAN_MTU;
> + ifp->if_capabilities = IFCAP_VLAN_MTU | IFCAP_HWCSUM |
> IFCAP_HWCSUM_IPV6 | IFCAP_TSO;
> + ifp->if_capenable = ifp->if_capabilities;
> + ifp->if_hwassist = (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_IP_TSO);
I've not tried TSO on if_epair yet. TSO has special treatment so I guess the
above is not sufficient.
> ifp->if_transmit = epair_transmit;
> ifp->if_qflush = epair_qflush;
> ifp->if_start = epair_start;
>
> 14.09.2024, 05:45, "Zhenlei Huang" <z...@freebsd.org>:
>
>
>
> On Sep 13, 2024, at 10:54 PM, Sad Clouds <cryintotheblue...@gmail.com> wrote:
>
> On Fri, 13 Sep 2024 08:08:02 -0400
> Mark Saad <nones...@longcount.org> wrote:
>
> Sad
> Can you go back a bit you mentioned there is a RPi in the mix ? Some of
> the raspberries have their nic usb attached under the covers . Which will
> kill the total speed of things.
>
> Can you cobble together a diagram of what you have on either end ?
> Hello, I'm not sending data across the network, only between the host
> and the jails. I'm trying to evaluate how FreeBSD handles TCP data
> locally within a single host.
>
> When you take vnet into account, the **locally** traffic should within
> on single vnet jail. If you want traffic across vnet jails, if_epair or
> netgraph
> hooks should be employed, and it of course will introduce some overhead.
>
>
> I understand that vnet jails will have more overhead, compared to a
> shared TCP/IP stack via localhost. So I'm trying to measure it and see
> where the bottlenecks are.
>
> The overhead of vnet jail should be neglectable, compared to legacy jail
> or no-jail. Bare in mind when VIMAGE option is enabled, there is a default
> vnet 0. It is not visible via jls and can not be destroyed. So when you see
> bottlenecks, for example this case, it is mostly caused by other components
> such as if_epair, but not the vnet jail itself.
>
>
> The Raspberry Pi 4 host has a single vnet jail, exchanging data with
> the host via epair(4) and if_bridge(4) interfaces. I don't really know
> what topology FreeBSD is using to represent all this so can't draw any
> diagrams, but I think all data flows through the kernel internally and
> never leaves the physical network interface.
>
> For vnet jails, when you try to describe the network topology, you can
> treat them as VM / physical boxes.
>
> I have one box with dozens of vnet jails. Each of them has very single
> responsibility, e.g. DHCP, LADP, pf firewall, OOB access. The topology looks
> quite
> clear and it is easy to maintenance. The only overhead is too much
> hops between the vnet jail instances. For my use case the performance
> is not critical and it works great for years.
>
>
>
> Best regards,
> Zhenlei