On Mon, 2016-04-04 at 16:57 +0200, Jesper Dangaard Brouer wrote: > On Fri, 1 Apr 2016 19:47:12 -0700 Alexei Starovoitov > <alexei.starovoi...@gmail.com> wrote: > > > My guess we're hitting 14.5Mpps limit for empty bpf program > > and for program that actually looks into the packet because we're > > hitting 10G phy limit of 40G nic. Since physically 40G nic > > consists of four 10G phys. There will be the same problem > > with 100G and 50G nics. Both will be hitting 25G phy limit. > > We need to vary packets somehow. Hopefully Or can explain that > > bit of hw design. > > Jesper's experiments with mlx4 showed the same 14.5Mpps limit > > when sender blasting the same packet over and over again. > > That is an interesting observation Alexei, and could explain the pps limit > I hit on 40G, with single flow testing. AFAIK 40G is 4x 10G PHYs, and > 100G is 4x 25G PHYs. > > I have a pktgen script that tried to avoid this pitfall. By creating a > new flow per pktgen kthread. I call it "pktgen_sample05_flow_per_thread.sh"[1] > > [1] > https://github.com/netoptimizer/network-testing/blob/master/pktgen/pktgen_sample05_flow_per_thread.sh >
A single flow is able to use 40Gbit on those 40Gbit NIC, so there is not a single 10GB trunk used for a given flow. This 14Mpps thing seems to be a queue limitation on mlx4.