On Mon, Apr 04, 2016 at 08:22:03AM -0700, Eric Dumazet wrote:
> On Mon, 2016-04-04 at 16:57 +0200, Jesper Dangaard Brouer wrote:
> > On Fri, 1 Apr 2016 19:47:12 -0700 Alexei Starovoitov 
> > <alexei.starovoi...@gmail.com> wrote:
> > 
> > > My guess we're hitting 14.5Mpps limit for empty bpf program
> > > and for program that actually looks into the packet because we're
> > > hitting 10G phy limit of 40G nic. Since physically 40G nic
> > > consists of four 10G phys. There will be the same problem
> > > with 100G and 50G nics. Both will be hitting 25G phy limit.
> > > We need to vary packets somehow. Hopefully Or can explain that
> > > bit of hw design.
> > > Jesper's experiments with mlx4 showed the same 14.5Mpps limit
> > > when sender blasting the same packet over and over again.
> > 
> > That is an interesting observation Alexei, and could explain the pps limit
> > I hit on 40G, with single flow testing.  AFAIK 40G is 4x 10G PHYs, and
> > 100G is 4x 25G PHYs.
> > 
> > I have a pktgen script that tried to avoid this pitfall.  By creating a
> > new flow per pktgen kthread. I call it 
> > "pktgen_sample05_flow_per_thread.sh"[1]
> > 
> > [1] 
> > https://github.com/netoptimizer/network-testing/blob/master/pktgen/pktgen_sample05_flow_per_thread.sh
> > 
> 
> A single flow is able to use 40Gbit on those 40Gbit NIC, so there is not
> a single 10GB trunk used for a given flow.
> 
> This 14Mpps thing seems to be a queue limitation on mlx4.

yeah, could be queueing related.
Multiple cpus can send ~30Mpps of the same 64 byte packet,
but mlx4 can only receive 14.5Mpps. Odd.

Or (and other mellanox guys),
what is really going on inside 40G nic ?

Reply via email to