> On 04 Aug 2016, at 11:40, Ben RUBSON <ben.rub...@gmail.com> wrote: > >> On 02 Aug 2016, at 22:11, Ben RUBSON <ben.rub...@gmail.com> wrote: >> >>> On 02 Aug 2016, at 21:35, Hans Petter Selasky <h...@selasky.org> wrote: >>> >>> The CX-3 driver doesn't bind the worker threads to specific CPU cores by >>> default, so if your CPU has more than one so-called numa, you'll end up >>> that the bottle-neck is the high-speed link between the CPU cores and not >>> the card. A quick and dirty workaround is to "cpuset" iperf and the >>> interrupt and taskqueue threads to specific CPU cores. >> >> My CPUs : 2x E5-2620v3 with DDR4@1866. > > OK, so I cpuset all Mellanox interrupts to one NUMA, as well as the iPerf > processes, and I'm able to reach max bandwidth. > Choosing the wrong NUMA (or both, or one for interrupts, the other one for > iPerf, etc...) totally kills throughput. > > However, full-duplex throughput is still limited, I can't manage to reach > 2x40Gb/s, throttle is at about 45Gb/s. > I tried many different cpuset layouts, but I never went above 45Gb/s. > (Linux allowed me to reach 2x40Gb/s so hardware is not a bottleneck) > >>> Are you using "options RSS" and "options PCBGROUP" in your kernel config? > > I will then give RSS a try.
Without RSS : A ---> B : 40Gbps (unidirectional) A <--> B : 45Gbps (bidirectional) With RSS : A ---> B : 28Gbps (unidirectional) A <--> B : 28Gbps (bidirectional) Sounds like RSS does not help :/ Why, without RSS, do I have difficulties to reach 2x40Gbps (full-duplex) ? Thank U ! _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"