In testing two Chelsio T580-CR dual port cards with FreeBSD 10-STABLE, I've been able to use a collection of clients to generate approximately 1.5-1.6 million TCP packets per second sustained, and routinely hit 10GB/s, both measured by netstat -d -b -w1 -W (I usually use -h for the quick read, accepting the loss of granularity).
While performance has so far been stellar, and I'm honestly speculating I will need more CPU depth and horsepower to get much faster, I'm curious if there is any gain to tweaking performance settings. I'm seeing, under multiple streams, with N targets connecting to N servers, interrupts on all CPUs peg at 99-100%, and I'm curious if tweaking configs will help, or its a free clue to get more horsepower. So, far, except for temporarily turning off pflogd, and setting the following sysctl variables, I've not done any performance tuning on the system yet. /etc/sysctl.conf net.inet.ip.fastforwarding=1 kern.random.sys.harvest.ethernet=0 kern.random.sys.harvest.point_to_point=0 kern.random.sys.harvest.interrupt=0 a) One of the first things I did in prior testing was to turn hyperthreading off. I presume this is still prudent, as HT doesn't help with interrupt handling? b) I briefly experimented with using cpuset(1) to stick interrupts to physical CPUs, but it offered no performance enhancements, and indeed, appeared to decrease performance by 10-20%. Has anyone else tried this? What were your results? c) the defaults for the cxgbe driver appear to be 8 rx queues, and N tx queues, with N being the number of CPUs detected. For a system running multiple cards, routing or firewalling, does this make sense, or would balancing tx and rx be more ideal? And would reducing queues per card based on NUMBER-CPUS and NUM-CHELSIO-PORTS make sense at all? d) dev.cxl.$PORT.qsize_rxq: 1024 and dev.cxl.$PORT.qsize_txq: 1024. These appear to not be writeable when if_cxgbe is loaded, so I speculate they are not to be messed with, or are loader.conf variables? Is there any benefit to messing with them? e) dev.t5nex.$CARD.toe.sndbuf: 262144. These are writeable, but messing with values did not yield an immediate benefit. Am I barking up the wrong tree, trying? f) based on prior experiments with other vendors, I tried tweaks to net.isr.* settings, but did not see any benefits worth discussing. Am I correct in this speculation, based on others experience? g) Are there other settings I should be looking at, that may squeeze out a few more packets? Thanks in advance! -- John Jasen (jja...@gmail.com) _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"