On 2016-02-03 08:37, Meyer, Wolfgang wrote: > Hello, > > we are evaluating network performance on a DELL-Server (PowerEdge R930 with 4 > Sockets, hw.model: Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz) with 10 > GbE-Cards. We use programs that on server side accepts connections on a > IP-address+port from the client side and after establishing the connection > data is sent in turns between server and client in a predefined pattern > (server side sends more data than client side) with sleeps in between the > send phases. The test set-up is chosen in such way that every client process > initiates 500 connections handled in threads and on the server side each > process representing an IP/Port pair also handles 500 connections in threads. > > The number of connections is then increased and the overall network througput > is observed using nload. On FreeBSD (on server side) roughly at 50,000 > connections errors begin to occur and the overall throughput won't increase > further with more connections. With Linux on the server side it is possible > to establish more than 120,000 connections and at 50,000 connections the > overall throughput ist double that of FreeBSD with the same sending pattern. > Furthermore system load on FreeBSD is much higher with 50 % system usage on > each core and 80 % interrupt usage on the 8 cores handling the interrupt > queues for the NIC. In comparison Linux has <10 % system usage, <10 % user > usage and about 15 % interrupt usage on the 16 cores handling the network > interrupts for 50,000 connections. > > Varying the numbers for the NIC interrupt queues won't change the performance > (rather worsens the situation). Disabling Hyperthreading (utilising 40 cores) > degrades the performance. Increasing MAXCPU to utilise all 80 cores won't > improve compared to 64 cores, atkbd and uart had to be disabled to avoid > kernel panics with increased MAXCPU (thanks to Andre Oppermann for > investigating this). Initiallly the tests were made on 10.2 Release, later I > switched to 10 Stable (later with ixgbe driver version 3.1.0) but that didn't > change the numbers. > > Some sysctl configurables were modified along the network performance > guidelines found on the net (e.g. > https://calomel.org/freebsd_network_tuning.html, > https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html, > https://pleiades.ucsc.edu/hyades/FreeBSD_Network_Tuning) but most of them > didn't have any measuarable impact. Final sysctl.conf and loader.conf > settings see below. Actually the only tunables that provided any improvement > were identified to be hw.ix.txd, and hw.ix.rxd that were reduced (!) to the > minimum value of 64 and hw.ix.tx_process_limit and hw.ix.rx_process_limit > that were set to -1. > > Any ideas what tunables might be changed to get a higher number of TCP > connections (it's not a question of the overall throughput as changing the > sending pattern allows me to fully utilise the 10Gb bandwidth)? How can I > determine where the kernel is spending its time that causes the high CPU > load? Any pointers are highly appreciated, I can't believe that there is such > a blatant difference in network performance compared to Linux. > > Regards, > Wolfgang >
I wonder if this might be NUMA related. Specifically, it might help to make sure that the 8 CPU cores that the NIC queues are pinned to, are on the same CPU that is backing the PCI-E slot that the NIC is in. -- Allan Jude
signature.asc
Description: OpenPGP digital signature