Some time ago I reported on problems I had with a simple TCP benchmark running on a T1000. I was able to solve the problems, and now have a fairly sturdy application for testing raw TCP performance.
The testbed includes an 8-core T1000 with 4 Intel Gigabit NICs. On Linux, the maximal throughput with the default scheduler assignments is 2.6 Gbps, while the best manually-tuned assignment (4 cores each dedicated to handling the interrupts and soft IRQs of a single NIC, all processes on the other 4 cores) resulted in just under 3.5 Gbps To my surprise, OpenSolaris had out-of-the-box throughput of over 3.3 Gbps - quite close to the best I could get out of Linux. Initial investigation suggests that OpenSolaris may be able to use more than one strand for handling received packets. However, the latest document I have on the Solaris network stack (dated January 2006), claims that only one virtual processor can handle packets from a given NIC, unless the fan-out option is used, in which case all available processors are used. However, the fanout option is disabled on my system (with no visible difference if it is turned on). Moreover, a look at mpstat suggests that 2 virtual processors are dedicated to each NIC (benchmark processes are assigned to processor sets, to isolate the virtual processors performing incoming packet handling). Can anybody shed some light on how networking tasks are assigned to processors, and whether there has been any change from the design specified in "Solaris OS Networking - The Magic Revealed"? Thanks, --Elad _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org