(Sorry for posting sol10 networking question here, let me know if I should post it elsewhere)
Problem: We don't see 4 x 1 GigE card producing 4 GigE throughput. our setup: two nodes (n1, n2) are back to back connected with 4 GigE NIC cards. Each individual NIC can produce 100MBps throughput. n1 is the client and n2 is the server. n1 is trying to read data stored in n2's memory without hitting disk. If I run the same applciation (on all 4 NICs at the same time) then max I get is 200MBps. With 2 NICs I get 150MBps. I watched that cpu#6 is getting heavily loaded 100% and "ithr" in mpstat is very high for cpu#6, see below a sample mpstat output. n2>#mpstat 1 1000 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 0 0 71414 42597 241 3660 111 728 188 0 1336 0 46 0 54 1 0 0 49839 45700 0 4228 100 635 149 0 906 0 40 0 60 2 0 0 67422 41955 0 1484 47 267 178 0 1243 0 43 0 57 3 0 0 60928 43176 0 1260 44 198 424 0 1061 0 43 0 57 4 0 0 27945 47010 3 552 8 63 187 0 571 1 29 0 70 5 0 0 29726 46722 1 626 7 73 63 0 515 0 27 0 73 6 0 0 0 52581 1872 387 114 10 344 0 8 0 99 0 1 7 0 0 48189 44176 0 1077 25 152 150 0 858 0 34 0 66 on n1, processor#6 is loaded 60% , rest of the processors are below 50%. These results I got through default system parameters. This happend with the mtu size 1500 with the broadcom GigE nic. When I use mtu = 9000, then I get throughput close to 3.8Gbps. cpu#6 is still >90% busy, ithr is still very high on cpu#6...only difference is that other cpus are also busy (close to 90%). I tried changing some /etc/system parameters e.g. * distribute squeues among all cpus * do this when NICs are faster than CPUs set ip:ip_squeue_fanout=1 (this was not the case in our setup, we have 8x2.33Ghz processors vs 4x1GigE NIC, still tried this) * if number of cpus far more than number of nics set ip:tcp_squeue_wput=1 (since this was the case, I tried this, without any improvement) * latency sensitive machines should set this to zero * default is: worker threads wait for 10ms * val=0 means no wait, serve immeditely ip:ip_squeue_wait=0 but without effect. Changing set ip:ip_squeue_fanout=1 fails the benchmark to run. tcp connection works otherwise. 1) So, my question is why is cpu% so high in cpu#6 only? Though the problem is solved with jumbo frames for 2 machines, if we increase number of nodes this scalability problem will be seen with 3,4,5 ...machines (since cpu utilization is very high with current state). Is there any kernel tunable I should try to distribute the load differently? Are all TCP connections (and squeues) getting tied with processor #6? Is there a way to distribute connections among other processors? 2) with 24 TCP connections established, I again see, only cpu 6 has some mblk's (and don't see for others)....I didn't capture the mblks for cpu #6 in this example though. is there something wrong here, shouldn't each TCP connection have its own squeue? [3]> ::squeue ADDR STATE CPU FIRST LAST WORKER ffffffff98e199c0 02060 7 0000000000000000 0000000000000000 fffffe8001139c80 ffffffff98e19a80 02060 6 0000000000000000 0000000000000000 fffffe8001133c80 ffffffff98e19b40 02060 5 0000000000000000 0000000000000000 fffffe80010dfc80 ffffffff98e19c00 02060 4 0000000000000000 0000000000000000 fffffe800108bc80 ffffffff98e19cc0 02060 3 0000000000000000 0000000000000000 fffffe8001037c80 ffffffff98e19d80 02060 2 0000000000000000 0000000000000000 fffffe8000fe3c80 ffffffff98e19e40 02060 1 0000000000000000 0000000000000000 fffffe80004ebc80 ffffffff98e19f00 02060 0 0000000000000000 0000000000000000 fffffe8000293c80 [3]> :c thanks, som ([EMAIL PROTECTED], ph: 650-527-1566) -- This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org