Hello folks, In looking at a few benchmarks (especially netperf) run locally, it seems that tcp is unable to make full use of available CPU cycles as the sender is throttled waiting for ACKs to arrive. The problem is exacerbated when the sender is using a small send buffer -- running netperf -C -c -- -s 1024 show a miserable 420Kbit/s at essentially 0% CPU usage. Tests over gige are similarly constrained to a mere 96Mbit/s.
Since there is no way for the receiver to know if the sender is being blocked on transmit space, would it not make sense for the receiver to send out any delayed ACKs when it is clear that the receiving process is waiting for more data? The patch below attempts this (I make no guarantees of its correctness with respect to the rest of the delayed ack code). One point I'm still contemplating is what to do if the receiver is waiting in poll/select/epoll. [All tests run with maxcpus=1 on a 2.67GHz Woodcrest system.] Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB Base (2.6.17-rc4): default send buffer size netperf -C -c 87380 16384 16384 10.02 14127.79 99.90 99.90 0.579 0.579 87380 16384 16384 10.02 13875.28 99.90 99.90 0.590 0.590 87380 16384 16384 10.01 13777.25 99.90 99.90 0.594 0.594 87380 16384 16384 10.02 13796.31 99.90 99.90 0.593 0.593 87380 16384 16384 10.01 13801.97 99.90 99.90 0.593 0.593 netperf -C -c -- -s 1024 87380 2048 2048 10.02 0.43 -0.04 -0.04 -7.105 -7.377 87380 2048 2048 10.02 0.43 -0.01 -0.01 -2.337 -2.620 87380 2048 2048 10.02 0.43 -0.03 -0.03 -5.683 -5.940 87380 2048 2048 10.02 0.43 -0.05 -0.05 -9.373 -9.625 87380 2048 2048 10.02 0.43 -0.05 -0.05 -9.373 -9.625 from a remote system over gigabit ethernet netperf -H woody -C -c 87380 16384 16384 10.03 936.23 19.32 20.47 3.382 1.791 87380 16384 16384 10.03 936.27 17.67 20.95 3.091 1.833 87380 16384 16384 10.03 936.17 19.18 20.77 3.356 1.817 87380 16384 16384 10.03 936.26 18.22 20.26 3.188 1.773 87380 16384 16384 10.03 936.26 17.35 20.54 3.036 1.797 netperf -H woody -C -c -- -s 1024 87380 2048 2048 10.00 95.72 10.04 6.64 17.188 5.683 87380 2048 2048 10.00 95.94 9.47 6.42 16.170 5.478 87380 2048 2048 10.00 96.83 9.62 5.72 16.283 4.840 87380 2048 2048 10.00 95.91 9.58 6.13 16.368 5.236 87380 2048 2048 10.00 95.91 9.58 6.13 16.368 5.236 Patched: default send buffer size netperf -C -c 87380 16384 16384 10.01 13923.16 99.90 99.90 0.588 0.588 87380 16384 16384 10.01 13854.59 99.90 99.90 0.591 0.591 87380 16384 16384 10.02 13840.42 99.90 99.90 0.591 0.591 87380 16384 16384 10.01 13810.96 99.90 99.90 0.593 0.593 87380 16384 16384 10.01 13771.27 99.90 99.90 0.594 0.594 netperf -C -c -- -s 1024 87380 2048 2048 10.02 2473.48 99.90 99.90 3.309 3.309 87380 2048 2048 10.02 2421.46 99.90 99.90 3.380 3.380 87380 2048 2048 10.02 2288.07 99.90 99.90 3.577 3.577 87380 2048 2048 10.02 2405.41 99.90 99.90 3.402 3.402 87380 2048 2048 10.02 2284.41 99.90 99.90 3.582 3.582 netperf -H woody -C -c 87380 16384 16384 10.04 936.10 23.04 21.60 4.033 1.890 87380 16384 16384 10.03 936.20 18.52 21.06 3.242 1.843 87380 16384 16384 10.03 936.52 17.61 21.05 3.082 1.841 87380 16384 16384 10.03 936.18 18.24 20.73 3.191 1.814 87380 16384 16384 10.03 936.28 18.30 21.04 3.202 1.841 netperf -H woody -C -c -- -s 1024 87380 2048 2048 10.00 142.46 10.19 7.53 11.714 4.332 87380 2048 2048 10.00 147.28 9.73 7.93 10.829 4.412 87380 2048 2048 10.00 143.37 10.64 6.54 12.161 3.738 87380 2048 2048 10.00 146.41 9.18 7.43 10.277 4.158 87380 2048 2048 10.01 145.58 9.80 7.25 11.032 4.081 Comments/thoughts? -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 934396b..e554ceb 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1277,8 +1277,11 @@ #endif /* Do not sleep, just process backlog. */ release_sock(sk); lock_sock(sk); - } else + } else { + if (inet_csk_ack_scheduled(sk)) + tcp_send_ack(sk); sk_wait_data(sk, &timeo); + } #ifdef CONFIG_NET_DMA tp->ucopy.wakeup = 0; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html