too long for a subject, but more descriptive: receiver-throttled tcp, and tcp on large bdp (bandwidth delay product) networks
i submitted a patch (tcpnewreno) that deals with two seperate problems i've seen when streaming over tcp. (and one discovered while debugging.) the tool used for debugging (nettest(8)) has also been submitted. cavet: for rpc-like protocols (e.g. 9p) that don't use multiple oustanding (e.g. the mnt driver), these changes will likely not have a large effect. this is going to be seen mostly in streaming senerios. the problems: 0. the current tcp stack loves to send less-than mss-sized segments and can send tinygrams for no apparent reason. 1. if the reader on the receiving end is reading so slowly that the receive queue fills, the connection can slow or even livelock. 1/10th to 1/50th calculated bandwith can be seen. 2. much lower than expected (again 1/10th to 1/50th) throughput can be seen on large bandwidth-delay product networks. the solutions: 0. nix had a fix for the tiny gram problem. i'm not sure who did the work, but thanks. good stuff. 1. corrections to zero-window probing fixed the receiver-pacing and livelock problem. (some of this was included in nix.) 2. NewReno was fully implemented and a few rfc-compliance issues were addressed. with the constants in the submitted code, BDPs of up to 1mb should be doable. that should be enough for 100Mbps @ 100ms. i'd be curious to know if anyone has a longer or fatter network. you can use the mathis* formula to calculate the theoretical speed of your network as bw = √(3/4)*(rtt/mss)* (1/√p) where bw bandwith rtt round-trip time p probability of loss event (i.e. entering newreno recovery, not the probability of a lost packet) obviously this doesn't work for p=0, and gets questionable for p<ε, for some ε = f(bdp). ("the macroscopic behavior of the tcp congestion avoidance algorithm", Matthew Mathis, Jeffrey Semke, Jamshid Mahdavi, ACM SIGCOMM, volume 27, number 3, July 1997. ISSN # 0146-4833.) a detailed discussion of the NewReno changes and observed performance is here http://www.quanstro.net/plan9 - erik