-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 We've recently been doing some TCP congestion control research, and have written a small logging module for 6.2 that outputs the cwnd of a tcp flow to a log file. The logging module is called SIFTR and is available on our website: http://caia.swin.edu.au/urp/newtcp/tools.html.
In analysing the output, we noticed that despite the rfc3390 sysctl variable being enabled, there were times when flows started with a cwnd equal to 1 MSS instead of 4380 as suggested by rfc3390. We realised the cwnd was constrained to 1 MSS when it's starting value was influenced by the hostcache, so we created the attached patch to view relevant variable values in the tcp_mss function in tcp_input.c. The following are the logged values for 2 sequential ssh sessions to the same host. The first time cwnd is set according to rfc3390, and the second time based on the hostcache. - ------- one: Jul 16 13:31:20 jhealy kernel: setting snd_cwnd according to rfc3390 Jul 16 13:31:20 jhealy kernel: tp->snd_cwnd now set to: 4380 two: Jul 16 13:31:31 jhealy kernel: setting snd_cwnd from hostcache Jul 16 13:31:31 jhealy kernel: tp->snd_cwnd: 1073725440 Jul 16 13:31:31 jhealy kernel: mss: 1448 Jul 16 13:31:31 jhealy kernel: metrics.rmx_cwnd: 14480 Jul 16 13:31:31 jhealy kernel: tp->snd_wnd: 0 Jul 16 13:31:31 jhealy kernel: so->so_snd.sb_hiwat: 33304 Jul 16 13:31:31 jhealy kernel: tp->snd_cwnd now set to: 1448 - ------ The formula used to calculate the cwnd when a relevant entry exists in the hostcache is on line 3054 of tcp_input.c: tp->snd_cwnd = max(mss, min(metrics.rmx_cwnd / 2, min(tp->snd_wnd, so->so_snd.sb_hiwat))); Using the values that we logged earlier, this breaks down as: = max(1448, min(7240, min(0, 33304))) = max(1448, min(7240, 0)) = max(1448, 0) = 1448 Given that the snd_wnd value during the connection initiation seems to always be 0, the cwnd is always going to be set to 1 MSS. This behaviour seems a little odd to us - can anyone shed some light on it? Our assumption is that the use of the hostcache is designed to increase performance where appropriate by seeding the initial cwnd based on past experience. For this section of code to return a cwnd that is successfully influenced by the hostcache, it would seem that the use of tp->snd_wnd should be avoided when the connection is still being initialised: tp->snd_cwnd = max(mss, min(metrics.rmx_cwnd / 2, so->so_snd.sb_hiwat)); James Healy & Lawrence Stewart Centre for Advanced Internet Architectures -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGmvkW4oawkrbYo/kRAkD1AKCMKJfbt3UYb2NCCh+wt4m+SlLuZgCfYJnD stwB3B5BzaMPuyJa5uw3wuQ= =LWnf -----END PGP SIGNATURE----- Swinburne University of Technology CRICOS Provider Code: 00111D NOTICE This e-mail and any attachments are confidential and intended only for the use of the addressee. They may contain information that is privileged or protected by copyright. If you are not the intended recipient, any dissemination, distribution, printing, copying or use is strictly prohibited. The University does not warrant that this e-mail and any attachments are secure and there is also a risk that it may be corrupted in transmission. It is your responsibility to check any attachments for viruses or defects before opening them. If you have received this transmission in error, please contact us on +61 3 9214 8000 and delete it immediately from your system. We do not accept liability in connection with computer virus, data corruption, delay, interruption, unauthorised access or unauthorised amendment. Please consider the environment before printing this email.
--- tcp_input.c.orig Thu Jul 12 10:28:47 2007 +++ tcp_input.c Mon Jul 16 12:29:28 2007 @@ -3051,22 +3067,41 @@ #define TCP_METRICS_CWND #ifdef TCP_METRICS_CWND if (metrics.rmx_cwnd) + { + printf("setting snd_cwnd from hostcache\n"); + printf("tp->snd_cwnd: %li\n", tp->snd_cwnd); + printf("mss: %u\n", mss); + printf("metrics.rmx_cwnd: %li\n", metrics.rmx_cwnd); + printf("tp->snd_wnd: %li\n", tp->snd_wnd); + printf("so->so_snd.sb_hiwat: %u\n", so->so_snd.sb_hiwat); tp->snd_cwnd = max(mss, min(metrics.rmx_cwnd / 2, min(tp->snd_wnd, so->so_snd.sb_hiwat))); + } else #endif if (tcp_do_rfc3390) + { + printf("setting snd_cwnd according to rfc3390\n"); tp->snd_cwnd = min(4 * mss, max(2 * mss, 4380)); + } #ifdef INET6 else if ((isipv6 && in6_localaddr(&inp->in6p_faddr)) || (!isipv6 && in_localaddr(inp->inp_faddr))) + { #else else if (in_localaddr(inp->inp_faddr)) + { #endif + printf("setting snd_cwnd according to local sysctl variable\n"); tp->snd_cwnd = mss * ss_fltsz_local; + } else + { + printf("setting snd_cwnd according to local sysctl variable\n"); tp->snd_cwnd = mss * ss_fltsz; + } + printf("tp->snd_cwnd now set to: %li\n", tp->snd_cwnd); } /*
_______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"