Module Name: src Committed By: dyoung Date: Tue May 3 18:28:46 UTC 2011
Modified Files: src/distrib/sets/lists/comp: mi src/sys/dist/pf/net: pf.c src/sys/netinet: Makefile files.netinet in_pcb.c in_pcb.h in_pcb_hdr.h tcp_input.c tcp_subr.c tcp_usrreq.c tcp_var.h udp_usrreq.c src/sys/netinet6: in6_pcb.c in6_pcb.h in6_src.c ip6_input.c raw_ip6.c udp6_usrreq.c src/sys/rump/net/lib/libnetinet: Makefile.inc src/usr.bin/netstat: Makefile inet.c inet6.c main.c netstat.h Added Files: src/sys/netinet: tcp_vtw.c tcp_vtw.h src/usr.bin/netstat: vtw.c vtw.h Log Message: Reduces the resources demanded by TCP sessions in TIME_WAIT-state using methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime Truncation (MSLT). MSLT and VTW were contributed by Coyote Point Systems, Inc. Even after a TCP session enters the TIME_WAIT state, its corresponding socket and protocol control blocks (PCBs) stick around until the TCP Maximum Segment Lifetime (MSL) expires. On a host whose workload necessarily creates and closes down many TCP sockets, the sockets & PCBs for TCP sessions in TIME_WAIT state amount to many megabytes of dead weight in RAM. Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to a class based on the nearness of the peer. Corresponding to each class is an MSL, and a session uses the MSL of its class. The classes are loopback (local host equals remote host), local (local host and remote host are on the same link/subnet), and remote (local host and remote host communicate via one or more gateways). Classes corresponding to nearer peers have lower MSLs by default: 2 seconds for loopback, 10 seconds for local, 60 seconds for remote. Loopback and local sessions expire more quickly when MSLT is used. Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket dead weight with a compact representation of the session, called a "vestigial PCB". VTW data structures are designed to be very fast and memory-efficient: for fast insertion and lookup of vestigial PCBs, the PCBs are stored in a hash table that is designed to minimize the number of cacheline visits per lookup/insertion. The memory both for vestigial PCBs and for elements of the PCB hashtable come from fixed-size pools, and linked data structures exploit this to conserve memory by representing references with a narrow index/offset from the start of a pool instead of a pointer. When space for new vestigial PCBs runs out, VTW makes room by discarding old vestigial PCBs, oldest first. VTW cooperates with MSLT. It may help to think of VTW as a "FIN cache" by analogy to the SYN cache. A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT sessions as fast as it can is approximately 17% idle when VTW is active versus 0% idle when VTW is inactive. It has 103 megabytes more free RAM when VTW is active (approximately 64k vestigial PCBs are created) than when it is inactive. To generate a diff of this commit: cvs rdiff -u -r1.1619 -r1.1620 src/distrib/sets/lists/comp/mi cvs rdiff -u -r1.64 -r1.65 src/sys/dist/pf/net/pf.c cvs rdiff -u -r1.19 -r1.20 src/sys/netinet/Makefile cvs rdiff -u -r1.21 -r1.22 src/sys/netinet/files.netinet cvs rdiff -u -r1.137 -r1.138 src/sys/netinet/in_pcb.c cvs rdiff -u -r1.47 -r1.48 src/sys/netinet/in_pcb.h cvs rdiff -u -r1.5 -r1.6 src/sys/netinet/in_pcb_hdr.h cvs rdiff -u -r1.311 -r1.312 src/sys/netinet/tcp_input.c cvs rdiff -u -r1.240 -r1.241 src/sys/netinet/tcp_subr.c cvs rdiff -u -r1.158 -r1.159 src/sys/netinet/tcp_usrreq.c cvs rdiff -u -r1.165 -r1.166 src/sys/netinet/tcp_var.h cvs rdiff -u -r0 -r1.1 src/sys/netinet/tcp_vtw.c src/sys/netinet/tcp_vtw.h cvs rdiff -u -r1.179 -r1.180 src/sys/netinet/udp_usrreq.c cvs rdiff -u -r1.112 -r1.113 src/sys/netinet6/in6_pcb.c cvs rdiff -u -r1.34 -r1.35 src/sys/netinet6/in6_pcb.h cvs rdiff -u -r1.49 -r1.50 src/sys/netinet6/in6_src.c cvs rdiff -u -r1.129 -r1.130 src/sys/netinet6/ip6_input.c cvs rdiff -u -r1.107 -r1.108 src/sys/netinet6/raw_ip6.c cvs rdiff -u -r1.88 -r1.89 src/sys/netinet6/udp6_usrreq.c cvs rdiff -u -r1.7 -r1.8 src/sys/rump/net/lib/libnetinet/Makefile.inc cvs rdiff -u -r1.33 -r1.34 src/usr.bin/netstat/Makefile cvs rdiff -u -r1.95 -r1.96 src/usr.bin/netstat/inet.c cvs rdiff -u -r1.53 -r1.54 src/usr.bin/netstat/inet6.c cvs rdiff -u -r1.77 -r1.78 src/usr.bin/netstat/main.c cvs rdiff -u -r1.41 -r1.42 src/usr.bin/netstat/netstat.h cvs rdiff -u -r0 -r1.1 src/usr.bin/netstat/vtw.c src/usr.bin/netstat/vtw.h Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.