On Mon, Jul 04, 2011 at 09:01:12PM +0100, Stuart Henderson wrote: > On 2011/07/04 19:02, Claudio Jeker wrote: > > On Mon, Jul 04, 2011 at 12:32:51PM -0400, David Hill wrote: > > > On Mon, Jul 04, 2011 at 09:47:36AM +0100, Stuart Henderson wrote: > > > :let's try again with this. regenerated diffs because the only > > > :complaints were about whitespace issues, no real changes compared > > > :to the diffs I sent out previously. > > > : > > > :- add sysctl net.inet.tcp.always_keepalive to act as if > > > :SO_KEEPALIVE was set on all TCP connections. > > > > > > I see good/bad with it. Fixes using apps behind crappy apps, but > > > doesn't fix crappy apps? I have no opinion on this. > > > > > > > Same here. Sometimes I would like such a knob when using crappy NAT boxes. > > On the other hand this only matters for ssh an there you could use the ssh > > specific knob. > > Thanks for reading.. > > It matters for ftp(1) (this is why espie needed the NOOP hack), and some > chat protocols if they can only connect behind an aggresively timing out > NAT. (I suspect we will see more not less of these in future). And while > there might be some occasions we always want a certain protocol to use > keepalives (expiring dead connections in ftpd or sshd), whether we really > need it or not it is more commonly a feature of the network we're on. > i.e. crappy hotel nat -> we typically want it no matter what software > we're using over the connection. > > > > :- fix old bug where net.inet.tcp.keepintvl was ignored, > > > :keepalives were instead sent at keepidle/slow_hz seconds. > > > > > > I believe this is correct. Tested it. OK, but get more OKs :) > > > > > > > I belive it is wrong. > > > > * The TCPT_KEEP timer is used to keep connections alive. If an > > * connection is idle (no segments received) for TCPTV_KEEP_INIT amount of > > time, > > * but not yet established, then we drop the connection. > > This part works correctly, attempt to connect to a TCP port which > does not respond to SYNs and your connection attempt times out after > TCPTV_KEEP_INIT/slowhz seconds. > > > Once the connection > > * is established, if the connection is idle for TCPTV_KEEP_IDLE time > > * (and keepalives have been enabled on the socket), we begin to probe > > * the connection. > > > > So when a TCP packet is received we arm the timer with tcp_keepidle and > > only when that timer fires we switch to tcp_keepintvl. From my quick look > > this is what the code does and you would change that to always use > > tcp_keepintvl when keepalive probes are enabled. > > But this is broken. With a snapshot kernel and nc hacked to set > SO_KEEPALIVE, using the following: > > $ sysctl net.inet.tcp|grep keep > net.inet.tcp.keepinittime=150 > net.inet.tcp.keepidle=28 > net.inet.tcp.keepintvl=7 > > I see this, > > tcpdump: listening on trunk0, link-type EN10MB > 20:42:27.755730 10.15.1.46.30261 > 85.158.44.150.21: S > 1388456603:1388456603(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale > 3,nop,nop,timestamp 1760787434 0> (DF) > 20:42:27.756867 85.158.44.150.21 > 10.15.1.46.30261: S > 2818783366:2818783366(0) ack 1388456604 win 16384 <mss > 1460,nop,nop,sackOK,nop,wscale 3,nop,nop,timestamp 1869746683 1760787434> (DF) > 20:42:27.756963 10.15.1.46.30261 > 85.158.44.150.21: . ack 1 win 2048 > <nop,nop,timestamp 1760787434 1869746683> (DF) > 20:42:27.803695 85.158.44.150.21 > 10.15.1.46.30261: P 1:269(268) ack 1 win > 2172 <nop,nop,timestamp 1869746683 1760787434> (DF) > 20:42:27.999104 10.15.1.46.30261 > 85.158.44.150.21: . ack 269 win 2048 > <nop,nop,timestamp 1760787435 1869746683> (DF) > 20:42:41.798956 10.15.1.46.30261 > 85.158.44.150.21: . ack 269 win 2048 (DF) > 20:42:41.799947 85.158.44.150.21 > 10.15.1.46.30261: . ack 1 win 2172 > <nop,nop,timestamp 1869746711 1760787435> (DF) > 20:42:55.798809 10.15.1.46.30261 > 85.158.44.150.21: . ack 269 win 2048 (DF) > 20:42:55.799832 85.158.44.150.21 > 10.15.1.46.30261: . ack 1 win 2172 > <nop,nop,timestamp 1869746739 1760787435> (DF) > 20:43:09.798620 10.15.1.46.30261 > 85.158.44.150.21: . ack 269 win 2048 (DF) > 20:43:09.799644 85.158.44.150.21 > 10.15.1.46.30261: . ack 1 win 2172 > <nop,nop,timestamp 1869746767 1760787435> (DF) > > So we can see keepalives every 28/2 seconds i.e. keepidle interval. > (Actually what happens is that the timer is armed for keepintvl and then > it's rearmed with keepidle). So the current behaviour is definitely wrong. >
But this is correct since because of the reception of a TCP keepalive response we do not consider the TCP session idle anymore and keepintvl comes only into play when the other side does not respond. At least that is how think keepalives are supposed to work. -- :wq Claudio
