FreeBSD 9.0 generates incorrect SEC/ACK numbers under load
Hello. I've run into a problem with a web server runing FreeBSD 9.0/amd64. What I believe is happening, is what server loses track of correct SEQ/ACK numbers on some connections. Here is an example: 15:20:00.347514 IP (tos 0x68, ttl 123, id 1181, offset 0, flags [DF], proto TCP (6), length 52) 93.72.14.220.49239 > 193.178.147.113.80: Flags [S], cksum 0x6995 (correct), seq 3881466934, win 8192, options [mss 1460,nop,wscale 2,nop,nop,sackOK], length 0 15:20:00.347526 IP (tos 0x10, ttl 254, id 28065, offset 0, flags [DF], proto TCP (6), length 44) 193.178.147.113.80 > 93.72.14.220.49239: Flags [S.], cksum 0x79fa (correct), seq 2151790680, ack 3881466935, win 0, options [mss 1460], length 0 15:20:00.361812 IP (tos 0x68, ttl 123, id 1183, offset 0, flags [DF], proto TCP (6), length 40) 93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x96c6 (correct), seq 3881466935, ack 2151790681, win 64240, length 0 15:20:00.361869 IP (tos 0x10, ttl 254, id 31305, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x71b7 (correct), seq 2151790681, ack 3881466935, win 8192, length 0 Client sends "GET" request 15:20:48.236181 IP (tos 0x68, ttl 123, id 1353, offset 0, flags [DF], proto TCP (6), length 626) 93.72.14.220.49239 > 193.178.147.113.80: Flags [P.], cksum 0x7fc9 (correct), seq 3881466935:3881467521, ack 2151790681, win 64240, length 586 and then the "ping-pong" starts: 15:20:48.236198 IP (tos 0x0, ttl 254, id 63530, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97 (correct), seq 2991748588, ack 1985077892, win 8760, length 0 15:20:48.255998 IP (tos 0x68, ttl 123, id 1357, offset 0, flags [DF], proto TCP (6), length 40) 93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c (correct), seq 3881467521, ack 2151790681, win 64240, length 0 15:20:48.256015 IP (tos 0x0, ttl 254, id 53518, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97 (correct), seq 2991748588, ack 1985077892, win 8760, length 0 15:20:48.276084 IP (tos 0x68, ttl 123, id 1360, offset 0, flags [DF], proto TCP (6), length 40) 93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c (correct), seq 3881467521, ack 2151790681, win 64240, length 0 15:20:48.276099 IP (tos 0x0, ttl 254, id 42983, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97 (correct), seq 2991748588, ack 1985077892, win 8760, length 0 15:20:48.290914 IP (tos 0x68, ttl 123, id 1361, offset 0, flags [DF], proto TCP (6), length 40) 93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c (correct), seq 3881467521, ack 2151790681, win 64240, length 0 This happens on about 0.01% of connections. This tcpdump is recorded on the 193.178.147.113, before traffic hits the wire. So it's not a NIC fault. Server is running nginx and serving static content 200-500 request per second. Any ideas ? -- Sergey Smitienko ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: FreeBSD 9.0 generates incorrect SEC/ACK numbers under load
Here you go, two sessions, one with win set in Syn/Ack packet and other with separate "windows open" Ack packet. 16:59:47.629750 IP (tos 0x0, ttl 123, id 55648, offset 0, flags [DF], proto TCP (6), length 48) 195.64.148.12.61153 > 193.178.147.113.80: Flags [S], cksum 0x5721 (correct), seq 770400880, win 65535, options [mss 1460,nop,nop,sackOK], length 0 16:59:47.629774 IP (tos 0x0, ttl 254, id 43755, offset 0, flags [DF], proto TCP (6), length 48) 193.178.147.113.80 > 195.64.148.12.61153: Flags [S.], cksum 0xeaa3 (correct), seq 2323563246, ack 770400881, win 8192, options [mss 1460,sackOK,eol], length 0 16:59:47.631873 IP (tos 0x0, ttl 123, id 40733, offset 0, flags [DF], proto TCP (6), length 40) 195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x3667 (correct), ack 2323563247, win 65535, length 0 16:59:47.633613 IP (tos 0x0, ttl 123, id 36942, offset 0, flags [DF], proto TCP (6), length 840) 195.64.148.12.61153 > 193.178.147.113.80: Flags [P.], cksum 0xcfb6 (correct), seq 770400881:770401681, ack 2323563247, win 65535, length 800 16:59:47.633710 IP (tos 0x0, ttl 254, id 22412, offset 0, flags [DF], proto TCP (6), length 1500) 193.178.147.113.80 > 195.64.148.12.61153: Flags [.], seq 2323563247:2323564707, ack 770401681, win 8760, length 1460 16:59:47.633721 IP (tos 0x0, ttl 254, id 22395, offset 0, flags [DF], proto TCP (6), length 340) 193.178.147.113.80 > 195.64.148.12.61153: Flags [P.], cksum 0x8323 (correct), seq 2323564707:2323565007, ack 770401681, win 8760, length 300 16:59:47.633745 IP (tos 0x0, ttl 254, id 17184, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 195.64.148.12.61153: Flags [F.], cksum 0x0a2e (correct), seq 2323565007, ack 770401681, win 8760, length 0 16:59:47.636215 IP (tos 0x0, ttl 123, id 65415, offset 0, flags [DF], proto TCP (6), length 40) 195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x2c67 (correct), ack 2323565007, win 65535, length 0 16:59:47.636607 IP (tos 0x0, ttl 123, id 48103, offset 0, flags [DF], proto TCP (6), length 40) 195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x2c66 (correct), ack 2323565008, win 65535, length 0 16:59:47.636841 IP (tos 0x0, ttl 123, id 39732, offset 0, flags [DF], proto TCP (6), length 40) 195.64.148.12.61153 > 193.178.147.113.80: Flags [F.], cksum 0x2c65 (correct), seq 770401681, ack 2323565008, win 65535, length 0 16:59:47.636855 IP (tos 0x0, ttl 254, id 37717, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 195.64.148.12.61153: Flags [.], cksum 0x0a2e (correct), ack 770401682, win 8759, length 0 17:01:58.437891 IP (tos 0x0, ttl 121, id 23760, offset 0, flags [DF], proto TCP (6), length 48) 92.231.64.37.61153 > 193.178.147.113.80: Flags [S], cksum 0x5c46 (correct), seq 3652856772, win 16384, options [mss 1452,nop,nop,sackOK], length 0 17:01:58.437907 IP (tos 0x10, ttl 254, id 61730, offset 0, flags [DF], proto TCP (6), length 44) 193.178.147.113.80 > 92.231.64.37.61153: Flags [S.], cksum 0x2c06 (correct), seq 3164719252, ack 3652856773, win 0, options [mss 1452], length 0 17:01:58.514354 IP (tos 0x0, ttl 121, id 23780, offset 0, flags [DF], proto TCP (6), length 40) 92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xffaa (correct), ack 3164719253, win 17424, length 0 17:01:58.514412 IP (tos 0x10, ttl 254, id 17560, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 92.231.64.37.61153: Flags [.], cksum 0x23bb (correct), ack 3652856773, win 8192, length 0 17:01:58.605052 IP (tos 0x0, ttl 121, id 23789, offset 0, flags [DF], proto TCP (6), length 690) 92.231.64.37.61153 > 193.178.147.113.80: Flags [P.], cksum 0x6b1c (correct), seq 3652856773:3652857423, ack 3164719253, win 17424, length 650 17:01:58.605123 IP (tos 0x0, ttl 254, id 54275, offset 0, flags [DF], proto TCP (6), length 1492) 193.178.147.113.80 > 92.231.64.37.61153: Flags [.], seq 3164719253:3164720705, ack 3652857423, win 8712, length 1452 17:01:58.605142 IP (tos 0x0, ttl 254, id 28400, offset 0, flags [DF], proto TCP (6), length 346) 193.178.147.113.80 > 92.231.64.37.61153: Flags [P.], cksum 0x6b55 (correct), seq 3164720705:3164721011, ack 3652857423, win 8712, length 306 17:01:58.605162 IP (tos 0x0, ttl 254, id 4658, offset 0, flags [DF], proto TCP (6), length 40) 193.178.147.113.80 > 92.231.64.37.61153: Flags [F.], cksum 0x184a (correct), seq 3164721011, ack 3652857423, win 8712, length 0 17:01:58.67 IP (tos 0x0, ttl 121, id 23803, offset 0, flags [DF], proto TCP (6), length 40) 92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xf642 (correct), ack 3164721011, win 17424, length 0 17:01:58.680737 IP (tos 0x0, ttl 121, id 23804, offset 0, flags [DF], proto TCP (6), length 40) 92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xf641 (correct), ack 3164721012, win 17424, length 0 17:01:58.682290 IP (tos 0x0, ttl 121, id 23806, offset 0, flags [DF], proto TCP (6), length 40
Re: FreeBSD 9.0 generates incorrect SEQ/ACK numbers under load
30.03.12 18:13, Andre Oppermann wrote: > On 30.03.2012 15:04, Sergey Smitienko wrote: >> Here you go, two sessions, one with win set in Syn/Ack packet and other >> with separate "windows open" Ack packet. > > Thanks for the tcpdumps. The window update issue seems to be separate > from the seq#ack# problem. No, it's not. You gave me an idea. I have pf running on the server. It's has basic ruleset. We have table with 4k+ networks of our usual visitors. pf rules looks like this: pass in quick from to port 80 keep state pass in quick from any to port 80 synproxy state. So, in case of synproxy pf anwers Syn packet with Syn/Ack without knowledge of window size, and then passes connection to the kernel tcp stack and generates "window open" Ack packet. I've replaced "synproxy state" with usual "keep state" in pf and I don't see any Syn/Ack packets with zero window size. >From the over side, I have 20Gb of tcpdump files with 10^8 packets recorded. I've wrote a simple parser, which can detect sessions with incorrect sec/ack numbers. Then I've checked all IP addresses with failed TCP sessions and non of them was from set. So, 100% of failed sessions was comming through pf synproxy state. Synproxy state also include modulate state function, which is basicky an addition of random number to seq/ack numbers. So, I think there is a case, then tcp comming from kernel is not properly modulated/demodulated by pf and this causes generation of incorrect seq/ack numbers. > Why do set the recvspace to the very low value of 8192? 8K is big enough for usual GET request or for POST with login or comment. -- Sergey Smitienko ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
pf with divert is not working properly. FreeBSD 9.1
Hello. I think I found a bug in divert sockets processing in pf, or maybe I'm missing something. Here is my setup, machine 192.168.250.103 is a DNS server and udp traffic coming to port 53 gets diverted to a test application. Test application is very simple, it just prints some info on the packet and reinjects it back to the kernel. Then divertion is done by IPFW, all works as expected. IPFW rule is "divert 1025 udp from any to 192.168.250.103 dst-port 53". Then I divert packets using pf rule "pass in log quick on em0 inet proto udp from any to 192.168.250.103 port 53 divert-to 127.0.0.1 port 1025", I'm starting to get a loop of the same packet comming back from divert socket again and again. If I change my sento() call to n = sendto(fd, packet, n, 0, (struct sockaddr*) &org, sizeof(org));, packet riches DNS server, but then I'm getting DNS reply in my divert socket and reply is getting looped all over again. I've also tried sample code from OpenBSD divert man page and I'm getting same loop once again. Here is my test code: #include #include #include #include #include #include #include #include #include #include #include void run (int port) { int fd; struct sockaddr_in sin; struct sockaddr_in org; int len, n, on=1; struct ip* ip; struct udphdr* udp; char *packet; packet = malloc(65536); if (packet == NULL) { warn ("malloc()"); exit(1); } ip = (struct ip*) packet; fd = socket(PF_INET, SOCK_RAW, IPPROTO_DIVERT); if (fd < 0) { warn ("socket(divert)"); exit(1); } sin.sin_len = sizeof(struct sockaddr_in); sin.sin_family = AF_INET; sin.sin_port=htons(port); sin.sin_addr.s_addr=inet_addr("127.0.0.1"); len = sizeof(struct sockaddr_in); if (bind(fd, (struct sockaddr *)&sin, len)<0) { warn("binding"); exit(1); } while (1) { len = sizeof(struct sockaddr_in); if (getsockname(fd, (struct sockaddr*) &org, &len) < 0) { warn("getsockname"); continue; } memset(packet, 0, 65536); memset(&sin, 0, sizeof(sin)); len = sizeof(sin); n = recvfrom(fd, packet, 65536, 0, (struct sockaddr*) &sin, &len); if (n < 0) { warn("recvfrom"); continue; } if (n < sizeof (struct ip)) continue; printf ("Got %d bytes from %s:%d | ", n, inet_ntoa(sin.sin_addr), ntohs(sin.sin_port)); printf ("%s:%d\n", inet_ntoa(org.sin_addr), ntohs(org.sin_port)); printf ("%s -> ", inet_ntoa(ip->ip_src)); printf ("%s ", inet_ntoa(ip->ip_dst)); printf ("TTL %d, PROTO %d, hlen %d, CSUM %x\n", ip->ip_ttl, ip->ip_p, ip->ip_hl, ip->ip_sum); udp = (struct udphdr*) (packet + ip->ip_hl*4); printf ("UDP src_port %d, dst_port %d\n", ntohs(udp->uh_sport), ntohs(udp->uh_dport)); n = sendto(fd, packet, n, 0, (struct sockaddr*) &sin, sizeof(sin)); if (n < 0 ) { warn("sendto"); } } } int main(void) { run (1025); } -- Sergey Smitienko ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"