FreeBSD 9.0 generates incorrect SEC/ACK numbers under load

2012-03-29 Thread Sergey Smitienko
Hello.

I've run into a problem with a web server runing FreeBSD 9.0/amd64. What
I believe is happening, is what server loses track of correct SEQ/ACK
numbers
on some connections. Here is an example:

15:20:00.347514 IP (tos 0x68, ttl 123, id 1181, offset 0, flags [DF],
proto TCP (6), length 52)
93.72.14.220.49239 > 193.178.147.113.80: Flags [S], cksum 0x6995
(correct), seq 3881466934, win 8192, options [mss 1460,nop,wscale
2,nop,nop,sackOK], length 0
15:20:00.347526 IP (tos 0x10, ttl 254, id 28065, offset 0, flags [DF],
proto TCP (6), length 44)
193.178.147.113.80 > 93.72.14.220.49239: Flags [S.], cksum 0x79fa
(correct), seq 2151790680, ack 3881466935, win 0, options [mss 1460],
length 0
15:20:00.361812 IP (tos 0x68, ttl 123, id 1183, offset 0, flags [DF],
proto TCP (6), length 40)
93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x96c6
(correct), seq 3881466935, ack 2151790681, win 64240, length 0
15:20:00.361869 IP (tos 0x10, ttl 254, id 31305, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x71b7
(correct), seq 2151790681, ack 3881466935, win 8192, length 0

Client sends "GET"  request
15:20:48.236181 IP (tos 0x68, ttl 123, id 1353, offset 0, flags [DF],
proto TCP (6), length 626)
93.72.14.220.49239 > 193.178.147.113.80: Flags [P.], cksum 0x7fc9
(correct), seq 3881466935:3881467521, ack 2151790681, win 64240, length 586

and then the "ping-pong" starts:

15:20:48.236198 IP (tos 0x0, ttl 254, id 63530, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.255998 IP (tos 0x68, ttl 123, id 1357, offset 0, flags [DF],
proto TCP (6), length 40)
93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0
15:20:48.256015 IP (tos 0x0, ttl 254, id 53518, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.276084 IP (tos 0x68, ttl 123, id 1360, offset 0, flags [DF],
proto TCP (6), length 40)
93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0
15:20:48.276099 IP (tos 0x0, ttl 254, id 42983, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.290914 IP (tos 0x68, ttl 123, id 1361, offset 0, flags [DF],
proto TCP (6), length 40)
93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0

This happens on about 0.01% of connections. This tcpdump is recorded on
the 193.178.147.113, before traffic hits the wire.
So it's not a NIC fault. Server is running nginx and serving static
content 200-500 request  per second.

Any ideas ?

-- 
Sergey Smitienko

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: FreeBSD 9.0 generates incorrect SEC/ACK numbers under load

2012-03-30 Thread Sergey Smitienko
Here you go, two sessions, one with win set in Syn/Ack packet and other
with separate "windows open" Ack packet.


16:59:47.629750 IP (tos 0x0, ttl 123, id 55648, offset 0, flags [DF],
proto TCP (6), length 48)
195.64.148.12.61153 > 193.178.147.113.80: Flags [S], cksum 0x5721
(correct), seq 770400880, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
16:59:47.629774 IP (tos 0x0, ttl 254, id 43755, offset 0, flags [DF],
proto TCP (6), length 48)
193.178.147.113.80 > 195.64.148.12.61153: Flags [S.], cksum 0xeaa3
(correct), seq 2323563246, ack 770400881, win 8192, options [mss
1460,sackOK,eol], length 0
16:59:47.631873 IP (tos 0x0, ttl 123, id 40733, offset 0, flags [DF],
proto TCP (6), length 40)
195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x3667
(correct), ack 2323563247, win 65535, length 0
16:59:47.633613 IP (tos 0x0, ttl 123, id 36942, offset 0, flags [DF],
proto TCP (6), length 840)
195.64.148.12.61153 > 193.178.147.113.80: Flags [P.], cksum 0xcfb6
(correct), seq 770400881:770401681, ack 2323563247, win 65535, length 800
16:59:47.633710 IP (tos 0x0, ttl 254, id 22412, offset 0, flags [DF],
proto TCP (6), length 1500)
193.178.147.113.80 > 195.64.148.12.61153: Flags [.], seq
2323563247:2323564707, ack 770401681, win 8760, length 1460
16:59:47.633721 IP (tos 0x0, ttl 254, id 22395, offset 0, flags [DF],
proto TCP (6), length 340)
193.178.147.113.80 > 195.64.148.12.61153: Flags [P.], cksum 0x8323
(correct), seq 2323564707:2323565007, ack 770401681, win 8760, length 300
16:59:47.633745 IP (tos 0x0, ttl 254, id 17184, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 195.64.148.12.61153: Flags [F.], cksum 0x0a2e
(correct), seq 2323565007, ack 770401681, win 8760, length 0
16:59:47.636215 IP (tos 0x0, ttl 123, id 65415, offset 0, flags [DF],
proto TCP (6), length 40)
195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x2c67
(correct), ack 2323565007, win 65535, length 0
16:59:47.636607 IP (tos 0x0, ttl 123, id 48103, offset 0, flags [DF],
proto TCP (6), length 40)
195.64.148.12.61153 > 193.178.147.113.80: Flags [.], cksum 0x2c66
(correct), ack 2323565008, win 65535, length 0
16:59:47.636841 IP (tos 0x0, ttl 123, id 39732, offset 0, flags [DF],
proto TCP (6), length 40)
195.64.148.12.61153 > 193.178.147.113.80: Flags [F.], cksum 0x2c65
(correct), seq 770401681, ack 2323565008, win 65535, length 0
16:59:47.636855 IP (tos 0x0, ttl 254, id 37717, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 195.64.148.12.61153: Flags [.], cksum 0x0a2e
(correct), ack 770401682, win 8759, length 0
17:01:58.437891 IP (tos 0x0, ttl 121, id 23760, offset 0, flags [DF],
proto TCP (6), length 48)


92.231.64.37.61153 > 193.178.147.113.80: Flags [S], cksum 0x5c46
(correct), seq 3652856772, win 16384, options [mss 1452,nop,nop,sackOK],
length 0
17:01:58.437907 IP (tos 0x10, ttl 254, id 61730, offset 0, flags [DF],
proto TCP (6), length 44)
193.178.147.113.80 > 92.231.64.37.61153: Flags [S.], cksum 0x2c06
(correct), seq 3164719252, ack 3652856773, win 0, options [mss 1452],
length 0
17:01:58.514354 IP (tos 0x0, ttl 121, id 23780, offset 0, flags [DF],
proto TCP (6), length 40)
92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xffaa
(correct), ack 3164719253, win 17424, length 0
17:01:58.514412 IP (tos 0x10, ttl 254, id 17560, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 92.231.64.37.61153: Flags [.], cksum 0x23bb
(correct), ack 3652856773, win 8192, length 0
17:01:58.605052 IP (tos 0x0, ttl 121, id 23789, offset 0, flags [DF],
proto TCP (6), length 690)
92.231.64.37.61153 > 193.178.147.113.80: Flags [P.], cksum 0x6b1c
(correct), seq 3652856773:3652857423, ack 3164719253, win 17424, length 650
17:01:58.605123 IP (tos 0x0, ttl 254, id 54275, offset 0, flags [DF],
proto TCP (6), length 1492)
193.178.147.113.80 > 92.231.64.37.61153: Flags [.], seq
3164719253:3164720705, ack 3652857423, win 8712, length 1452
17:01:58.605142 IP (tos 0x0, ttl 254, id 28400, offset 0, flags [DF],
proto TCP (6), length 346)
193.178.147.113.80 > 92.231.64.37.61153: Flags [P.], cksum 0x6b55
(correct), seq 3164720705:3164721011, ack 3652857423, win 8712, length 306
17:01:58.605162 IP (tos 0x0, ttl 254, id 4658, offset 0, flags [DF],
proto TCP (6), length 40)
193.178.147.113.80 > 92.231.64.37.61153: Flags [F.], cksum 0x184a
(correct), seq 3164721011, ack 3652857423, win 8712, length 0
17:01:58.67 IP (tos 0x0, ttl 121, id 23803, offset 0, flags [DF],
proto TCP (6), length 40)
92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xf642
(correct), ack 3164721011, win 17424, length 0
17:01:58.680737 IP (tos 0x0, ttl 121, id 23804, offset 0, flags [DF],
proto TCP (6), length 40)
92.231.64.37.61153 > 193.178.147.113.80: Flags [.], cksum 0xf641
(correct), ack 3164721012, win 17424, length 0
17:01:58.682290 IP (tos 0x0, ttl 121, id 23806, offset 0, flags [DF],
proto TCP (6), length 40

Re: FreeBSD 9.0 generates incorrect SEQ/ACK numbers under load

2012-03-30 Thread Sergey Smitienko
30.03.12 18:13, Andre Oppermann wrote:
> On 30.03.2012 15:04, Sergey Smitienko wrote:
>> Here you go, two sessions, one with win set in Syn/Ack packet and other
>> with separate "windows open" Ack packet.
>
> Thanks for the tcpdumps.  The window update issue seems to be separate
> from the seq#ack# problem.
No, it's not. You gave me an idea.

I have pf running on the server.  It's has basic ruleset.
We have table  with 4k+ networks of our usual visitors.
pf rules looks like this:

pass in quick from  to  port 80 keep state
pass in quick from any to  port 80 synproxy state.

So, in case of synproxy pf anwers Syn packet with Syn/Ack without
knowledge of window size,
and then passes connection to the kernel tcp stack and generates "window
open" Ack packet.
I've replaced "synproxy state" with usual "keep state" in pf and I don't
see any Syn/Ack packets
with zero window size.

>From the over side, I have 20Gb of tcpdump files with 10^8 packets recorded.
I've wrote a simple parser, which can detect sessions with incorrect
sec/ack numbers. Then I've
checked all IP addresses with failed TCP sessions and non of them was
from  set.
So, 100% of failed sessions was comming through pf synproxy state.
Synproxy state also include
modulate state function, which is basicky an addition of random number
to seq/ack numbers.
So, I think there is a case, then tcp comming from kernel is not
properly modulated/demodulated
by pf and this causes generation of incorrect seq/ack numbers.

> Why do set the recvspace to the very low value of 8192?
8K is big enough for usual GET request or for POST with login or comment.

-- 
Sergey Smitienko


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


pf with divert is not working properly. FreeBSD 9.1

2013-01-16 Thread Sergey Smitienko
Hello.
I think I found a bug in divert sockets processing in pf, or maybe I'm
missing something.
Here is my setup, machine 192.168.250.103 is a DNS server and udp
traffic coming to port 53
gets diverted to a test application. Test application is very simple, it
just prints some info on the
packet and reinjects it back to the kernel. Then divertion is done by
IPFW, all works as expected.
IPFW rule is "divert 1025 udp from any to 192.168.250.103 dst-port 53".
Then I divert packets
using pf  rule "pass in log quick on em0 inet proto udp from any to
192.168.250.103 port 53
divert-to 127.0.0.1 port 1025", I'm starting to get a loop of the same
packet comming back from
divert socket again and again. If I change my sento() call to n =
sendto(fd, packet, n, 0, (struct sockaddr*) &org, sizeof(org));,
packet riches DNS server, but then I'm getting DNS reply in my divert
socket and reply is getting looped
all over again.

I've also tried sample code from OpenBSD divert man page and I'm getting
same loop once again.
Here is my test code:

#include 
#include 

#include 
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 

void run (int port)
{

  int fd;
  struct sockaddr_in sin;
  struct sockaddr_in org;
  int   len, n, on=1;
  struct ip* ip;
  struct udphdr* udp;
  char *packet;

  packet = malloc(65536);

  if (packet == NULL) {
 warn ("malloc()");
 exit(1);
  }
  ip = (struct ip*) packet;

  fd = socket(PF_INET, SOCK_RAW, IPPROTO_DIVERT);
  if (fd < 0) {
  warn ("socket(divert)");
  exit(1);
  }

  sin.sin_len = sizeof(struct sockaddr_in);
  sin.sin_family = AF_INET;
  sin.sin_port=htons(port);
  sin.sin_addr.s_addr=inet_addr("127.0.0.1");
  len = sizeof(struct sockaddr_in);

  if (bind(fd, (struct sockaddr *)&sin, len)<0)  {
 warn("binding");
 exit(1);
  }
 
  while (1) {
len = sizeof(struct sockaddr_in);
   
if (getsockname(fd, (struct sockaddr*) &org, &len) < 0) {
warn("getsockname");
continue; 
}
memset(packet, 0, 65536);
memset(&sin, 0, sizeof(sin));
len = sizeof(sin);
n = recvfrom(fd, packet, 65536, 0, (struct sockaddr*) &sin, &len);
if (n < 0) {
 warn("recvfrom");
 continue;
}
if (n < sizeof (struct ip)) continue;

printf ("Got %d bytes from %s:%d | ", n, inet_ntoa(sin.sin_addr),
ntohs(sin.sin_port));
printf ("%s:%d\n", inet_ntoa(org.sin_addr), ntohs(org.sin_port));
printf ("%s -> ", inet_ntoa(ip->ip_src));
printf ("%s ", inet_ntoa(ip->ip_dst));
printf ("TTL %d, PROTO %d, hlen %d, CSUM %x\n", ip->ip_ttl,
ip->ip_p, ip->ip_hl, ip->ip_sum);
   
udp = (struct udphdr*) (packet + ip->ip_hl*4);
printf ("UDP src_port %d, dst_port %d\n", ntohs(udp->uh_sport),
ntohs(udp->uh_dport));
  
n = sendto(fd, packet, n, 0, (struct sockaddr*) &sin, sizeof(sin));
if (n < 0 ) {
warn("sendto");
}
  }
}

int main(void)
{
  run (1025);
} 

-- 
Sergey Smitienko

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"