On 04/16/2018 08:52 PM, Dominique Martinet wrote: > Thank you for the hints. Out of order reply. > > Florian Westphal wrote on Mon, Apr 16, 2018: >> echo 1 > /proc/sys/net/netfilter/nf_conntrack_tcp_be_liberal >> >> which stops conntrack from marking packets with out-of-window >> acks as invalid. > > That worked, the connection stays established with this set. > > >> echo 6 > /proc/sys/net/netfilter/nf_conntrack_log_invalid > > Took me a while to figure how to log that properly (why do distros not > ship ulogd?!) but that is helpful. > > I have attached the full logs to this mail as text attachments in case I > missed something in my excerpts > > > Specifically, there are many outbound invalid packets, but as Marcelo > pointed out there is no conntrack netfilter for outbound packets... > The first invalid incoming packet is that one (json output seemed to be > the most convenient as it has tcp seq/ackseq and the oob.prefix message, > feel free to ask for some other format) > > {"timestamp": "2018-04-17T10:29:14.956485", "dvc": "Netfilter", > "raw.pktlen": 52, "raw.pktcount": 1, "oob.prefix": "nf_ct_tcp: ACK is > over the upper bound (ACKed data not seen yet) ", "oob.time.sec": > 1523928554, "oob.time.usec": 956485, "oob.mark": 0, "oob.hook": 0, > "oob.family": 2, "oob.protocol": 2048, "raw.label": 0, "ip.protocol": 6, > "ip.tos": 0, "ip.ttl": 52, "ip.totlen": 52, "ip.ihl": 5, "ip.csum": > 50909, "ip.id": 30328, "ip.fragoff": 16384, "src_port": 30280, > "dest_port": 29543, "tcp.seq": 4048786673, "tcp.ackseq": 3024392506, > "tcp.window": 781, "tcp.offset": 0, "tcp.reserved": 0, "tcp.urg": 0, > "tcp.ack": 1, "tcp.psh": 0, "tcp.rst": 0, "tcp.syn": 0, "tcp.fin": 0, > "tcp.res1": 0, "tcp.res2": 0, "tcp.csum": 17752, "oob.in": "", > "oob.out": "", "src_ip": "client", "dest_ip": "server"} > > which is the first ack that isn't seen alright (sent packet, blocked > ack, replays): > 10:29:14.926115 IP server.29543 > client.30280: Flags [.], seq > 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val > 681965809 ecr 1634528339], length 1374 > ... > 10:29:14.956485 IP client.30280 > server.29543: Flags [.], ack 3024392506, > win 781, options [nop,nop,TS val 1634528369 ecr 681965809], length 0 > ... > 10:29:15.255489 IP server.29543 > client.30280: Flags [.], seq > 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val > 681966138 ecr 1634528369], length 1374 > 10:29:15.281581 IP client.30280 > server.29543: Flags [.], ack 3024436474, > win 1444, options [nop,nop,TS val 1634528697 ecr 681965904,nop,nop,sack 1 > {3024391132:3024392506}], length 0 > 10:29:15.555466 IP client.30280 > server.29543: Flags [P.], seq > 4048786673:4048786709, ack 3024436474, win 1444, options [nop,nop,TS val > 1634528869 ecr 681965904], length 36 > 10:29:15.719510 IP server.29543 > client.30280: Flags [.], seq > 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val > 681966602 ecr 1634528369], length 1374 > > So now I guess the question is why netfilter think this is over the > upper bound. > ..And surely enough the answer is within the invalid outbound packets > that aren't considered in the conntrack state machine, the first of whih > is this one: > {"timestamp": "2018-04-17T10:29:14.926143", "dvc": "Netfilter", > "raw.pktlen": 2800, "raw.pktcount": 1, "oob.prefix": "nf_ct_tcp: SEQ is > over the upper bound (over the window of the receiver)", "oob.time.sec": > 1523928554, "oob.time.usec": 926143, "oob.mark": 0, "oob.hook": 0, > "oob.family": 2, "oob.protocol": 2048, "oob.uid": 1000, "oob.gid": 100, > "raw.label": 0, "ip.protocol": 6, "ip.tos": 8, "ip.ttl": 64, > "ip.totlen": 2800, "ip.ihl": 5, "ip.csum": 47600, "ip.id": 27809, > "ip.fragoff": 16384, "src_port": 29543, "dest_port": 30280, "tcp.seq": > 3024391132, "tcp.ackseq": 4048786673, "tcp.window": 307, "tcp.offset": > 0, "tcp.reserved": 0, "tcp.urg": 0, "tcp.ack": 1, "tcp.psh": 0, > "tcp.rst": 0, "tcp.syn": 0, "tcp.fin": 0, "tcp.res1": 0, "tcp.res2": 0, > "tcp.csum": 5201, "oob.in": "", "oob.out": "", "src_ip": "server", > "dest_ip": "client"} > > which sequence matches the start seq of the first packet we don't ack, > so that's why conntrack doesn't recongnize that as a valid ack. > > > Here's some context before that packet: > 10:29:14.921305 IP server.29543 > client.30280: Flags [.], seq > 3024378314:3024379688, ack 4048786673, win 307, options [nop,nop,TS val > 681965804 ecr 1634528336], length 1374 > 10:29:14.921311 IP server.29543 > client.30280: Flags [.], seq > 3024379688:3024381062, ack 4048786673, win 307, options [nop,nop,TS val > 681965804 ecr 1634528336], length 1374 > 10:29:14.922166 IP client.30280 > server.29543: Flags [.], ack 3024365948, > win 329, options [nop,nop,TS val 1634528336 ecr 681965778], length 0 > 10:29:14.922178 IP server.29543 > client.30280: Flags [.], seq > 3024381062:3024382436, ack 4048786673, win 307, options [nop,nop,TS val > 681965805 ecr 1634528336], length 1374 > 10:29:14.922182 IP server.29543 > client.30280: Flags [.], seq > 3024382436:3024383810, ack 4048786673, win 307, options [nop,nop,TS val > 681965805 ecr 1634528336], length 1374 > 10:29:14.923848 IP client.30280 > server.29543: Flags [.], ack 3024367322, > win 352, options [nop,nop,TS val 1634528338 ecr 681965778], length 0 > 10:29:14.923860 IP server.29543 > client.30280: Flags [.], seq > 3024383810:3024385184, ack 4048786673, win 307, options [nop,nop,TS val > 681965807 ecr 1634528338], length 1374 > 10:29:14.923863 IP server.29543 > client.30280: Flags [.], seq > 3024385184:3024386558, ack 4048786673, win 307, options [nop,nop,TS val > 681965807 ecr 1634528338], length 1374 > 10:29:14.924175 IP client.30280 > server.29543: Flags [.], ack 3024368696, > win 375, options [nop,nop,TS val 1634528338 ecr 681965778], length 0 > 10:29:14.924187 IP server.29543 > client.30280: Flags [.], seq > 3024386558:3024387932, ack 4048786673, win 307, options [nop,nop,TS val > 681965807 ecr 1634528338], length 1374 > 10:29:14.924190 IP server.29543 > client.30280: Flags [.], seq > 3024387932:3024389306, ack 4048786673, win 307, options [nop,nop,TS val > 681965807 ecr 1634528338], length 1374 > 10:29:14.925410 IP client.30280 > server.29543: Flags [.], ack 3024370070, > win 397, options [nop,nop,TS val 1634528339 ecr 681965779], length 0 > 10:29:14.925422 IP server.29543 > client.30280: Flags [.], seq > 3024389306:3024390680, ack 4048786673, win 307, options [nop,nop,TS val > 681965808 ecr 1634528339], length 1374 > 10:29:14.925425 IP server.29543 > client.30280: Flags [P.], seq > 3024390680:3024391132, ack 4048786673, win 307, options [nop,nop,TS val > 681965808 ecr 1634528339], length 452 > 10:29:14.926030 IP client.30280 > server.29543: Flags [.], ack 3024371444, > win 420, options [nop,nop,TS val 1634528339 ecr 681965779], length 0 > 10:29:14.926115 IP server.29543 > client.30280: Flags [.], seq > 3024391132:3024392506, ack 4048786673, win 307, options [nop,nop,TS val > 681965809 ecr 1634528339], length 1374 > > But the way I see it, (3024392506-3024371444) = 21064 is smaller than > 420*64 = 26880. > Even looking at the previous (ack 3024370070, win 397) we get 22436 and > 25408 which still work, same with ack 3024368696, win 375 (23810 < > 24000) > So that means conntrack is looking at an ack at least as old as ack > 3024367322, win 352?... > Is there any way to confirm that? In the code it looks like we match > ! before(seq, sender->td_maxend + 1) > > I turned pr_debug on in tcp_in_window() for another try and it's a bit > mangled because the information on multiple lines and the function is > called in parallel but it looks like I do have some seq > maxend +1 > > Although it's weird, the maxend was set WAY earlier apparently? > Apr 17 11:13:14 res=1 sender end=1913287798 maxend=1913316998 maxwin=29312 > receiver end=505004284 maxend=505033596 maxwin=29200 > then window decreased drastically e.g. previous ack just before refusal: > Apr 17 11:13:53 seq=1913292311 ack=505007789+(0) sack=505007789+(0) win=284 > end=1913292311 > Apr 17 11:13:53 sender end=1913292311 maxend=1913331607 maxwin=284 scale=0 > receiver end=505020155 maxend=505033596 maxwin=39296 scale=7
scale=0 is suspect. Really if conntrack does not see SYN SYNACK packets, it should not make any window check, since windows can be scaled up to 14 :/ > > first res=0 (refused) packet: > Apr 17 11:13:53 odin kernel: seq=505042139 ack=1913292311+(0) > ack=1913292311+(0) win=307 end=505044887 > Apr 17 11:13:53 res=0 sender end=505033895 maxend=505033596 maxwin=39296 > receiver end=1913292311 maxend=1913331607 maxwin=674 > > >> (Earlier email implies this is related to timestamps, but unfortunately >> to best of my knowledge conntrack doesn't look at tcp timestamps). > > tcp_timestamp introduces other changes in behaviour so it might still be > relateed. > > > > Recap of attachments: > - new tcpdump output of a failed attempt > - matching invalid packet dumped in json format > - logs of tcp_in_window function unfortunately of a different attempt > > Happy to provide new/different logs if it would be useful. Once again > thanks for the extra eyes and useful commands. >