Re: Issues with TCP Timestamps allocation

Michael Tuexen Wed, 17 Jul 2019 11:25:54 -0700

> On 17. Jul 2019, at 09:42, Vitalij Satanivskij <sa...@ukr.net> wrote:
> 
> 
> 
> Hello. 
> 
> Is there any changes about this problem
Please find a patch in https://reviews.freebsd.org/D20980


If possible, please test and report.

Best regards
Michael
> 
> 
> I'm using FreeBSD 12 on my desktop and can confirm problem occur with some 
> hosts.
> 
> 
> 
> Michael Tuexen wrote:
> MT> 
> MT> 
> MT> > On 9. Jul 2019, at 14:58, Paul <de...@ukr.net> wrote:
> MT> > 
> MT> > Hi Michael,
> MT> > 
> MT> > 9 July 2019, 15:34:29, by "Michael Tuexen" <tue...@freebsd.org>:
> MT> > 
> MT> >> 
> MT> >> 
> MT> >>> On 8. Jul 2019, at 17:22, Paul <de...@ukr.net> wrote:
> MT> >>> 
> MT> >>> 
> MT> >>> 
> MT> >>> 8 July 2019, 17:12:21, by "Michael Tuexen" <tue...@freebsd.org>:
> MT> >>> 
> MT> >>>>> On 8. Jul 2019, at 15:24, Paul <de...@ukr.net> wrote:
> MT> >>>>> 
> MT> >>>>> Hi Michael,
> MT> >>>>> 
> MT> >>>>> 8 July 2019, 15:53:15, by "Michael Tuexen" <tue...@freebsd.org>:
> MT> >>>>> 
> MT> >>>>>>> On 8. Jul 2019, at 12:37, Paul <de...@ukr.net> wrote:
> MT> >>>>>>> 
> MT> >>>>>>> Hi team,
> MT> >>>>>>> 
> MT> >>>>>>> Recently we had an upgrade to 12 Stable. Immediately after, we 
> have started 
> MT> >>>>>>> seeing some strange connection establishment timeouts to some 
> fixed number
> MT> >>>>>>> of external (world) hosts. The issue was persistent and easy to 
> reproduce.
> MT> >>>>>>> Thanks to a patience and dedication of our system engineer we 
> have tracked  
> MT> >>>>>>> this issue down to a specific commit:
> MT> >>>>>>> 
> MT> >>>>>>> https://svnweb.freebsd.org/base?view=revision&revision=338053
> MT> >>>>>>> 
> MT> >>>>>>> This patch was also back-ported into 11 Stable:
> MT> >>>>>>> 
> MT> >>>>>>> https://svnweb.freebsd.org/base?view=revision&revision=348435
> MT> >>>>>>> 
> MT> >>>>>>> Among other things this patch changes the timestamp allocation 
> strategy,
> MT> >>>>>>> by introducing a deterministic randomness via a hash function 
> that takes
> MT> >>>>>>> into account a random key as well as source address, source port, 
> dest
> MT> >>>>>>> address and dest port. As the result, timestamp offsets of 
> different
> MT> >>>>>>> tuples (SA,SP,DA,DP) will be wildly different and will jump from 
> small 
> MT> >>>>>>> to large numbers and back, as long as something in the tuple 
> changes.
> MT> >>>>>> Hi Paul,
> MT> >>>>>> 
> MT> >>>>>> this is correct.
> MT> >>>>>> 
> MT> >>>>>> Please note that the same happens with the old method, if two 
> hosts with
> MT> >>>>>> different uptimes are bind a consumer grade NAT.
> MT> >>>>> 
> MT> >>>>> If NAT does not replace timestamps then yes, it should be the case.
> MT> >>>>> 
> MT> >>>>>>> 
> MT> >>>>>>> After performing various tests of hosts that produce the above 
> mentioned 
> MT> >>>>>>> issue we came to conclusion that there are some interesting 
> implementations 
> MT> >>>>>>> that drop SYN packets with timestamps smaller  than the largest 
> timestamp 
> MT> >>>>>>> value from streams of all recent or current connections from a 
> specific 
> MT> >>>>>>> address. This looks as some kind of SYN flood protection.
> MT> >>>>>> This also breaks multiple hosts with different uptimes behind a 
> consumer
> MT> >>>>>> level NAT talking to such a server.
> MT> >>>>>>> 
> MT> >>>>>>> To ensure that each external host is not going to see a wild 
> jumps of 
> MT> >>>>>>> timestamp values I propose a patch that removes ports from the 
> equation
> MT> >>>>>>> all together, when calculating the timestamp offset:
> MT> >>>>>>> 
> MT> >>>>>>> Index: sys/netinet/tcp_subr.c
> MT> >>>>>>> 
> ===================================================================
> MT> >>>>>>> --- sys/netinet/tcp_subr.c        (revision 348435)
> MT> >>>>>>> +++ sys/netinet/tcp_subr.c        (working copy)
> MT> >>>>>>> @@ -2224,7 +2224,22 @@
> MT> >>>>>>> uint32_t
> MT> >>>>>>> tcp_new_ts_offset(struct in_conninfo *inc)
> MT> >>>>>>> {
> MT> >>>>>>> - return (tcp_keyed_hash(inc, V_ts_offset_secret));
> MT> >>>>>>> +        /* 
> MT> >>>>>>> +         * Some implementations show a strange behaviour when a 
> wildly random 
> MT> >>>>>>> +         * timestamps allocated for different streams. It seems 
> that only the
> MT> >>>>>>> +         * SYN packets are affected. Observed implementations 
> drop SYN packets
> MT> >>>>>>> +         * with timestamps smaller than the largest timestamp 
> value of all 
> MT> >>>>>>> +         * recent or current connections from specific a 
> address. To mitigate 
> MT> >>>>>>> +         * this we are going to ensure that each host will 
> always observe 
> MT> >>>>>>> +         * timestamps as increasing no matter the stream: by 
> dropping ports
> MT> >>>>>>> +         * from the equation.
> MT> >>>>>>> +         */ 
> MT> >>>>>>> +        struct in_conninfo inc_copy = *inc;
> MT> >>>>>>> +
> MT> >>>>>>> +        inc_copy.inc_fport = 0;
> MT> >>>>>>> +        inc_copy.inc_lport = 0;
> MT> >>>>>>> +
> MT> >>>>>>> + return (tcp_keyed_hash(&inc_copy, V_ts_offset_secret));
> MT> >>>>>>> }
> MT> >>>>>>> 
> MT> >>>>>>> /*
> MT> >>>>>>> 
> MT> >>>>>>> In any case, the solution of the uptime leak, implemented in 
> rev338053 is 
> MT> >>>>>>> not going to suffer, because a supposed attacker is currently 
> able to use 
> MT> >>>>>>> any fixed values of SP and DP, albeit not 0, anyway, to remove 
> them out 
> MT> >>>>>>> of the equation.
> MT> >>>>>> Can you describe how a peer can compute the uptime from two 
> observed timestamps?
> MT> >>>>>> I don't see how you can do that...
> MT> >>>>> 
> MT> >>>>> Supposed attacker could run a script that continuously monitors 
> timestamps,
> MT> >>>>> for example via a periodic TCP connection from a fixed local port 
> (eg 12345) 
> MT> >>>>> and a fixed local address to the fixed victim's address and port 
> (eg 80).
> MT> >>>>> Whenever large discrepancy is observed, attacker can assume that 
> reboot has 
> MT> >>>>> happened (due to V_ts_offset_secret re-generation), hence the 
> received 
> MT> >>>>> timestamp is considered an approximate point of reboot from which 
> the uptime
> MT> >>>>> can be calculated, until the next reboot and so on.
> MT> >>>> Ahh, I see. The patch we are talking about is not intended to 
> protect against
> MT> >>>> continuous monitoring, which is something you can always do. You 
> could even
> MT> >>>> watch for service availability and detect reboots. A change of the 
> local key
> MT> >>>> would also look similar to a reboot without a temporary loss of 
> connectivity.
> MT> >>>> 
> MT> >>>> Thanks for the clarification.
> MT> >>>>> 
> MT> >>>>>>> 
> MT> >>>>>>> There is the list of example hosts that we were able to reproduce 
> the 
> MT> >>>>>>> issue with:
> MT> >>>>>>> 
> MT> >>>>>>> curl -v http://88.99.60.171:80
> MT> >>>>>>> curl -v http://163.172.71.252:80
> MT> >>>>>>> curl -v http://5.9.242.150:80
> MT> >>>>>>> curl -v https://185.134.205.105:443
> MT> >>>>>>> curl -v https://136.243.1.231:443
> MT> >>>>>>> curl -v https://144.76.196.4:443
> MT> >>>>>>> curl -v http://94.127.191.194:80
> MT> >>>>>>> 
> MT> >>>>>>> To reproduce, call curl repeatedly with a same URL some number of 
> times. 
> MT> >>>>>>> You are going  to see some of the requests stuck in 
> MT> >>>>>>> `*    Trying XXX.XXX.XXX.XXX...`
> MT> >>>>>>> 
> MT> >>>>>>> For some reason, the easiest way to reproduce the issue is with 
> nc:
> MT> >>>>>>> 
> MT> >>>>>>> $ echo "foooooo" | nc -v 88.99.60.171 80
> MT> >>>>>>> 
> MT> >>>>>>> Only a few such calls are required until one of them is stuck on 
> connect():
> MT> >>>>>>> issuing SYN packets with an exponential backoff.
> MT> >>>>>> Thanks for providing an end-point to test with. I'll take a look.
> MT> >>>>>> Just to be clear: You are running a FreeBSD client against one of 
> the above
> MT> >>>>>> servers and experience the problem with the new timestamp 
> computations.
> MT> >>>>>> 
> MT> >>>>>> You are not running arbitrary clients against a FreeBSD server...
> MT> >>>>> 
> MT> >>>>> We are talking about FreeBSD being the client. Peers that yield 
> this unwanted
> MT> >>>>> behaviour are unknown. Little bit of tinkering showed that some of 
> them run 
> MT> >>>>> Debian:
> MT> >>>>> 
> MT> >>>>> telnet 88.99.60.171 22
> MT> >>>>> Trying 88.99.60.171...
> MT> >>>>> Connected to 88.99.60.171.
> MT> >>>>> Escape character is '^]'.
> MT> >>>>> SSH-2.0-OpenSSH_6.7p1 Debian-5+deb8u3
> MT> >>>> Also some are hosted by Hetzner, but not all. I'll will look into
> MT> >>>> this tomorrow, since I'm on a deadline today (well it is 2am tomorrow
> MT> >>>> morning, to be precise)...
> MT> >>> 
> MT> >>> Thanks a lot, I would appreciate that.
> MT> >> Hi Paul,
> MT> >> 
> MT> >> I have looked into this.
> MT> >> 
> MT> >> * The FreeBSD behaviour is the one which is specified in the last 
> bullet item
> MT> >>  in https://tools.ietf.org/html/rfc7323#section-5.4
> MT> >>  It is also the one, which is RECOMMENDED in
> MT> >>  https://tools.ietf.org/html/rfc7323#section-7.1 
> MT> >> 
> MT> >> * My NAT box (a popular one in Germany) does NOT rewrite TCP 
> timestamps.
> MT> >> 
> MT> >> This means that the host you are referring to have some sort of 
> protection,
> MT> >> which makes incorrect assumptions. It will also break multiple hosts 
> behind
> MT> >> a NAT.
> MT> >> 
> MT> >> I can run
> MT> >> curl -v http://88.99.60.171:80
> MT> >> in a loop without any problems from a FreeBSD head system. I tested 
> 1000
> MT> >> iterations or so. The TS.val is jumping up and down as expected.
> MT> >> I'm wondering why you are observing errors in this case, too.
> MT> >> 
> MT> >> However, doing something like
> MT> >> echo "foooooo" | nc -v 88.99.60.171 80
> MT> >> triggers the problem.
> MT> >> 
> MT> >> So I think there is some functionality (in a middlebox or running on 
> the host),
> MT> >> which incorrectly assume monotonic timestamps between multiple TCP 
> connections
> MT> >> coming from the same IP address, but only in case of errors at the 
> application layer.
> MT> > 
> MT> > Yeah, exactly, some hosts seem to enable this only in case of an error 
> in HTTP
> MT> > communication (some smart proxy?). However, there are some that behave 
> this way
> MT> > regardless of errors, for example these:
> MT> > 
> MT> > curl -v https://185.134.205.105:443
> MT> > curl -v https://136.243.1.231:443
> MT> Wireshark sees an Encrypted Alert in both cases. So I guess this is 
> another indication
> MT> of "error at the application layer".
> MT> > 
> MT> >> 
> MT> >> Do you have any insights whether the hosts you are listed share 
> something in
> MT> >> common. Some of them are hosted by Hetzner, but not all.
> MT> > 
> MT> > Nope. A whole set of endpoints that we have detected so far is pretty 
> diverse,
> MT> > containing a lot of different locations geographically, as well as 
> different
> MT> > hosters.
> MT> OK. Thanks for the clarification.
> MT> > 
> MT> >> 
> MT> >> I think in general, it is the correct thing to include the port 
> numbers in
> MT> >> the offset computation. We might add a sysctl variable to control the 
> inclusion.
> MT> >> This would allow interworking with broken middleboxes.
> MT> > 
> MT> > Yeah, I completely agree that these rare cases should not dictate the 
> implementation.
> MT> > But an ability to enable a work-around via sysctl would be greatly 
> appreciated.
> MT> > Currently we are unable to roll-out the upgrade across all servers 
> because of this
> MT> > issue: even though it happens not so often, a lot of requests from our 
> users 
> MT> > get stuck or fail all together. For example, a host 185.134.205.105 is 
> a kind of
> MT> > social network that our proxy servers connect to so securely access to 
> content,
> MT> > such as images, on behalf of our users.
> MT> > 
> MT> >> 
> MT> >> Please note, this does not fix the case of multiple clients behind a 
> NAT.
> MT> > 
> MT> > Yeah, that's true. Fortunately we don't use NAT.
> MT> > 
> MT> >> 
> MT> >> I'm also trying to figure out how and why Linux and Windows are 
> handling this.
> MT> > 
> MT> > Thanks for bothering!
> MT> Will let you know what I figure out.
> MT> 
> MT> Best regards
> MT> Michael
> MT> > 
> MT> >> 
> MT> >> Best regards
> MT> >> Michael
> MT> >> 
> MT> >>> 
> MT> >>>> 
> MT> >>>> Best regards
> MT> >>>> Michael 
> MT> >>>>> 
> MT> >>>>> 
> MT> >>>>>> 
> MT> >>>>>> Best regards
> MT> >>>>>> Michael
> MT> >>>>>> 
> MT> >>>>>> 
> MT> >>>> 
> MT> >>>> 
> MT> >> 
> MT> >> 
> MT> 
> MT> _______________________________________________
> MT> freebsd-net@freebsd.org mailing list
> MT> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> MT> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Issues with TCP Timestamps allocation

Reply via email to