Sorry for the long wait, but had a free weekend and none of the site techs got back to me until later today.

On 01/29/16 22:03, Stuart Henderson wrote:
If you have contact with any of the site admins see if they are
running on linux with tcp_tw_recycle=1, I think there is a strong
possibility that they are, and if so then they should fix their
configuration.
I wrote to our contact there and am trying to get the information if
they are using this setting.
I managed to get the information from their server and sadly

net.ipv4.tcp_tw_recycle = 0



Typical Linux behaviour (at least the version I tried) is to use a single
counter for all TCP sessions from the host so it would be more likely to
use 1,2,3 - 7,8,9 - 49,50,51 - 67,68,69.

This isn't required by TCP though - that only needs timestamps *within a
session* i.e src+dest host-port quad - to be increasing. Multiple sessions
are treated separately and can be in any order wrt each other. If I understand
correctly tw_recycle reduces it to just src+dest *host*.

If you have two hosts with the simple behaviour (single counter) going
through a NAT, it doesn't usually touch timestamps so they will be
out of order - maybe 49,50,51 - 67,68,69 - 1,2,3 - 7,8,9. This is
OK as far as TCP goes but breaks with tw_recycle. But in the NAT case
it's usually only noticed if two people from behind the same NAT visit
the site within the TIME_WAIT timeout window.

For a proxy, there is a cutoff. There are two TCP sessions end-to-end,
the packet data are copied across but not headers. The headers are subject
to the proxy's OS's behaviour.

Now... OpenBSD randomizes these per session. A random offset is applied
and stored as part of the TCP state. This is good because it's extra
entropy to help protect against blind spoofing, and avoids leaking
information about the host's uptime. So simplified example you could
have 4 consecutive sessions using 1,2,3 - 49,50,51 - 67,68,69 - 7,8,9 --
and that's ok. In spec for TCP, suggested by the newer RFC, and as you
can see above, it's totally normal for a natted connection to act like
this. It's just that Linux's tw_recycle misfeature gets confused.

If you run the proxy on an OS which doesn't offset timestamps like this
(note that OpenBSD has done this for many years), you won't trigger it,
but run it on OpenBSD and it's easy. You'll also be able to trigger it
by connecting from a single machine with a simple timestamp but running
the connection through a PF nat with the "modulate timestamps" option.

It can be worked around your side. But if you do that the server admins
will likely never fix things (and maybe blame it on OpenBSD) so I'm
reluctant to mention it on list - and that workaround will throttle tcp
for all connections to/from the server, limiting you to about 5Mb max
for transatlantic connections.


Thank you Stuart again for this great explanation of this behaviour.
Sadly as noted above the server doesn't have this option set.

I am currently at a lose and gladly provide more information.

Cheers
Kim

Reply via email to