On Tue, Jun 12, 2012 at 9:37 AM, Datty <datty....@gmail.com> wrote:
> On Tue, Jun 12, 2012 at 2:21 PM, Michael Mol <mike...@gmail.com> wrote:
>> On Jun 12, 2012 8:59 AM, "Datty" <datty....@gmail.com> wrote:

[snip]

>> More detail later...but make sure your vpn link is not TCP. UDP, fine,
>> IP-IP, fine, but not TCP. TCP transport for a VPN tunnel leads to ugly
>> traffic problems.

> Ah it is TCP at the moment. Not something I could change too easily either.
> Is it possible to work around or is it not worth fighting with?

If all of these cases are true:

* You only have TCP traffic going over that VPN
* You don't have any latency-sensitive traffic going over that VPN (no
VOIP, no interactive terminal sessions and you won't pull your hair
out over 10s or more round-trips slowing down page loads)
* You don't have large bulk data transfers going over that VPN (my
best example of personal experience here was trying to locally sync my
work-related IMAP mailbox)

...then it's not worth fighting with.

It's very unlikely you fall in that camp.

The problem of TCP VPN transport is a confluence of three issues:

1) You're likely to have packet loss underneath that transport due to
things like congestion...the TCP transport will hide this from
tunneled traffic and retransmit itself.
2) In TCP, Nagle's Algorithm handles flow throttling, but it depends
on detecting packet loss to limit how many packets it pushes.
3) Your VPN endpoint will very probably buffer a very large amount of
data for sending if its TCP transport link is acting slow.

Here's what happens:

1) Your TCP app on your computer opens a connection with a remote
host. This connection is encapsulated inside your TCP OpenVPN tunnel.
2) Your app's TCP connection starts exchanging data. For as long as
it's not losing any packets, it figures it can send more and more data
without waiting for a response; this is Nagle's Algorithm managing
your TCP sliding window, and it's why TCP can scale from dial-up
speeds up to 10g ethernet.
3) Your VPN link's TCP connection experiences packet loss. Maybe it's
because of a congested router between you and the remote side of the
VPN, or maybe it's because someone's ADSL connection is pushing more
than its measly 768Kb/s upstream speed allows for. Or maybe it's noise
on the copper causing packet loss on the ADSL link. Or maybe it's a
frame collision on the PPoE link.

...time passes...

4) Your VPN link's TCP stack notes the packet loss and retransmits the
lost packet until the packet gets through.
5) The connection traffic from step (2) is completely unaware that the
VPN's TCP connection is fielding packet loss issues for it, and
Nagle's Algorithm figures, 'hey, this is a high-bandwith link! Let's
shove more data!'
6) OpenVPN link is now receving data it can't stuff into the pipe
right this second, so it buffers it for a moment, and then sends it
when its turn has come. Still, no data is lost.

...time passes...

7) Steps 4-6 repeat themselves, causing your original connection to
become more and more confident about the bandwidth of the pipe.

...time passes...

8) The connection from step 2 is now so confident of the connection
speed of the pipe, it's pushing data to OpenVPN faster than OpenVPN
could conceivably push out, even if there were no packet loss issues.
You've now got a cycle of just steps 5 and 6.

Presumably, you'd eventually hit OpenVPN's buffer limit. I don't know
what that is, and I don't think it's tuneable. The one time I
personally saw, measured and helped diagnose this, I was getting up to
a *fifteen minute* round-trip ping time over the VPN, even though the
round-trip time for ping outside the VPN between the VPN endpoints was
only about 100ms. Watching that round-trip time climb was surreal
until I figured out what was happening.

Switching the VPN transport to UDP allowed the tunneled connections'
TCP stacks to properly gauge and react to available throughput. Even
SIP started working over that VPN link.

-- 
:wq

Reply via email to