libsvn_ra_serf: options.c ra_serf.h serf.c util.c

Ben Reser Mon, 08 Jul 2013 18:53:09 -0700

First of all a bit of top posting here because I want to paint my
response to Greg and then ultimately Ivan's email with a bit of
background.  In the process of preparing to do the 1.8.1 release I
spent quite a bit of time reviewing nominated changes.  I'll admit I
hadn't paid too much attention to this issue while I was on vacation
or when I got back last week.  It seemed like there was a agreed upon
solution.  What I didn't realize is that we'd agreed upon an option
named "busted-proxy" and agreed upon forcing users to set an option in
order to work in these circumstances.

I'm replying to Greg's and Ivan's emails because after re-reading this
entire thread these two emails present the two essential arguments for
the option to exist at all.

1) That the proxies are busted and that having an option named
busted-proxy we will encourage proxy administrators to fix the
problem.  Greg's email makes this argument.

2) That there is a performance cost of making the extra request and
that the request is too high of a cost for users that don't have these
proxies in their way.  Ivan's email makes this argument.

Most people on this thread have argued that we should automatically
detect this behavior and adjust accordingly.  The above are the two
strongest arguments for the option.

On Fri, Jun 28, 2013 at 8:37 PM, Greg Stein <gst...@gmail.com> wrote:
> I'm not seeing the point. Subversion will (now) work, but we still view the
> proxy as busted. It doesn't provide the performance characteristics that
> Subversion wants and expects. Note that Subversion is built to work against
> mod_dav_svn which is HTTP/1.1 with chunked requests. The interposition of a
> proxy that disables chunked requests... busted.

Greg's argument here mostly depends on the idea that Subversion is
built to work against mod_dav_svn and any proxy or implementation that
doesn't implement HTTP/1.1 with chunked requests support as
mod_dav_svn does is busted.  I find this argument very weak and
unconvincing.  Especially since this assumes that mod_dav_svn is the
only server side implementation of our protocol.  We know this is not
the case.  GitHub has their own implementation of the DAV server side
in order to support Subversion clients checking out of git repos
hosted there.  I'm sure there are others out there we aren't
necessarily even aware of (I can think of at least one more but I'm
not going to go into the details since it's not important here).

If we fail to follow the standard those sorts of implementations are
the things we risk losing as a project.  In this case I don't think
GitHub's functionality breaks because of this issue.  But let's say
that their implementation for some reason chose not to support chunked
requests?  If we started breaking GitHub's server implementation with
our clients on a regular basis because we wanted to require the use
features of HTTP that we have not required in the past what do you
suppose they would do?  I'd expect that they'd probably eventually
give up on bothering to support our client.

I see this as a turning point for this project.  If we go down this
road, then even our DAV protocol is not following a standard.  I know
we threw out DeltaV with HTTPv2, but the distinction here is that we
were backwards compatible.  In this case we are not backwards
compatible with what we supported in Subversion 1.7.x because we did
not fail when proxied through a proxy that refused chunked requests.

While I think it's been sufficiently proven elsewhere in this thread
I'll make the argument that we are not HTTP/1.1 compliant if we don't
handle this since part of my argument above assumes that we're not
compliant if we don't handle this case.

The draft RFC for improving the HTTP/1.1 standard, says that you don't
need to support chunked requests to be HTTP/1.1 compliant.  Kevin
linked it earlier but here it is again:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.3.3

The important text is
[[[
   A server MAY reject a request that contains a message body but not a
   Content-Length by responding with 411 (Length Required).
]]]

The chair of the working group posted this:
http://www.mnot.net/blog/2011/07/11/what_proxies_must_do

Obviously, it would be nice if everyone supported chunked requests.
But the reality is that hasn't fully happened.  RFC 2616 left the door
open to HTTP/1.1 servers to not support chunked requests by providing
the 411 status response.  Eventually the HTTPbis draft will become an
RFC and these ambiguities will be entirely removed.  As such I assert
that the HTTP/1.1 standard does allow this behavior and that these
proxies are not broken, busted, non-compliant or whatever phrase you
want to use to that effect. (Yes I originally thought chunked was
required, but the above two links have convinced me otherwise).

It's hard to argue that we're following the HTTP/1.1 standard when our
behavior when proxied through a server that is exercising it's option
to not implement chunked requests is to fail.  A fundamental feature
of HTTP is negotiation to prevent these sorts of failures so that
everyone can independently implement the standard.

> "Prefer" is not the same as "must" :-)

This recommendation is a suggestion to avoid the very problem we're
talking about here.

Specifically it says (again from the IETF link above):
[[[
   Unless a transfer coding other than chunked has been applied, a
   client that sends a request containing a message body SHOULD use a
   valid Content-Length header field if the message body length is known
   in advance, rather than the chunked transfer coding, since some
   existing services respond to chunked with a 411 (Length Required)
   status code even though they understand the chunked transfer coding.
   This is typically because such services are implemented via a gateway
   that requires a content-length in advance of being called and the
   server is unable or unwilling to buffer the entire request before
   processing.
]]]

This is flat out the "be liberal with what you accept and conservative
with what you send policy".  We've now been bitten by the very issue
that this paragraph exists to warn us of.  Are we really so arrogant
as to suggest that we should ignore this advice?

> In our current model, we prefer chunked. But there is a pretty
> straightforward extension to serf's bucket model that should allow us to use
> C-L in many situations. We *might* be able to do that in a serf 1.x, but I'm
> not sure. Worst case, we'll add the feature in serf 2.x.

Ivan has managed to implement this on Serf trunk.  It seems that the
plan is now to include it in Serf 1.4.x which Ivan says should be out
in a month (not holding my breath but still relatively soon).

Given this development I'm wondering if the option makes sense at all.
 Shouldn't we just eat the extra round trip to detect this issue and
then when we build against Serf 1.4.x simply stop issuing the extra
request to detect it?

Which brings us to Ivan's argument, which is that the cost is too
expensive to just detect this always.

On Tue, Jun 25, 2013 at 3:21 PM, Ivan Zhakov <i...@visualsvn.com> wrote:
> Please note that this extra request is per session and currently we
> create many sessions even during one operation. And I'm also not happy
> to make performance worse for users who doesn't use reverse proxies
> and etc.

First and foremost I'd argue that this is not backed by any
performance measurements.  We have the feature implemented.  How bad
does it really impact performance.

We limit RA serf to a max of 8 connections, but with things like
svn:externals we open new connections and then close them, so 8 may
not be the absolute worst case scenario.
[[[
#define SVN_RA_SERF__MAX_CONNECTIONS_LIMIT 8
]]]

We also require that we can make at least 2 connections to the server.
 So it's hard to say what the typical behavior would be across so many
different configurations.  But worst case scenario is always going to
be number_connections_opened * round_trip_time (RTT).

The corporate users that Ivan is worried about that are likely to have
no issues here because they have no such proxy in their way are the
ones likely to have the lowest RTT.

So let's say you have a RTT of 500ms, that means your command is going
to be delayed by 4 seconds assuming 8 connections, it also means that
you're going to have a pretty bad time using Subversion anyway.  Yes
that sucks, but I doubt very many of our commands actually make use of
all 8 of those connections, remember we only actually require 2.
Right now from the US I have a RTT to svn.eu.apache.org of about 180ms
and to svn.us.apache.org of about 20ms, meaning worst case scenario
for those is a delay of 1.4 seconds and 160ms.

I have a very hard time believing that the above is worth the concern.
 We should implement this detection behavior by default.

Given Ivan's work on Serf to prefer sending Content-Length, we can
disable it if we're built against a Serf that behaves this way.

I think that we should defer from adding an option to manipulate this
behavior until such a time that we find significant performance
issues.  Especially if down the line we're not going to need it
because we can prefer CL.

If we feel that we MUST have the option then I think a different name
should be selected.  I realize that Greg pushed for this name in order
to encourage proxy admins to fix their proxies to support chunked
requests.  But that name seems like a huge mistake in light of the
trend towards HTTP/1.1 explicitly allowing servers to behave this way.

Ignoring any issues with the busted part of "busted-proxy" it also
doesn't follow our naming conventions.  We have several http-proxy-*
options.  I realize these are for client forward requests, but if
we're going to add an option we should follow our naming conventions,
meaning at a minimum http- should be prefixed.  If we must stay with
this "busted-proxy" name then "http-proxy-busted" is a better match to
what we have.  However, i think that risks confusing users who think
it only applies when they have a forward proxy configured.

I'd suggest http-detect-chunking with a description of "Detect if the
server supports chunked requests.  This may be necessary for use when
a proxy does not support chunked requests.  By default this is off
because mod_dav_svn always supports chunked requests and detection of
this hurts performance."

I'll look some more tomorrow into other options here with respect to a
better way to detect this but I'm having a hard time believe are
options are anything other than 1) extra request, 2) don't use chunked
requests, 3) punt on supporting these proxies.

Re: svn commit: r1495419 - in /subversion/trunk/subversion/libsvn_ra_serf: options.c ra_serf.h serf.c util.c

Reply via email to