Re: Merging parallel-put to /trunk

Stefan Fuhrmann Sat, 30 Jan 2016 04:50:33 -0800

On 29.01.2016 22:35, Evgeny Kotkov wrote:

Stefan Fuhrmann <stefanfuhrm...@alice-dsl.de> writes:

This branch adds support for HTTP2-style transactions to all backends.
The feature is opt-in and supported in all 3 FS implementations.

I'd like to merge this to /trunk next weekend, so that e.g. Bert can
continue the work on the network layer. Because it is opt-in, removing
the feature later should be simple - in case we find a major problem
with it.

Any objections?

I have several concerns about the particular design and implementation:

(1) Requiring twice more disk writes and space

   Changes that don't fit into memory require twice more disk writes and space
   in the parallel mode, because data is first spilled to a temporary file, and
   then appended to the protorev file.  Even if we negotiate the use of this
   feature, and only enable it for HTTP/2 capable clients and servers, that
   might still be slower for every case except when committing over a
   high-latency network.


The extra temporary space is not a concern:
Your server would run out of disk space just
one equally large revision earlier than it does
today.

In practice, representations are small enough
to never be spilled to disk. If I/O was a concern,
we would not use that scheme checkouts over
HTTP. Note that the extra traffic on the client
side is one order of magnitude higher than
what parallel puts add to the server side even
for large files (plain text vs. delta + compression).

So, I don't think this will be an issue at all.

(2) Making the feature opt-in

   Since the feature is opt-in and disabled by default, we would be adding
   another layer to the amount of different code execution paths that we need
   to support and test (FS formats, addressing, parallel / non-parallel writes).
   And we cannot enable it by default due to (1).

Fair enough. We can make the feature mandatory.
After all, this fixes a regression vs, BDB.

The downside is that parallel puts require extra
locking and must do without the DAG cache for
in-txn nodes. That overhead seems to be small,
though. I just ran of the first 100k revs of the
freebsd repo into a repo on ramdisk and got less
than a 10% CPU overhead: 305s (212s user, 93s
sys) instead of 281s (198 user, 83 sys).

Shall I just enable the feature unconditionally?

(3) Opening a denial of service attack vector

   In the parallel mode, a server holds up to 0.5 MB of data in memory per
   every change.  Combined with the fact that writes to the protorev files
   are serialized via rev-lock, an evil doer could be able to exhaust all
   available memory on the server by issuing a huge PUT request, waiting for
   the moment when the rev-lock is acquired and spamming with a big amount of
   smaller PUT requests.  The latter requests would block until the rev-lock
   file is released, but the associated data is going to be kept in memory.
   Alternatively, this could lead to a starvation between httpd workers, where
   most of them are blocked waiting for a single PUT to complete and are
   locking out other clients.

How is that different from today?

As explained in the docstring to the text delta
window size is 100k, of which you need two. So,
your evildoer only needs to open many txns and
begin a single PUT request in each of them?  Note,
that it is sufficient if the modified file is 100+k
full text size in the repo and that the delta may
effectively be empty. So, no bandwidth etc. is
required.

Using BDB, the attack should already be possible
and may only be prevented by something in the
HTTP processing code. In the future, mod_dav_svn
(or some other module) could limit the number of
concurrent PUT requests per connection.

If you are truly concerned about memory usage
with commits, we can reduce the in-memory
buffer size of e.g. 4x16k which is still plenty in
most scenarios. Note that this is 64kB actual
representation size after deltification and
compression.

(4) Being incompatible with released versions of Subversion

   Support for parallel writes is implemented without a format bump, and can
   result in failing commits or data corruption due to mismatching concurrency
   settings.  Consider an environment with a reverse proxy doing load balancing
   between 1.9 and 1.10 mod_dav_svn servers that work with the same repository.
   A client performs two parallel PUT requests, and each of them is rerouted to
   each of the servers.  Consequently, either the 1.9 server fails to acquire
   a nonblocking rev-lock and the commit fails, or, in a racy scenario, the
   file/offset manipulation logic in transaction.c:rep_write_contents_close()
   results in a corrupted revision.

This setup will produce random error messages
for the concurrent-put-enabled clients, because
the 1.9 code will preventing concurrent puts. So,
your corrupting setup will not remain silent for
long.

Two solutions:

A: Don't enable concurrent writes in that setup.
   That requires them to be optional which they
   currently.
B: Make the feature dependent on a format bump.
   That, too, means the feature is optional.

I'd be find with either approach. Which one do you
prefer? How does that relate to (2)?

-- Stefan^2.

Re: Merging parallel-put to /trunk

Reply via email to