Stefan Fuhrmann <stef...@apache.org> writes: > The extra temporary space is not a concern: Your server would run out of > disk space just one equally large revision earlier than it does today.
I wouldn't say it is not a concern at all — e.g., in the situation where a user cannot possibly commit a 4 GB file just because doing so now requires at least 8 GB of free disk space. While it might sound like an edge case, this could be important for some of the users. > Shall I just enable the feature unconditionally? I'm not sure about this. The feature has a price, and there are cases when enabling parallel writes has a visible performance impact. Below are my results for a couple of quick tests: (First two tests should be reproducible, since they were performed on an Azure VM; last one was done on a spinning disk in my environment; all tests were executed over https:// protocol.) Importing 2000 files of Subversion's source code: 22.233 → 30.546 s (37% slower) Importing a 300 MB .zip file: 36.650 s → 46.255 s (26% slower) Importing a 4 GB .iso file: 159.372 s → 212.559 s (33% slower) After giving all this topic a second thought, I wonder whether we are heading in the right direction. We aim for a faster svn commit over high-latency networks. In order to achieve that, we try to implement the parallel PUTs, beginning from the FS layer. This leaves a couple of questions: (1) Why do we start with adding a quite complex FS feature, given that we don't know what kind of problems are associated with implementing this in ra_serf? (Can we actually do it? What can be parallelized while keeping the necessary order of operations on the transaction? How do we plug that into the commit editor? As well as that currently HTTP/2 is not officially supported by neither httpd nor serf.) (2) Is making parallel PUTs the proper way to speed up commits? As far as I know, squashing everything into a single POST would make the commit up to 10-20 times faster, depending on the amount of changes. Although there are associated challenges, this approach doesn't require us to deal with concurrency and doesn't introduce a dependency on httpd. How faster is a commit going to be with parallel PUTs? Would that be at least twice faster? Even if yes, that would require us to keep the non-trivial code that is prone to deadlocks and different types of race conditions. For instance, transaction.c is quite complex by itself and already contains a mechanism to *prevent* concurrent writes. Adding a layer that allows concurrent writes *on top of that* makes it even more complex. So, are we sure that we need to implement it this way? Regards, Evgeny Kotkov