RE: over-tolerant code in run_file_install?

Bert Huijben Tue, 21 May 2013 03:03:16 -0700


> -----Original Message-----
> From: Mattias Engdegård [mailto:matti...@bredband.net]
> Sent: maandag 20 mei 2013 22:23
> To: Subversion Development
> Cc: rhuij...@collab.net
> Subject: over-tolerant code in run_file_install?
> 
> When trying to consolidate file copying into svn_io_copy_file, I ran
> into the code around workqueue.c:590 which derailed my plans somewhat:
> 
>      /* With a single db we might want to install files in a missing
> directory.
>         Simply trying this scenario on error won't do any harm and at
> least
>         one user reported this problem on IRC. */
> 
> Apparently added in r1142088, the exact motivation behind the code
> remains somewhat unclear. Does anyone remember precisely what
> prompted
> it, and what, if anything, it is needed for today? (The "won't do any
> harm" and "at least one user reported this problem on IRC" parts are
> red flags here; those phrases suggest that the code attempts to do
> more than promised, and that's almost as bad as not doing enough.)


Before we had a single DB we stored the workqueue items per directory. (We
had a wc.db per directory, just like the old entries). If the workqueue
contained an item that specified to install an item in the directory we
could safely assume that the directory existed. (Otherwise it would
certainly not contain a .svn/wc.db)

During release testing of 1.7 we encountered more missing directory
scenarios then our testsuite tried.
(It was designed in a pre single-db world where this was impossible)

So we added this automatic creation of directory.

> The reason why it is troublesome is that it makes it hard to use
> svn_io_copy_file for the "atomic copy" (copy to temporary + rename)
> pattern in run_file_install, since the "helpful" code attempts to
> create some missing directories if the final rename fails, and then re-
> try the rename.

The workqueue implements a different way of atomic behavior: we store the
operation until it succeeded. Every operation can be retried infinitely. And
only after it succeeded and the operation is marked completed we go to the
next operation.

For the install operation the common case is installing from the pristine
store (not the common copy), and we apply filtering streams.


> 
> But we want svn_io_copy_file to take care of it, because it already
> does an atomic copy, and it appears fairly clearly to be the right
> place for any file copy improvements. Of course we could use
> svn_io_copy_file for an atomic copy to a temporary, and then a second
> rename to the final destination, but that would be wasteful and silly.

Looking at my performance traces I doubt that there will be actual wall
clock time to be won by optimizing the file copy. The delays there are 100%
in the disk IO and we can't make disks spin harder than they do.

I tried the same thing a few years ago when I had a less advanced profiler
than I have now... (sorry)


In most cases our working copy uses streams for operations and if possible
we try to do many things at once: keyword expansion, newline normalization,
hash calculation, etc. etc.
(And it also allows different backend implementations, such as a compressed
backend)

We can certainly optimize the single copy case (but that fix would really
belong in APR, our platform abstraction layer), but then for many cases we
would have to introduce another copy.

Things that really make a difference are reducing the number of file
accesses, such as the three-file compare introduced by Markus Schaber some
time ago. For files that don't fit the OS file caching this makes a huge
difference.

> 
> Why improve file copying? Because it consumes a fair bit of time and I/
> O bandwidth of many operations, such as checkouts and updates. In
> addition, exact file copies account for almost half the disk space
> used by a working copy, and being able to use fast and space-
> conserving COW copies where available would be nice.
> 
> There are also possibilities of server-side file copies (NFSv4.2,
> SMB2) as well as new client-side file copying APIs (win32 has
> CopyFileEx; copy_range or something similar may be coming to Linux).

For most operations we assume a working copy to be local and access to be
cheap. And my profile runs show that for working copy file access we are
100% IO bound. There is not much to gain in actual time to optimize here

In almost every working copy operation we do far more than copying: We
calculate hashes, normalize newlines, etc. And this all during the primary
copy/compare operations.

Like I mentioned before: moving away from this pattern and using additional
copies will add to the overall IO required for the operations. If we want to
gain speed it is in doing more things at once with the same IO, not by
adding more IO operations.



A thing that might help and I haven't investigated yet, is keeping file
handles open a bit longer.
SMB2 has op-locks that -when kept- allow remote filesystems to be cached at
client computers. If the file changes on the backend then the server tells
the client that it should flush its cache. When used properly this can avoid
reading files multiple times.

        Bert

RE: over-tolerant code in run_file_install?

Reply via email to