When trying to consolidate file copying into svn_io_copy_file, I ran into the code around workqueue.c:590 which derailed my plans somewhat:

/* With a single db we might want to install files in a missing directory. Simply trying this scenario on error won't do any harm and at least
       one user reported this problem on IRC. */

Apparently added in r1142088, the exact motivation behind the code remains somewhat unclear. Does anyone remember precisely what prompted it, and what, if anything, it is needed for today? (The "won't do any harm" and "at least one user reported this problem on IRC" parts are red flags here; those phrases suggest that the code attempts to do more than promised, and that's almost as bad as not doing enough.)

The reason why it is troublesome is that it makes it hard to use svn_io_copy_file for the "atomic copy" (copy to temporary + rename) pattern in run_file_install, since the "helpful" code attempts to create some missing directories if the final rename fails, and then re- try the rename.

But we want svn_io_copy_file to take care of it, because it already does an atomic copy, and it appears fairly clearly to be the right place for any file copy improvements. Of course we could use svn_io_copy_file for an atomic copy to a temporary, and then a second rename to the final destination, but that would be wasteful and silly.

Why improve file copying? Because it consumes a fair bit of time and I/ O bandwidth of many operations, such as checkouts and updates. In addition, exact file copies account for almost half the disk space used by a working copy, and being able to use fast and space- conserving COW copies where available would be nice.

There are also possibilities of server-side file copies (NFSv4.2, SMB2) as well as new client-side file copying APIs (win32 has CopyFileEx; copy_range or something similar may be coming to Linux).

Reply via email to