> -----Original Message----- > From: Mattias Engdegård [mailto:matti...@bredband.net] > Sent: maandag 20 mei 2013 22:23 > To: Subversion Development > Cc: rhuij...@collab.net > Subject: over-tolerant code in run_file_install? > > When trying to consolidate file copying into svn_io_copy_file, I ran > into the code around workqueue.c:590 which derailed my plans somewhat: > > /* With a single db we might want to install files in a missing > directory. > Simply trying this scenario on error won't do any harm and at > least > one user reported this problem on IRC. */ > > Apparently added in r1142088, the exact motivation behind the code > remains somewhat unclear. Does anyone remember precisely what > prompted > it, and what, if anything, it is needed for today? (The "won't do any > harm" and "at least one user reported this problem on IRC" parts are > red flags here; those phrases suggest that the code attempts to do > more than promised, and that's almost as bad as not doing enough.)
Before we had a single DB we stored the workqueue items per directory. (We had a wc.db per directory, just like the old entries). If the workqueue contained an item that specified to install an item in the directory we could safely assume that the directory existed. (Otherwise it would certainly not contain a .svn/wc.db) During release testing of 1.7 we encountered more missing directory scenarios then our testsuite tried. (It was designed in a pre single-db world where this was impossible) So we added this automatic creation of directory. > The reason why it is troublesome is that it makes it hard to use > svn_io_copy_file for the "atomic copy" (copy to temporary + rename) > pattern in run_file_install, since the "helpful" code attempts to > create some missing directories if the final rename fails, and then re- > try the rename. The workqueue implements a different way of atomic behavior: we store the operation until it succeeded. Every operation can be retried infinitely. And only after it succeeded and the operation is marked completed we go to the next operation. For the install operation the common case is installing from the pristine store (not the common copy), and we apply filtering streams. > > But we want svn_io_copy_file to take care of it, because it already > does an atomic copy, and it appears fairly clearly to be the right > place for any file copy improvements. Of course we could use > svn_io_copy_file for an atomic copy to a temporary, and then a second > rename to the final destination, but that would be wasteful and silly. Looking at my performance traces I doubt that there will be actual wall clock time to be won by optimizing the file copy. The delays there are 100% in the disk IO and we can't make disks spin harder than they do. I tried the same thing a few years ago when I had a less advanced profiler than I have now... (sorry) In most cases our working copy uses streams for operations and if possible we try to do many things at once: keyword expansion, newline normalization, hash calculation, etc. etc. (And it also allows different backend implementations, such as a compressed backend) We can certainly optimize the single copy case (but that fix would really belong in APR, our platform abstraction layer), but then for many cases we would have to introduce another copy. Things that really make a difference are reducing the number of file accesses, such as the three-file compare introduced by Markus Schaber some time ago. For files that don't fit the OS file caching this makes a huge difference. > > Why improve file copying? Because it consumes a fair bit of time and I/ > O bandwidth of many operations, such as checkouts and updates. In > addition, exact file copies account for almost half the disk space > used by a working copy, and being able to use fast and space- > conserving COW copies where available would be nice. > > There are also possibilities of server-side file copies (NFSv4.2, > SMB2) as well as new client-side file copying APIs (win32 has > CopyFileEx; copy_range or something similar may be coming to Linux). For most operations we assume a working copy to be local and access to be cheap. And my profile runs show that for working copy file access we are 100% IO bound. There is not much to gain in actual time to optimize here In almost every working copy operation we do far more than copying: We calculate hashes, normalize newlines, etc. And this all during the primary copy/compare operations. Like I mentioned before: moving away from this pattern and using additional copies will add to the overall IO required for the operations. If we want to gain speed it is in doing more things at once with the same IO, not by adding more IO operations. A thing that might help and I haven't investigated yet, is keeping file handles open a bit longer. SMB2 has op-locks that -when kept- allow remote filesystems to be cached at client computers. If the file changes on the backend then the server tells the client that it should flush its cache. When used properly this can avoid reading files multiple times. Bert