Folks, Thanks for your replies. So, in principle, I should not expect any problems. The machine would be a decent one-core Athlon3500+ with 2GB RAM, doing nothing else other than serving bugzilla, reviewboard and mediawiki with lighttpd, and the repo(s) is/are on a permanently mounted USB disk. Network throughput would not be the issue since the working copies would be on the same machine.
I am aware that operations like copy, move, update, commit would take time and/or space, but these files and their locations rarely change. I was more concerned about the likelihood of repository corruption. But this does not seem to be of any concern. Thanks again. - Winston ---------------------------------------- > Date: Tue, 24 May 2011 12:04:26 +0530 > From: ar...@collab.net > To: dev@subversion.apache.org > Subject: Re: large number of large binary files in subversion > > On Tuesday 24 May 2011 12:58 AM, Stefan Sperling wrote: > > On Mon, May 23, 2011 at 11:07:50PM +0400, Konstantin Kolinko wrote: > >> In svn 1.7 there is pristine storage area in the working copy, where > >> all present files are stored by their checksums. If I understand this > >> pristine storage correctly, if you move a file remotely on the server > >> (svn mv URL URL) then when you update your working copy and both old > >> and new paths are in the same working copy, Subversion will find the > >> file in its pristine storage and won't re-download it over network. If > >> what I wrote is true (I have not verified whether this actually works > >> this way, but I have some hopes), > > > > Unfortunately, that's not how it works. > > > > When a new file is added during an update, the entire file content is > > first spooled to a temporary file to calculcate its checksum. > > If a pristine with the same checksum is already present, the temporary > > file is deleted. > > > > (see pristine_install_txn() in subversion/libsvn_wc/wc_db_pristine.c) > Why can't we send the recorded checksum from the server instead of > sending the whole file and then calculating it on the client side? > > If the checksum matches one of the pristine files, then use that to > populate the nodes table. If there is no match, only then do we spool to > a temporary file and what not. > > This seems like a straightforward idea. Any pitfalls to this approach?