(sorry for the delay; didn't want to reply while sleepy) Bert Huijben wrote on Tue, Aug 17, 2010 at 09:30:08 -0700: > > > > -----Original Message----- > > From: Ramkumar Ramachandra [mailto:artag...@gmail.com] > > Sent: dinsdag 17 augustus 2010 9:09 > > To: Daniel Shahaf > > Cc: Subversion-dev Mailing List > > Subject: Re: svnrdump: The BIG update > > > > Hi Daniel, > > > > Daniel Shahaf writes: > > > Ramkumar Ramachandra wrote on Thu, Aug 12, 2010 at 12:17:34 +0530: > > > > > > The dump functionality is also complete- thanks to Stefan's review > > and > > > > > > MANY others for cleaning it up. It's however hit a brick wall now > > > > > > because of missing headers in the RA layer. Until I (or someone > else) > > > > > > figures out how to fix the RA layer, we can't do better than the > XFail > > > > > > copy-and-modify test I've committed. > > > > > > > > > > Part of the diff there is lack of SHA-1 headers --- which is > unavoidable > > > > > until editor is revved --- but part of it is a missing > Text-copy-source- > > md5. > > > > > Why don't you output that information --- doesn't the editor give it > to > > you? > > > > > > > > Afaik, no. I don't see Text-copy-source-* anywhere in the RA > > > > layer. Maybe I'm not looking hard enough? > > > > > > > > > > Hmm. It seems you're right. So you might have to use two RA session in > > > parallel... > > > > > > (and then, you might have to have the user authenticate twice?) > > > > Hm, I also have to find out if it's allowed. The commit_editor doesn't > > allow it for instance. Besides, it's a very inelegant solution- I'd > > rather fix the RA layer than do this. > > @Daniel, what would adding these adders add? > > The extra headers are for making it easier to detect corruptions by checking > them along the transfer. > > If we are just doing additional work to add headers via a different process > it slows the dumping down more than a bit and it doesn't make the dump file > any safer because it uses a different processes to obtain the header. > I think you would have to obtain the source of the copyfrom and get some > checksum from that; maybe you can do that without transferring the file > again, but I'm not sure about that. >
I'm a bit surprised, but indeed I don't see a way to obtain the checksum via svn_ra.h. (The word 'checksum' doesn't appear there, and it isn't included in svn_dirent_t either.) I wonder how we got away without having it... > (And without the added headers the process is already as safe as svnsync.). > > Yes, we can add more and more processing to also get those new Sha1 headers > by recalculating them while dumping, but the idea for svnrdump was to create > a fast and secure way to dump and load repositories... not an incredible > slow one that has to transfer files multiple times just to make all the > optional headers match the output of svnadmin. > > Those headers were made optional for a reason: you don't always have them. > And different conversion processes have different headers available. > Svnadmin looks at the FS layer for dumping, so it sees different things than > an RA layer api. E.g. the dump in svnadmin has to create diffs from > fulltexts itself, while svnrdump has diffs and must apply these itself to > get full texts. The checksums have a similar mangling. The FS has access to > some of the checksums and recalculates others for you. (See the performance > drop in 1.6 of svnadmin dump) > Okay, agreed. I assumed the editor would provide the copyfrom's checksum for free (or, at least, that svn_ra_stat() would provide it), but of course I won't suggest to add those copyfrom-checksum headers if calculating them is as expensive as it now appears to be. > There is a similar case at the import side. Applying commits can't check all > the checksums, but the really important ones are already handled. Svnrdump > dump and svnrdump load are a nice match. > > Bert > Thanks for doubting, Daniel