Re: [PATCH] Server-transmitted "final SHA1 checksums" (Was: "large number of large binary files in subversion")

Arwin Arni Tue, 24 May 2011 08:54:05 -0700

On Tuesday 24 May 2011 09:00 PM, C. Michael Pilato wrote:

On 05/24/2011 04:23 AM, Stefan Sperling wrote:

On Tue, May 24, 2011 at 12:04:26PM +0530, Arwin Arni wrote:

Why can't we send the recorded checksum from the server instead of
sending the whole file and then calculating it on the client side?


If the checksum matches one of the pristine files, then use that to
populate the nodes table. If there is no match, only then do we
spool to a temporary file and what not.

This seems like a straightforward idea. Any pitfalls to this approach?

The pitfall is that you need to change the client<->server interface.

The current interface at subversion/include/svn_delta.h only transmits
checksums before and after applying deltas. First it transmits the checksum
of the expected base text for the delta (apply_textdelta), and then,
after the delta has been applied, it sends the expected checksum of the
content resulting from application of the delta to the base (close_file).
This interface doesn't prevent data from being transmitted.

Heheh... I'm sitting on a patch which transmits the "final sha1 checksum"
*before* file contents would be transmitted over DAV.  Something I whipped
up before hopping on the plane out of Berlin.


Nice :)

I was hoping to slip in
support for avoiding those text transfers altogether where possible.  But I
ran into the obvious problems with the editor interface.  (Also, I became
concerned about the possibility of a race condition in the pristines table
-- client says "Yep, I've got that text already!", edit continues while
simultaneously in some other part of the tree that text is removed, edit
goes to reference the supposedly redundant checksum and it ain't there no mo'.)

Can't we increase the ref-count of the pristine node on getting the<add-file...> element... this way, we need not worry about a parallel<delete-file..> (IIUC delete-file will only reduce the ref-count, andthe actual removal of the pristine node is taken care of after theentire editor drive.. atleast.. this is the way it is in my head)

I held off on committing my RA work because there were no consumers.  But
I've still considering making the change because a) there's no penalty for
doing so, and b) maybe if we add the requisite client-side magic in 1.8, 1.7
servers would already be advertising the sha1 checksum that code would need.

Thoughts?

I'll attach my patch which, if accepted in 1.7, should probably be expanded
to cover svnserve, too.

Re: [PATCH] Server-transmitted "final SHA1 checksums" (Was: "large number of large binary files in subversion")

Reply via email to