This is a bit of an oversimplification. Just cutting up a file does not work. You need a rolling checksum. Else you are going to be unable to detect moves within a file.
Assume a file. 'aaaaabbbbbccccc' Now your chunks are 5 bytes. And you decide to modify the file locally to: 'adaaaabbbbbccccc' Now if you would have 'static' chunks. This will force you to reupload all chunks. Stuff like this is not uncommon. But this is indeed exactly what zsync is. Basically you need to store the zsync file as meta data. Because calculating the checksum on the server is not really a scaleable solution. Long story short. Change detection is not only about changed chunks. It is also about moved data (that did not change). I agree that the blog series is a good place to start on how to design the API. Cheers, --Roeland From: Klaas Freitag <frei...@owncloud.com> To: <devel@owncloud.org> Sent: 22-3-2016 15:16 Subject: Re: [owncloud-devel] GSoC Proposal for Large File Sync On 21.03.2016 23:58, Tomaz Canabrava wrote: > Hi, > > I can work on a proof of concept for large text files and virtual > machine images (wich would already be a win-situation for some users) > and then focus on *some* of the hard to sync files (like powerpoint > presentations) and see what I could get. > I do not think you should consider the file type at all. Just try to implement the zsync based approach I'd say, and just for the chunked upload mode. Raw steps: 1. on the client, chop the file in chunks and create a list: Number of chunk start-at-byte end-of-byte Checksum? 2. send this list to the server to get the servers checksums 3. While waiting on the server list of checksums, calc the client checksums 4. compare the lists once the both are ready and decide which need upload 5. upload the chunks that changed. The trick is in the cutting of the chunks. The amount of chunks that do not change can be increased by picking clever boundaries. This project requires both client and server work. Please do the server work based on what is described in the blog series about the new chunking API, there is a branch with basic implementation of that here: https://github.com/owncloud/core/pull/20118 Makes sense? regards, Klaas > > _______________________________________________ Devel mailing list Devel@owncloud.org http://mailman.owncloud.org/mailman/listinfo/devel
_______________________________________________ Devel mailing list Devel@owncloud.org http://mailman.owncloud.org/mailman/listinfo/devel