Re: Thought on large files

Brendan Grieve Wed, 23 Jan 2008 20:53:56 -0800

Matt McCutchen wrote:

On Wed, 2008-01-23 at 13:38 +0900, Brendan Grieve wrote:

Lets say 
the file, whatever it is, is a 10Gb file, and that some small amount of 
data changes in it. This is efficiently sent accross by rsync, BUT the 
rsync server side will correctly break the hard-link and create a new 
file with the changed bits. This means, if even 1 byte of that 10Gb file 
changes, you now have to store that whole file again.

What my thoughts were is that if the server could transparently break a 
large file into chunks and store them that way, then one can still make 
use of hard-links efficiently.


This is a fine idea, but I don't think support for this should be added
to rsync.  Instead, I suggest that you use rdiff-backup
( http://www.nongnu.org/rdiff-backup/ ), a backup tool that stores an
ordinary latest snapshot of the source along with reverse deltas for
previous snapshots and redundant attribute information both in its own
format.


Matt

I had a look at rdiff-backup, but I was trying to get something that spoke native rsync (IE, not to force any change on the client side).

I do however agree that support should NOT be added in rsync. Rsync is a mirroring tool and not some elaborate tool that needs to know really how files are stored. In fact I'd go as far as to say many of the options rsync does support veer away from being a simple mirror tool (IE backup etc...).

After some thought I think the best place to put such a change would be at the filesystem level. For example, if one had a FUSE filesystem that simply ran on top of an existing one, that wrote its files as I described (or uses diff-like methods), but presents a clean filesystem for rsync (or indeed any tool) to make use of. I believe I may look in that direction instead of hacking rsync.

Brendan Grieve

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Thought on large files

Reply via email to