On Mon, Feb 16, 2004 at 10:48:32PM -0800, Craig Barratt wrote: > "Jason M. Felice" writes: > > > This patch adds the --link-by-hash=DIR option, which hard links received > > files in a link farm arranged by MD4 file hash. The result is that the system > > will only store one copy of the unique contents of each file, regardless of > > the file's name. > > > > (rev 2) > > * This revision is actually against CVS HEAD (I didn't realize I was working > > from a stale rsync'd CVS). > > * Apply permissions after linking (permissions were lost if we already had > > a copy of the file in the link farm). > > I haven't studied your patch, but I have a couple of comments/questions: > > - If you update permissions, then all hardlinks will change too. > Does that mean that all instances of an identical file will get > the last mtime/permissions/ownership? Or does the link farm have > unique entries for contents plus meta data (vs just contents)?
All instances of the file will have the last mtime/permissions/ownership. This is not such a big deal for me (although it is annoying), but I can't afford to keep multiple copies of files just because the metadata is different. If anyone has any suggestions to solve this which aren't too incredibly hackish, I'll implement (all I can think of is to store permissions in dotfiles or implement my original idea of a "database backend" as opposed to a "filesystem backend"). > - Some file systems have a hardlink limit of 32000. You will need to > roll to a new file when that limit is exceeded (ie: link() fails). Ick. Well, I *do* need it. > Also, empty files tend to be quite prevalent, so it is probably > easier to just create those files and not link them (should be no > difference in disk usage). Sounds good. In my test rsyncs (/etc from several machines), the zero-byte file got 117 links. > - How does this patch interact with -H? They should be compatible. > > Craig I'll update the patch and post. -- Jason M. Felice Cronosys, LLC <http://www.cronosys.com/> 216.221.4600 x302 -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html