In response to Michel Meyers <[EMAIL PROTECTED]>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Bill Moran wrote:
> > I expect that what happens is when a file with a duplicate filename is
> > backed up for the first time, a checksum is generated to compare it to
> > files of the same name already in the system.  When incrementals are run,
> > if the file is recently modified, the checksums are checked again.
> > 
> > I think the first thing that would need to occur for Bacula to do this,
> > is the use of something stronger than MD5.  Perhaps SHA256.
> 
> Why would Bacula need to use SHA256? MD5 should be more than sufficient
> to distinguish 2 different files that happen to have the same name and
> filesize.

I don't have the math background to say one way or another.  I mentioned
it because I believe I remember reading about rsync using strong
hash functions.

> - From a checksum/hashing standpoint, Bacula should be ready to go. It's
> the implementation of the duplicate detection and elimination algorythms
> that requires careful planning and a lot of work to implement everywhere.

I think the big thing is that it would require a new protocol between
the director and the FD.  As I understand it, the director simply tells
the fd to back up everything modified after $date.  To do incrementals
with deduplication, the fd would have to send back filenames with
hash values to the director, who would have to look them up to determine
whether the file needed to be backed up or not, at which point the
director would have to tell the fd what to do.

-- 
Bill Moran
http://www.potentialtech.com

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to