On Wednesday 27 February 2008, Stroller wrote:

> > Of course, this does not detect a succesful, but somehow corrupted,
> > copy
> > (which should be exceptionally rare, anyway).
>
> Well perhaps I'm just being paranoid today.
> But how do I know that a successful, but somehow corrupted, copy has
> not occurred?
>
> What makes you confident that these are rare? I don't ask this to be
> antagonistic, just to increase my own confidence in the `cp` command.

Ah well, I have no statistics here. But I can say that such a thing has 
never occured to me in the past (or at least if it occured, I did not 
notice that). Not a definitive proof, I know; rather, just my 
experience. You are of course free to not trust me and, if you're truly 
paranoid, you probably should do so :-)

> I have to admit that I haven't run this command and I don't have any
> idea what its actual resource usage would be. I guess I'd be happy
> with a lower-grade of checksumming, if it would reduce the runtime to
> acceptable levels. With md5sum one can be - barring certain malicious
> external attacks - quite certain that a copied file is identical to
> the original. I would be happy with a "the file's there and it looks
> ok" level of confidence.

Well, md5deep has already been suggested. If you are content with a 
lower-grade checksumming, you could write your own script that compares 
file lenghts and calculate checksums only on the first n and last m 
bytes of each file, for some reasonable values of n and m (bigger is 
better, as you guess). This is what backuppc (an excellent backup 
software) does when it has to decide whether a file has changed (and 
thus has to be backed up) compared with the copy stored in the backup 
pool.
Read this for more info:

http://backuppc.sourceforge.net/faq/BackupPC.html#some_design_issues

"The hashing function" paragraph. Do note that (of course) that method is 
not 100% accurate and might report false negatives if the corruption is 
in the middle of the file and file length did not change.
-- 
gentoo-user@lists.gentoo.org mailing list

Reply via email to