On Sun, 17 Apr 2005 22:06:04 -0600, Ivan Van Laningham <[EMAIL PROTECTED]> wrote: [snip] > So I wrote a set of >programs to both index the disk versions with the cd versions, and to >compare, using filecmp.cmp(), the cd and disk version. Works fine. >Turned up several dozen files that had been inadvertantly rotated or >saved with the wrong quality, various fat-fingered mistakes like that. > >However, it didn't flag the files that I know have bitrot. I seem to >remember that diff uses a checksum algorithm on binary files, not a >byte-by-byte comparison. Am I wrong?
According to the docs: """ cmp( f1, f2[, shallow[, use_statcache]]) Compare the files named f1 and f2, returning True if they seem equal, False otherwise. Unless shallow is given and is false, files with identical os.stat() signatures are taken to be equal """ and what is an os.stat() signature, you ask? So did I. According to the code itself: def _sig(st): return (stat.S_IFMT(st.st_mode), st.st_size, st.st_mtime) Looks like it assumes two files are the same if they are of the same type, same size, and same time-last-modified. Normally I guess that's good enough, but maybe the phantom bit-toggler is bypassing the file system somehow. What OS are you running? You might like to do two things: (1) run your comparison again with shallow=False (2) submit a patch to the docs. (-: You have of course attempted to eliminate other variables by checking that the bit-rot effect is apparent using different display software, a different computer, an observer who's not on the same medication as you, ... haven't you? :-) HTH, John -- http://mail.python.org/mailman/listinfo/python-list