Steven D'Aprano <[EMAIL PROTECTED]> writes: > Sure. But if you are just comparing two files, is there any reason to > bother with a checksum? (MD5 or other.)
No of course not, except in special situations, like some problem opening and reading both files simultaneously. E.g.: the files are on two different DVD-R's, they are too big to fit in ram, and you only have one DVD drive. If you want to compare byte by byte, you have to either copy one of the DVD's to your hard disk (if you have the space available) or else swap DVD's back and forth in the DVD drive reading and comparing a bufferload at a time. But you can easily read in the first DVD and compute its hash on the fly, then read and hash the second DVD and compare the hashes. If it's a normal situation with two files on HD, just open both files simultaneously, and use large buffers to keep the amount of seeking reasonable. That will be faster than big md5 computations, and more reliable (there are known ways to construct pairs of distinct files that have the same md5 hash.) -- http://mail.python.org/mailman/listinfo/python-list