On 2009-04-15, Martin <mar...@marcher.name> wrote: > On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
> I'd still say rather burn CPU cycles than development hours (if I got > the question right), _Hours_? Calling the file compare module takes _one_line_of_code_. Implementing a file compare from scratch takes about a half dozen lines of code. > if not then with binary files you will have to find some way > of representing differences between the 2 files in a readable > manner anyway. 1) Who said anything about a readable representation of the differences? 2) How does a checksum provide that? >> Hashing is a *lot* more work than just comparing two bytes. >> The MD5 checksum has been specifically designed to be fast and >> compact, and the algorithm is still complicated: > > I know that the various checksum algorithms aren't exactly > cheap, but I do think that just to know wether 2 files are > different a solution which takes 5mins to implement wins > against a lengthy discussion Bah. A direct compare is trivial. The discussion of which checksum to use, how to implement it, and how reliable it is will be far longer than any discussion over a direct comparison. > which optimizes too early wins hands down. Optimizes too early? Comparing the bytes is the simplest and most direct, obvious solution. It takes one line of code to call the file compare module. Implementing it from scratch takes about five lines of code. We all rail against premature optimization, but using a checksum instead of a direct comparison is premature unoptimization. ;) -- Grant Edwards grante Yow! Hmmm ... A hash-singer at and a cross-eyed guy were visi.com SLEEPING on a deserted island, when ... -- http://mail.python.org/mailman/listinfo/python-list