On Tue, 08 Feb 2005 16:13:43 +0000, rumours say that Robin Becker <[EMAIL PROTECTED]> might have written:
>Ola Natvig wrote: >> Hi all >> >> Does anyone know of a fast way to calculate checksums for a large file. >> I need a way to generate ETag keys for a webserver, the ETag of large >> files are not realy nececary, but it would be nice if I could do it. I'm >> using the python hash function on the dynamic generated strings (like in >> page content) but on things like images I use the shutil's >> copyfileobject function and the hash of a fileobject's hash are it's >> handlers memmory address. >> >> Does anyone know a python utility which is possible to use, perhaps >> something like the md5sum utility on *nix systems. >> >> >well md5sum is usable on many systems. I run it on win32 and darwin. [snip use of some md5sum.exe] Why not use the md5 module? The following md5sum.py is in use and tested, but not "failproof". |import sys, os, md5 |from glob import glob | |for arg in sys.argv[1:]: | for filename in glob(arg): | fp= file(filename, "rb") | md5sum= md5.new() | while True: | data= fp.read(65536) | if not data: break | md5sum.update(data) | fp.close() | print md5sum.hexdigest(), filename It's fast enough, especially if you cache results. -- TZOTZIOY, I speak England very best. "Be strict when sending and tolerant when receiving." (from RFC1958) I really should keep that in mind when talking with people, actually... -- http://mail.python.org/mailman/listinfo/python-list