Robin Becker wrote: >> Does anyone know of a fast way to calculate checksums for a large file. I >> need a way to generate >> ETag keys for a webserver, the ETag of large files are not realy nececary, >> but it would be nice >> if I could do it. I'm using the python hash function on the dynamic >> generated strings (like in >> page content) but on things like images I use the shutil's copyfileobject >> function and the hash >> of a fileobject's hash are it's handlers memmory address. >> >> Does anyone know a python utility which is possible to use, perhaps >> something like the md5sum >> utility on *nix systems. >> > well md5sum is usable on many systems. I run it on win32 and darwin. > > I tried this in 2.4 with the new subprocess module
on my machine, Python's md5+mmap is a little bit faster than subprocess+md5sum: import os, md5, mmap file = open(fn, "r+") size = os.path.getsize(fn) hash = md5.md5(mmap.mmap(file.fileno(), size)).hexdigest() (I suspect that md5sum also uses mmap, so the difference is probably just the subprocess overhead) </F> -- http://mail.python.org/mailman/listinfo/python-list