Hi, Christopher Culver wrote:
Returning to Python after several years away, I'm working on a little script that will download a ZIP archive from a website and unzip it to a mounted filesystem. The code is below, and it works so far, but I'm unsure of a couple of things.The first is, is there a way to read the .zip into memory without the use of a temporary file? If I do archive = zipfile.ZipFile(remotedata.read()) directly without creating a temporary file, the zipfile module complains that the data is in the wrong string type.
Which makes sense given the documentation (note you can either browse the HTML online/offline or just use help() within the interpreter/ide: Help on class ZipFile in module zipfile: class ZipFile | Class with methods to open, read, write, close, list zip files. | | z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=True) | | file: Either the path to the file, or a file-like object. | If it is a path, the file will be opened and closed by ZipFile. | mode: The mode can be either read "r", write "w" or append "a".| compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib). | allowZip64: if True ZipFile will create files with ZIP64 extensions when | needed, otherwise it will raise an exception when this would
| be necessary. | ... so instead you would use archive = zipfile.ZipFile(remotedata)
The second issue is that I don't know if this is the correct way to unpack a file onto the filesystem. It's strange that the zipfile module has no one simple function to unpack a zip onto the disk. Does this code seem especially liable to break? try: remotedata = urllib2.urlopen(theurl) except IOError: print("Network down.") sys.exit() data = os.tmpfile() data.write(remotedata.read()) archive = zipfile.ZipFile(data) if archive.testzip() != None: print "Invalid zipfile" sys.exit() contents = archive.namelist() for item in contents:
... here you should check the zipinfo entry and normalize and clean the path just in case to avoid unpacking a zipfile with special crafted paths (like /etc/passwd and such) Maybe also checking for the various encodings (like utf8) in pathnames makes sense. The dir-creation could be put into a class with caching of already existing subdirectories created and recursive creation of missing subdirectories as well es to make sure you do not ascend out of your target directory by accident (or crafted zip, see above). Regards Tino
smime.p7s
Description: S/MIME Cryptographic Signature
-- http://mail.python.org/mailman/listinfo/python-list