I have a large ASCII data set that is zipped to a reasonable size.
Can I access the data without decompressing the whole file first?
I would like to run through the data to produce a much smaller
extract and some summary statistics, but without unzipping
it (if that is even possible).

Yes, if you're willing to slightly hack your install if you're running pre-2.6 Python.

I had the same question, and Gabriel suggested[2] I try dropping the 2.6 version of zipfile.py in my $PYTHONPATH so it's found before the existing version.

Once available, you can use the ZipFile.open() method which has an iterator you can use rather than reading the entire content into memory. You can read through the thread for further details.

Works on My Machine(tm)[3]

-tkc

[1]
http://mail.python.org/pipermail/python-list/2007-December/469254.html

[2]
http://mail.python.org/pipermail/python-list/2007-December/469320.html

[3]
http://www.codinghorror.com/blog/archives/000818.html




--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to