Craig Ringer wrote: >> These are huge files. My goal is to analyze the content of the gzip >> file in the tar file without having to un gzip. If that is possible. > > As far as I know, gzip is a stream compression algorithm that can't be > decompressed in small blocks. That is, I don't think you can seek 500k > into a 1MB file and decompress the next 100k.
correct. > I'd say you'll have to progressively read the file from the beginning, > processing and discarding as you go. It looks like a no-brainer to me - > see zlib.decompressobj. it can be a bit tricky to set things up properly, though. here's a piece of code that uses Python's good old consumer interface to decode things incrementally: http://effbot.org/zone/consumer-gzip.htm you can either use this as is; just create a "target consumer", wrap it in the gzip consumer, and feed data to the gzip consumer in suitable pieces. alternatively, hack it until it does what you want. </F> -- http://mail.python.org/mailman/listinfo/python-list