Re: Read a gzip file from inside a tar file

Fredrik Lundh Mon, 13 Dec 2004 13:07:41 -0800

Craig Ringer wrote:

>> These are huge files. My goal is to analyze the content of the gzip
>> file in the tar file without having to un gzip.  If that is possible.
>
> As far as I know, gzip is a stream compression algorithm that can't be
> decompressed in small blocks. That is, I don't think you can seek 500k
> into a 1MB file and decompress the next 100k.


correct.

> I'd say you'll have to progressively read the file from the beginning,
> processing and discarding as you go. It looks like a no-brainer to me -
> see zlib.decompressobj.

it can be a bit tricky to set things up properly, though.  here's a piece
of code that uses Python's good old consumer interface to decode things
incrementally:

    http://effbot.org/zone/consumer-gzip.htm

you can either use this as is; just create a "target consumer", wrap it in the
gzip consumer, and feed data to the gzip consumer in suitable pieces.

alternatively, hack it until it does what you want.

</F> 



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Read a gzip file from inside a tar file

Reply via email to