On 29Oct2015 11:39, Chris Angelico <ros...@gmail.com> wrote:
If it's only zipped, it's not opaque.  Just `zcat` or `zgrep` and
process away.  The whole base64+minus_newlines thing does opaquify
and doesn't really save all that much for the trouble.

If you zip the whole file as a whole, yes. If you zip individual
pieces, you can't zcat it (at least, I don't think so?).

If it is pure gzip, then yes you can. So this:

 gunzip < file1.gz; gunzip < file2.gz

and this:

 cat file1.gz file2.gz | gunzip

should produce the same output. I think this works at the record level too.

Of course all bets are off once you wrap the records in some outer layer (I have a file format with is little records which may have the data section zipped).

Conversely,
zipping the whole file means you have no choice but to sequentially
scan it - you can't pull up the last section of the file. It's still a
binary blob to many tools - we as humans may have handy tools around,
but it's still going to be an extra step for any tool that doesn't
intrinsically support it.

Yes. But if you're keeping a lot of data or you're using a very constrained system you probably do want compression somewhere in there. Maybe the OP is optimising prematurely, but again, maybe not.

However it sounds like the OP wants a text log encoding some test state, and is just compressing to gain a little room; I suspect that with a short record you might put on a line the compression obtained will be small and the loss from any base64 post step will undo it all. He may be better off keeping conventional text logs and just rotating them and compressing the rotated copies.

Cheers,
Cameron Simpson <c...@zip.com.au>

Hoping to shave precious seconds off the time it would take me to get through the checkout process and on my way home, I opted for the express line ("9 Items Or Less [sic]" Why nine items? Where do they come up with these rules, anyway? It's the same way at most stores -- always some oddball number like that, instead of a more understandable multiple of five. Like "five.")
- Geoff Miller, geo...@purplehaze.corp.sun.com
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to