On Thu, Oct 29, 2015 at 9:53 AM, Tim Chase <python.l...@tim.thechases.com> wrote: > On 2015-10-29 09:38, Chris Angelico wrote: >> On Thu, Oct 29, 2015 at 9:30 AM, Marc Aymerich >> <glicer...@gmail.com> wrote: >> > I'm writting an application that saves historical state in a log >> > file. I want to be really efficient in terms of used bytes. >> >> Why, exactly? >> >> By zipping the state, you make it utterly opaque. > > If it's only zipped, it's not opaque. Just `zcat` or `zgrep` and > process away. The whole base64+minus_newlines thing does opaquify > and doesn't really save all that much for the trouble.
If you zip the whole file as a whole, yes. If you zip individual pieces, you can't zcat it (at least, I don't think so?). Conversely, zipping the whole file means you have no choice but to sequentially scan it - you can't pull up the last section of the file. It's still a binary blob to many tools - we as humans may have handy tools around, but it's still going to be an extra step for any tool that doesn't intrinsically support it. >> Disk space is not expensive. Even if you manage to cut your file by >> a factor of four (75% compression, which is entirely possible if >> your content is plain text, but far from guaranteed) > > Though one also has to consider the speed of reading it off the drive > for processing. If you have spinning-rust drives, it's pretty slow > (and SSD is still not like accessing RAM), and reading zipped > content can shovel a LOT more data at your CPU than if it is coming > off the drive uncompressed. Logs aren't much good if they aren't > being monitored and processed for the information they contain. If > nobody is monitoring the logs, just write them to /dev/null for 100% > compression. ;-) Yeah. There are lots of considerations, but frankly, I don't think disk _capacity_ is a big one. Sometimes you _might_ get some benefit from compression (writing less sectors might save you time), but I almost never fill up my hard drives, and when I do, it's usually with already-compressed data (movies and stuff). ChrisA -- https://mail.python.org/mailman/listinfo/python-list