On Fri, 2009-09-04 at 13:41 -0700, Richard Elling wrote: > On Sep 4, 2009, at 12:23 PM, Len Zaifman wrote: > > > We have groups generating terabytes a day of image data from lab > > instruments and saving them to an X4500. > > Wouldn't it be easier to compress at the application, or between the > application and the archiving file system?
Preamble: I am actively doing research into image set compression, specifically jpeg2000, so this is my point of reference. I think it would be easier to compress at the application level. I would suggest getting the image from the source, then use lossless jpeg2000 compression on it, saving the result to an uncompressed ZFS pool. JPEG2000 uses arithmetic encoding to do the final compression step. Arithmetic encoding has a higher compression rate (in general) than gzip-9, lzbj or others. There is an opensource implementation of jpeg2000 called jasper[1]. Jasper is the reference implementation for jpeg2000, meaning that all other jpeg2000 programs must verify it's output to that of jasper (kinda). Saving the jpeg2000 image to an uncompressed ZFS partition will be the fastest thing. Since jpeg2000 is already compressed, trying to compress it will not yeild any storage space reduction, in fact it may _increase_ the size of the data stored on disk. Since good compression algorithms result in random data you can see why running on a compressed pool would be bad for performance. [1] http://www.ece.uvic.ca/~mdadams/jasper On a side note, if you want to know how Arithmetic encoding works, Wikipedia[2] has a real nice explanation. Suffice it to say, in theory ( Without considering implementation details ) arithmetic encoding can encode _any_ data at the rate of data_entropy*num_of_symbols + data_symbol_table. In practice this doesn't happen due to floating point overflows and some other issues. [2] http://en.wikipedia.org/wiki/Arithmetic_coding -- Louis-Frédéric Feuillette <jeb...@gmail.com> _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss