Dennis is correct, in that compressibility is inversely related to
randomness.

And, also that binaries have nice commonality of symbols and headers.

All of which goes to excellent DEDUP. But not necessarily real good
compression - since what we need for compression is duplication /inside/
each file. 



However, VM images aren't all binaries, (typical OS images have text
files and pre-compressed stuff all over the place, e.g. man pages,
Windows .cab files, text config files, etc. plus the standard binary
executables).  So, let's see what a typical example brings us...




As a sample, I just compressed a basic installation of Windows 2000 (C:
\WINNT).

Here's what I found:

Using a standard LZSS encoding tool (DECODE, aka that used in gzip, ZIP,
PDF, etc.), I get about a 40% compression (i.e. ~1.6 ratio).  1.8GB ->
858MB

Using ZFS filesystem compression of the same data, I get 1.58x
compression ratio, 2.2GB -> 1.4GB


A ZFS filesystem with dedup on, using the same WINNT data, produces a
1.86x dedup factor.  (2.15GB "used", for 1.17GB allocated)


Finally, a ZFS with compression & dedup turned on:  compressratio =
1.66x  dedupratio = 1.77x,  1.27GB data stored, 741MB allocated
(remember, this is for a 2.2GB data set, so about 3.0x total)



I also just did a quick rsync of my / partition on an Ubuntu 10.04 x64
system - I'm getting about 1.5x compressratio for that stuff (or, less
than 35% compression).  6.3GB -> 4.2GB



So, yes, I should have been more exact, in that system binaries /DO/
compress. However, they're not excessively compressible, and in a
comparison between a fs with compression and one with dedup, dedup wins
hands down, even if I'm not storing lots of identical images.  I'm still
not convinced that compression is really worthwhile for OS images, but
I'm open to talk...   :-)


Also, here's the time output for each of the unzip operations (I'm doing
this in a VirtualBox instance, so it's a bit slow):

(no dedup, no compression)
real    1m41.865s
user    0m23.270s
sys     0m26.400s

(no dedup, compression)
real    1m40.465s
user    0m23.088s
sys     0m25.190s

(no compression, dedup)
real    2m1.400s
user    0m23.162s
sys     0m27.639s

(dedup & compression)
real    1m51.122s
user    0m23.294s
sys     0m26.953s




I'll also be a bit more careful with the broad generalizations. :-)


-Erik






On Fri, 2010-05-07 at 11:23 -0400, Dennis Clarke wrote:
> > On 06/05/2010 21:07, Erik Trimble wrote:
> >> VM images contain large quantities of executable files, most of which
> >> compress poorly, if at all.
> >
> > What data are you basing that generalization on ?
> 
> note : I can't believe someone said that.
> 
> warning : I just detected a fast rise time on my pedantic input line and I
> am in full geek herd mode :
> http://www.blastwave.org/dclarke/blog/?q=node/160
> 
> The degree to which a file can be compressed is often related to the
> degree of randomness or "entropy" in the bit sequences in that file. We
> tend to look at files in chunks of bits called "bytes" or "words" or
> "blocks" of some given length but the harsh reality is that it is just a
> sequence of ones and zero values and nothing more. However I can spot
> blocks or patterns in there and then create tokens that represent
> repeating blocks. If you want a really random file that you are certain
> has nearly perfect high entropy then just get a coin and flip it 1024
> times while recording the heads and tails results. Then input that data
> into a file as a sequence of ones and zero bits and you have a very neatly
> random chunk of data.
> 
> Good luck trying to compress that thing.
> 
> Pardon me .. here it comes. I spent waay too many years in labs doing work
> with RNG hardware and software to just look the other way. And I'm in a
> good mood.
> 
> Suppose that C is soem discrete random variable. That means that C can
> have well defined values like HEAD or TAIL. You usually have a bunch ( n
> of them ) of possible values x1, x2, x3, ..., xn that C can be. Each of
> those shows up in the data set with specific propabilities p1, p2, p3,
> ..., pn where the sum of those add to exactly one. This means that x1 will
> appear in the dataset with an "expected" probability of p1. All of those
> propabilities are expressed as a value between 0 and 1. A value of 1 means
> "certainty". Okay, so in the case of a coin ( not the one in Bat Man The
> Dark Knight ) you have x1=TAIL and x2=HEAD with ( we hope ) p1=0.5=p2 such
> that p1+p2 = 1 exactly unless the coin lands on its edge and the universe
> collapses due to entropy implosion. That is a joke. I used to teach this
> as a TA in university so bear with me.
> 
> So go flip a coin a few thousand times and you will get fairly random
> data. That is a Random Number Generator that you have and its always
> kicking around your lab or in your pocket or on the street. Pretty cheap
> but the baud rate is hellish low.
> 
> If you get tired of flipping bits using a coin then you may have to just
> give up on that ( or buy a radioactive source where you can monitor
> particles emitted as it decays for input data ) OR be really cheap and
> look at /dev/urandom on a decent Solaris machine :
> 
> $ ls -lap /dev/urandom
> lrwxrwxrwx   1 root     root          34 Jul  3  2008 /dev/urandom ->
> ../devices/pseudo/ran...@0:urandom
> 
> That thing right there is a pseudo random number generator. It will make
> for really random data but there is no promise that over a given number of
> bits that the sum p1 + p2 will be precisely 1.  It will be real real close
> however to a very random ( high entropy ) data source.
> 
> Need 1024 bits of random data ?
> 
> $ /usr/xpg4/bin/od -Ax -N 128 -t x1 /dev/urandom
> 0000000 ef c6 2b ba 29 eb dd ec 6d 73 36 06 58 33 c8 be
> 0000010 53 fa 90 a2 a2 70 25 5f 67 1b c3 72 4f 26 c6 54
> 0000020 e9 83 44 c6 b9 45 3f 88 25 0c 4d c7 bc d5 77 58
> 0000030 d3 94 8e 4e e1 dd 71 02 dc c2 d0 19 f6 f4 5c 44
> 0000040 ff 84 56 9f 29 2a e5 00 33 d2 10 a4 d2 8a 13 56
> 0000050 d1 ac 86 46 4d 1e 2f 10 d9 0b 33 d7 c2 d4 ef df
> 0000060 d9 a2 0b 7f 24 05 72 39 2d a6 75 25 01 bd 41 6c
> 0000070 eb d9 4f 23 d9 ee 05 67 61 7c 8a 3d 5f 3a 76 e3
> 0000080
> 
> There ya go. That was faster than flipping a coin eh? ( my Canadian bit
> just flipped )
> 
> So you were saying ( or someone somewhere had the crazy idea that ZFS with
> dedupe and compression enabled ) won't really be of great benefit because
> of all the binary files in the filesystem. Well thats just nuts. Sorry but
> it is. Those binary files are made up of ELF headers and opcodes from a
> specific set of opcodes for a given architecture and that means the input
> set C consists of a "discrete set of possible values" and NOT pure random
> high entropy data.
> 
> Want a demo ?
> 
> Here :
> 
> (1) take a nice big lib
> 
> $ uname -a
> SunOS aequitas 5.11 snv_138 i86pc i386 i86pc
> $ ls -lap /usr/lib | awk '{ print $5 " " $9 }' | sort -n | tail
> 4784548 libwx_gtk2u_core-2.8.so.0.6.0
> 4907156 libgtkmm-2.4.so.1.1.0
> 6403701 llib-lX11.ln
> 8939956 libicudata.so.2
> 9031420 libgs.so.8.64
> 9300228 libCg.so
> 9916268 libicudata.so.3
> 14046812 libicudata.so.40.1
> 21747700 libmlib.so.2
> 40736972 libwireshark.so.0.0.1
> 
> $ cp /usr/lib/libwireshark.so.0.0.1 /tmp
> 
> $ ls -l /tmp/libwireshark.so.0.0.1
> -r-xr-xr-x   1 dclarke  csw      40736972 May  7 14:20
> /tmp/libwireshark.so.0.0.1
> 
> What is the SHA256 hash for that file ?
> 
> $ cd /tmp
> 
> Now compress it with gzip ( a good test case ) :
> 
> $ /opt/csw/bin/gzip -9v libwireshark.so.0.0.1
> libwireshark.so.0.0.1:   76.1% -- replaced with libwireshark.so.0.0.1.gz
> 
> $ ls -l libwireshark.so.0.0.1.gz
> -r-xr-xr-x   1 dclarke  csw      9754053 May  7 14:20
> libwireshark.so.0.0.1.gz
> 
> $ bc
> scale=9
> 9754053/40736972
> 0.239439814
> 
> I see compression there.
> 
> Let's see what happens with really random data :
> 
> $ dd if=/dev/urandom of=/tmp/foo.dat bs=8192 count=8192
> 8192+0 records in
> 8192+0 records out
> $ ls -l /tmp/foo.dat
> -rw-r--r--   1 dclarke  csw      67108864 May  7 15:21 /tmp/foo.dat
> 
> $ ls -l /tmp/foo.dat.gz
> -rw-r--r--   1 dclarke  csw      67119130 May  7 15:21 /tmp/foo.dat.gz
> 
> QED.
> 
> 
> 
> 
> 
> 

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to