On 12/7/06, Andrew Miller <[EMAIL PROTECTED]> wrote:
Quick question about the interaction of ZFS filesystem compression and the 
filesystem cache.  We have an Opensolaris (actually Nexenta alpha-6) box 
running RRD collection.   These files seem to be quite compressible.  A test 
filesystem containing about 3,000 of these files shows a compressratio of 12.5x.

Be careful here.  If you are using files that have no data in them yet
you will get much better compression than later in life.  Judging by
the fact that you got only 12.5x, I suspect that your files are at
least partially populated.  Expect the compression to get worse over
time.

Looking at some RRD files that come from a very active (e.g. numbers
vary frequently) servers with data filling about 2/3 of the configured
time periods, I see the following rates:

1.8 mpstat.rrd
1.8 vmstat.rrd
1.9 exacct_PROJECT_user.oracle.rrd
2.0 net-ce2.rrd
2.1 iostat-c14.rrd
2.1 iostat-c15.rrd
2.1 iostat-c16.rrd
. . .
7.6 net-ce912005.rrd
7.7 net-ce912016.rrd
9.1 exacct_PROJECT_user.gemsadm.rrd
12.2 exacct_PROJECT_exacct_interval.rrd
18.1 exacct_PROJECT_user.patrol.rrd
18.1 exacct_PROJECT_user.precise.rrd
18.1 exacct_PROJECT_user.precise6.rrd
31.8 net-ce8.rrd
39.6 net-eri3.rrd
45.1 net-eri2.rrd

The first column is the compression ratio.  The net-eri{2,3} files are
almost empty.


My question is about how the filesystem cache works with compressed files.  
Does the fscache keep a copy of the compressed data, or the uncompressed 
blocks?   To update one of these RRD files, I believe the whole contents are 
read into memory, modified, and then written back out.   If the filesystem 
cache maintained a copy of the compressed data, a lot more, maybe more than 10x 
more, of these files could be maintained in the cache.  That would mean we 
could have a lot more data files without ever needing to do a physical read.

Here is an insert of a value:

25450:  open("/opt/perfstat/rrd/somehost/iostat-c4.rrd", O_RDWR) = 3
25450:  fstat64(3, 0xFFBFF5E0)                          = 0
25450:  fstat64(3, 0xFFBFF640)                          = 0
25450:  fstat64(3, 0xFFBFF4E8)                          = 0
25450:  ioctl(3, TCGETA, 0xFFBFF5CC)                    Err#25 ENOTTY
25450:  read(3, " R R D\0 0 0 0 1\0\0\0\0".., 8192)     = 8192
25450:  llseek(3, 0, SEEK_CUR)                          = 8192
25450:  lseek(3, 0xFFFFFC68, SEEK_CUR)                  = 7272
25450:  fcntl(3, F_SETLK, 0xFFBFF7D0)                   = 0
25450:  llseek(3, 0, SEEK_CUR)                          = 7272
25450:  lseek(3, 2230952, SEEK_SET)                     = 2230952
25450:  write(3, " @ x S = pA3D7\v ?E6 f f".., 64)      = 64
25450:  lseek(3, 1864, SEEK_SET)                        = 1864
25450:  write(3, " E xA0 # U N K N\0\0\0\0".., 5408)    = 5408
25450:  close(3)                                        = 0

Notice that it does the following:

Open the file
Read the first 8K
Seek to a particular spot
Take a lock
Seek
Write 64 bytes
seek
Write 5408 bytes
close

The rrd file in question is 8.6 MB.  There was 8KB of reads and 5472
bytes of write.  This is one of the big wins over the current binary
rrd format over the original ASCII version that came with MRTG.

Mike

--
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to