We are using zfs compression across 5 zpools, about 45TB of data on
iSCSI storage. I/O is very fast, with small fractional CPU usage (seat
of the pants metrics here, sorry). We have one other large 10TB volume
for nearline Networker backups, and that one isn't compressed. We
already compress these data on the backup client, and there wasn't any
more compression to be had on the zpool, so it isn't worth it there.
There's no doubt that heavier weight compression would be a problem as
you say. One thing that would be ultra cool on the backup pool would be
to have post write compression. After backups are done, the backup
server sits more or less idle. It would be cool to do a compress on
scrub operation that cold do some real high level compression. Then we
could zfssend | ssh-remote | zfsreceive to an off site location with far
less less network bandwidth, not to mention the remote storage could be
really small. Datadomain (www.datadomain.com) does block level
checksumming to save files as link lists of common blocks. They get
very high compression ratios (in our tests about 6/1, but with more
frequent full backups, more like 20/1). Then off site transfers go that
much faster.
Jon
Dave Johnson wrote:
From: "Robert Milkowski" <[EMAIL PROTECTED]>
LDAP servers with several dozen millions accounts?
Why? First you get about 2:1 compression ratio with lzjb, and you also
get better performance.
a busy ldap server certainly seems a good fit for compression but when i
said "large" i meant, as in bytes and numbers of files :)
seriously, is anyone out there using zfs for large "storage" servers? you
know, the same usage that 90% of the storage sold in the world is used for ?
(yes, i pulled that figure out of my *ss ;)
are my concerns invalid with the current implementation of zfs with
compression ? is the compression so lightweight that it can be decompressed
as fast as the disks can stream uncompressed backup data to tape while the
server is still servicing clients ? the days of "nightly" backups seem long
gone in the space I've been working in the last several years... backups run
almost 'round the clock it seems on our biggest systems (15-30Tb and
150-300mil files , which may be small by the standard of others of you out
there.)
what really got my eyes rolling about c9n and prompted my question was all
this talk about gzip compression and other even heavierweight compression
algor's. lzjb is relatively lightweight but i could still see it being a
bottleneck in a 'weekly full backups' scenario unless you had a very new
system with kilowatts of cpu to spare. gzip ? pulease. bzip and lzma
someone has *got* to be joking ? i see these as ideal candiates for AVS
scenarios where the aplication never requires full dumps to tape, but on a
typical storage server ? the compression would be ideal but would also make
it impossible to backup in any reasonable "window".
back to my postulation, if it is correct, what about some NDMP interface to
ZFS ? it seems a more than natural candidate. in this scenario,
compression would be a boon since the blocks would already be in a
compressed state. I'd imagine this fitting into the 'zfs send' codebase
somewhere.
thoughts (on either c9n and/or 'zfs send ndmp') ?
-=dave
----- Original Message -----
From: "Robert Milkowski" <[EMAIL PROTECTED]>
To: "Dave Johnson" <[EMAIL PROTECTED]>
Cc: "roland" <[EMAIL PROTECTED]>; <zfs-discuss@opensolaris.org>
Sent: Wednesday, October 17, 2007 2:35 AM
Subject: Re[2]: [zfs-discuss] HAMMER
Hello Dave,
Tuesday, October 16, 2007, 9:17:30 PM, you wrote:
DJ> you mean c9n ? ;)
DJ> does anyone actually *use* compression ? i'd like to see a poll on
how many
DJ> people are using (or would use) compression on production systems that
are
DJ> larger than your little department catch-all dumping ground server. i
mean,
DJ> unless you had some NDMP interface directly to ZFS, daily tape backups
for
DJ> any large system will likely be an excersize in futility unless the
systems
DJ> are largely just archive servers, at which point it's probably smarter
to
DJ> perform backups less often, coinciding with the workflow of migrating
DJ> archive data to it. otherwise wouldn't the system just plain get
pounded?
LDAP servers with several dozen millions accounts?
Why? First you get about 2:1 compression ratio with lzjb, and you also
get better performance.
--
Best regards,
Robert Milkowski mailto:[EMAIL PROTECTED]
http://milek.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
- _____/ _____/ / - Jonathan Loran - -
- / / / IT Manager -
- _____ / _____ / / Space Sciences Laboratory, UC Berkeley
- / / / (510) 643-5146 [EMAIL PROTECTED]
- ______/ ______/ ______/ AST:7731^29u18e3
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss