We are using zfs compression across 5 zpools, about 45TB of data on iSCSI storage. I/O is very fast, with small fractional CPU usage (seat of the pants metrics here, sorry). We have one other large 10TB volume for nearline Networker backups, and that one isn't compressed. We already compress these data on the backup client, and there wasn't any more compression to be had on the zpool, so it isn't worth it there. There's no doubt that heavier weight compression would be a problem as you say. One thing that would be ultra cool on the backup pool would be to have post write compression. After backups are done, the backup server sits more or less idle. It would be cool to do a compress on scrub operation that cold do some real high level compression. Then we could zfssend | ssh-remote | zfsreceive to an off site location with far less less network bandwidth, not to mention the remote storage could be really small. Datadomain (www.datadomain.com) does block level checksumming to save files as link lists of common blocks. They get very high compression ratios (in our tests about 6/1, but with more frequent full backups, more like 20/1). Then off site transfers go that much faster.

Jon

Dave Johnson wrote:
From: "Robert Milkowski" <[EMAIL PROTECTED]>
LDAP servers with several dozen millions accounts?
Why? First you get about 2:1 compression ratio with lzjb, and you also
get better performance.

a busy ldap server certainly seems a good fit for compression but when i said "large" i meant, as in bytes and numbers of files :)

seriously, is anyone out there using zfs for large "storage" servers? you know, the same usage that 90% of the storage sold in the world is used for ? (yes, i pulled that figure out of my *ss ;)

are my concerns invalid with the current implementation of zfs with compression ? is the compression so lightweight that it can be decompressed as fast as the disks can stream uncompressed backup data to tape while the server is still servicing clients ? the days of "nightly" backups seem long gone in the space I've been working in the last several years... backups run almost 'round the clock it seems on our biggest systems (15-30Tb and 150-300mil files , which may be small by the standard of others of you out there.)

what really got my eyes rolling about c9n and prompted my question was all this talk about gzip compression and other even heavierweight compression algor's. lzjb is relatively lightweight but i could still see it being a bottleneck in a 'weekly full backups' scenario unless you had a very new system with kilowatts of cpu to spare. gzip ? pulease. bzip and lzma someone has *got* to be joking ? i see these as ideal candiates for AVS scenarios where the aplication never requires full dumps to tape, but on a typical storage server ? the compression would be ideal but would also make it impossible to backup in any reasonable "window".

back to my postulation, if it is correct, what about some NDMP interface to ZFS ? it seems a more than natural candidate. in this scenario, compression would be a boon since the blocks would already be in a compressed state. I'd imagine this fitting into the 'zfs send' codebase somewhere.

thoughts (on either c9n and/or 'zfs send ndmp') ?

-=dave

----- Original Message ----- From: "Robert Milkowski" <[EMAIL PROTECTED]>
To: "Dave Johnson" <[EMAIL PROTECTED]>
Cc: "roland" <[EMAIL PROTECTED]>; <zfs-discuss@opensolaris.org>
Sent: Wednesday, October 17, 2007 2:35 AM
Subject: Re[2]: [zfs-discuss] HAMMER


Hello Dave,

Tuesday, October 16, 2007, 9:17:30 PM, you wrote:

DJ> you mean c9n ? ;)

DJ> does anyone actually *use* compression ? i'd like to see a poll on how many DJ> people are using (or would use) compression on production systems that are DJ> larger than your little department catch-all dumping ground server. i mean, DJ> unless you had some NDMP interface directly to ZFS, daily tape backups for DJ> any large system will likely be an excersize in futility unless the systems DJ> are largely just archive servers, at which point it's probably smarter to
DJ> perform backups less often, coinciding with the workflow of migrating
DJ> archive data to it. otherwise wouldn't the system just plain get pounded?

LDAP servers with several dozen millions accounts?
Why? First you get about 2:1 compression ratio with lzjb, and you also
get better performance.


--
Best regards,
Robert Milkowski                            mailto:[EMAIL PROTECTED]
                                      http://milek.blogspot.com



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--


-     _____/     _____/      /           - Jonathan Loran -           -
-    /          /           /                IT Manager               -
-  _____  /   _____  /     /     Space Sciences Laboratory, UC Berkeley
-        /          /     /      (510) 643-5146 [EMAIL PROTECTED]
- ______/    ______/    ______/           AST:7731^29u18e3

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to