>>>>> "sl" == Scott Lawson <scott.law...@manukau.ac.nz> writes: >>>>> "wa" == Wilkinson, Alex <alex.wilkin...@dsto.defence.gov.au> writes: >>>>> "dg" == Dale Ghent <da...@elemental.org> writes: >>>>> "djm" == Darren J Moffat <darr...@opensolaris.org> writes:
sl> Specifically I am talking of ZFS snapshots, rollbacks, sl> cloning, clone promotion, [...] sl> Of course to take maximum advantage of ZFS in full, then as sl> everyone has mentioned it is a good idea to let ZFS manage the sl> underlying raw disks if possible. okay, but these two feature groups are completely orthogonal. You can get the ZFS revision tree which helped you so much, and all the other features you mentioned, with a single-LUN zpool. wa> So, shall I forget ZFS and use UFS ? Naturally here you will find mostly people who have chosen to use ZFS, so I think you will have to think on your own rather than taking a poll of the ZFS list. Myself, I use ZFS. I would probably use it on a single-LUN SAN pool, but only if I had a backup system onto a second zpool, and iff I could do a restore/cutover really quickly if the primary zpool became corrupt. Some people have zpools that take days to restore, and in that case I would not do it---I'd want direct-attached storage, restore-by-cutover, or at the very least zpool-level redundancy. I'm using ZFS on a SAN right now, but my SAN is just Linux iSCSI targets, and it is exporting many JBOD LUN's with zpool-level redundancy so I'm less at risk for the single-LUN lost pool problems than you'd be with single-lun EMC. And I have a full backup onto another zpool, on a machine capable enough to assume the role of the master, albeit not automatically. For a lighter filesystem I'm looking forward to the liberation of QFS, too. And in the future I think Solaris plans to offer redundancy options above the filesystem level, like pNFS and Lustre, which may end up being the ultimate win because of the way they can move the storage mesh onto a big network switch, rather than what we have with ZFS where it's a couple bonded gigabit ethernet cards and a single PCIe backplane. Not all of ZFS's features will remain useful in such a world. However I don't think there is ANY situation in which you should run UFS over a zvol (which is one of the things you mentioned). That's only interesting for debugging or performance comparison (meaning it should always perform worse, or else there's a bug). If you read the replies you got more carefully you'll find doing that addresses none of the concerns people raised. dg> Not at all. Just export lots of LUNs from your EMC to get the dg> IO scheduling win, not one giant one, and configure the zpool dg> as a stripe. I've never heard of using multiple-LUN stripes for storage QoS before. Have you actually measured some improvement in this configuration over a single LUN? If so that's interesting. But it's important to understand there's no difference between multiple LUN stripes and a single big LUN w.r.t. reliability, as far as we know to date. The advice I've seen here to use multiple LUN's over SAN vendor storage is, until now, not for QoS but for one of two reasons: * availability. a zpool mirror of LUNs on physically distant, or at least separate, storage vendor gear. * avoid the lost-zpool problem when there are SAN reboots or storage fabric disruptions without a host reboot. djm> Not if you want ZFS to actually be able to recover from djm> checksum detected failures. while we agree recovering from checksum failures is an advantage of zpool-level redundancy, I don't think it predominates the actual failures observed by people using SAN's. The lost-my-whole-zpool failure mode predominates, and in the two or three cases when it was examined enough to recover the zpool, it didn't look like a checksum problem. It looked like either ZFS bugs or lost writes, or one leading to the other. And having zpool-level redundancy may happen to make this failure mode much less common, but it won't eliminate it, especially since we still haven't tracked down the root cause. Also we need to point out there *is* an availability advantage to letting the SAN manage a layer of redundancy, because SAN's are much better at dealing with failing disks without crashing/slowing down than ZFS, so far. I've never heard of anyone actually exporting JBOD from EMC yet. Is someone actually doing this? So far I've heard of people burning huge $$$$$$ of disk by exporting two RAID LUN's from the SAN and then mirroring them with zpool. djm> If ZFS is just given 1 or more LUNs in a stripe then it is djm> unlikely to be able to recover from data corruption, it might djm> be able to recover metadata because it is always stored with djm> at least copies=2 but that is best efforts. okay, fine, nice feature. But this failure is not actually happening, based on reports to the list. It's redundancy in space, while reports we've seen from SAN's show what's really needed is redundancy in time, if that's even possible.
pgpozJwfU79Dm.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss