Folks feedback on my spam communications was that -- I jump from point to point too fast and am lazy to explain and often somewhat misleading. ;-)
On the NetApp thing, please note they had their time talking about SW RAID can be as good as/better than HW RAID. However, from a customer point of view, the math is done in a reversed fashion. Roughly, for a 3-9 (99.9%) availability, customer has 8 hours of annual downtime, and RAID could help; for a 4-9 (99.99%) availability, customer has 45 minutes of annual downtime, and RAID alone won't do, H/A clustering may be needed (without clustering, a big iron box, such as ES70000, can do 99.98%, but hard to reach 99.99%, in our past field studies). for a 5-9 (99.999%) availability, customer has 5 minutes of annual downtime, and H/A clustering with automated stateful failover is a must. So, for every additional 9, the customer needs to learn additional pages in the NetApp price book, which I think that's the real issue with NetApp (enterprise customers with the checkbooks may have absolutely no idea about how RAID checksum would impact their SLO/SLA costs.) I have not done a cost study on ZFS towards the 9999999s, but I guess we can do better with more system and I/O based assurance over just RAID checksum, so customers can get to more 99998888s with less redundant hardware and software feature enablement fees. Also note that the upcoming NetApp ONTAP/GX converged release would hopefully improve the NetApp solution cost structure at some level, but I cannot discuss that until it's officially released [beyond keep screaming "6920+ZFS"]. ;-) best, z ----- Original Message ----- From: "Richard Elling" <richard.ell...@sun.com> To: "Tim" <t...@tcsac.net> Cc: <zfs-discuss@opensolaris.org>; "Ulrich Graef" <ulrich.gr...@sun.com> Sent: Friday, January 02, 2009 2:35 PM Subject: Re: [zfs-discuss] ZFS vs HardWare raid - data integrity? > Tim wrote: >> >> >> >> The Netapp paper mentioned by JZ >> >> (http://pages.cs.wisc.edu/~krioukov/ParityLostAndParityRegained-FAST08.ppt >> >> <http://pages.cs.wisc.edu/%7Ekrioukov/ParityLostAndParityRegained-FAST08.ppt>) >> talks about write verify. >> >> Would this feature make sense in a ZFS environment? I'm not sure if >> there is any advantage. It seems quite unlikely, when data is >> written in >> a redundant way to two different disks, that both disks lose or >> misdirect the same writes. >> >> Maybe ZFS could have an option to enable instant readback of written >> blocks, if one wants to be absolutely sure, data is written >> correctly to >> disk. >> >> >> Seems to me it would make a LOT of sense in a WORM type system. > > Since ZFS only deals with block devices, how would we guarantee > that the subsequent read was satisfied from the media rather than a > cache? If the answer is that we just wait long enough for the caches > to be emptied, then the existing scrub should work, no? > -- richard > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss