> On Wed, Nov 07, 2007 at 01:47:04PM -0800, can you > guess? wrote: > > I do consider the RAID-Z design to be somewhat > brain-damaged [...] > > How so? In my opinion, it seems like a cure for the > brain damage of RAID-5.
Nope. A decent RAID-5 hardware implementation has no 'write hole' to worry about, and one can make a software implementation similarly robust with some effort (e.g., by using a transaction log to protect the data-plus-parity double-update or by using COW mechanisms like ZFS's in a more intelligent manner). The part of RAID-Z that's brain-damaged is its concurrent-small-to-medium-sized-access performance (at least up to request sizes equal to the largest block size that ZFS supports, and arguably somewhat beyond that): while conventional RAID-5 can satisfy N+1 small-to-medium read accesses or (N+1)/2 small-to-medium write accesses in parallel (though the latter also take an extra rev to complete), RAID-Z can satisfy only one small-to-medium access request at a time (well, plus a smidge for read accesses if it doesn't verity the parity) - effectively providing RAID-3-style performance. The easiest way to fix ZFS's deficiency in this area would probably be to map each group of N blocks in a file as a stripe with its own parity - which would have the added benefit of removing any need to handle parity groups at the disk level (this would, incidentally, not be a bad idea to use for mirroring as well, if my impression is correct that there's a remnant of LVM-style internal management there). While this wouldn't allow use of parity RAID for very small files, in most installations they really don't occupy much space compared to that used by large files so this should not constitute a significant drawback. - bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss