On Thu, 13 Sep 2007, Pawel Jakub Dawidek wrote: > On Wed, Sep 12, 2007 at 11:20:52PM +0100, Peter Tribble wrote: >> On 9/10/07, Pawel Jakub Dawidek <[EMAIL PROTECTED]> wrote: >>> Hi. >>> >>> I've a prototype RAID5 implementation for ZFS. It only works in >>> non-degraded state for now. The idea is to compare RAIDZ vs. RAID5 >>> performance, as I suspected that RAIDZ, because of full-stripe >>> operations, doesn't work well for random reads issued by many processes >>> in parallel. >>> >>> There is of course write-hole problem, which can be mitigated by running >>> scrub after a power failure or system crash. >> >> If I read your suggestion correctly, your implementation is much >> more like traditional raid-5, with a read-modify-write cycle? >> >> My understanding of the raid-z performance issue is that it requires >> full-stripe reads in order to validate the checksum. [...] > > No, checksum is independent thing, and this is not the reason why RAIDZ > needs to do full-stripe reads - in non-degraded mode RAIDZ doesn't read > parity. > > This is how RAIDZ fills the disks (follow the numbers): > > Disk0 Disk1 Disk2 Disk3 > > D0 D1 D2 P3 > D4 D5 D6 P7 > D8 D9 D10 P11 > D12 D13 D14 P15 > D16 D17 D18 P19 > D20 D21 D22 P23 > > D is data, P is parity. > > And RAID5 does this: > > Disk0 Disk1 Disk2 Disk3 > > D0 D3 D6 P0,3,6 > D1 D4 D7 P1,4,7 > D2 D5 D8 P2,5,8 > D9 D12 D15 P9,12,15 > D10 D13 D16 P10,13,16 > D11 D14 D17 P11,14,17
Surely the above is not accurate? You've showing the parity data only being written to disk3. In RAID5 the parity is distributed across all disks in the RAID5 set. What is illustrated above is RAID3. > As you can see even small block is stored on all disks in RAIDZ, where > on RAID5 small block can be stored on one disk only. > > -- Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss