On Thu, 13 Sep 2007, Pawel Jakub Dawidek wrote:

> On Wed, Sep 12, 2007 at 11:20:52PM +0100, Peter Tribble wrote:
>> On 9/10/07, Pawel Jakub Dawidek <[EMAIL PROTECTED]> wrote:
>>> Hi.
>>>
>>> I've a prototype RAID5 implementation for ZFS. It only works in
>>> non-degraded state for now. The idea is to compare RAIDZ vs. RAID5
>>> performance, as I suspected that RAIDZ, because of full-stripe
>>> operations, doesn't work well for random reads issued by many processes
>>> in parallel.
>>>
>>> There is of course write-hole problem, which can be mitigated by running
>>> scrub after a power failure or system crash.
>>
>> If I read your suggestion correctly, your implementation is much
>> more like traditional raid-5, with a read-modify-write cycle?
>>
>> My understanding of the raid-z performance issue is that it requires
>> full-stripe reads in order to validate the checksum. [...]
>
> No, checksum is independent thing, and this is not the reason why RAIDZ
> needs to do full-stripe reads - in non-degraded mode RAIDZ doesn't read
> parity.
>
> This is how RAIDZ fills the disks (follow the numbers):
>
>       Disk0   Disk1   Disk2   Disk3
>
>       D0      D1      D2      P3
>       D4      D5      D6      P7
>       D8      D9      D10     P11
>       D12     D13     D14     P15
>       D16     D17     D18     P19
>       D20     D21     D22     P23
>
> D is data, P is parity.
>
> And RAID5 does this:
>
>       Disk0   Disk1   Disk2   Disk3
>
>       D0      D3      D6      P0,3,6
>       D1      D4      D7      P1,4,7
>       D2      D5      D8      P2,5,8
>       D9      D12     D15     P9,12,15
>       D10     D13     D16     P10,13,16
>       D11     D14     D17     P11,14,17

Surely the above is not accurate?  You've showing the parity data only 
being written to disk3.  In RAID5 the parity is distributed across all 
disks in the RAID5 set.  What is illustrated above is RAID3.

> As you can see even small block is stored on all disks in RAIDZ, where
> on RAID5 small block can be stored on one disk only.
>
> --

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
            Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to