On Thu, Sep 13, 2007 at 04:58:10AM +0000, Marc Bevand wrote:
> Pawel Jakub Dawidek <pjd <at> FreeBSD.org> writes:
> > 
> > This is how RAIDZ fills the disks (follow the numbers):
> > 
> >     Disk0   Disk1   Disk2   Disk3
> > 
> >     D0      D1      D2      P3
> >     D4      D5      D6      P7
> >     D8      D9      D10     P11
> >     D12     D13     D14     P15
> >     D16     D17     D18     P19
> >     D20     D21     D22     P23
> > 
> > D is data, P is parity.
> 
> This layout assumes of course that large stripes have been written to
> the RAIDZ vdev. As you know, the stripe width is dynamic, so it is
> possible for a single logical block to span only 2 disks (for those who
> don't know what I am talking about, see the "red" block occupying LBAs
> D3 and E3 on page 13 of these ZFS slides [1]).

Yes I'm aware of that.

> To read this logical block (and validate its checksum), only D_0 needs 
> to be read (LBA E3). So in this very specific case, a RAIDZ read
> operation is as cheap as a RAID5 read operation. [...]

If you do single sector writes - yes, but this is very inefficient,
because of two reasons:
1. Bandwidth - writting one sector at a time? Come on.
2. Space - when you write one sector and its parity you consume two
   sectors. You may have more than one parity column in that case, eg.
        Disk0   Disk1   Disk2   Disk3   Disk4   Disk5
        D0      P0      D1      P1      D2      P2
   In this case space overhead is the same as in mirror.

> [...] The existence of these
> small stripes could explain why RAIDZ doesn't perform as bad as RAID5
> in Pawel's benchmark...

No, as I said, the smallest block I used was 2kB, which means four 512b
blocks plus one 512b of parity - each 2kB block uses all 5 disks.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
[EMAIL PROTECTED]                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Attachment: pgpvqYkQFVjyQ.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to