Re: [zfs-discuss] Re: ZFS and Storage

Erik Trimble Wed, 28 Jun 2006 14:25:33 -0700

On Wed, 2006-06-28 at 13:24 -0400, Jonathan Edwards wrote:
> 
> On Jun 28, 2006, at 12:32, Erik Trimble wrote:
> 
> > The main reason I don't see ZFS mirror / HW RAID5 as useful is this:
> > 
> > 
> > ZFS mirror/ RAID5:      capacity =  (N / 2) -1
> > 
> >                                     speed <<  N / 2 -1
> > 
> >                                     minimum # disks to lose before
> > loss of data:  4
> > 
> >                                     maximum # disks to lose before
> > loss of data:  (N / 2) + 2
> > 
> 
> 
> shouldn't that be capacity = ((N -1) / 2) ?
> 
Nope.  For instance, 12 drives:  2 mirrors of 6 drive RAID5, which
actually has 5 drives capacity. N=12, so (12 / 2) -1 = 6 -1 = 5.


> 
> loss of a single disk would cause a rebuild on the R5 stripe which
> could affect performance on that side of the mirror.  Generally
> speaking good RAID controllers will dedicate processors and channels
> to calculate the parity and write it out so you're not impacted from
> the host access PoV.  There is a similar sort of CoW behaviour that
> can happen between the array cache and the drives, but in the ideal
> case you're dealing with this in dedicated hw instead of shared hw.
> 

But, in all cases I've ever observed, even with hardware assist, writing
to a N-drive RAID5 array is slower than writing to a (N-1)-drive HW
Striped array. NVRAM of course can mitigate this somewhat, but the truth
comes down to that RAID 5/6 always requires more work to be done than
simple striping.

And, a N-drive striped array will always outperform a N-drive RAID5/6
array.  Always. 

I agree that there is some latitude for array design/cache
performance/workload variance in this, but I've compared what would be
the generally optimal RAID-5 workload (large size streaming writes/
streaming reads) against a identical number of striped drives, and you
are looking at BEST CASE the RAID5 performing at (N-1)/N  of the stripe.

[ in reality, that isn't quite the best case. The best case is that
RAID-5 matches striping, in the case of reads of size <= (stripe size) *
(N-1) ]


> > 
> > ZFS mirror / HW Stripe   capacity =  (N / 2)
> > 
> >                                     speed >=  N / 2
> > 
> >                                     minimum # disks to lose before
> > loss of data:  2
> > 
> >                                     maximum # disks to lose before
> > loss of data:  (N / 2) + 1
> > 
> > 
> > Given a reasonable number of hot-spares, I simply can't see the
> > (very) marginal increase in safety give by using HW RAID5 as out
> > balancing the considerable speed hit using RAID5 takes. 
> > 
> 
> 
> I think you're comparing this to software R5 or at least badly
> implemented array code and divining that there is a considerable speed
> hit when using R5.  In practice this is not always the case provided
> that the response time and interaction between the array cache and
> drives is sufficient for the incoming stream.  By moving your
> operation to software you're now introducing more layers between the
> CPU, L1/L2 cache, memory bus, and system bus before you get to the
> interconnect and further latencies on the storage port and underlying
> device (virtualized or not.)  Ideally it would be nice to see ZFS
> style improvements in array firmware, but given the state of embedded
> Solaris and the predominance of 32bit controllers - I think we're
> going to have some issues.  We'd also need to have some sort of client
> mechanism to interact with the array if we're talking about moving the
> filesystem layer out there .. just a thought
> 
> 
> Jon E
> 


What I was trying to provide was the case for those using HW Arrays AND
ZFS, and what the best configuration would be to do so.  I'm not saying
either/or; what the discussion centered around was what the best way to
do BOTH is.


-- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: ZFS and Storage

Reply via email to