Re: [OpenIndiana-discuss] Recommendations for fast storage

Edward Ned Harvey (openindiana) Wed, 17 Apr 2013 05:39:23 -0700

> From: Sašo Kiselkov [mailto:skiselkov...@gmail.com]
> 
> Raid-Z indeed does stripe data across all
> leaf vdevs (minus parity) and does so by splitting the logical block up
> into equally sized portions.


Jay, there you have it.  You asked why use mirrors, and you said you would use 
raidz2 or raidz3 unless cpu overhead is too much.  I recommended using mirrors 
and avoiding raidzN, and here is the answer why.

If you have 16 disks arranged in 8x mirrors, versus 10 disks in raidz2 which 
stripes across 8 disks plus 2 parity disks, then the serial write of each 
configuration is about the same; that is, 8x the sustained write speed of a 
single device.  But if you have two or more parallel sequential read threads, 
then the sequential read speed of the mirrors will be 16x while the raidz2 is 
only 8x.  The mirror configuration can do 8x random write while the raidz2 is 
only 1x.  And the mirror can do 16x random read while the raidz2 is only 1x.

In the case you care about the least, they're equal.  In the case you care 
about most, the mirror configuration is 16x faster.

You also said the raidz2 will offer more protection against failure, because 
you can survive any two disk failures (but no more.)  I would argue this is 
incorrect (I've done the probability analysis before).  Mostly because the 
resilver time in the mirror configuration is 8x to 16x faster (there's 1/8 as 
much data to resilver, and IOPS is limited by a single disk, not the "worst" of 
several disks, which introduces another factor up to 2x, increasing the 8x as 
high as 16x), so the smaller resilver window means lower probability of 
"concurrent" failures on the critical vdev.  We're talking about 12 hours 
versus 1 week, actual result of my machines in production.  Also, while it's 
possible to fault the pool with only 2 failures in the mirror configuration, 
the probability is against that happening.  The first disk failure probability 
is 1/16 for each disk ... And then if you have a 2nd concurrent failure, 
there's a 14/15 probability that it occurs on a separately independent (safe) 
mirror.  The 3rd concurrent failure 12/14 chance of being safe.  The 4th 
concurrent failure 10/13 chance of being safe.  Etc.  The mirror configuration 
can probably withstand a higher number of failures, and also the resilver 
window for each failure is smaller.  When you look at the total probability of 
pool failure, they were both like 10^-17 or something like that.  In other 
words, we're splitting hairs but as long as we are, we might as well point out 
that they're both about the same.


_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

Re: [OpenIndiana-discuss] Recommendations for fast storage

Reply via email to