Re: [zfs-discuss] Raidz vdev size... again.

Richard Elling Mon, 27 Apr 2009 18:27:04 -0700

Some history below...

Scott Lawson wrote:

Michael Shadle wrote:
On Mon, Apr 27, 2009 at 4:51 PM, Scott Lawson
<scott.law...@manukau.ac.nz> wrote:
If possible though you would be best to let the 3ware controller expose
the 16 disks as a JBOD to ZFS and create a RAIDZ2 within Solaris asyou
will then
gain the full benefits of ZFS. Block self healing etc etc.
There isn't an issue in using a larger amount of disks in a RAIDZ2,just
that it
is not the optimal size. Longer rebuild times for larger vdev's in azpool
(although this
is proportional to how full the pool is.). Two parity disks gives you
greater cover in
the event of a drive failing in a large vdev stripe.
Hmm, this is a bit disappointing to me. I would have dedicated only 2
disks out of 16 then to a single large raidz2 instead of two 8 disk
raidz2's (meaning 4 disks went to parity)
No I was referring to a single RAIDZ2 vdev of 16 disks in your pool.So you wouldlose ~2 disks to parity effectively. The larger the stripe,potentially the slower the rebuild.If you had multiple vdevs in a pool that were smaller stripes youwould get less performancedegradation by virtue of IO isolation. Of course here you lose poolcapacity. Withsmaller vdevs, you could also potentially just use RAIDZ and notRAIDZ2 and then you would
have the equivalent size pool still with two parity disks. 1 per vdev.


A few years ago, Sun introduced the X4500 (aka Thumper) which had 48
disks in the chassis.  Of course, the first thing customers did was to make

a single-level 46 or 48 disk raidz set. The second thing they did wascomplainthat the resulting performance sucked. So the "solution" was to try andputsome sort of practical limit into the docs to help people not hurtthemselves.

After much research (down at the pub? :-) the recommendation you see in
the man page was the concensus.  It has absolutely nothing to do with
correctness of design or implementation.  It has everything to do with
setting expectations of "goodness."

One thing you haven't mentioned is the drive type and size that youare planning to use as thisgreatly influences what people here would recommend. RAIDZ2 is builtfor big, slow SATAdisks as reconstruction times in large RAIDZ's and RAIDZ2's increasethe risk of vdev failuresignificantly due to the time taken to resilver to a replacementdrive. Hot spares are your friend!


The concern with large drives is unrecoverable reads during resilvering.
One contributor to this is superparamagnetic decay, where the bits are
lost over time as the medium tries to revert to a more steady state.
To some extent, periodic scrubs will help repair these while the disks
are otherwise still good. At least one study found that this can occur
even when scrubs are done, so there is an open research opportunity
to determine the risk and recommend scrubbing intervals.  To a lesser
extent, hot spares can help reduce the hours it may take to physically
repair the failed drive.

I was still operating under the impression that vdevs larger than 7-8
disks typically make baby Jesus nervous.
You did also state that this is a system to be used for backups? Soavailability is five 9's?


I do not believe you can achieve five 9s with current consumer disk
drives for an extended period, say >1 year.

Are you planning on using Open Solaris or mainstream Solaris 10?Mainstream Solaris10 is more conservative and is capable of being placed under a supportagreement if need
be.


Mainstream Solaris 10 gets a port of ZFS from OpenSolaris, so its
features are fewer and later.  As time ticks away, fewer features
will be back-ported to Solaris 10.  Meanwhile, you can get a production
support  agreement for OpenSolaris.
http://www.sun.com/service/opensolaris/index.jsp
-- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Raidz vdev size... again.

Reply via email to