On Oct 20, 2010, at 6:03 AM, Edward Ned Harvey wrote:

> In a discussion a few weeks back, it was mentioned that the Best Practices
> Guide says something like "Don't put more than ___ disks into a single
> vdev."  At first, I challenged this idea, because I see no reason why a
> 21-disk raidz3 would be bad.  It seems like a good thing.

It is a choice.  The reason we have the "best practices" guide is because 
the implications of such choices are not always immediately obvious.

Anecdote: when the X4500 was first released, people built 46-disk wide
raidz1 pools.  This is not a good idea for most cases. Hence the man page
and ZFS best practices guide recommendations to limit the number of 
disks in a set.

> I was operating on assumption that resilver time was limited by sustainable
> throughput of disks, which was wrong.

It is limited by the random write capacity of the resilvering disk.

>  At present, resilver time is limited
> by random IO, so the ZFS resilver time is typically much longer than it
> would be if you were resilvering the whole disk serially.

Resilver is also throttled.

> But that was the only negative against 21-disk raidz3.  

Untrue.  The performance of a 21-disk raidz3 will be nowhere near the
performance of a 20 disk 2-way mirrror.

> That was the only
> negative, against using more than ___ disks in a single vdev.  Assuming this
> one problem is improved at some point, is there any other reason to stay
> below ___ disks in a vdev?

Taking this to a limit, would you say a 1,000 disk raidz3 set is a good thing?
10,000 disks?

> Does the random IO resilver performance problem also apply to scrub or zfs
> send?  Again the problem is:  resilver is done in effectively random order,
> so the disks perform zillions of random seeks instead of serializing IO and
> minimizing seeks during resilver.  Is the same thing true for scrub or zfs
> send?

To recap:
+ The data will be read from where it was written.
+ Resilver is throttled. 
+ Resilver performance is typically bound by the random write capability of
   the resilvering disk.
+ Resilvering is done in temporal order.
+ If too many disks fail during resilver, the resulting zpool can still be 
on-disk
   consistent.
+ ZFS is open source, feel free to modify and share your ideas for improvement.

 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
USENIX LISA '10 Conference, November 7-12, San Jose, CA
ZFS and performance consulting
http://www.RichardElling.com













_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to