On Oct 20, 2010, at 6:03 AM, Edward Ned Harvey wrote: > In a discussion a few weeks back, it was mentioned that the Best Practices > Guide says something like "Don't put more than ___ disks into a single > vdev." At first, I challenged this idea, because I see no reason why a > 21-disk raidz3 would be bad. It seems like a good thing.
It is a choice. The reason we have the "best practices" guide is because the implications of such choices are not always immediately obvious. Anecdote: when the X4500 was first released, people built 46-disk wide raidz1 pools. This is not a good idea for most cases. Hence the man page and ZFS best practices guide recommendations to limit the number of disks in a set. > I was operating on assumption that resilver time was limited by sustainable > throughput of disks, which was wrong. It is limited by the random write capacity of the resilvering disk. > At present, resilver time is limited > by random IO, so the ZFS resilver time is typically much longer than it > would be if you were resilvering the whole disk serially. Resilver is also throttled. > But that was the only negative against 21-disk raidz3. Untrue. The performance of a 21-disk raidz3 will be nowhere near the performance of a 20 disk 2-way mirrror. > That was the only > negative, against using more than ___ disks in a single vdev. Assuming this > one problem is improved at some point, is there any other reason to stay > below ___ disks in a vdev? Taking this to a limit, would you say a 1,000 disk raidz3 set is a good thing? 10,000 disks? > Does the random IO resilver performance problem also apply to scrub or zfs > send? Again the problem is: resilver is done in effectively random order, > so the disks perform zillions of random seeks instead of serializing IO and > minimizing seeks during resilver. Is the same thing true for scrub or zfs > send? To recap: + The data will be read from where it was written. + Resilver is throttled. + Resilver performance is typically bound by the random write capability of the resilvering disk. + Resilvering is done in temporal order. + If too many disks fail during resilver, the resulting zpool can still be on-disk consistent. + ZFS is open source, feel free to modify and share your ideas for improvement. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com USENIX LISA '10 Conference, November 7-12, San Jose, CA ZFS and performance consulting http://www.RichardElling.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss