On Wed, Mar 21, 2012 at 7:56 AM, Jim Klimov <jimkli...@cos.ru> wrote: > 2012-03-21 7:16, MLR wrote:
> One thing to note is that many people would not recommend using > a "disbalanced" ZFS array - one expanded by adding a TLVDEV after > many writes, or one consisting of differently-sized TLVDEVs. > > ZFS does a rather good job of trying to use available storage > most efficiently, but it was often reported that it hits some > algorithmic bottleneck when one of the TLVDEVs is about 80-90% > full (even if others are new and empty). Blocks are balanced > across TLVDEVs on write, so your old data is not magically > redistributed until you explicitly rewrite it (i.e. zfs send > or rsync into another dataset on this pool). I have been running ZFS in a mission critical application since zpool version 10 and have not seen any issues with some of the vdevs in a zpool full while others are virtually empty. We have been running commercial Solaris 10 releases. The configuration was that each business unit had a separate zpool consisting of mirrored pairs of 500 GB LUNs from SAN based storage. Each zpool started with enough storage for that business unit. As each business unit filled their space, we added additional mirrored pairs of LUNs. So the smallest unit had one mirror vdev and the largest had 13 vdevs. In the case of the two largest (13 and 11 vdevs) most of the vdevs were well above 90% utilized and there were 2 or 3 almost empty vdevs. We never saw any reliability issues with this condition. In terms of performance, the storage was NOT our performance bottleneck, so I do not know if there were any performance issue with this situation. > So I'd suggest that you keep your disks separate, with two > pools made from 1.5Tb disks and from 3Tb disks, and use these > pools for different tasks (i.e. a working set with relatively > high turnaround and fragmentation, and WORM static data with > little fragmentation and high read performance). > Also this would allow you to more easily upgrade/replace the > whole set of 1.5Tb disks when the time comes. I have never tried mixing drives of different size or performance characteristic in the same zpool or vdev, except as a temporary migration strategy. You already know that growing a RAIDz vdev is currently impossible, so with a RAIDz strategy your only option for growth is to add complete RAIDz vdevs, and you _want_ those to match in terms of performance or you will have unpredictable performance. For situations where you _might_ want to grow the data capacity in the future I recommend mirrors, but ... and Richard Elling posted hard data on this to the list a while back, to get the reliability of RAIDz2 you need more than a 2-way mirror. In my mind, the larger the amount of data (and size of drives) the _more_ reliability you need. We are no longer using the configuration described above. The current configuration is five JBOD chassis of 24 drives each. We have 22 vdevs, each a RAIDz2 consisting of one drive from each chassis and 10 hot spares. Our priority was reliability followed by capacity and performance. If we could have, we would have just used 3 or 4 way mirrors, but we needed more capacity than that provided. I note that in pre-production testing we did have two of the five JBOD chassis go offline at once and did not lose _any_ data. The total pool size is about 40 TB. We also have a redundant copy of the data on a remote system. That system only has two JBOD chassis and capacity is the priority. The zpool consists of two vdevs each a RAIDz2 of 23 drives and two hot spares. The performance is dreadful, but we _have_ the data in case of a real disaster. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, Troy Civic Theatre Company -> Technical Advisor, RPI Players _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss