pet peeve below... Kent Watsen wrote: > >> I think I have managed to confuse myself so i am asking outright hoping for >> a straight answer. >> > Straight answer: > > ZFS does not (yet) support adding a disk to an existing raidz set - > the only way to expand an existing pool is by adding a stripe. > Stripes can either be mirror, raid5, or raid6 (raidz w/ single or > double parity) - these striped pools are also known as raid10, > raid50, and raid60 respectively. Each stripe in a pool may be > different in both size and type - essentially, each offers space at > a resiliency rating. However, since apps can't control which stripe > their data is written to, all stripes in a pool generally have the > same amount of parity. Thus, in practice, stripes differ only in > size, which can be achieved either by using larger disks or by using > more disks (in a raidz). When stripes are of different size, ZFS > will, in time, consume all the space each stripe offers - assuming > data-access is completely balanced, larger stripes effectively have > more I/O. Regarding matching the amount of parity in each stripe, > note that a 2-disk mirror has the same amount of parity as RAID5 and > a 3-disk mirror has the same parity as RAID6. > > > So, if the primary goal is to grow a pool over time by adding as few > disks as possible each time while having 1 bit of parity, you need to > plan on each time adding two disks in a mirrored configuration. Thus > your number of disks would grow like this: 2, 4, 6, 8, 10, etc. > > > But since folks apparently want to be able to just add disks to a RAIDZ, > lets compare that to adding 2-disk mirror stripes in terms of impact to > space, resiliency, and performance. In both cases I'm assuming 500GB > disks having a MTBF of 4 years,7,200 rpm, and 8.5 ms average read seek.
MTBF=4 years is *way too low*! Disk MTBF should be more like 114 years. This is also a common misapplication of reliability analysis. To excerpt from http://blogs.sun.com/relling/entry/using_mtbf_and_time_dependent For example, data collected for the years 1996-1998 in the US showed that the annual death rate for children aged 5-14 was 20.8 per 100,000 resident population. This shows an average failure rate of 0.0208% per year. Thus, the MTBF for children aged 5-14 in the US is approximately 4,807 years. Clearly, no human child could be expected to live 5,000 years. That said (ok, it is a pet peeve for RAS guys :-) the relative merit of the rest of the analysis is good :-) And, for the record, I mirror. -- richard > Lets first consider adding disks to a RAID5: > > Following the ZFS best-practice rule of (N+P), where N={2,4,8} and > P={1,2}, the disk-count should grow as follows: 3, 5, 9. That is, > you would start with 3, add 2, and then add 4 - note: this would be > the limit of the raidz expansion since ZFS discourages N>8. So, > the pool's MTTDL would be: > > 3 disks: space=1000 GB, mttdl=760.42 years, iops=79 > 5 disks: space=2000 GB, mttdl=228.12 years, iops=79 > 9 disks: space=4000 GB, mttdl=63.37 years, iops=79 > > Now lets consider adding 2-disk mirror stripes: > > We already said that the disks would grow by twos: 2, 4, 6, 8, 10, > etc. - so the pool's MTTDL would be: > > 2 disks: space=500 GB, mttdl=760.42 years, iops=158 > 4 disks: space=1000 GB, mttdl=380 years, iops=316 > 6 disks: space=1500 GB, mttdl=190 years, iops=474 > 8 disks: space=2000 GB, mttdl=95 years, iops=632 > > So, adding 2-disk mirrors: > > 1. is less expensive per addition (its always just two disks) > 2. not limited in number of stripes (a raidz should only hold up to 8 > data disks) > 3. drops mttdl at about the same rate (though the raidz is dropping a > little faster) > 4. increases performance (adding disks to a raidz set has no impact) > 5. increases space more slowly (the only negative - can you live with > it?) > > > Highly Recommended Resources: > > > http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance > http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl > > > > > Hope that helps, > > Kent > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss