Good timing, I'd like some feedback for some work I'm doing below...
Matt B wrote:
I am trying to determine the best way to move forward with about 35 x86 X4200's Each box has 4x 73GB internal drives.
Cool. Nice box.
All the boxes will be built using Solaris 10 11/06. Additionally, these boxes are part of a highly available production environment with an uptime expectation of 6 9's ( just a few seconds per month unscheduled downtime allowed)
Just for my curiosity, how do you measure 6 9's?
Ideally, I would like to use a single RaidZ2 pool of all 4 disks, but apparently that is not supported yet. I understand there is the ZFSmount software for making a ZFS root, but I don't think I want to use that for an environment of this grade and I can't wait until Sun comes out with it integrated later this year...have to use 11/06
It is supported to use 4 disks in a pool, but it isn't yet supported to use ZFS for the root file system. So, you'll end up mixing UFS and ZFS on the same disk, as a likely option.
For perspective, these systems are currently running using pure UFS. With only 2 of the 4 disks being used in a software raid 1/ = 5GB /var = 5GB /tmp = 4GB /home = 2GB /data = 50GB
I'm not a fan of separate /var, it just complicates things. I'll also presume that by "/tmp" you really mean "swap"
I am looking for recommendations on how to maximize the use of ZFS and minimize the use of UFS without resorting to anything "experimental".
Put / in UFS, swap as raw, and everything else in a zpool. I would mirror /+swap on two disks, with the other two disks used as a LiveUpgrade alternate boot environment. When you patch or upgrade, you will normally have better availability (shorter planned outages) with LiveUpgrade. Also, you'll be able to roll back to the previous boot environment, potentially saving more time.
So assuming that each 73GB disk yields 70GB usable space... Would it make sense to create a UFS root partition of 5GB that is a 4 way mirror across all 4 disks? I haven't used SVM to create these types of mirrors before soif anyone has any experience here let me know. My expectation is that up to any 3 of the 4 disks could fail while leaving the root partition intact. Basically, every time root has data updated that data would be written 3 times more to each other disk
I don't see any practical gain for a 4-way mirror over a 3-way mirror. With such configs you are much more likely to see some other fault which will ruin your day (eg. accidental rm)
So this would leave each disk with 68GB of free space. I would then create a 4GB UFS /tmp (swap) partition that would be 4 way mirrored across the remaining 3 disks just as I am suggesting above with the root partition. So again, up to any 3 disks could fail and the swap filesystem would still be intact.This would leave each disk with 64GB of free space, totaling 256GB. I would then create a single ZFS pool of all the remaining freespace on each of the 4 disks. How should this be done?Perhaps a form of mirring? What would be the difference in doing? zpool create tank mirror c1d0 c2d0 c3d0 c4d0 or zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0 Would it be better to use RaidZ with a hotspare or RAIDZ2I would like /data, /home, and /var to be able to grow as needed and be able to withstand at least 2 disk failures (doesn't have to be any 2). I am open to using a hotspareSuggestions?
Prioritize your requirements. Then take a look at the attached spreadsheet. What the spreadsheet contains is a report from RAIDoptimizer for the type of disk you'll be likely to have, based upon the disk vendor's data sheet (Seagate Saviio). The algorithms are described in my blog, http://blogs.sun.com/relling and an enterprising person could key them into a spreadsheet. There are 4 main portions of the data: + configuration info: raid type, set size, spares, available space + mean time to data loss (MTTDL) info: for two different MTTDL models + performance info: random, read iops and media bandwidths + mean time between services (MTBS) info: how often do you expect to repair something I'm particularly interested in feedback on MTBS. The various MTBS models consider the immediate effect of having a bunch of disks, and the deferred repair strategies of waiting until you have to replace a disk, based upon the RAID config and spares. In any case, a higher MTBS is better, though there is more risk for each MTBS model. Let me know if this is helpful. As Tomas said, you could look at some of this data in graphical form on my blog, though those graphs assume 46 disks instead of 4. For 4 disks, you have far fewer possible combinations, so it fits reasonably in a spreadsheet. Caveat: the numbers are computed by algorithms and the code has not yet been verified that it properly implements the algorithms. Models are simplifications of real life, don't expect real life to follow a model. If you do follow models, then note that Elizabeth Hurley is off the market :-) -- richard
for_matt.ods
Description: application/vnd.oasis.opendocument.spreadsheet
_______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
