Good timing, I'd like some feedback for some work I'm doing below...

Matt B wrote:
I am trying to determine the best way to move forward with about 35 x86 X4200's
Each box has 4x 73GB internal drives.

Cool. Nice box.

All the boxes will be built using Solaris 10 11/06. Additionally, these boxes are part of a highly available production environment with an uptime expectation of 6 9's ( just a few seconds per month unscheduled downtime allowed)

Just for my curiosity, how do you measure 6 9's?

Ideally, I would like to use a single RaidZ2 pool of all 4 disks, but apparently that is not supported yet. I understand there is the ZFSmount software for making a ZFS root, but I don't think I want to use that for an environment of this grade and I can't wait until Sun comes out with it integrated later this year...have to use 11/06

It is supported to use 4 disks in a pool, but it isn't yet supported to use
ZFS for the root file system.  So, you'll end up mixing UFS and ZFS on the same
disk, as a likely option.

For perspective, these systems are currently running using pure UFS. With only 2 of the 4 disks being used in a software raid 1
/ = 5GB
/var = 5GB
/tmp = 4GB
/home = 2GB
/data = 50GB

I'm not a fan of separate /var, it just complicates things.  I'll also presume
that by "/tmp" you really mean "swap"

I am looking for recommendations on how to maximize the use of ZFS and minimize the use of UFS without resorting to anything "experimental".

Put / in UFS, swap as raw, and everything else in a zpool.
I would mirror /+swap on two disks, with the other two disks used as a 
LiveUpgrade
alternate boot environment.  When you patch or upgrade, you will normally have
better availability (shorter planned outages) with LiveUpgrade.  Also, you'll be
able to roll back to the previous boot environment, potentially saving more 
time.

So assuming that each 73GB disk yields 70GB usable space...
Would it make sense to create a UFS root partition of 5GB that is a 4 way mirror
across all 4 disks? I haven't used SVM to create these types of mirrors before 
so
if anyone has any experience here let me know. My expectation is that up to any 3 of the 4 disks could fail while leaving the root partition intact. Basically, every time root has data updated that data would be written 3 times more to each other disk

I don't see any practical gain for a 4-way mirror over a 3-way mirror.  With 
such
configs you are much more likely to see some other fault which will ruin your
day (eg. accidental rm)

So this would leave each disk with 68GB of free space. I would then create a 4GB UFS /tmp (swap) partition that would be 4 way mirrored across the remaining 3 disks just as I am suggesting above with the root partition. So again, up to any 3 disks could fail and the swap filesystem would still be intact.

This would leave each disk with 64GB of free space, totaling 256GB. I would then create a single ZFS pool of all the remaining freespace on each of the 4 disks. How should this be done?
Perhaps a form of mirring? What would be the difference in doing?
zpool create tank mirror c1d0 c2d0 c3d0 c4d0
or
zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0

Would it be better to use RaidZ with a hotspare or RAIDZ2

I would like /data, /home, and /var to be able to grow as needed and be able to withstand at least 2 disk failures (doesn't have to be any 2). I am open to using a hotspare

Suggestions?

Prioritize your requirements.  Then take a look at the attached spreadsheet.
What the spreadsheet contains is a report from RAIDoptimizer for the type of
disk you'll be likely to have, based upon the disk vendor's data sheet (Seagate
Saviio).  The algorithms are described in my blog, http://blogs.sun.com/relling
and an enterprising person could key them into a spreadsheet.

There are 4 main portions of the data:
        + configuration info: raid type, set size, spares, available space
        + mean time to data loss (MTTDL) info: for two different MTTDL models
        + performance info: random, read iops and media bandwidths
        + mean time between services (MTBS) info: how often do you expect to 
repair
          something

I'm particularly interested in feedback on MTBS.  The various MTBS models 
consider
the immediate effect of having a bunch of disks, and the deferred repair 
strategies
of waiting until you have to replace a disk, based upon the RAID config and 
spares.
In any case, a higher MTBS is better, though there is more risk for each MTBS 
model.
Let me know if this is helpful.

As Tomas said, you could look at some of this data in graphical form on my blog,
though those graphs assume 46 disks instead of 4.  For 4 disks, you have far 
fewer
possible combinations, so it fits reasonably in a spreadsheet.

Caveat: the numbers are computed by algorithms and the code has not yet been
verified that it properly implements the algorithms.  Models are simplifications
of real life, don't expect real life to follow a model.  If you do follow 
models,
then note that Elizabeth Hurley is off the market :-)
 -- richard

Attachment: for_matt.ods
Description: application/vnd.oasis.opendocument.spreadsheet

_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to