On Mon, 29 Jan 2007, Toby Thain wrote:

> Hi,
>
> This is not exactly ZFS specific, but this still seems like a
> fruitful place to ask.
>
> It occurred to me today that hot spares could sit in standby (spun
> down) until needed (I know ATA can do this, I'm supposing SCSI does
> too, but I haven't looked at a spec recently). Does anybody do this?
> Or does everybody do this already?

I don't work with enough disk storage systems to know what is the industry
norm.  But there are 3 broad categories of disk drive spares:

a) Cold Spare.  A spare where the power is not connected until it is
required.  [1]

b) Warm Spare.  A spare that is active but placed into a low power mode.
Or into a "low mechanical ware & tare" mode.  In the case of a disk drive,
the controller board is active but the HDA (Head Disk Assembly) is
inactive (platters are stationary, heads unloaded [if the heads are
physically unloaded]); it has power applied and can be made "hot" by a
command over its data/command (bus) connection.  The supervisorary
hardware/software/firmware "knows" how long it *should* take the drive to
go from warm to hot.

c) Hot Spare.  A spare that is spun up and ready to accept
read/write/position (etc) requests.

> Does the tub curve (chance of early life failure) imply that hot
> spares should be burned in, instead of sitting there doing nothing
> from new? Just like a data disk, seems to me you'd want to know if a
> hot spare fails while waiting to be swapped in. Do they get tested
> periodically?

The ideal scenario, as you already allude to, would be for the disk
subsystem to initially configure the drive as a hot spare and send it
periodic "test" events for, say, the first 48 hours.  This would get it
past the first segment of the "bathtub" reliability curve - often referred
to as the "infant mortality" phase.  After that, (ideally) it would be
placed into "warm standby" mode and it would be periodically tested (once
a month??).

If saving power was the highest priority, then the ideal situation would
be where the disk subsystem could apply/remove power to the spare and move
it from warm to cold upon command.

One "trick" with disk subsystems, like ZFS that have yet to have the FMA
type functionality added and which (today) provide for hot spares only, is
to initially configure a pool with one (hot) spare, and then add a 2nd hot
spare, based on installing a brand new device, say, 12 months later.  And
another spare 12 months later.  What you are trying to achieve, with this
strategy, is to avoid the scenario whereby mechanical systems, like disk
drives, tend to "wear out" within the same general, relatively short,
timeframe.

One (obvious) issue with this strategy, is that it may be impossible to
purchase the same disk drive 12 and 24 months later.  However, it's always
possible to purchase a larger disk drive and simply commit to the fact
that the extra space provided by the newer drive will be wasted.

[1] The most common example is a disk drive mounted on a carrier but not
seated within the disk drive enclosure.  Simple "push in" when required.

Off Topic: To go off on a tangent - the same strategy applies to a UPS
(Uninterruptable Power Supply).  As per the following time line:

year 0: purchase the UPS and one battery cabinet
year 1: purchase and attach an additional battery cabinet
year 2: purchase and attach an additional battery cabinet
year 3: purchase and attach an additional battery cabinet
year 4: purchase and attach an additional battery cabinet and remove the
oldest battery cabinet
year 5 ... N: repeat year 4s scenario until its time to replace the UPS.

The advantage of this scenario is that you can budget a *fixed* cost for
the UPS and your management understands that there is a recurring cost so
that, when the power fails, your UPS will have working batteries!!

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
           Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
             OpenSolaris Governing Board (OGB) Member - Feb 2006
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to