Re: [zfs-discuss] ZFS configuration for VMware

Miles Nordin Sat, 28 Jun 2008 09:35:13 -0700

>>>>> "et" == Erik Trimble <[EMAIL PROTECTED]> writes:


    et> SSD used to refer strictly to standard DRAM backed with a
    et> battery (and, maybe some sort of a fancy enclosure with a hard
    et> drive to write all DRAM data to after a power outage).

    et> * 3.5" LP disk form factor, SCSI hotswap/SATA2 and 4-8 Compact
    et> Flash card slots

I'm suspicious of things that are suppoed to protect you during
extreme circumstances, which are themselves complicated and weird.  I
wonder if they are really helping, or if people use them like witches'
incantations, and believe the weird things work because along with the
tens of thousands of dollars they spend on weird things, they also
take care to never encounter the extreme circumstances.  If they do
encounter said circumstances, they think they were ``asking for it'',
not that their <weird complicated device> failed to do what it
promised.  so, this thing with battery sockets and DRAM sockets and
probably also lights and buttons I, doesn't inspire in me the same
confidence as a hard disk which is sealed and never had any buttons
lights or sockets.  but whatever, maybe I'm turning into a grouchy old
man.

One thing I do like about the iram is that it attaches like a disk and
can be moved from one machine to another along with the rest of the
disks that make up the array.  I don't know how the storage vendors
are doing this---do they use their batteries to power the disks
themselves for a few seconds?---but it seems kind of mickey-mouse
that, while normally you can move all your ``hot swap'' disks from one
enclosure to another, if the enclosure breaks, sometimes the enclosure
itself is ``dirty''.  Maybe you have some way of powering it down
habitually that leaves the NVRAM dirty, and you never realize you're
doing this because the NVRAM works well.  One day, you plan some
``downtime'' to ``migrate'' the disks to a newer stepping of
enclosure, which corrupts the array because you've separated the disks
from the NVRAM.  Instead of blaming the shitty NVRAM architecture, you
decide that since you were ``asking for it'' by moving so much stuff
around, you should have asked for a half day maintenance window to do
a full backup right before moving the disks, and the imbecile who
stuffed this RAM into the guts of the beast and turned it into a
bulletted marketing point without so much as a red LED to warn it
wasn't empty never gets the blame he deserves.

I guess these days you would want battery-backed RAM, which gets
copied onto FLASH after a power outage.

I think the feature should go right on the motherboard.  The battery
should back up all DRAM in the system (or all DRAM attached to
processor 0 in big systems, or something like that).  but it should
only supply ~60min of power.

There's also a microcontroller and a bunch of CF under battery power.
During normal operation, the kernel registers chunks of physical
memory with the microcontroller that need to be nonvolatile.

During a power outage, and also during a weekly test, the battery
supplies power to DRAM, CF cards, and the small microcontroller.  The
microcontroller copies the requested chunks of DRAM into CF.  During
the weekly test, the microcontroller deliberately does two copies, to
make sure the battery has some extra capacity.  And the kernel
immediately ``scrubs'' the CF to make sure it's really working.

During a normal shutdown, the kernel deregisters all chunks with the
microcontroller so the microcontroller knows it can leave the CF card
empty and power down.

 * something will have to simulate the load of the DRAM on the battery
   during the weekly test.  You can't just switch the DRAM's power
   source to the battery during the test, because what if the test
   fails?

 * next to each CF slot is an orange/green light.  Whenever the
   microcontroller has power, especially while the machine is off, it
   makes the light orange if the CF card has data on it, green if it's
   empty, and dark if the battery test has failed.

   if the light's green, then the CF card is safe to consider part of
   the motherboard not part of the array, and the array can be moved
   to another system with no NV-CF-RAM option.

   if the light is orange, then the CF card is part of the array and
   must be moved along with all the other disks.

 * on boot, the kernel reads the log directly from the CF card.  The
   kernel can read the log from CF cards in USB-to-CF adapters, too,
   so whatever format the microcontroller uses for dumps needs to be
   respected by ZFS as at least an acceptable read-only ZIL format for
   disk devices.

 * if the CF card has data on it which the kernel has seen, but
   refused to assimilate (a log for a ZFS pool which isn't ONLINE),
   then the light turns Red to indicate the NV-CF-RAM feature is
   disabled, and it might not be safe to throw away the card.  so,
   during a normal dirty-shutdown bootup, the light is:

    + powered off after clean shutdown: Green

    + running normally, with DRAM regions registered: Orange

      here, the card doesn't actually contain any data.  but it can't
      be removed.

    + cord yanked: (?) for a few seconds during the DRAM dump

    + while the power is off after cord yanking: orange

      this is actual-orange.  The card has a ZIL on it.

    + after the kernel boots up and attaches the microcontroller driver: red 
(for a few seconds)

    + after all the ZFS pools are probed, and the dirty pool attached: Orange

      the card has no ZIL on it, but it can't be removed because DRAM
      regions are registered with the microcontroller again.

 * In Red condition, when the kernel tries to register a region of
   memory with the microcontroller, the microcontroller refuses,
   causing ZFS to proceed safely with no log (not blindly with a
   silently volatile log).

 * there are multiple CF slots, so the microcontroller can be told to
   make mirrored dumps.

 * some day the other slots can be used for other purposes.  For
   example systems where CPR is working could auto-hibernate when
   their cords are pulled.

pgpMT1mGJjOGF.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS configuration for VMware

Reply via email to