Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

Miles Nordin Thu, 14 Jan 2010 12:35:29 -0800

>>>>> "cg" == Christopher George <cgeo...@ddrdrive.com> writes:


    cg> I agree, it would be very informative if RAID HBA vendors
    cg> would publish failure statistics of their Li-Ion based BBU
    cg> products.

If they haven't, then on what are you basing your decision *not* to
use one?  Just the random thought that they might fail?

    cg> inflexible proprietary nature of Li-Ion

You can get complete systems with charging microcontroller and battery
without any undue encumbrances I can detect on sparkfun.com.  What's
``proprietary'' mean in this context?

    cg> the ignition risk, thermal wear-out, and the inflexible
    cg> proprietary nature of Li-Ion solutions simply outweigh the
    cg> benefits of internal or all inclusive mounting for enterprise
    cg> bound NVRAM.

well...for *HOME* use based on the failure modes I've observed I'd
prefer to keep the battery next to the SDRAM like ACARD and LSI do.

for the enterprise, someone should warn netapp/hitachi/emc/storagetek
who are presumably Lion based nvram users.

One thing on which I can agree: if the vendor has used Lion it's hard
to tell if the implementation is proper, ex whether it will warn of an
aged battery without enough capacity.  For slog, IMHO the ideal
behavior would be:

 1. weekly test-flushes to CF or USBstick or whatever is the NAND
    backing-store

 2. the device should shut itself off, as if SATA cable were pulled,
    or in some other way ZFS detects instantly, if the battery's not
    got capacity left after the test flush completes.  One way would
    be to require *two* consecutive successful test flushes each week.

 3. there should be a button you can press to simulate the
    battery-failure-powerdown behavior, so you can test that ZFS and
    your controller respond properly.

 4. ``redundant'' power should mean the device has (1) power from
    host, and (2) enough stored energy in the battery to do two
    consecutive flushes.  Whenever the device does not have
    ``redundant'' power, it should:

    a. disable itself as in (3)

    b. flush SDRAM to NAND.

    This means, if the device's battery is exhausted, the system may
    boot with the device disconnected.  The host will have to suport
    hotplug so the slog can come back after the battery charges.

so, (2) is really a special case of (4).  

and AIUI Lion will last longer if you don't charge it to 100%.
laptops usually want 100% because they compete on mAh/kg at initial
purchase, but for this application charging to 70% should be fine
which from what I heard will make them last a lot longer before
crystalizing.

    cg> can detect not only a disconnect but any loss of power.  In
    cg> all cases, the card throws an interrupt so that the device
    cg> driver (and ultimately user space) can be immediately
    cg> notified.

We need to look at the overall system, though.  Does a ZFS system
using the card disable the slog when this happens?  or does it just
print a warning in dmesg and do nothing?

When you're using a LSI BBU, the disks behind the controller have
their write cahce disabled.  so, if you evil-tune ZFS to skip issuing
SYNC CACHE, but then the BBU dies and becomes write-through, the
overall system is still safe (albeit slow).

Also what you describe still doesn't seem to detect the failure case
you brought up yourself, of a worn-out battery.  UPS's do test their
batteries, but ones with worn-out batteries enter bypass mode, they
don't turn themselves off, which seems to be the only way your card
would have to hear a warning.

    cg> attaching/detaching the external power cable has no effect on
    cg> data integrity as long as the host is powered on.

In other words, as long as you don't trip over both cables at once.  :(

Does the device partially obey my (4) and immediately flush to NAND
when the host is powered off?  or does it keep the data in SDRAM only
for as long as possible, until told to do otherwise by ``the user'' or
something?

pgptaCigYmnol.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New ZFS Intent Log (ZIL) device available - Beta program now open!

Reply via email to