>>>>> "et" == Erik Trimble <[EMAIL PROTECTED]> writes:
et> SSD used to refer strictly to standard DRAM backed with a et> battery (and, maybe some sort of a fancy enclosure with a hard et> drive to write all DRAM data to after a power outage). et> * 3.5" LP disk form factor, SCSI hotswap/SATA2 and 4-8 Compact et> Flash card slots I'm suspicious of things that are suppoed to protect you during extreme circumstances, which are themselves complicated and weird. I wonder if they are really helping, or if people use them like witches' incantations, and believe the weird things work because along with the tens of thousands of dollars they spend on weird things, they also take care to never encounter the extreme circumstances. If they do encounter said circumstances, they think they were ``asking for it'', not that their <weird complicated device> failed to do what it promised. so, this thing with battery sockets and DRAM sockets and probably also lights and buttons I, doesn't inspire in me the same confidence as a hard disk which is sealed and never had any buttons lights or sockets. but whatever, maybe I'm turning into a grouchy old man. One thing I do like about the iram is that it attaches like a disk and can be moved from one machine to another along with the rest of the disks that make up the array. I don't know how the storage vendors are doing this---do they use their batteries to power the disks themselves for a few seconds?---but it seems kind of mickey-mouse that, while normally you can move all your ``hot swap'' disks from one enclosure to another, if the enclosure breaks, sometimes the enclosure itself is ``dirty''. Maybe you have some way of powering it down habitually that leaves the NVRAM dirty, and you never realize you're doing this because the NVRAM works well. One day, you plan some ``downtime'' to ``migrate'' the disks to a newer stepping of enclosure, which corrupts the array because you've separated the disks from the NVRAM. Instead of blaming the shitty NVRAM architecture, you decide that since you were ``asking for it'' by moving so much stuff around, you should have asked for a half day maintenance window to do a full backup right before moving the disks, and the imbecile who stuffed this RAM into the guts of the beast and turned it into a bulletted marketing point without so much as a red LED to warn it wasn't empty never gets the blame he deserves. I guess these days you would want battery-backed RAM, which gets copied onto FLASH after a power outage. I think the feature should go right on the motherboard. The battery should back up all DRAM in the system (or all DRAM attached to processor 0 in big systems, or something like that). but it should only supply ~60min of power. There's also a microcontroller and a bunch of CF under battery power. During normal operation, the kernel registers chunks of physical memory with the microcontroller that need to be nonvolatile. During a power outage, and also during a weekly test, the battery supplies power to DRAM, CF cards, and the small microcontroller. The microcontroller copies the requested chunks of DRAM into CF. During the weekly test, the microcontroller deliberately does two copies, to make sure the battery has some extra capacity. And the kernel immediately ``scrubs'' the CF to make sure it's really working. During a normal shutdown, the kernel deregisters all chunks with the microcontroller so the microcontroller knows it can leave the CF card empty and power down. * something will have to simulate the load of the DRAM on the battery during the weekly test. You can't just switch the DRAM's power source to the battery during the test, because what if the test fails? * next to each CF slot is an orange/green light. Whenever the microcontroller has power, especially while the machine is off, it makes the light orange if the CF card has data on it, green if it's empty, and dark if the battery test has failed. if the light's green, then the CF card is safe to consider part of the motherboard not part of the array, and the array can be moved to another system with no NV-CF-RAM option. if the light is orange, then the CF card is part of the array and must be moved along with all the other disks. * on boot, the kernel reads the log directly from the CF card. The kernel can read the log from CF cards in USB-to-CF adapters, too, so whatever format the microcontroller uses for dumps needs to be respected by ZFS as at least an acceptable read-only ZIL format for disk devices. * if the CF card has data on it which the kernel has seen, but refused to assimilate (a log for a ZFS pool which isn't ONLINE), then the light turns Red to indicate the NV-CF-RAM feature is disabled, and it might not be safe to throw away the card. so, during a normal dirty-shutdown bootup, the light is: + powered off after clean shutdown: Green + running normally, with DRAM regions registered: Orange here, the card doesn't actually contain any data. but it can't be removed. + cord yanked: (?) for a few seconds during the DRAM dump + while the power is off after cord yanking: orange this is actual-orange. The card has a ZIL on it. + after the kernel boots up and attaches the microcontroller driver: red (for a few seconds) + after all the ZFS pools are probed, and the dirty pool attached: Orange the card has no ZIL on it, but it can't be removed because DRAM regions are registered with the microcontroller again. * In Red condition, when the kernel tries to register a region of memory with the microcontroller, the microcontroller refuses, causing ZFS to proceed safely with no log (not blindly with a silently volatile log). * there are multiple CF slots, so the microcontroller can be told to make mirrored dumps. * some day the other slots can be used for other purposes. For example systems where CPR is working could auto-hibernate when their cords are pulled.
pgpMT1mGJjOGF.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss