Due to recent experiences, and discussion on this list, my colleague and I performed some tests:
Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have log device removal that was introduced in zpool 19) In any way possible, you lose an unmirrored log device, and the OS will crash, and the whole zpool is permanently gone, even after reboots. Using opensolaris, upgraded to latest, which includes zpool version 22. (Or was it 23? I forget now.) Anyway, it's >=19 so it has log device removal. 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out the log device. Behavior was good. The pool became "degraded" which is to say, it started using the primary storage for the ZIL, performance presumably degraded, but the system remained operational and error free. I was able to restore perfect health by "zpool remove" the failed log device, and "zpool add" a new log device. Next: 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out both power cords. 4. While the system is down, also remove the log device. (OOoohhh, that's harsh.) I created a situation where an unmirrored log device is known to have unplayed records, there is an ungraceful shutdown, *and* the device disappears. That's the absolute worst case scenario possible, other than the whole building burning down. Anyway, the system behaved as well as it possibly could. During boot, the faulted pool did not come up, but the OS came up fine. My "zpool status" showed this: # zpool status pool: junkpool state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. see: http://www.sun.com/msg/ZFS-8000-K4 scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool FAULTED 0 0 0 bad intent log c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 logs c8t3d0 UNAVAIL 0 0 0 cannot open (---------------------------) I know the unplayed log device data is lost forever. So I clear the error, remove the faulted log device, and acknowledge that I have lost the last few seconds of written data, up to the system crash: # zpool clear junkpool # zpool status pool: junkpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool DEGRADED 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 logs c8t3d0 UNAVAIL 0 0 0 cannot open # zpool remove junkpool c8t3d0 # zpool status junkpool pool: junkpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss