OK, so this is another "my pool got eaten" problem. Our setup: Nevada 77 when it happened, now running 87. 9 iSCSI vdevs exported from Linux boxes off of hardware RAID (running Linux for drivers on the RAID controllers). The pool itself is simply striped.
Our problem: Power got yanked to 8 of the 9 vdevs. At the time, we had ZIL disabled and write-back caching enabled on the vdevs for performance reasons. The ZIL *was* going to be re-enabled, but Murphy's Law says things crash beforehand. On attempting to bring the system back up after a reboot, all the vdevs and the pool itself is marked FAULTED with corrupted data. What we've attempted: Since last Thursday (today is the Wednesday afterwords), we've tried using this weekend's nightly build to use zpool import -F to no avail. In addition, I've been going through and applying dtrace probes into the kernel to see where its dying and how, to see if it's a "turn off sanity checks and mount r/o" issue, or if it's that our data is hopelessly munged. This attempt has resulted in a bit of a goose chase, with possibilities popping up and failure modes branching quicker than I can take a close look at them. My partner here is working on the possibility of an offline file-grabbing program, which shows some progress, but not much yet. Our biggest problem is neither of us are experienced in kernel-land debugging or filesystems, and at least I am rather unexperienced with the debugging power tools available on Solaris, such as mdb, and uses of dtrace beyond looking at function return values and entry arguments. Is there someone who has a bit more experience with this who can help us? -- Matt This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss