Hi all, I have a RAID-Z zpool made up of 4 x SATA drives running on Nexenta 1.0.1 (OpenSolaris b85 kernel). It has on it some ZFS filesystems and few volumes that are shared to various windows boxes over iSCSI. On one particular iSCSI volume, I discovered that I had mistakenly deleted some files from the FAT32 partition that is on it. The files were still in a ZFS snapshot that was made earlier in the morning so I made use of the ZFS clone command to create a separate copy of the volume. I accessed it in Windows, got the files I needed, and then proceeded to delete it using "zfs destroy". During the process, disk activity stopped, my SSH windows stopped responding and Windows lost all iSCSI connections, reporting delayed write failed for the volumes that disappeared.
I powered down the Nexenta box and started it back up, where it hung with the following output: SunOS Release 5.11 Version NexentaOS_20080312 64-bit Loading Nexenta... Hostname: mammoth This is before the usual "Reading ZFS config: done" and "Mounting ZFS filesystems" indicators. The only way I could bring system up was to disconnect all four SATA drives before power-on. I can then export the zpool, reboot, and the system comes up without complaint. However, of course, the pool isn't imported. When I execute "zpool import", the pool is detected fine: pool: zp id: 2070286287887108251 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: zp ONLINE raidz1 ONLINE c0t1d0 ONLINE c0t0d0 ONLINE c0t3d0 ONLINE c0t2d0 ONLINE The next issue is that when the pool is actually imported ("zpool import -f zp"), it too hangs the whole system, albeit after a minute or so of disk activity. A "zpool iostat zp 10" during that time is below: capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- zp 1.73T 1018G 1.13K 7 4.43M 23.6K zp 1.73T 1018G 1.05K 0 4.07M 0 zp 1.73T 1018G 1.15K 0 4.88M 0 zp 1.73T 1018G 457 0 1.36M 0 zp 1.73T 1018G 668 0 2.49M 0 zp 1.73T 1018G 411 0 1.80M 0 [system stopped at this point and wouldn't accept keypresses any more] I'm lost as to what to do - every time the pool is imported, it briefly turns up in "zpool status", but will then hang the system to the extent that I must power off, disconnect drives, power up, zpool export, and reboot, just to be able to start typing commands again!! So far I've tried: 1. Rebooting with only one of the SATA drives attached at a time. All four times the OS came up fine, but of course "zpool status" reported the pool as having insufficient replicas. I don't know whether powering up with two or three drives will work; I didn't want to try any permutations in case I made things worse. 2. Checking with "fmdump -e", the only output relating to zfs is regarding missing vdev's and is presumably from when I have been rebooting with drives disconnected. 3. "dd if=/dev/rdsk/c0t0d0 of=/"dev/null bs=1048576" and the equivalents for the other three drives are all currently running and I await the results. Given that a scrub takes about 7 hours, I expect I'll have to leave this overnight. 4. "zdb -e zp" is now at the stage of "Traversing all blocks to verify checksums and verify nothing leaked ...". I expect this will also take some time. While I wait for the results from "dd" and "zdb", is there anything else I can try in order to get the pool up and running again? I have spotted some previous, similar posts regarding hanging, notably this one: http://opensolaris.org/jive/thread.jspa?threadID=70205&tstart=15 Unfortunately, I am a bit of a Nexenta/OpenSolaris/Unix newbie so a lot of that is way over my head, and when the system completely hangs, I have no choice but to power off. Any help is much appreciated! Thanks, Chris This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss