I recently upgraded a box to Solaris 10 U8. I've been getting more timeouts and I guess the Adaptec card is suspect, possibly not able to keep up, so it issues bus resets at times. It has apparently corrupted some files on the pool, and zpool status -v showed 2 files and one dataset corrupt. I initially was able to bring the pool up and salvage some of the files. It would not let me remove the files listed giving a "Bad exchange descriptor error". So I figured I'd salvage and remove those two datasets and try again.
So while trying to salvage what I could, it apparently stressed the card too much (constantly at 98-100% busy), eventually the service times increased high enough and then failed with timeouts and another bus reset. Then it crashed with the following: panic[cpu2]/thread=c603adc0: assertion failed: 0 == dmu_buf_hold_array(os, object, offset, size, FALSE, FTAG, &numbufs, &dbp), file: ../../common/fs/zfs/dmu.c, line: 591 c603abec genunix:assfail+51 (edf9094c, edf90930,) c603ac34 zfs:dmu_write+150 (c5aa3a20, 86, 0, b5) c603ac9c zfs:space_map_sync+2ed (c6fde4cc, 1, c6fde3) c603acec zfs:metaslab_sync+245 (c6fde340, 904f005, ) c603ad14 zfs:vdev_sync+a8 (c0bad040, 904f005, ) c603ad5c zfs:spa_sync+38e (c23196c0, 904f005, ) c603ada8 zfs:txg_sync_thread+22c (c1016600, 0) c603adb8 unix:thread_start+8 () syncing file systems... [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] done (not all i/o completed) dumping to /dev/dsk/c0t0d0s1, offset 215547904, content: kernel WARNING: This system contains a SCSI HBA card/driver that doesn't support software reset. This means that memory being used by the HBA for DMA based reads could have been updated after we panic'd. And then it would not boot anymore. It just went into a panic loop. I hopped in the car and went to the data center. I managed to boot off CD, mounted the root file system and moved /etc/zfs/zpool.cache out of the way, so now I can boot the OS again. If I try to import the pool, I get the panic as above. If you just enter "zpool import", I get the following: state: ONLINE status: The pool is formatted using an older on-disk version. action: The pool can be imported using its name or numeric identifier, though some features will not be available without an explicit 'zpool upgrade'. config: pool0 ONLINE c2t4d0 ONLINE c2t4d2 ONLINE So it appears to still be there, but I can't import it. The two devices are actually hardware RAID devices of 750G each, so I don't have redundancy on the system level, only the hardware RAIDs. I'm not too sure what to do with zdb to see anything. Any ideas as to what I can do to recover the rest of the data? There's still some database files on there I need. Thanks, Brian _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss