I've got ZFS running on Solaris s10x_u3wos_10 X86 on a v40z, which has two PCI SCSI controllers, each connected to it's own external HP Diskarray (MSA30) with 7 disks + hot spare.
Both controllers are: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI The disks are a mix of: COMPAQ-BD3008A4C6-HPB4-279.40GB COMPAQ-BD30089BBA-HPB1-279.40GB COMPAQ-BD3008856C-HPB2-279.40GB For the past few months, we've had behavior from the zfs which we wouldn't expect. We've had previous issues where we've seen a particular disk's service time through the roof (while other disks in the same pool were idle) and had to reboot due to the pool locking. The most recent issue happened today, where the zfs pool locked up and we couldnt do anything about it besides reboot the system. we couldnt zpool status, we couldnt df -k, all commands related to IO just seemed to lock. When the system came back up, zfs is showing one of the disks as UNAVAIL. # zpool status pool: dbzpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed with 0 errors on Mon Feb 4 17:16:39 2008 config: NAME STATE READ WRITE CKSUM dbzpool DEGRADED 0 0 0 mirror ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 mirror DEGRADED 0 0 0 c2t8d0 ONLINE 0 0 0 c3t8d0 UNAVAIL 0 0 0 cannot open spares c2t15d0 AVAIL c3t15d0 AVAIL errors: No known data errors I've tried: # zpool offline dbzpool c3t8d0 cannot offline c3t8d0: no valid replicas # zpool replace dbzpool c3t8d0 cannot replace c3t8d0 with c3t8d0: c3t8d0 is busy # zpool online dbzpool c3t8d0 Bringing device c3t8d0 online Note that even through the last command seems fruitful, the disks status remains UNAVAIL. I've also tried writing to the disk directly - both before and after the above zpool commands. # dd if=/dev/zero of=/dev/rdsk/c3t8d0s0 bs=1024 count=1048576 1048576+0 records in 1048576+0 records out # smartctl -H /dev/rdsk/c3t8d0s0 smartctl version 5.37 [i386-pc-solaris2.10] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ SMART Health Status: OK # iostat -nx 5 2 | grep c3t8 0.5 203.7 6.0 1648.2 0.0 0.0 0.0 0.1 0 3 c3t8d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t8d0 After all this data, my questions are as follows: 1. What do I have to do (short of replacing the seemingly good disk) to get c3t8d0 back online? 2. Is there an alternative to the seemingly necessary reboot when the zfs pool locks? 3. Is the pool locking due to a possible problem in u3 that is addressed in u4 and beyond ? -- Jeremy Kister http://jeremy.kister.net./ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss