Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

Ross Walker Mon, 13 Jul 2009 08:52:46 -0700

On Jul 13, 2009, at 11:33 AM, Ross <no-re...@opensolaris.org> wrote:

Gaaah, looks like I spoke too soon:

$ zpool status
 pool: rc-pool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.Anattempt was made to correct the error. Applications areunaffected.action: Determine if the device needs to be replaced, and clear theerrors
       using 'zpool clear' or replace the device with 'zpool replace'.
  see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver in progress for 2h59m, 77.89% done, 0h50m to go
config:

       NAME              STATE     READ WRITE CKSUM
       rc-pool           DEGRADED     0     0     0
         mirror          DEGRADED     0     0     0
           c4t1d0        ONLINE       0     0     0  218M resilvered
replacing UNAVAIL 0 963K 0 insufficientreplicas
             c4t2d0s0/o  FAULTED  1.71M 23.4M     0  too many errors
             c4t2d0      REMOVED      0  964K     0  67.0G resilvered
           c5t1d0        ONLINE       0     0     0  218M resilvered
         mirror          ONLINE       0     0     0
           c4t3d0        ONLINE       0     0     0
           c5t2d0        ONLINE       0     0     0
           c5t0d0        ONLINE       0     0     0
         mirror          ONLINE       0     0     0
           c5t3d0        ONLINE       0     0     0
           c4t5d0        ONLINE       0     0     0
           c4t4d0        ONLINE       0     0     0
         mirror          ONLINE       0     0     0
           c5t4d0        ONLINE       0     0     0
           c5t5d0        ONLINE       0     0     0
           c4t6d0        ONLINE       0     0     0
         mirror          ONLINE       0     0     0
           c4t7d0        ONLINE       0 13.0K     0
           c5t6d0        ONLINE       0     0     0
           c5t7d0        ONLINE       0     0     0
       logs              DEGRADED     0     0     0
         c6d1p0          ONLINE       0     0     0

errors: No known data errors


There are a whole bunch of errors in /var/adm/messages:
Jul 13 15:56:53 rob-036 scsi: [ID 107833 kern.warning] WARNING: /p...@1,0/pci1022,7...@1/pci11ab,1...@2/d...@2,0 (sd3):Jul 13 15:56:53 rob-036 Error for Command: write(10) Error Level: RetryableJul 13 15:56:53 rob-036 scsi: [ID 107833 kern.notice] RequestedBlock: 83778048 Error Block: 83778048Jul 13 15:56:53 rob-036 scsi: [ID 107833 kern.notice] Vendor:ATA Serial Number:Jul 13 15:56:53 rob-036 scsi: [ID 107833 kern.notice] Sense Key:Aborted_CommandJul 13 15:56:53 rob-036 scsi: [ID 107833 kern.notice] ASC: 0x0 (noadditional sense info), ASCQ: 0x0, FRU: 0x0
Jul 13 15:57:31 rob-036 scsi: [ID 107833 kern.warning] WARNING: /p...@1,0/pci1022,7...@1/pci11ab,1...@2/d...@2,0 (sd3):Jul 13 15:57:31 rob-036 Command failed to complete...Deviceis gone
Not what I would expect from a brand new drive!!
Does anybody have any tips on how i can work out where the faultlies here? I wouldn't expect controller with so many other drivesworking, and what on earth is the proper technique for replacing adrive that failed part way through a resilver?

I really believe there is a problem with either the cabling or theenclosure's backplane here.

Two disks is statistical coincidence, three disks means, it ain't thedisks that are bad (if you checked and there was no recall and thefirmware is correct and up to date).

Fix the real problem and the disks already in place should resilverwithout further interruption.


-Ross

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] OpenSolaris 2008.11 - resilver still restarting

Reply via email to