[zfs-discuss] Corrupt Array

Gareth de Vaux Wed, 21 Dec 2011 11:47:12 -0800

Hi guys, after a scrub my raidz array status showed:

# zpool status
  pool: pool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scan: scrub repaired 85.5K in 1h21m with 0 errors on Mon Dec 19 06:24:25 2011
config:


        NAME        STATE     READ WRITE CKSUM
        pool        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            ad18    ONLINE       0     0     1
            ad19    ONLINE       0     0     0
            ad10    ONLINE       0     0     1
            ad4     ONLINE       0     0     0

errors: No known data errors


I assume the checksum counts are current and irreconcilable. (Why does
the scan say 'repaired with 0 errors' then?).

What should one do at this point?

I rebooted and ran another scrub, this time it came up with 0 errors
and 0 checksum counts, what does that mean?

I then backed up the array, kicked out ad18 and resilvered it from scratch:

# zpool status
  pool: pool
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scan: resilvered 218G in 1h25m with 14 errors on Wed Dec 21 14:48:47 2011
config:

        NAME             STATE     READ WRITE CKSUM
        pool             DEGRADED     0     0    14
          raidz1-0       DEGRADED     0     0    28
            replacing-0  OFFLINE      0     0     0
              ad18/old   OFFLINE      0     0     0
              ad18       ONLINE       0     0     0
            ad19         ONLINE       0     0     0
            ad10         ONLINE       0     0     0
            ad4          ONLINE       0     0     0

errors: 11 data errors, use '-v' for a list


and 'zpool status -v' gives me a list of affected files.

I assume I delete those files, then follow the same procedure on ad10?


# uname -a
FreeBSD file 8.2-STABLE FreeBSD 8.2-STABLE #0: Sat Nov 12 17:51:22 SAST 2011    
 root@file:/usr/obj/usr/src/sys/COWNEL  amd64

ZFS filesystem version 5
ZFS storage pool version 28


ps. I did get 1 disk alert in the logs during this whole process, half an hour 
before resilvering:

Dec 21 12:41:48 file kernel: ad10: WARNING - READ_DMA48 UDMA ICRC error 
(retrying request) LBA=306763504
Dec 21 12:41:48 file kernel: ad10: FAILURE - READ_DMA48 
status=51<READY,DSC,ERROR> error=10<NID_NOT_FOUND> LBA=306763504
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Corrupt Array

Reply via email to