On Mar 11, 2010, at 11:28 PM, Paul Tetley wrote:
> Hi,
> 
> My zpool is reporting unrecoverable errors with the metadata:  
> 
>    pool: rpool2
>   state: ONLINE
>  status: One or more devices has experienced an error resulting in data
>              corruption.  Applications may be effected.
>  action: Restore the file in question if possible. Otherwise restore the 
> entire pool from backup.
>      see: http://www.sun.com/msg/ZFS-8000-8A
> 
> (snip)
> 
> "errors: Permanent errors have been detected in the following files:
>     <metadata>:<0x0>
>     <metadata>:<0x1>
> 
> It initially reported a DEGRADED pool, but after a reboot, the pool is now 
> ONLINE and a quick inspection indicates that my data is present and intact 
> (though the errors stop the file-systems in the pool from mounting at boot - 
> it drops into maintenance mode). My reading of 
> http://www.sun.com/msg/ZFS-8000-8A indicates I should destroy the pool and 
> start again, but http://www.crypticide.com/dropsafe/article/2162 gives me 
> some small hope that this might be fixable...
> 
> The pool has been 'a little flakey' since I built it two months back.  I've 
> been getting small numbers of read and checksum errors on a few of the disks 
> each day.  Initially I replaced the disks, but they would always pass all 
> testing, so lately I've just been clearing the errors each day and looking 
> for another solution.  I thought I had found it when I discovered WD had a 
> firmware patch  (http://www.3ware.com/kb/article.aspx?id=15592 , 
> http://blog.insanegenius.com/2009/09/western-digital-re4-gp-2tb-drive.html)   
> which solved bugs in the drive spin-up behaviour which has been causing 
> problems for various hardware RAID controllers.   So, yesterday I shutdown 
> the machine, pulled 2 of the 4 troublesome disks, and applied the firmware 
> upgrade (04.05G05).   When I booted, the pool was degraded and showed 
> metadata errors.  After a shutdown and cold start, the pool was ONLINE but 
> still have metadata errors (so somewhat inconsistent guidance from ZPOOL 
> STATUS.
> 
> Can anyone explain what this 'metadata' is?
> 
> More Details:
>       • This is a backup server so I can rebuild if necessary, but on 
> principle I'd like to have a go at fixing it...
>       • The zpool has 96 x 2TB drives divided in to RAIDZ2 sets of 8 (6+2).
>       • The drives are Western Digital RE-4's  (WD2002FYPS).
>       • Running OpenSolaris build snv_111b.
>       • Drives are in two AIC JBODs connectected via SAS.
>       • HBA is an LSI 3801E

It has been suggested to check the firmware release for this controller.

This CR has a workaround that might help, too.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6894775
 -- richard

>       • Server is 1RU SuperMicro Intel.
> Any advice appreciated!
> 
> :-)
> Paul Tetley
> NearMap Pty Ltd
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Atlanta, March 16-18, 2010 http://nexenta-atlanta.eventbrite.com 
Los Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to