Hi Mike,
Yes, this is looking much better.
Some combination of removing corrupted files indicated in the zpool
status -v output, running zpool scrub and then zpool clear should
resolve the corruption, but its depends on how bad the corruption is.
First, I would try least destruction method: Try to remove the
files listed below by using the rm command.
This entry probably means that the metadata is corrupted or some
other file (like a temp file) no longer exists:
tank1/argus-data:<0xc6>
If you are able to remove the individual file with rm, run another
zpool scrub and then a zpool clear to clear the pool errors. You
might need to repeat the zpool scrub/zpool clear combo.
If you can't remove the individual files, then you might have to
destroy the tank1/argus-data file system.
Let us know what actually works.
Thanks,
Cindy
On 01/31/11 12:20, Mike Tancsa wrote:
On 1/29/2011 6:18 PM, Richard Elling wrote:
On Jan 29, 2011, at 12:58 PM, Mike Tancsa wrote:
On 1/29/2011 12:57 PM, Richard Elling wrote:
0(offsite)# zpool status
pool: tank1
state: UNAVAIL
status: One or more devices could not be opened. There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-3C
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank1 UNAVAIL 0 0 0 insufficient replicas
raidz1 ONLINE 0 0 0
ad0 ONLINE 0 0 0
ad1 ONLINE 0 0 0
ad4 ONLINE 0 0 0
ad6 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ada4 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada6 ONLINE 0 0 0
ada7 ONLINE 0 0 0
raidz1 UNAVAIL 0 0 0 insufficient replicas
ada0 UNAVAIL 0 0 0 cannot open
ada1 UNAVAIL 0 0 0 cannot open
ada2 UNAVAIL 0 0 0 cannot open
ada3 UNAVAIL 0 0 0 cannot open
0(offsite)#
This is usually easily solved without data loss by making the
disks available again. Can you read anything from the disks using
any program?
Thats the strange thing, the disks are readable. The drive cage just
reset a couple of times prior to the crash. But they seem OK now. Same
order as well.
# camcontrol devlist
<WDC WD\021501FASR\25500W2B0 \200 0956> at scbus0 target 0 lun 0
(pass0,ada0)
<WDC WD\021501FASR\25500W2B0 \200 05.01D\0205> at scbus0 target 1 lun 0
(pass1,ada1)
<WDC WD\021501FASR\25500W2B0 \200 05.01D\0205> at scbus0 target 2 lun 0
(pass2,ada2)
<WDC WD\021501FASR\25500W2B0 \200 05.01D\0205> at scbus0 target 3 lun 0
(pass3,ada3)
# dd if=/dev/ada2 of=/dev/null count=20 bs=1024
20+0 records in
20+0 records out
20480 bytes transferred in 0.001634 secs (12534561 bytes/sec)
0(offsite)#
The next step is to run "zdb -l" and look for all 4 labels. Something like:
zdb -l /dev/ada2
If all 4 labels exist for each drive and appear intact, then look more closely
at how the OS locates the vdevs. If you can't solve the "UNAVAIL" problem,
you won't be able to import the pool.
-- richard
On 1/29/2011 10:13 PM, James R. Van Artsdalen wrote:
On 1/28/2011 4:46 PM, Mike Tancsa wrote:
I had just added another set of disks to my zfs array. It looks like the
drive cage with the new drives is faulty. I had added a couple of files
to the main pool, but not much. Is there any way to restore the pool
below ? I have a lot of files on ad0,1,4,6 and ada4,5,6,7 and perhaps
one file on the new drives in the bad cage.
Get another enclosure and verify it works OK. Then move the disks from
the suspect enclosure to the tested enclosure and try to import the pool.
The problem may be cabling or the controller instead - you didn't
specify how the disks were attached or which version of FreeBSD you're
using.
First off thanks to all who responded on and offlist!
Good news (for me) it seems. New cage and all seems to be recognized
correctly. The history is
...
2010-04-22.14:27:38 zpool add tank1 raidz /dev/ada4 /dev/ada5 /dev/ada6
/dev/ada7
2010-06-11.13:49:33 zfs create tank1/argus-data
2010-06-11.13:49:41 zfs create tank1/argus-data/previous
2010-06-11.13:50:38 zfs set compression=off tank1/argus-data
2010-08-06.12:20:59 zpool replace tank1 ad1 ad1
2010-09-16.10:17:51 zpool upgrade -a
2011-01-28.11:45:43 zpool add tank1 raidz /dev/ada0 /dev/ada1 /dev/ada2
/dev/ada3
FreeBSD RELENG_8 from last week, 8G of RAM, amd64.
zpool status -v
pool: tank1
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
tank1 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ad0 ONLINE 0 0 0
ad1 ONLINE 0 0 0
ad4 ONLINE 0 0 0
ad6 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ada0 ONLINE 0 0 0
ada1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
ada5 ONLINE 0 0 0
ada8 ONLINE 0 0 0
ada7 ONLINE 0 0 0
ada6 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
/tank1/argus-data/previous/argus-sites-radium.2011.01.28.16.00
tank1/argus-data:<0xc6>
/tank1/argus-data/argus-sites-radium
0(offsite)# zpool get all tank1
NAME PROPERTY VALUE SOURCE
tank1 size 14.5T -
tank1 used 7.56T -
tank1 available 6.94T -
tank1 capacity 52% -
tank1 altroot - default
tank1 health ONLINE -
tank1 guid 7336939736750289319 default
tank1 version 15 default
tank1 bootfs - default
tank1 delegation on default
tank1 autoreplace off default
tank1 cachefile - default
tank1 failmode wait default
tank1 listsnapshots on local
Do I just want to do a scrub ?
Unfortunately, http://www.sun.com/msg/ZFS-8000-8A gives a 503
zdb now shows
0(offsite)# zdb -l /dev/ada0
--------------------------------------------
LABEL 0
--------------------------------------------
version=15
name='tank1'
state=0
txg=44593174
pool_guid=7336939736750289319
hostid=3221266864
hostname='offsite.sentex.ca'
top_guid=6980939370923808328
guid=16144392433229115618
vdev_tree
type='raidz'
id=1
guid=6980939370923808328
nparity=1
metaslab_array=38
metaslab_shift=35
ashift=9
asize=4000799784960
is_log=0
children[0]
type='disk'
id=0
guid=16144392433229115618
path='/dev/ada4'
whole_disk=0
DTL=341
children[1]
type='disk'
id=1
guid=1210677308003674848
path='/dev/ada5'
whole_disk=0
DTL=340
children[2]
type='disk'
id=2
guid=2517076601231706249
path='/dev/ada6'
whole_disk=0
DTL=339
children[3]
type='disk'
id=3
guid=16621760039941477713
path='/dev/ada7'
whole_disk=0
DTL=338
--------------------------------------------
LABEL 1
--------------------------------------------
version=15
name='tank1'
state=0
txg=44592523
pool_guid=7336939736750289319
hostid=3221266864
hostname='offsite.sentex.ca'
top_guid=6980939370923808328
guid=16144392433229115618
vdev_tree
type='raidz'
id=1
guid=6980939370923808328
nparity=1
metaslab_array=38
metaslab_shift=35
ashift=9
asize=4000799784960
is_log=0
children[0]
type='disk'
id=0
guid=16144392433229115618
path='/dev/ada4'
whole_disk=0
DTL=341
children[1]
type='disk'
id=1
guid=1210677308003674848
path='/dev/ada5'
whole_disk=0
DTL=340
children[2]
type='disk'
id=2
guid=2517076601231706249
path='/dev/ada6'
whole_disk=0
DTL=339
children[3]
type='disk'
id=3
guid=16621760039941477713
path='/dev/ada7'
whole_disk=0
DTL=338
--------------------------------------------
LABEL 2
--------------------------------------------
version=15
name='tank1'
state=0
txg=44593174
pool_guid=7336939736750289319
hostid=3221266864
hostname='offsite.sentex.ca'
top_guid=6980939370923808328
guid=16144392433229115618
vdev_tree
type='raidz'
id=1
guid=6980939370923808328
nparity=1
metaslab_array=38
metaslab_shift=35
ashift=9
asize=4000799784960
is_log=0
children[0]
type='disk'
id=0
guid=16144392433229115618
path='/dev/ada4'
whole_disk=0
DTL=341
children[1]
type='disk'
id=1
guid=1210677308003674848
path='/dev/ada5'
whole_disk=0
DTL=340
children[2]
type='disk'
id=2
guid=2517076601231706249
path='/dev/ada6'
whole_disk=0
DTL=339
children[3]
type='disk'
id=3
guid=16621760039941477713
path='/dev/ada7'
whole_disk=0
DTL=338
--------------------------------------------
LABEL 3
--------------------------------------------
version=15
name='tank1'
state=0
txg=44592523
pool_guid=7336939736750289319
hostid=3221266864
hostname='offsite.sentex.ca'
top_guid=6980939370923808328
guid=16144392433229115618
vdev_tree
type='raidz'
id=1
guid=6980939370923808328
nparity=1
metaslab_array=38
metaslab_shift=35
ashift=9
asize=4000799784960
is_log=0
children[0]
type='disk'
id=0
guid=16144392433229115618
path='/dev/ada4'
whole_disk=0
DTL=341
children[1]
type='disk'
id=1
guid=1210677308003674848
path='/dev/ada5'
whole_disk=0
DTL=340
children[2]
type='disk'
id=2
guid=2517076601231706249
path='/dev/ada6'
whole_disk=0
DTL=339
children[3]
type='disk'
id=3
guid=16621760039941477713
path='/dev/ada7'
whole_disk=0
DTL=338
0(offsite)#
---Mike
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss