Victor Latushkin wrote:
Liam Slusser wrote:
Long story short, my cat jumped on my server at my house crashing two
drives at the same time. It was a 7 drive raidz (next time ill do
raidz2).
Long story short - we've been able to get access to data in the pool.
This involved finding better old state with the help of 'zdb -t', then
verifying metadata checksums with 'zdb -eubbcsL', then extracting
configuration from the pool, making cache file from the extracted
configuration and finally importing pool (readonly at the moment) to
back up data.
As soon as it is backed up, we'll try to do read-write import...
Update - after copying all the data from the pool we've been able to
import pool read-write without any issues, and scrub discovered only 2
or 3 files with checksum errors.
victor
victor
The server crashed complaining about a drive failure, so i rebooted
into single user mode not realizing that two drives failed. I put in
a new 500g replacement and had zfs start a replace operation which
failed at about 2% because there was two broken drives. From that
point i turned off the computer and sent both drives to a data
recovery place. They were able to recover the data on one of the two
drives (the one that i started the replace operation on) - great -
that should be enough to get my data back.
I popped the newly recovered drive back in, it had an older tgx number
then the other drives so i made a backup of each drive and then
modified the tgx number to an earlier tgx number so they all match.
However i am still unable to mount the array - im getting the
following error: (doesnt matter if i use -f or -F)
bash-3.2# zpool import data
pool: data
id: 6962146434836213226
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://www.sun.com/msg/ZFS-8000-6X
config:
data UNAVAIL missing device
raidz1 DEGRADED
c0t0d0 ONLINE
c0t1d0 ONLINE
replacing ONLINE
c0t2d0 ONLINE
c0t7d0 ONLINE
c0t3d0 UNAVAIL cannot open
c0t4d0 ONLINE
c0t5d0 ONLINE
c0t6d0 ONLINE
Additional devices are known to be part of this pool, though
their
exact configuration cannot be determined.
Now i should have enough online devices to mount and get my data off
however no luck. I'm not really sure where to go at this point.
Do i have to fake a c0t3d0 drive so it thinks all drives are there?
Can somebody point me in the right direction?
thanks,
liam
p.s. To help me find which uberblocks to modify to reset the tgx i
wrote a little perl program which finds and prints out information in
order to revert to an earlier tgx value.
Its a little messy since i wrote it super late at night quickly - but
maybe it will help somebody else out.
http://liam821.com/findUberBlock.txt (its just a perl script)
Its easy to run. It pulls in 256k of data and sorts it (or skipping X
kbyte if you use the -s ###) and then searches for uberblocks.
(remember there is 4 labels, 0 256, and then two at the end of the
disk. You need to manually figure out the end skip value...)
Calculating the GUID seems to always fail because the number is to
large for perl so it returns a negative number. meh wasnt important
enough to try to figure out.
(the info below has NOTHING to do with my disk problem above, its a
happy and health server that i wrote the tool on)
- find newest tgx number
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n
block=148 (0025000) transaction=15980419
- print verbose output
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -v
block=148 (0025000)
zfs_ver=3 (0003 0000 0000 0000)
transaction=15980419 (d783 00f3 0000 0000)
guid_sum=-14861410676147539 (7aad 2fc9 33a0 ffcb)
timestamp=1253958103 (e1d7 4abd 0000 0000)
(Sat Sep 26 02:41:43 2009)
raw = 0025000 b10c 00ba 0000 0000 0003 0000 0000 0000
0025010 d783 00f3 0000 0000 7aad 2fc9 33a0 ffcb
0025020 e1d7 4abd 0000 0000 0001 0000 0000 0000
- list all uberblocks
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -l
block=145 (0024400) transaction=15980288
block=146 (0024800) transaction=15980289
block=147 (0024c00) transaction=15980290
block=148 (0025000) transaction=15980291
block=149 (0025400) transaction=15980292
block=150 (0025800) transaction=15980293
block=151 (0025c00) transaction=15980294
block=152 (0026000) transaction=15980295
block=153 (0026400) transaction=15980296
block=154 (0026800) transaction=15980297
block=155 (0026c00) transaction=15980298
block=156 (0027000) transaction=15980299
block=157 (0027400) transaction=15980300
block=158 (0027800) transaction=15980301
.
.
.
- skip to 256 into the disk and find the newest uberblock
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -s 256
block=507 (7ec00) transaction=15980522
Now lets say i want to go back in time on this, using the program can
help me do that. If i wanted to go back in time to tgx 15980450...
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -t 15980450
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=180 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=181 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=182 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=183 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=184 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=185 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=186 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=187 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=188 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=189 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=190 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=191 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=192 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=193 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=194 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=195 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=196 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=197 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=198 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=199 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=200 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=201 count=1 conv=notrunc
It prints out the DD commands you want to use to do it. It wouldn't
run it for you!
Anyway, maybe it will help somebody out sometime...
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss