Re: [zfs-discuss] Repairing Faulted ZFS pool when zbd doesn't recognize the pool as existing

George Wilson Sun, 06 Feb 2011 18:57:35 -0800

Chris,

I might be able to help you recover the pool but will need access to your
system. If you think this is possible just ping me off list and let me know.


Thanks,
George


On Sun, Feb 6, 2011 at 4:56 PM, Chris Forgeron <cforge...@acsi.ca> wrote:

> Hello all,
>
>  Long time reader, first time poster.
>
>
>
> I’m on day two of a rather long struggle with ZFS and my data. It seems we
> have a difference of opinion – ZFS doesn’t think I have any, and I’m pretty
> sure I saw a crapload of it just the other day.
>
>
>
> I’ve been researching and following various bits of information that I’ve
> found from so many helpful people on this list, but I’m running into a
> slightly different problem than the rest of you;
>
>
>
> My zdb doesn’t seem to recognize the pool for any command other than zdb –
> e <pool>
>
>
>
> I think my problem is a corrupt set of uberblocks, and if I could go back
> in time a bit, everything would be rosy. But how do you do that when zdb
> doesn’t give you the output that you need?
>
>
>
>
>
> Let’s start at the beginning, as this will be a rather long post. Hopefully
> it will be of use to others in similar situations.
>
>
>
> I was running Solaris Express 11, keeping my pool at v28 so I could
> occasionally switch back into FreeBSD-9-Current for tests, comparisons, etc.
>
>
>
> I’ve built a rather large raidz comprised of 25 1.5 TB drives, organized
> into a striped 5 x 5 drive raidz.
>
>
>
> Friday night, one of the 1.5 TB’s faulted, and the reslivering process
> started to the spare 1.5 TB drive. All was normal.
>
>
>
> In the morning, the resliver was around 86% complete when I started working
> on the CIFS ability of Solaris – I wanted to take it’s authentication from
> Workgroup to Domain mode, and thus I was following procedure on this,
> setting up krb5.conf, etc. I also changed the hostname at this point to
> better label the system.
>
>
>
> I had rebooted once during this, and everything came back up fine. The
> drive was still reslivering. I then went for a second reboot, and when the
> system came back up, I was shocked to see my pool was in a faulted state.
>
>
>
> Here’s a zpool status output from that fateful moment:
>
>
>
> -=-=-=-=-
>
>   pool: tank
>
> state: FAULTED
>
> status: The pool metadata is corrupted and the pool cannot be opened.
>
> action: Destroy and re-create the pool from
>
>         a backup source.
>
>    see: http://www.sun.com/msg/ZFS-8000-72
>
> scan: none requested
>
> config:
>
>
>
>         NAME             STATE     READ WRITE CKSUM
>
>         tank             FAULTED      0     0     1  corrupted data
>
>           raidz1-0       ONLINE       0     0     2
>
>             c9t0d0       ONLINE       0     0     0
>
>             c9t0d1       ONLINE       0     0     0
>
>             c9t0d2       ONLINE       0     0     0
>
>             c9t0d3       ONLINE       0     0     0
>
>             c9t0d4       ONLINE       0     0     0
>
>           raidz1-1       ONLINE       0     0     0
>
>             c9t1d0       ONLINE       0     0     0
>
>             c9t1d1       ONLINE       0     0     0
>
>             c9t1d2       ONLINE       0     0     0
>
>             c9t1d3       ONLINE       0     0     0
>
>             c9t1d4       ONLINE       0     0     0
>
>           raidz1-2       ONLINE       0     0     0
>
>             c9t2d0       ONLINE       0     0     0
>
>             c9t2d1       ONLINE       0     0     0
>
>             c9t2d2       ONLINE       0     0     0
>
>             c9t2d3       ONLINE       0     0     0
>
>             c9t2d4       ONLINE       0     0     0
>
>           raidz1-3       ONLINE       0     0     2
>
>             c9t3d0       ONLINE       0     0     0
>
>             c9t3d1       ONLINE       0     0     0
>
>             c9t3d2       ONLINE       0     0     0
>
>             c9t3d3       ONLINE       0     0     0
>
>             c9t3d4       ONLINE       0     0     0
>
>           raidz1-6       ONLINE       0     0     2
>
>             c9t4d0       ONLINE       0     0     0
>
>             c9t4d1       ONLINE       0     0     0
>
>             c9t4d2       ONLINE       0     0     0
>
>             c9t4d3       ONLINE       0     0     0
>
>             replacing-4  ONLINE       0     0     0
>
>               c9t4d4     ONLINE       0     0     0
>
>               c9t15d1    ONLINE       0     0     0
>
>         logs
>
>           c9t14d0p0      ONLINE       0     0     0
>
>           c9t14d1p0      ONLINE       0     0     0
>
>
>
> -=-=-=-=-=-
>
>
>
> After a “holy crap”, and a check of /tank to see if it really was gone, I
> executed a zpool export and then a zpool import.
>
>
>
> (notice the 2 under the raidz1-3 vdev, as well as the raidz1-6)
>
>
>
> Export worked fine, couldn’t import, as I received an I/O error.
>
>
>
> At this stage, I thought it was something stupid with the resliver being
> jammed, and since I had 4 out of 5 drives functional in my raidz1-6 vdev, I
> figured I’d just remove those two drives and try my import again. Still no
> go.
>
>
>
>
>
> At this stage, I started keeping a log, so I could more accurately record
> what steps I was taking, and what results I was seeing. The log will become
> more accurate as the gravity of the situation sunk in.
>
>
>
> There are copious reboots here at times to make sure I have a solid system.
>
>
>
>
> About the pool: It’s big, I’d say 10-12TB in size, about 12 zfs
> filesystems, some compression, some dedup, etc. It really was a nice thing
> when working.
>
>
>
> I have backups of my most important stuff, but I don’t have backups of my
> more recent work in the last week-3 months, and I also have a few TB of
> media that is not backed up – So I’d really like to get this back.  My
> secondary backup server was still in the process of being built, so backing
> up was painful, and thus not done often enough. I know..  I know..  and the
> funny thing is; from day 1 I decided I needed two SAN’s so the data would
> exist on two completely different bits of hardware to protect against big
> whoopsies like this.
>
>
>
> Here’s what I started doing:
>
>
>
> Tried: time zpool import -fFX -o ro -o failmode=continue -R /mnt
> 13666181038508963033
>
> Result (20 min later): cannot import 'tank': one or more devices is
> currently unavailable
>
>
>
> I found out about the –V command for zpool by reading the source, and
> decided to try this;
>
>
>
> Tried: zpool import –f –V 13666181038508963033
>
> Result: worked! Right away, but I was back to where I started, a faulted
> pool that didn’t mount.
>
>
>
> Tried: zpool clear -F data
>
> Result: cannot clear errors for tank: I/O error
>
>
>
> There was more playing around the same command, always the same results.
>
>
>
> * *I exported again, as the key was in the import I felt.
>
>
>
> I read about the importance of labels on the drives. I used zdb –l to check
> all my drives. What I found here was interesting – Since I had been doing so
> many tests with ZFS over the last few months, I had many labels on the
> drives, some at say c9t4d0 and others at c9t4d0s0. You will notice that my
> zpool status output shows the drives as c0tXdX without the s0 slice.
>
>
>
> BUT – Both drive c9d4d4 and c9d15d1 (the bad drive, and the replacement
> drive) lacked any labels at that level – the correct label was at c9t4d4s0
> and c9t14d1s0 ! Checking all the drives, I also found that an OLD zpool
> called “tank” was on the base of the drive, and my new zpool called tank was
> on the s0 part of the drive. The guids were different, so Solaris shouldn’t
> be confusing them.
>
>
>
> Now, I’ve done exports and imports before like this, and Solaris / FreeBSD
> always figured things out just fine - Reading about how this could cause
> problems ( http://opensolaris.org/jive/thread.jspa?threadID=104654 ) I
> decided to make sure it wasn’t causing an issue here.
>
>
>
> I followed the instructions, particularly for:
>
>
>
>
>
> mkdir /mytempdev
>
> cd /mytempdev
>
> for i in /dev/rdsk/c[67]d*s* ; do
>
> ln -s $i
>
> done
>
> zpool import -d /mytempdev
>
>
>
>
>
> which creates a new dir, which I made sure was populated with only my
> drives that I knew were in this pool, and only with the cXtXdXs0
> designation.
>
>
>
> Tried: time zpool import -d /mytempdev -fFX -o ro -o failmode=continue -R
> /mnt 13666181038508963033
>
> Result (1 min) The devices below are missing, use '-m' to import the pool
> anyway:
>
>             c9t14d0p0 [log]
>
>             c9t14d1p0 [log]
>
>
>
> Okay.. so that’s a bit of progress. I did what it said.
>
>
>
> Tried: time zpool import -m -d /mytempdev -fFX -o ro -o failmode=continue
> -R /mnt 13666181038508963033
>
> Result (1 min): cannot import 'tank': one or more devices is currently
> unavailable
>
>
>
>
>
> Then I downloaded Victor's dtrace script (
> http://markmail.org/search/?q=more+ZFS+recovery#query:more%20ZFS%20recovery+page:1+mid:6l2zrn36qis6rydv+state:results
> )
>
> and then executed these commands:
>
>
>
> Tried: dtrace -s ./zpool.d -c "zpool import -m -d /mytempdev -fFX -o ro -o
> failmode=continue -R /mnt 13666181038508963033" > dtrace.dump
>
> Results (6 min) : huge dump to dtrace.dump, like 12 Gig – couldn’t open in
> vim, or compress with bzip2. I decided to abandon that, as it’s so large I
> doubt I could do much with it.
>
>
>
>
>
> Tried: zdb -e -dddd 13666181038508963033
>
> Results: zdb: can't open 'tank': I/O error
>
>
>
> Tried: zdb -e 13666181038508963033
>
> Results: starts listing labels, then it tanks with an I/O error. It lists
> all the vdevs.
>
>
>
> This is about the only zdb command that works for me.
>
>
>
> The other that at least tries something is:
>
>
>
> Tried: zdb -U -uuuv 13666181038508963033
>
> Result: Assertion failed: thr_create(0, 0, (void *(*)(void *))func, arg,
> THR_DETACHED, &tid) == 0, file ../common/kernel.c, line 73, function
> zk_thread_create
>
>
>
> So I then typed :zpool export tank, and then I deleted the
> /mytempdev/c9t4d4s0 and c9d15d1s0 dev's, and I will try my import again.
>  The thinking here is the drives involved in the reslivering may somehow be
> bad.
>
>
>
> Tried: time zpool import -m -d /mytempdev -fFX -o ro -o failmode=continue
> -R /mnt 13666181038508963033
>
> Result: cannot import 'tank': one or more devices is currently unavailable
>
>
>
> So it doesn’t like to not have enough vdevs – which is fair.
>
>
>
> I gave it two blank drives to act as my two slivering drives, put the links
> back in /mytempdev and tried my import again. Same problems, nothing
> different.
>
>
>
> Here’s what my pool looks like at this stage:
>
>
>
>
>
>   pool: tank
>
> state: FAULTED
>
> status: The pool metadata is corrupted and the pool cannot be opened.
>
> action: Destroy and re-create the pool from
>
>         a backup source.
>
>    see: http://www.sun.com/msg/ZFS-8000-72
>
> scan: none requested
>
> config:
>
>
>
>         NAME                        STATE     READ WRITE CKSUM
>
>         tank                        FAULTED      0     0     1  corrupted
> data
>
>           raidz1-0                  ONLINE       0     0     2
>
>             /mytempdev/c9t0d0s0     ONLINE       0     0     0
>
>             /mytempdev/c9t0d1s0     ONLINE       0     0     0
>
>             /mytempdev/c9t0d2s0     ONLINE       0     0     0
>
>             /mytempdev/c9t0d3s0     ONLINE       0     0     0
>
>             /mytempdev/c9t0d4s0     ONLINE       0     0     0
>
>           raidz1-1                  ONLINE       0     0     0
>
>             /mytempdev/c9t1d0s0     ONLINE       0     0     0
>
>             /mytempdev/c9t1d1s0     ONLINE       0     0     0
>
>             /mytempdev/c9t1d2s0     ONLINE       0     0     0
>
>             /mytempdev/c9t1d3s0     ONLINE       0     0     0
>
>             /mytempdev/c9t1d4s0     ONLINE       0     0     0
>
>           raidz1-2                  ONLINE       0     0     0
>
>             /mytempdev/c9t2d0s0     ONLINE       0     0     0
>
>             /mytempdev/c9t2d1s0     ONLINE       0     0     0
>
>             /mytempdev/c9t2d2s0     ONLINE       0     0     0
>
>             /mytempdev/c9t2d3s0     ONLINE       0     0     0
>
>             /mytempdev/c9t2d4s0     ONLINE       0     0     0
>
>           raidz1-3                  ONLINE       0     0     2
>
>             /mytempdev/c9t3d0s0     ONLINE       0     0     0
>
>             /mytempdev/c9t3d1s0     ONLINE       0     0     0
>
>             /mytempdev/c9t3d2s0     ONLINE       0     0     0
>
>             /mytempdev/c9t3d3s0     ONLINE       0     0     0
>
>             /mytempdev/c9t3d4s0     ONLINE       0     0     0
>
>           missing-4                 ONLINE       0     0     0
>
>           missing-5                 ONLINE       0     0     0
>
>           raidz1-6                  ONLINE       0     0     2
>
>             /mytempdev/c9t4d0s0     ONLINE       0     0     0
>
>             /mytempdev/c9t4d1s0     ONLINE       0     0     0
>
>             /mytempdev/c9t4d2s0     ONLINE       0     0     0
>
>             /mytempdev/c9t4d3s0     ONLINE       0     0     0
>
>             replacing-4             ONLINE       0     0     0
>
>               /mytempdev/c9t4d4s0   ONLINE       0     0     0
>
>               /mytempdev/c9t15d1s0  ONLINE       0     0     0
>
>
>
> (Hey! Look, I just noticed. It says missing-4 missing-5. These never
> existed. The reason it jumps to raidz1-6 as far as I know is because
> raidz1-6 is the start of a new backplane. The other raidz1-[0-3] are on a
> different backplane. I wonder if ZFS is suddenly thinking we need those? )
>
>
>
> Tried: zdb -e -dddd 13666181038508963033
>
> Result: (same as before) can't open 'tank': I/O error
>
>
>
> Tried: zpool history tank
>
> Result: no output
>
>
>
> Tried:  zdb -U -lv 13666181038508963033
>
> Result:  zdb: can't open '13666181038508963033': No such file or directory
>
>
>
> Tried: zdb –e 13666181038508963033
>
> Result: lists all the vdevs,  gets an I/O error at the end of c9t15d1s0
>
>
>
>
>
> It thinks we have 7 vdev children, and it lists 7 (0-6)
>
>
>
>
>
> Tried: time zpool import -V -m -d /mytempdev -fFX -o ro -o
> failmode=continue -R /mnt 13666181038508963033
>
> Result: works, took 25 min, and all the vdevs are the proper /mytempdev
> devices, not other ones.
>
>
>
> Tried: zpool clear -F tank
>
> Result: cannot clear errors for tank: I/O error
>
>
>
> Zdb does work, my commands run on “rpool” come back properly.. look:
>
>
>
> solaris:/# zdb -R tank 0:11600:200
>
> zdb: can't open 'tank': No such file or directory
>
> solaris:/# zdb -R rpool 0:11600:200
>
> Found vdev: /dev/dsk/c8d0s0
>
> DVA[0]=<0:11600:200:STD:1> [L0 unallocated] off uncompressed LE contiguous
> unique unencrypted 1-copy size=200L/200P birth=4L/4P fill=0 cksum=0:0:0:0
>
>           0 1 2 3 4 5 6 7   8 9 a b c d e f  0123456789abcdef
>
> 000000:  070c89010c000254  1310050680101c00  T...............
>
> 000010:  58030001001f0728  060d201528830a07  (......X...(. ..
>
> 000020:  3bdf081041020c10  0f00cc0c00588256  ...A...;V.X.....
>
> 000030:  000800003d2fe64b  130016580df44f09  K./=.....O..X...
>
> 000040:  48b49e8ec3ac74c0  42fc03fcff2f0064  .t.....Hd./....B
>
> 000050:  42fc42fc42fc42fc  fc42fcff42fc42fc  .B.B.B.B.B.B..B.
>
> [..snip..]
>
>
>
>
>
> I’ve tried booting into FreeBSD, and I get pretty much the same results as
> I do under Solaris Express. The only diff with FreeBSD seems to be that
> after I import with –V and try a zdb command, the zpool doesn’t exist
> anymore (crashes out?). My advantage in FreeBSD is that I have the source
> code that I can browse through, and possibly edit if I need to.
>
>
>
>
>
>
>
> I’m thinking that if I could try some of the uberblock invalidation tricks,
> I could do something here – But how do I do this without zdb giving my
> information that I need?
>
>
>
> Hopefully some kind soul will take pity and dive into this mess with me. J
>
>
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>


-- 
George Wilson

 <http://www.delphix.com>

M: +1.770.853.8523
F: +1.650.494.1676
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com

<<image.gif>>

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Repairing Faulted ZFS pool when zbd doesn't recognize the pool as existing

Reply via email to