Re: [zfs-discuss] Help with corrupted pool

Ethan Sun, 21 Feb 2010 09:42:12 -0800

On Thu, Feb 18, 2010 at 16:03, Ethan <notet...@gmail.com> wrote:

> On Thu, Feb 18, 2010 at 15:31, Daniel Carosone <d...@geek.com.au> wrote:
>
>> On Thu, Feb 18, 2010 at 12:42:58PM -0500, Ethan wrote:
>> > On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <d...@geek.com.au> wrote:
>> > Although I do notice that right now, it imports just fine using the p0
>> > devices using just `zpool import q`, no longer having to use import -d
>> with
>> > the directory of symlinks to p0 devices. I guess this has to do with
>> having
>> > repaired the labels and such? Or whatever it's repaired having
>> successfully
>> > imported and scrubbed.
>>
>> It's the zpool.cache file at work, storing extra copies of labels with
>> corrected device paths.  For curiosity's sake, what happens when you
>> remove (rename) your dir with the symlinks?
>>
>
> I'll let you know when the current scrub finishes.
>
>
>>
>> > After the scrub finished, this is the state of my pool:
>> >             /export/home/ethan/qdsk/c9t1d0p0  DEGRADED     4     0    60
>> > too many errors
>>
>> Ick.  Note that there are device errors as well as content (checksum)
>> errors, which means it's can't only be correctly-copied damage from
>> your orignal pool that was having problems.
>>
>> zpool clear and rescrub, for starters, and see if they continue.
>>
>
> Doing that now.
>
>
>>
>> I suggest also:
>>  - carefully checking and reseating cables, etc
>>  - taking backups now of anything you really wanted out of the pool,
>>   while it's still available.
>>  - choosing that disk as the first to replace, and scrubbing again
>>   after replacing onto it, perhaps twice.
>>  - doing a dd to overwrite that entire disk with random data and let
>>   it remap bad sectors, before the replace (not just zeros, and not
>>   just the sectors a zfs resilver would hit. openssl enc of /dev/zero
>>   with a lightweight cipher and whatever key; for extra caution read
>>   back and compare with a second openssl stream using the same key)
>>  - being generally very watchful and suspicious of that disk in
>>   particular, look at error logs for clues, etc.
>>
>
> Very thorough. I have no idea how to do that with openssl, but I will look
> into learning this.
>
>
>>  - being very happy that zfs deals so well with all this abuse, and
>>   you know your data is ok.
>>
>
> Yes indeed - very happy.
>
>
>>
>> > I have no idea what happened to the one disk, but "No known data errors"
>> is
>> > what makes me happy. I'm not sure if I should be concerned about the
>> > physical disk itself
>>
>> given that it's reported disk errors as well as damaged content, yes.
>>
>
> Okay. Well, it's a brand-new disk and I can exchange it easily enough.
>
>
>>
>> > or just assume that some data got screwed up with all
>> > this mess. I guess maybe I'll see how the disk behaves during the
>> replace
>> > operations (restoring to it and then restoring from it four times seems
>> like
>> > a pretty good test of it), and if it continues to error, replace the
>> > physical drive and if necessary restore from the original truecrypt
>> volumes.
>>
>> Good plan; note the extra scrubs at key points in the process above.
>>
>
> Will do. Thanks for the tip.
>
>
>>
>> > So, current plan:
>> > - export the pool.
>>
>> shouldn't be needed; zpool offline <dev> would be enough
>>
>> > - format c9t1d0 to have one slice being the entire disk.
>>
>> Might not have been needed, but given Victor's comments about reserved
>> space, you may need to do this manually, yes.  Be sure to use EFI
>> labels.  Pick the suspect disk first.
>>
>> > - import. should be degraded, missing c9t1d0p0.
>>
>> no need if you didn't export
>>
>> > - replace missing c9t1d0p0 with c9t1d0
>>
>> yup, or if you've manually partitioned you may need to mention the
>> slice number to prevent it repartitioning with the default reserved
>> space again. You may even need to use some other slice (s5 or
>> whatever), but I don't think so.
>>
>> > - wait for resilver.
>> > - repeat with the other four disks.
>>
>>  - tell us how it went
>>  - drink beer.
>>
>> --
>> Dan.
>
>
> Okay. Plan is updated to reflect your suggestions. Beer was already in the
> plan, but I forgot to list it. Speaking of which, I see your e-mail address
> is .au, but if you're ever in new york city I'd love to buy you a beer as
> thanks for all your excellent help with this. And anybody else in this
> thread - you guys are awesome.
>
> -Ethan
>
>
Update: I'm stuck. Again.


To answer "For curiosity's sake, what happens when you remove (rename) your
dir with the symlinks?": it finds the devices on p0 with no problem, with
the symlinks directory deleted.

After clearing the errors and scrubbing again, no errors were encountered in
the second scrub. Then I offlined the disk which had errors in the first
scrub.

I followed the suggestion to thoroughly test the disk (and remap any bad
sectors), filling it with random-looking data by encrypting /dev/zero.
Reading back and decrypting the drive, it all read back as zeros - all
good.

I then checked the SMART status of the drive, which had 0 error rates for
everything. I ran the several-hour "extended self-test", whatever that does,
after which I had two write errors on one drive which weren't there before.
I believe it's the same drive that had the zfs errors, but I did the SMART
stuff in linux, not being able to find SMART tools in solaris, and I haven't
been able to figure out which drive is which. Is there a way to get a
drive's serial number in solaris? I could identify it by that.

I scrubbed again with the pool degraded. No errors.

        NAME                     STATE     READ WRITE CKSUM
        q                        ONLINE       0     0     0
          raidz1                 ONLINE       0     0     0
            c9t4d0p0             ONLINE       0     0     0
            c9t5d0p0             ONLINE       0     0     0
            c9t2d0p0             ONLINE       0     0     0
            3763020893739678459  UNAVAIL      0     0     0  was
/dev/dsk/c9t1d0p0
            c9t0d0p0             ONLINE       0     0     0

errors: No known data errors

I tried zpool replace on the drive.
# zpool replace q 3763020893739678459 c9t1d0
cannot replace 3763020893739678459 with c9t1d0: device is too small

Victor was right. I went into 'format' and fought with it for a while.
Moving the beginning of slice 0 from block 256 down to block 34 was simple
enough, but I can not figure out how to tell it I don't want 8MB in slice 8.
Is it even possible? I haven't got 8MB to spare (as silly as that sounds for
a 1.5TB drive) - if I can't get rid of slice 8, I may have to stick with
using p0's. I haven't encountered a problem using them so far (who needs
partition tables anyway?) but I figured I'd ask if anybody had ideas about
getting back that space.
What's the 8MB for, anyway? Some stuff seems to indicate that it has to do
with booting the drive, but this will never be a boot drive. That seems to
be for VTOC stuff, not EFI, though. I did look at switching to VTOC labels,
but it seems they don't support disks as large as I am using, so I think
that's out.
I also see "Information that was stored in the alternate cylinders area, the
last two cylinders of the disk, is now stored in slice 8." (
http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWaadm/SYSADV1/p117.html)
Not sure what an "alternate cylinders area" is - it sounds sort of like
remapping bad sectors, but that's something that the disk does on its own.

So, can I get the 8MB back? Should I use p0? Is there another option I'm not
thinking of? (I could always try diving into the EFI label with a hex editor
and set it the way I please with no silly slice 8)

-Ethan

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Help with corrupted pool

Reply via email to