On Thu, Feb 18, 2010 at 15:31, Daniel Carosone <d...@geek.com.au> wrote:

> On Thu, Feb 18, 2010 at 12:42:58PM -0500, Ethan wrote:
> > On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <d...@geek.com.au> wrote:
> > Although I do notice that right now, it imports just fine using the p0
> > devices using just `zpool import q`, no longer having to use import -d
> with
> > the directory of symlinks to p0 devices. I guess this has to do with
> having
> > repaired the labels and such? Or whatever it's repaired having
> successfully
> > imported and scrubbed.
>
> It's the zpool.cache file at work, storing extra copies of labels with
> corrected device paths.  For curiosity's sake, what happens when you
> remove (rename) your dir with the symlinks?
>

I'll let you know when the current scrub finishes.


>
> > After the scrub finished, this is the state of my pool:
> >             /export/home/ethan/qdsk/c9t1d0p0  DEGRADED     4     0    60
> > too many errors
>
> Ick.  Note that there are device errors as well as content (checksum)
> errors, which means it's can't only be correctly-copied damage from
> your orignal pool that was having problems.
>
> zpool clear and rescrub, for starters, and see if they continue.
>

Doing that now.


>
> I suggest also:
>  - carefully checking and reseating cables, etc
>  - taking backups now of anything you really wanted out of the pool,
>   while it's still available.
>  - choosing that disk as the first to replace, and scrubbing again
>   after replacing onto it, perhaps twice.
>  - doing a dd to overwrite that entire disk with random data and let
>   it remap bad sectors, before the replace (not just zeros, and not
>   just the sectors a zfs resilver would hit. openssl enc of /dev/zero
>   with a lightweight cipher and whatever key; for extra caution read
>   back and compare with a second openssl stream using the same key)
>  - being generally very watchful and suspicious of that disk in
>   particular, look at error logs for clues, etc.
>

Very thorough. I have no idea how to do that with openssl, but I will look
into learning this.


>  - being very happy that zfs deals so well with all this abuse, and
>   you know your data is ok.
>

Yes indeed - very happy.


>
> > I have no idea what happened to the one disk, but "No known data errors"
> is
> > what makes me happy. I'm not sure if I should be concerned about the
> > physical disk itself
>
> given that it's reported disk errors as well as damaged content, yes.
>

Okay. Well, it's a brand-new disk and I can exchange it easily enough.


>
> > or just assume that some data got screwed up with all
> > this mess. I guess maybe I'll see how the disk behaves during the replace
> > operations (restoring to it and then restoring from it four times seems
> like
> > a pretty good test of it), and if it continues to error, replace the
> > physical drive and if necessary restore from the original truecrypt
> volumes.
>
> Good plan; note the extra scrubs at key points in the process above.
>

Will do. Thanks for the tip.


>
> > So, current plan:
> > - export the pool.
>
> shouldn't be needed; zpool offline <dev> would be enough
>
> > - format c9t1d0 to have one slice being the entire disk.
>
> Might not have been needed, but given Victor's comments about reserved
> space, you may need to do this manually, yes.  Be sure to use EFI
> labels.  Pick the suspect disk first.
>
> > - import. should be degraded, missing c9t1d0p0.
>
> no need if you didn't export
>
> > - replace missing c9t1d0p0 with c9t1d0
>
> yup, or if you've manually partitioned you may need to mention the
> slice number to prevent it repartitioning with the default reserved
> space again. You may even need to use some other slice (s5 or
> whatever), but I don't think so.
>
> > - wait for resilver.
> > - repeat with the other four disks.
>
>  - tell us how it went
>  - drink beer.
>
> --
> Dan.


Okay. Plan is updated to reflect your suggestions. Beer was already in the
plan, but I forgot to list it. Speaking of which, I see your e-mail address
is .au, but if you're ever in new york city I'd love to buy you a beer as
thanks for all your excellent help with this. And anybody else in this
thread - you guys are awesome.

-Ethan
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to