On Tue, 10 Aug 2010, seth keith wrote:
first off I don't have the exact failure messages here, and I did not take good
notes of the failures, so I will do the best I can. Please try and give me
advice anyway.
I have a 7 drive raidz1 pool with 500G drives, and I wanted to replace them all
with 2TB drives. Immediately I ran into trouble. If I tired:
zpool offline brick <device>
Were you doing an in-place replace? i.e. pulling out the old disk and
putting in the new one?
I got a message like: insufficient replicas
This means that there was a problem with the pool already. When ZFS opens
a pool, it looks at the disks that are part of that pool. For raidz1, if
more than one disk is unopenable, then the pool will report that there are
"no valid replicas", which is probably the error message you saw.
If that's the case, then your pool already had one failed drive in, and
you were attempting to disable a second drive. Do you have a copy of the
output from "zpool status brick" from before you tried your experiment?
I tried to
zpool replace brick <old device> <new device>
and I got something like: <new device> must be a single disk
Unfortunately, this just means that we got back an EINVAL from the kernel,
which could mean any one of a number of things, but probably there was an
issue with calculating the drive size. I'd try plugging it separately and
using 'format' to see how big solaris thinks the drive is.
I finally got replace and offline to work by:
zpool export brick
[reboot]
zpool import brick
Probably didn't need to reboot there.
now
zpool offline brick <old device>
zpool replace brick <old device> <new device>
If you use this form for the replace command, you don't need to offline
the old disk first. You only need to offline a disk if you're going to
pull it out. And then you can do an in-place replace just by issuing
"zpool replace brick <device-you-swapped>"
This worked. zpool status showed replacing in progress, and then after
about 26 hours of resilvering, everything looked fine. The <old device>
was gone, and no errors in the pool. Now I tried to do it again with the
next device. I missed the "zpool offline" part however. Immediately, I
started getting disk errors on both the drive I was replacing and the
first drive I replaced.
Read errors? Write errors? Checksum errors? Sounds like a full scrub
would have been a good idea prior to replacing the second disk.
I have the two original drives, they are in good shape and should still
have all the data on them, can I somehow put my original zpool back.
How? Please help!
You can try exporting the pool, plugging in the original drives, and then
do a recovery on it. See the zpool manpage under "zpool import" for the
recovery options and what the flags mean.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss