On Tue, 10 Aug 2010, seth keith wrote:

first off I don't have the exact failure messages here, and I did not take good 
notes of the failures, so I will do the best I can. Please try and give me 
advice anyway.

I have a 7 drive raidz1 pool with 500G drives, and I wanted to replace them all 
with 2TB drives. Immediately I ran into trouble. If I tired:

  zpool offline brick <device>

Were you doing an in-place replace? i.e. pulling out the old disk and putting in the new one?

I got a message like: insufficient replicas

This means that there was a problem with the pool already. When ZFS opens a pool, it looks at the disks that are part of that pool. For raidz1, if more than one disk is unopenable, then the pool will report that there are "no valid replicas", which is probably the error message you saw.

If that's the case, then your pool already had one failed drive in, and you were attempting to disable a second drive. Do you have a copy of the output from "zpool status brick" from before you tried your experiment?


I tried to

   zpool replace brick <old device> <new device>

and I got something like: <new device> must be a single disk

Unfortunately, this just means that we got back an EINVAL from the kernel, which could mean any one of a number of things, but probably there was an issue with calculating the drive size. I'd try plugging it separately and using 'format' to see how big solaris thinks the drive is.


I finally got replace and offline to work by:

   zpool export brick
   [reboot]
   zpool import brick

Probably didn't need to reboot there.

now

   zpool offline brick <old device>
   zpool replace brick <old device> <new device>

If you use this form for the replace command, you don't need to offline the old disk first. You only need to offline a disk if you're going to pull it out. And then you can do an in-place replace just by issuing "zpool replace brick <device-you-swapped>"

This worked. zpool status showed replacing in progress, and then after about 26 hours of resilvering, everything looked fine. The <old device> was gone, and no errors in the pool. Now I tried to do it again with the next device. I missed the "zpool offline" part however. Immediately, I started getting disk errors on both the drive I was replacing and the first drive I replaced.

Read errors? Write errors? Checksum errors? Sounds like a full scrub would have been a good idea prior to replacing the second disk.

I have the two original drives, they are in good shape and should still have all the data on them, can I somehow put my original zpool back. How? Please help!

You can try exporting the pool, plugging in the original drives, and then do a recovery on it. See the zpool manpage under "zpool import" for the recovery options and what the flags mean.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to