Cindy How does the SS7000 do it? Today I demoed pulling a disk and the spare just automatically became part of the pool. After it was re-silvered I then pulled three more (latest Q3 version with triple RAID-Z). I then plugged all the drives back in (different slots) and everything was back to normal. Being nosey I've also had a shell running with zpool status in a while loop whilst "practising" this little stunt, but was not looking to see what commands it was issuing. I even had brain fade and pulled all four at once - Doh! The S7000 recovered however once I plugged the disks back in and rebooted (sweaty palms time :-) ). Unfortunately my borrowing time is up and it's now in a box on the way back to my local distributor otherwise I would poke around more..... Trevor Cindy Swearingen wrote: I think it is difficult to cover all the possible ways to replace a disk with a spare.This example in the ZFS Admin Guide didn't work for me: http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view See the manual replacement example. After the zpool detach and zpool replace operations, the spare is not removed from the spare pool. Its in some unknown state. I'll fix this. Cindy On 10/14/09 15:26, Jason Frank wrote:Thank you, that did the trick. That's not terribly obvious from the man page though. The man page says it detaches the devices from a mirror, and I had a raidz2. Since I'm messing with production data, I decided I wasn't going to chance it when I was reading the man page. You might consider changing the man page, and explaining a little more what it means, maybe even what the circumstances look like where you might use it. Actually, an official and easily searchable "What to do when you have a zfs disk failure" with lots of examples would be great. There are a lot of attempts out there, but nothing I've found is comprehensive. Jason On Wed, Oct 14, 2009 at 4:23 PM, Eric Schrock <eric.schr...@sun.com> wrote:On 10/14/09 14:17, Cindy Swearingen wrote:Hi Jason, I think you are asking how do you tell ZFS that you want to replace the failed disk c8t7d0 with the spare, c8t11d0? I just tried do this on my Nevada build 124 lab system, simulating a disk failure and using zpool replace to replace the failed disk with the spare. The spare is now busy and it fails. This has to be a bug.You need to 'zpool detach' the original (c8t7d0). - EricAnother way to recover is if you have a replacement disk for c8t7d0, like this: 1. Physically replace c8t7d0. You might have to unconfigure the disk first. It depends on the hardware. 2. Tell ZFS that you replaced it. # zpool replace tank c8t7d0 3. Detach the spare. # zpool detach tank c8t11d0 4. Clear the pool or the device specifically. # zpool clear tank c8t7d0 Cindy On 10/14/09 14:44, Jason Frank wrote:So, my Areca controller has been complaining via email of read errors for a couple days on SATA channel 8. The disk finally gave up last night at 17:40. I got to say I really appreciate the Areca controller taking such good care of me. For some reason, I wasn't able to log into the server last night or in the morning, probably because my home dir was on the zpool with the failed disk (although it's a raidz2, so I don't know why that was a problem.) So, I went ahead and rebooted it the hard way this morning. The reboot went OK, and I was able to get access to my home directory by waiting about 5 minutes after authenticating. I checked my zpool, and it was resilvering. But, it had only been running for a few minutes. Evidently, it didn't start resilvering until I rebooted it. I would have expected it to do that when the disk failed last night (I had set up a hot spare disk already). All of the zpool commands were taking minutes to complete while c8t7d0 was UNAVAIL, so I offline'd it. When I say all, that includes iostat, status, upgrade, just about anything non-destructive that I could try. That was a little odd. Once I offlined the drive, my resilver restarted, which surprised me. After all, I simply changed an UNAVAIL drive to OFFLINE, in either case, you can't use it for operations. But no big deal there. That fixed the login slowness and the zpool command slowness. The resilver completed, and now I'm left with the following zpool config. I'm not sure how to get things back to normal though, and I hate to do something stupid... r...@datasrv1:~# zpool status tank pool: tank state: DEGRADED scrub: scrub stopped after 0h10m with 0 errors on Wed Oct 14 15:23:06 2009 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 c8t0d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c8t7d0 REMOVED 0 0 0 c8t11d0 ONLINE 0 0 0 c8t8d0 ONLINE 0 0 0 c8t9d0 ONLINE 0 0 0 c8t10d0 ONLINE 0 0 0 spares c8t11d0 INUSE currently in use Since it's not obvious, the spare line had both t7 and t11 indented under it. When the resilver completed, I yanked the hard drive on target 7. I'm assuming that t11 has the same content as t7, but that's not necessarily clear from the output above. So, now I'm left with the following config. I can't zfs remove t7, because it's not a hot spare or a cache disk. I can't zfs replace t7 with t11, I'm told that t11 is busy. And I didn't see any other zpool subcommands that look likely to fix the problem. Here are my system details: SunOS datasrv1 5.11 snv_118 i86pc i386 i86xpv Solaris This system is currently running ZFS pool version 16. Pool 'tank' is already formatted using the current version. How do I tell the system that t11 is the replacement for t7, and how to I then add t7 as the hot spare (after I replace the disk)? Thanks_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Fishworks http://blogs.sun.com/eschrock_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. |
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss