On 9 apr 2010, at 10.58, Andreas Höschler wrote: > Hi all, > > I need to replace a disk in a zfs pool on a production server (X4240 running > Solaris 10) today and won't have access to my documentation there. That's why > I would like to have a good plan on paper before driving to that location. :-) > > The current tank pool looks as follows: > > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t15d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t8d0 ONLINE 0 0 0 > c1t9d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t10d0 ONLINE 0 0 0 > c1t11d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c1t12d0 ONLINE 0 0 0 > c1t13d0 ONLINE 0 0 0 > > errors: No known data errors > > Note that disk c1t15d0 is being used and has taken ove rthe duty of c1t6d0. > c1t6d0 failed and was replaced with a new disk a couple of months ago. > However, the new disk does not show up in /dev/rdsk and /dev/dsk. I was told > that the disk has to initialized first with the SCSI BIOS. I am going to do > so today (reboot the server). Once the disks shows up in /dev/rdsk I am > planning to do the following:
I don't think that the BIOS and rebooting part ever has to be true, at least I don't hope so. You shouldn't have to reboot just because you replace a hot plug disk. Depending on the hardware and the state of your system, it might not be the problem at all, and rebooting may not help. Are the device links for c1t6* gone in /dev/(r)dsk? Then someone must have ran a "devfsadm -C" or something like that. You could try "devfsadm -sv" to see if it wants to (re)create any device links. If you think that it looks good, run it with "devfsadm -v". If it is the HBA/raid controller acting up and not showing recently inserted drives, you should be able to talk to it with a program from within the OS. raidctl for some LSI HBAs, and arcconf for some SUN/StorageTek HBAs. > zpool attach tank c1t7d0 c1t6d0 > > This hopefully gives me a three-way mirror: > > mirror ONLINE 0 0 0 > c1t15d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > c1t6d0 ONLINE 0 0 0 > > And then a > > zpool dettach tank c1t15d0 > > to get c1t15d0 out of the mirror to finally have > > mirror ONLINE 0 0 0 > c1t6d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > > again. Is that a good plan? I believe so, and I tried it, as I don't actually do this very often by hand (only in my test shell scripts, which I currently run some dozens of times a day :-): -bash-4.0$ pfexec zpool create tank mirror c3t5d0 c3t6d0 -bash-4.0$ zpool status tank pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 errors: No known data errors -bash-4.0$ pfexec zpool attach tank c3t6d0 c3t7d0 -bash-4.0$ zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Apr 9 11:30:13 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 73.5K resilvered errors: No known data errors -bash-4.0$ pfexec zpool detach tank c3t5d0 -bash-4.0$ zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Apr 9 11:30:13 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 73.5K resilvered errors: No known data errors -bash-4.0$ > I am then intending to do > > zpool add tank mirror c1t14d0 c1t15d0 I believe that too: -bash-4.0$ pfexec zpool add tank mirror c3t1d0 c3t2d0 -bash-4.0$ zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Apr 9 11:30:13 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 73.5K resilvered mirror-1 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 errors: No known data errors -bash-4.0$ > to add another 146GB to the pool. > > Please let me know if I am missing anything. This is a production server. A > failure of the pool would be fatal. Then I'd recommend a second opinion, don't just take just my word for it. I have used zfs quite a bit now, but don't do these things every day. Hope someone else answer too! /ragge _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss