So for a general purpose fileserver using standard SATA connectors on the motherboard, with no drive status LEDs for each drive, using the info above from myxiplx, this faulty drive replacement routine should work in the event that a drive fails: (I have copy & pasted the example from myxiplx and made a few changes for my array/drive ids)
--------------------------- - have a cron task do a 'zpool status pool' periodically and email you if it detects a 'FAULTED' status using grep - when you see the email, see which drive is faulted from the email text grepped from doing a 'zpool status pool | grep FAULTED' -- e.g. c1t1d0 - offline the dive with: # zpool offline pool c1t1d0 - then identify the SATA controller that maps to this drive by running: # cfgadm | grep Ap_Id ; cfgadm | grep c1t1d0 Ap_Id Type Receptacle Occupant Condition sata0/1::dsk/c1t1d0 disk connected configured ok # And offline it with: # cfgadm -c unconfigure sata0/1 Verify that it is now offline with: # cfgadm | grep sata0/1 sata0/1 disk connected unconfigured ok Now remove and replace the disk. For my motherboard (M2N-SLI Deluxe), SATA controller 0/1 maps to "SATA 1" in the book -- i.e. SATA connector #1. Bring the disk online and check its status with: # cfgadm -c configure sata0/1 # cfgadm | grep sata0/1 sata0/1::dsk/c1t1d0 disk connected configured ok Bring the disk back into the zfs pool. You will get a warning: # zpool online splash c1t1d0 warning: device 'c1t1d0' onlined, but remains in faulted state use 'zpool replace' to replace devices that are no longer present # zpool replace pool c1t1d0 you will now see zpool status report that a resilver is in process, with detail as follows: (example from myxiplx's array) (resilvering is the process whereby ZFS recreates the data on the new disk from redundant data: data held on the other drives in the array plus parity data) raidz2 DEGRADED 0 0 0 spare DEGRADED 0 0 0 replacing DEGRADED 0 0 0 c5t7d0s0/o UNAVAIL 0 0 0 corrupted data c5t7d0 ONLINE 0 0 0 Once the resilver finishes, run zpool status again and it should appear fine -- i.e. array and drives marked as ONLINE and no errors shown. Note: I sometimes had to run zpool status twice to get an up to date status of the devices. --------------------------- Now I need to print out this info and keep it safe for the time when a drive fails. Also I should print out the SATA connector mapping for each drive currently in my array in case I'm unable to for any reason later. This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss