On Thu, Feb 9, 2012 at 2:13 PM, Roy Sigurd Karlsbakk <r...@karlsbakk.net> wrote: >> I don't quite understand what happened in your specific case. Let's >> say you had a setup: >> raidz2 c1d0 c1d1 c1d2 c1d3 spare c1d4 c1d5 >> >> Let's say c1d3 failed. Resilver started and d4 replaced d3's place - >> you now have a non-degraded raidz2. You then physically swapped out d3 >> for a new drive and did "zpool replace". Until the replace command >> completes, you still have the fully-functioning zpool of c1d0 c1d1 >> c1d2 c1d4. When another drive, eg. c1d2, fails, I would hope the >> replace command is cancelled (it's cosmetic - d4 is doing fine instead >> of d3) and instead the array is resilvered with c2d5 in place of c1d2. >> >> Is this what happened (other than the specific disk numbers)? > > What happened was this: > > Server Urd has four RAIDz2 VDEVs, somewhat non-optimally balanced (because of > a few factors, lack of time the dominant one), so the largest has 12 drives > (the other 7). In this VDEV, c14t19d0 died, and the common spare, c9t7d0, > stepped in. I replaced c14t19d0 (zpool offline, cfgadm -c unconfigure ... > zpool replace dpool c14t19d0 c14t19d0, zpool detach dpool c9t7d0). So, all > ok, resilver was almost done when c14t12d0 died and c9t7d0 took over once > more. Now, resilver was restarted, and is still running (high load on the > pool as well).
This is a side comment: you should only have run the "zpool detach dpool c9t7d0" *after* the pool was done resilvering back onto the new c14t19d0. > Now, I can somewhat see the argument in resilvering more drives in parallel > to save time, if the drives fail at the same time, but how often do they > really do that? Mostly, a drive will fail rather out of sync with others. > This leads me to thinking it would be better to let the pool resilver the > first device dying and then go on with the second, or perhaps allow for > manual override somewhere. > > What are your thoughts? I agree there is a tradeoff between letting a resilver finish and attempting to replace the newly-failed drive asap. I would probably set the threshold at 50% - if a current resilver is >= 50% complete, let it finish (if possible) before working on the next drive. I think you will get a better explanation on the (still active) "zfs-disc...@opensolaris.org" mailing list. Someone on that list might explain the design decision (or if it was simply an arbitrary choice). Jan _______________________________________________ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss