On Thu, 6 Jan 2011 18:33:58 -0800 buh...@lothlorien.nfbcal.org (Brian Buhrow) wrote:
> Hello. Ok. I have more information, perhaps this is a known > issue. If not, I can file a bug. Please, do file a PR... this is a new one. > the problem seems to be that if you partition a raid set with > gpt instead of disklabel, if a component of that raid set fails, the > underlying component is held open even after raidframe declares it > dead. Thus, when you try to ask raidframe to do a reconstruct on the > dead component, it can't open the component because the component is > busy. I think the culprit is in src/sys/dev/raidframe/rf_netbsdkintf.c:rf_find_raid_components() where in the: if (wedge) { ... ac_list = rf_get_component(ac_list, dev, vp, device_xname(dv), dkw.dkw_size); continue; } case that little "continue" is not letting the execution reach the: /* don't need this any more. We'll allocate it again a little later if we really do... */ vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); VOP_CLOSE(vp, FREAD | FWRITE, NOCRED); vput(vp); code which would close the opened wedge. :( Both 5.1 and -current suffer from the same issue (though the code in -current is slightly different). Thanks for the investigation and report... Later... Greg Oster