On Fri, 7 Jan 2011 15:22:03 -0800 buh...@lothlorien.nfbcal.org (Brian Buhrow) wrote:
> hello Greg. Regarding problem 1, the inability to > reconstruct disks in raid sets with wedges in them, I confess I don't > understand the vnode stuff entirely, but rf_getdisksize() in > rf_netbsdkintf.c looks suspicious to me. I'm a little unclear, but > it looks like it tries to get the disk size a number of ways, > including by checking for a possible wedge on the component. I > wonder if that's what's sending the reference count too high? -thanks In rf_reconstruct.c:rf_ReconstructInPlace() we have this: retcode = VOP_IOCTL(vp, DIOCGPART, &dpart, FREAD, curlwp->l_cred); I think will fail for wedges... it should be doing: retcode = VOP_IOCTL(vp, DIOCGWEDGEINFO, &dkw, FREAD, l->l_cred); for the wedge case (see rf_getdisksize()). Now: since the kernel prints: raid2: initiating in-place reconstruction on column 4 raid2: Recon write failed! raid2: reconstruction failed. it's somehow making it past that point... but maybe with the wrong values?? (is there an old label on the disk or something??? ) Later... Greg Oster > On Jan 7, 2:17pm, Greg Oster wrote: > } Subject: Re: Problems with raidframe under NetBSD-5.1/i386 > } On Fri, 7 Jan 2011 05:34:11 -0800 > } buh...@lothlorien.nfbcal.org (Brian Buhrow) wrote: > } > } > hello. OK. Still more info.There seem to be two bugs > here: } > > } > 1. Raid sets with gpt partition tables in the raid set are not > able } > to reconstruct failed components because, for some reason, > the failed } > component is still marked open by the system even > after the raidframe } > code has marked it dead. Still looking into > the fix for that one. } > } Is this just with autoconfig sets, or with non-autoconfig sets too? > } When RF marks a disk as 'dead', it only does so internally, and > doesn't } write anything to the 'dead' disk. It also doesn't even > try to close } the disk (maybe it should?). Where it does try to > close the disk is } when you do a reconstruct-in-place -- there, it > will close the disk } before re-opening it... > } > } rf_netbsdkintf.c:rf_close_component() should take care of closing a > } component, but does something Special need to be done for wedges > there? } > } > 2. Raid sets with gpt partition tables on them cannot be > } > unconfigured and reconfigured without rebooting. This is because > } > dkwedge_delall() is not called during the raid shutdown process. > I } > have a patch for this issue which seems to work fine. See the > } > following output: > } [snip] > } > > } > Here's the patch. Note that this is against NetBSD-5.0 sources, > but } > it should be clean for 5.1, and, i'm guessing, -current as > well. } > } Ah, good! Thanks for your help with this. I see Christos has > already } commited your changes too. (Thanks, Christos!) > } > } Later... > } > } Greg Oster > >-- End of excerpt from Greg Oster > Later... Greg Oster