Re: RAIDframe parity errors and rebuild

David Wilk Fri, 17 Mar 2006 20:15:10 -0800

On 3/17/06, John Eisenschmidt <[EMAIL PROTECTED]> wrote:
>  ----- David Wilk ([EMAIL PROTECTED]) wrote: -----
> > Howdy all,
> >
> > I've been testing a 3.8 system with RAIDframe and root on raid in a
> > RAID1 configuration.  Performance and stability are quite good, but
> > there's one thing that's a bit irksome and I wonder if I might not be
> > doing something right.
> >
> > I've had a couple crashes (potentially hardware related) and every
> > time the RAID requires a parity rebuild.  That seems fine, but it
> > refuses to bring the array on line during this time.  It takes several
> > hours to rebuild a 232GB RAID1 array!
>
> Is the raidframe driver causing the panic? Pedro sent out an email on
> 2/26 about testing a patch that's being included in 3.9 -STABLE
> (Subject: Re: raid(4) users, please test this). I had a problem with
> my 3.2 raidframe mirrors causing the system to panic because a call
> wasn't being made to VOP_UNLOCK() when VOP_ISLOCKED() was true. I put
> the disks in my unpatched 3.8 box and I got an immediate panic. Applied
>  the patch and they were fine.


good question.  I don't think so, although it maybe been the problem
once when I filled up the RAID volume and it dropped into the kernel
debugger.  This last time looks like a problem with my ATA controller
(bus resets resulting in a kernel hang).  I've been tracking
3.8-STABLE figuring that would be the safest route.  Is this patch
going to make it into that tree or would one have to use CURRENT?  I
didn't think there was a 3.9-STABLE already...
>
> > Is this normal?  this seems like quite a bit of time to be down with
> > every improper shutdown of the system.
>
> I've used OpenVMS volume shadowing, Solaris Disk Suite (circus Solaris
> 2.8), software raid for Mac OS system 8, raidframe, etc all for
> software RAID 1 and all of them took a long time to check after a
> crash. Essentially when the system hard crashes, it needs to
> compare the parity information between both disks sector by sector to
> ensure that the mirrors are in sync. From where I sit, the kernalized
> raidframe driver stops the system from moving on to multiuser mode
> until it has verified the disks are both in sync (the safest route to
> take). Parity for my 100GB volume would take about 90 minutes to check
> after a crash.
right, that makes total sense, however I was assuming this would occur
in the background (like a hardware RAID solution).
>
> Where I've seen it take less time on other implementations is when it
> pushes the system straight to multiuser mode and checks in the
> background, which raidframe will do if you hit CTRL-C on the console
> when it starts checking. You can't pass by the fsck but you can stop
> the interactive parity check and it will run in the background.

ah, so that's the key.  Awesome.  I'll give that a try.  Is there
anyway to configure this behavior to be the default?  I'm imagining my
server being powercycled when I'm not around and wanting it to come up
ASAP, especially since disk load would be very light.
>
> > thanks for your thoughts.
> >
> > Dave
>
> --
> John W. Eisenschmidt ([EMAIL PROTECTED])
>   website: http://www.eisenschmidt.org/jweisen/
>   my blog: http://thealphajohn.blogspot.com/
>   house blog: http://4104-chestnut-street.blogspot.com/
>
> Law of False Alerts: As the rate of erroneous alerts increases, operator
>         reliance, or belief, in subsequent warnings decreases.

Re: RAIDframe parity errors and rebuild

Reply via email to