Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-04 Thread Tejun Heo
Lee Revell wrote: > On 4/4/07, Bill Davidsen <[EMAIL PROTECTED]> wrote: >> I won't say that's voodoo, but if I ever did it I'd wipe down my >> keyboard with holy water afterward. ;-) >> >> Well, I did save the message in my tricks file, but it sounds like a >> last ditch effort after something get

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-04 Thread Lee Revell
On 4/4/07, Bill Davidsen <[EMAIL PROTECTED]> wrote: I won't say that's voodoo, but if I ever did it I'd wipe down my keyboard with holy water afterward. ;-) Well, I did save the message in my tricks file, but it sounds like a last ditch effort after something get very wrong. Would it reallty b

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-04 Thread Bill Davidsen
Tejun Heo wrote: [resending. my mail service was down for more than a week and this message didn't get delivered.] [EMAIL PROTECTED] wrote: Anyway, what's annoying is that I can't figure out how to bring the drive back on line without resetting the box. It's in a hot-swap enclosure

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-03 Thread Tejun Heo
[EMAIL PROTECTED] wrote: > [EMAIL PROTECTED] wrote: >>> Anyway, what's annoying is that I can't figure out how to bring the >>> drive back on line without resetting the box. It's in a hot-swap enclosure, >>> but power cycling the drive doesn't seem to help. I thought libata hotplug >>> was workin

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-03 Thread linux
[EMAIL PROTECTED] wrote: >> Anyway, what's annoying is that I can't figure out how to bring the >> drive back on line without resetting the box. It's in a hot-swap enclosure, >> but power cycling the drive doesn't seem to help. I thought libata hotplug >> was working? (SiI3132 card, using the si

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-04-02 Thread Tejun Heo
[resending. my mail service was down for more than a week and this message didn't get delivered.] [EMAIL PROTECTED] wrote: > > Anyway, what's annoying is that I can't figure out how to bring the > > drive back on line without resetting the box. It's in a hot-swap enclosure, > > but power cycling

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-23 Thread linux
As an additional data point, here's a libata problem I'm having trying to rebuild the array. I have six identical 400 GB drives (ST3400832AS), and one is giving me hassles. I've run SMART short and long diagnostics, badblocks, and Seagate's "seatools" diagnostic software, and none of these find p

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Neil Brown
On Thursday March 22, [EMAIL PROTECTED] wrote: > > Not a cfq failure, but I have been able to reproduce a different oops > at array stop time while i/o's were pending. I have not dug into it > enough to suggest a patch, but I wonder if it is somehow related to > the cfq failure since it involves

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Dan Williams
On 3/22/07, Dan Williams <[EMAIL PROTECTED]> wrote: On 3/22/07, Neil Brown <[EMAIL PROTECTED]> wrote: > On Thursday March 22, [EMAIL PROTECTED] wrote: > > On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: > > > > 3 (I think) seperate instances of this, each involving raid5. Is your > > > > array degr

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Dan Williams
On 3/22/07, Neil Brown <[EMAIL PROTECTED]> wrote: On Thursday March 22, [EMAIL PROTECTED] wrote: > On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: > > > 3 (I think) seperate instances of this, each involving raid5. Is your > > > array degraded or fully operational? > > > > Ding! A drive fell out th

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Neil Brown
On Thursday March 22, [EMAIL PROTECTED] wrote: > On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: > > > 3 (I think) seperate instances of this, each involving raid5. Is your > > > array degraded or fully operational? > > > > Ding! A drive fell out the other day, which is why the problems only > > app

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Jens Axboe
On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: > > 3 (I think) seperate instances of this, each involving raid5. Is your > > array degraded or fully operational? > > Ding! A drive fell out the other day, which is why the problems only > appeared recently. > > md5 : active raid5 sdf4[5] sdd4[3] sdc

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread linux
> 3 (I think) seperate instances of this, each involving raid5. Is your > array degraded or fully operational? Ding! A drive fell out the other day, which is why the problems only appeared recently. md5 : active raid5 sdf4[5] sdd4[3] sdc4[2] sdb4[1] sda4[0] 1719155200 blocks level 5, 64k ch

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Jens Axboe
On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: > This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 > over multiple PCIe SiI3132 SATA controllers. The hardware has been very > stable for a long time, but has been acting up of late since I upgraded > to 2.6.20.3. ECC memory sho

Re: 2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread Aristeu Sergio Rozanski Filho
> This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 > over multiple PCIe SiI3132 SATA controllers. The hardware has been very > stable for a long time, but has been acting up of late since I upgraded > to 2.6.20.3. ECC memory should preclude the possibility of bit-flip > err

2.6.20.3 AMD64 oops in CFQ code

2007-03-22 Thread linux
This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 over multiple PCIe SiI3132 SATA controllers. The hardware has been very stable for a long time, but has been acting up of late since I upgraded to 2.6.20.3. ECC memory should preclude the possibility of bit-flip errors. Kern