On Tuesday January 2, [EMAIL PROTECTED] wrote:
> 
> Perhaps a deadlock with a normal (not irq) spinlock.
> 
> Could you enable SysRQ and press <Alt>+<SysRq>+<P> ("showPc")
> Then write down the EIP values (including the [< >] brackets) and
> translate them with ksymoops.
> 
> Ksymoops repeats only the EIP values.
> 
> But searching through the System.map file has only Labels from
> the raid5 staff around.
> 
> As stated in my first mail I run actually my raid5 devices in degrated mode
> and as I remenber there has been some raid5 stuff changed between 
> test13p3 and newer kernels.

So tell us, why do you run your raid5 devices in degraded mode??  I
cannot be good for performance, and certainly isn't good for
redundancy!!!  But I'm not complaining as you found a bug...


> 
> Hope this gives someone an idea?

Yep.  This, combined with a related bug report from  [EMAIL PROTECTED]
strongly suggests the following patch.
Writes to the failed drive are never completing, so you eventually
run out of stripes in the stripe cache and you block waiting for a
stripe to become free.  

Please test this and confirm that it works.

NeilBrown


--- ./drivers/md/raid5.c        2001/01/03 09:04:05     1.1
+++ ./drivers/md/raid5.c        2001/01/03 09:04:13
@@ -1096,8 +1096,10 @@
                                bh->b_rdev = bh->b_dev;
                                bh->b_rsector = bh->b_blocknr * (bh->b_size>>9);
                                generic_make_request(action[i]-1, bh);
-                       } else
+                       } else {
                                PRINTK("skip op %d on disc %d for sector %ld\n", 
action[i]-1, i, sh->sector);
+                               clear_bit(BH_Lock, &bh->b_state);
+                       }
                }
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to