On Wed, 07 Mar 2007 14:12:16 -0800
Dave Hansen <[EMAIL PROTECTED]> wrote:

> I'm seeing weird hangs running ltp on 2.6.21-rc2-mm2.  It manifests
> itself by the waitpid06 test in LTP hanging.  This is very, very
> reproducible in about 5 seconds by adding '-s wait' to the ltp command
> line.
> 
> I see 4 waitpid06 processes on my 4-way machine spinning in userspace.
> But, the weird part is that I can't ssh in once this happens, but I can
> log in to the console.  I've bisected it down to:
>         
>         sched-fix-idle-load-balancing-in-softirqd-context
> 
> http://sr71.net/~dave/linux/elm3b82-config
> 
> sysrq-t:
> 
> bash          S 00000001     0  1670   1667  1676               (NOTLB)
>        f761ff18 00000086 00000000 00000001 f7afb3ac c16f5f6c f761ff1c c0146eb5
>        e93d3067 fffb8000 f761ff20 c2d922e0 f7ec4030 f7ec413c 00006db1 1457ab10
>        00000025 c038eb60 fffffe00 f7ec4030 f7ec40e4 f761ff88 c011bc38 c014800d
> Call Trace:
>  [<c011bc38>] do_wait+0x271/0x316
>  [<c011bd86>] sys_wait4+0x2f/0x31
>  [<c011bdaf>] sys_waitpid+0x27/0x2c
>  [<c010265c>] syscall_call+0x7/0xb
>  =======================
> runltp        S 00000001     0  1676   1670  1780               (NOTLB)
>        f7625f18 00000082 00000000 00000001 f7adf3ac c16f5bec f7625f1c c0146eb5
>        e8cf4067 fffab000 f7625f20 c2d9a2e0 c30ee050 c30ee15c 00007d8f 3c8acf22
>        00000025 c309e560 fffffe00 c30ee050 c30ee104 f7625f88 c011bc38 c014800d
> Call Trace:
>  [<c011bc38>] do_wait+0x271/0x316
>  [<c011bd86>] sys_wait4+0x2f/0x31
>  [<c011bdaf>] sys_waitpid+0x27/0x2c
>  [<c010265c>] syscall_call+0x7/0xb
>  =======================
> pan           S 00000000     0  1780   1676  1831               (NOTLB)
>        f7639f30 00000086 00000000 00000000 00000000 00000246 f7ad095c f7f0e760
>        00000000 f7639f0c c0162748 c2d922e0 f7a0aab0 f7a0abbc 0000edaf abaf1d13
>        00000026 c038eb60 fffffe00 f7a0aab0 f7a0ab64 f7639fa0 c011bc38 e97e7065
> Call Trace:
>  [<c011bc38>] do_wait+0x271/0x316
>  [<c011bd86>] sys_wait4+0x2f/0x31
>  [<c010265c>] syscall_call+0x7/0xb
>  =======================
> waitpid06     S 00000001     0  1831   1780  1832               (NOTLB)
>        f7637f18 00000082 00000000 00000001 f769b134 c16ed36c f7637f1c c0146eb5
>        f7a38c80 00000000 f7bce030 c2d9a2e0 f7e78030 f7e7813c 00000f64 abdcd3a1
>        00000026 f7bce030 fffffe00 f7e78030 f7e780e4 f7637f88 c011bc38 c014800d
> Call Trace:
>  [<c011bc38>] do_wait+0x271/0x316
>  [<c011bd86>] sys_wait4+0x2f/0x31
>  [<c011bdaf>] sys_waitpid+0x27/0x2c
>  [<c010265c>] syscall_call+0x7/0xb
>  =======================
> 

Michal has reported a similar thing in recent -mm's. 
sched-fix-idle-load-balancing-in-softirqd-context was added in
2.6.21-rc2-mm1.

I'll drop sched-fix-idle-load-balancing-in-softirqd-context.patch and
sched-fix-idle-load-balancing-in-softirqd-context-fix.patch and, as a
consequence, sched-dynticks-idle-load-balancing-v3.patch.

Thanks for doing the bisect - it really helps.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to