On Wed, 6 Feb 2008 16:33:20 -0800 (PST) [EMAIL PROTECTED] wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9906 > > Summary: Weird hang with NPTL and SIGPROF. > Product: Process Management > Version: 2.5 > KernelVersion: 2.6.24-rc4 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Scheduler > AssignedTo: [EMAIL PROTECTED] > ReportedBy: [EMAIL PROTECTED] > > > Latest working kernel version: None > Earliest failing kernel version: 2.6.18 > Distribution: Ubuntu > Hardware Environment: Any > Problem Description: > I have a testcase that demonstrates a strange hang of the latest kernel > (as well as previous ones). In the process of investigating the NPTL, > we wrote a test that just creates a bunch of threads, then does a > barrier wait to synchronize them all, after which everybody exits. > That's all it does. > > This works fine under most circumstances. Unfortunately, we also want > to do profiling, so we catch SIGPROF and turn on ITIMER_PROF. In this > case, at somewhere between 4000 and 4500 threads, and using the NPTL, > the system hangs. It's not a hard hang, interrupts are still working > and clocks are ticking, but nothing is making progress. It becomes > noticeable when the softlockup_tick() warning goes off after the > watchdog has been starved long enough. > > Sometimes the system recovers and gets going again. Other times it > doesn't. I've examined the state of things several times with kdb and > there's certainly nothing obvious going on. Something, perhaps having > to do with the scheduler, is certainly getting into a bad state, but I > haven't yet been able to figure out what that is. I've even run it with > KFT and have seen nothing obvious there, either, except for the fact > that when it hangs it becomes obvious that it stops making progress and > it begins to fill up with smp_apic_timer_interrupt() and do_softirq() > entries. I've also seen smp_apic_timer_interrupt() appear twice or more > on the stack, as if the previous run(s) didn't finish before the next > tick happened. > > Steps to reproduce: > > I'll attach a testcase shortly. >
It's probably better to handle this one via email, so please send that testcase vie reply-to-all to this email, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/