On Mon, 2013-06-24 at 19:35 +0200, Oleg Nesterov wrote: > On 06/24, Dave Jones wrote: > > > > On Sun, Jun 23, 2013 at 06:04:52PM +0200, Oleg Nesterov wrote: > > > > > > Could you please do the following: > > > > > > 1. # cd /sys/kernel/debug/tracing > > > # echo 0 >> options/function-trace > > > # echo preemptirqsoff >> current_tracer > > > > > > 2. reproduce the lockup again > > > > > > 3. show the result of > > > # cat trace > > > > Not sure this is helpful, but.. > > This makes me think that something is seriously broken. > > Or I do not understand this stuff at all. Quite possible too. > Steven, could you please help? > > > # preemptirqsoff latency trace v1.1.5 on 3.10.0-rc7+ > > # -------------------------------------------------------------------- > > # latency: 165015310 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 > > #P:4) > > OK, 165015310/1000000 = 165, nice.
9600 baud modem serial console? > > > # ----------------- > > # | task: trinity-child1-3173 (uid:1000 nice:19 policy:0 rt_prio:0) > > # ----------------- > > # => started at: vprintk_emit > > # => ended at: vprintk_emit > > # > > # > > # _------=> CPU# > > # / _-----=> irqs-off > > # | / _----=> need-resched > > # || / _---=> hardirq/softirq > > # ||| / _--=> preempt-depth > > # |||| / delay > > # cmd pid ||||| time | caller > > # \ / ||||| \ | / > > trinity--3173 1dNh1 0us!: console_unlock <-vprintk_emit > > trinity--3173 1dNh1 165015310us : console_unlock <-vprintk_emit > > trinity--3173 1dNh1 165015311us+: stop_critical_timings <-vprintk_emit > > trinity--3173 1dNh1 165015315us : <stack trace> > > => console_unlock > > => vprintk_emit > > => printk > > => watchdog_timer_fn > > But this is already called in the non-preemtible context, how can > 'started at' blame vprintk_emit? Well, it looks to really have started with console_unlock() not vprintk_emit. > > > => __run_hrtimer > > => hrtimer_interrupt > > => smp_apic_timer_interrupt > > => apic_timer_interrupt > > => _raw_spin_lock > > This is where start_critical_timing() should be called? > > Or by TRACE_IRQS_OFF in apic_timer_interrupt... Also, what _raw_spin_lock is that. Unless the interrupt triggered at the start of the completion spin lock (before it disabled interrupts), it could have happened while spinning on inode_sb_list_lock? But you are correct, the critical timing should have started with the entering of smp_apic_timer_interrupt. Did anything re-enable interrupts? > > > => sync_inodes_sb > > => sync_inodes_one_sb > > => iterate_supers > > => sys_sync > > => tracesys > > Also. watchdog_timer_fn() calls printk() only if it detects the > lockup, so I assume you hit another one? Probably. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/