On Mon, 2013-06-24 at 19:35 +0200, Oleg Nesterov wrote:
> On 06/24, Dave Jones wrote:
> >
> > On Sun, Jun 23, 2013 at 06:04:52PM +0200, Oleg Nesterov wrote:
> >  >
> >  > Could you please do the following:
> >  >
> >  >  1. # cd /sys/kernel/debug/tracing
> >  >     # echo 0 >> options/function-trace
> >  >     # echo preemptirqsoff >> current_tracer
> >  >
> >  >  2. reproduce the lockup again
> >  >
> >  >  3. show the result of
> >  >     # cat trace
> >
> > Not sure this is helpful, but..
> 
> This makes me think that something is seriously broken.
> 
> Or I do not understand this stuff at all. Quite possible too.
> Steven, could you please help?
> 
> > # preemptirqsoff latency trace v1.1.5 on 3.10.0-rc7+
> > # --------------------------------------------------------------------
> > # latency: 165015310 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 
> > #P:4)
> 
> OK, 165015310/1000000 = 165, nice.

9600 baud modem serial console?

> 
> > #    -----------------
> > #    | task: trinity-child1-3173 (uid:1000 nice:19 policy:0 rt_prio:0)
> > #    -----------------
> > #  => started at: vprintk_emit
> > #  => ended at:   vprintk_emit
> > #
> > #
> > #                  _------=> CPU#
> > #                 / _-----=> irqs-off
> > #                | / _----=> need-resched
> > #                || / _---=> hardirq/softirq
> > #                ||| / _--=> preempt-depth
> > #                |||| /     delay
> > #  cmd     pid   ||||| time  |   caller
> > #     \   /      |||||  \    |   /
> > trinity--3173    1dNh1    0us!: console_unlock <-vprintk_emit
> > trinity--3173    1dNh1 165015310us : console_unlock <-vprintk_emit
> > trinity--3173    1dNh1 165015311us+: stop_critical_timings <-vprintk_emit
> > trinity--3173    1dNh1 165015315us : <stack trace>
> >  => console_unlock
> >  => vprintk_emit
> >  => printk
> >  => watchdog_timer_fn
> 
> But this is already called in the non-preemtible context, how can
> 'started at' blame vprintk_emit?

Well, it looks to really have started with console_unlock() not
vprintk_emit.

> 
> >  => __run_hrtimer
> >  => hrtimer_interrupt
> >  => smp_apic_timer_interrupt
> >  => apic_timer_interrupt
> >  => _raw_spin_lock
> 
> This is where start_critical_timing() should be called?
> 
> Or by TRACE_IRQS_OFF in apic_timer_interrupt...

Also, what _raw_spin_lock is that. Unless the interrupt triggered at the
start of the completion spin lock (before it disabled interrupts), it
could have happened while spinning on inode_sb_list_lock?

But you are correct, the critical timing should have started with the
entering of smp_apic_timer_interrupt. Did anything re-enable interrupts?


> 
> >  => sync_inodes_sb
> >  => sync_inodes_one_sb
> >  => iterate_supers
> >  => sys_sync
> >  => tracesys
> 
> Also. watchdog_timer_fn() calls printk() only if it detects the
> lockup, so I assume you hit another one?

Probably.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to