On Fri, Oct 24, 2014 at 05:20:48PM -0700, Jay Vosburgh wrote: > Paul E. McKenney <paul...@linux.vnet.ibm.com> wrote: > > >On Fri, Oct 24, 2014 at 03:59:31PM -0700, Paul E. McKenney wrote: > [...] > >> Hmmm... It sure looks like we have some callbacks stuck here. I clearly > >> need to take a hard look at the sleep/wakeup code. > >> > >> Thank you for running this!!! > > > >Could you please try the following patch? If no joy, could you please > >add rcu:rcu_nocb_wake to the list of ftrace events? > > I tried the patch, it did not change the behavior. > > I enabled the rcu:rcu_barrier and rcu:rcu_nocb_wake tracepoints > and ran it again (with this patch and the first patch from earlier > today); the trace output is a bit on the large side so I put it and the > dmesg log at: > > http://people.canonical.com/~jvosburgh/nocb-wake-dmesg.txt > > http://people.canonical.com/~jvosburgh/nocb-wake-trace.txt
Thank you again! Very strange part of the trace. The only sign of CPU 2 and 3 are: ovs-vswitchd-902 [000] .... 109.896840: rcu_barrier: rcu_sched Begin cpu -1 remaining 0 # 0 ovs-vswitchd-902 [000] .... 109.896840: rcu_barrier: rcu_sched Check cpu -1 remaining 0 # 0 ovs-vswitchd-902 [000] .... 109.896841: rcu_barrier: rcu_sched Inc1 cpu -1 remaining 0 # 1 ovs-vswitchd-902 [000] .... 109.896841: rcu_barrier: rcu_sched OnlineNoCB cpu 0 remaining 1 # 1 ovs-vswitchd-902 [000] d... 109.896841: rcu_nocb_wake: rcu_sched 0 WakeNot ovs-vswitchd-902 [000] .... 109.896841: rcu_barrier: rcu_sched OnlineNoCB cpu 1 remaining 2 # 1 ovs-vswitchd-902 [000] d... 109.896841: rcu_nocb_wake: rcu_sched 1 WakeNot ovs-vswitchd-902 [000] .... 109.896842: rcu_barrier: rcu_sched OnlineNoCB cpu 2 remaining 3 # 1 ovs-vswitchd-902 [000] d... 109.896842: rcu_nocb_wake: rcu_sched 2 WakeNotPoll ovs-vswitchd-902 [000] .... 109.896842: rcu_barrier: rcu_sched OnlineNoCB cpu 3 remaining 4 # 1 ovs-vswitchd-902 [000] d... 109.896842: rcu_nocb_wake: rcu_sched 3 WakeNotPoll ovs-vswitchd-902 [000] .... 109.896843: rcu_barrier: rcu_sched Inc2 cpu -1 remaining 4 # 2 The pair of WakeNotPoll trace entries says that at that point, RCU believed that the CPU 2's and CPU 3's rcuo kthreads did not exist. :-/ More diagnostics in order... Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/