On Fri, Oct 24, 2014 at 05:20:48PM -0700, Jay Vosburgh wrote:
> Paul E. McKenney <paul...@linux.vnet.ibm.com> wrote:
> 
> >On Fri, Oct 24, 2014 at 03:59:31PM -0700, Paul E. McKenney wrote:
> [...]
> >> Hmmm...  It sure looks like we have some callbacks stuck here.  I clearly
> >> need to take a hard look at the sleep/wakeup code.
> >> 
> >> Thank you for running this!!!
> >
> >Could you please try the following patch?  If no joy, could you please
> >add rcu:rcu_nocb_wake to the list of ftrace events?
> 
>       I tried the patch, it did not change the behavior.
> 
>       I enabled the rcu:rcu_barrier and rcu:rcu_nocb_wake tracepoints
> and ran it again (with this patch and the first patch from earlier
> today); the trace output is a bit on the large side so I put it and the
> dmesg log at:
> 
> http://people.canonical.com/~jvosburgh/nocb-wake-dmesg.txt
> 
> http://people.canonical.com/~jvosburgh/nocb-wake-trace.txt

Thank you again!

Very strange part of the trace.  The only sign of CPU 2 and 3 are:

    ovs-vswitchd-902   [000] ....   109.896840: rcu_barrier: rcu_sched Begin 
cpu -1 remaining 0 # 0
    ovs-vswitchd-902   [000] ....   109.896840: rcu_barrier: rcu_sched Check 
cpu -1 remaining 0 # 0
    ovs-vswitchd-902   [000] ....   109.896841: rcu_barrier: rcu_sched Inc1 cpu 
-1 remaining 0 # 1
    ovs-vswitchd-902   [000] ....   109.896841: rcu_barrier: rcu_sched 
OnlineNoCB cpu 0 remaining 1 # 1
    ovs-vswitchd-902   [000] d...   109.896841: rcu_nocb_wake: rcu_sched 0 
WakeNot
    ovs-vswitchd-902   [000] ....   109.896841: rcu_barrier: rcu_sched 
OnlineNoCB cpu 1 remaining 2 # 1
    ovs-vswitchd-902   [000] d...   109.896841: rcu_nocb_wake: rcu_sched 1 
WakeNot
    ovs-vswitchd-902   [000] ....   109.896842: rcu_barrier: rcu_sched 
OnlineNoCB cpu 2 remaining 3 # 1
    ovs-vswitchd-902   [000] d...   109.896842: rcu_nocb_wake: rcu_sched 2 
WakeNotPoll
    ovs-vswitchd-902   [000] ....   109.896842: rcu_barrier: rcu_sched 
OnlineNoCB cpu 3 remaining 4 # 1
    ovs-vswitchd-902   [000] d...   109.896842: rcu_nocb_wake: rcu_sched 3 
WakeNotPoll
    ovs-vswitchd-902   [000] ....   109.896843: rcu_barrier: rcu_sched Inc2 cpu 
-1 remaining 4 # 2

The pair of WakeNotPoll trace entries says that at that point, RCU believed
that the CPU 2's and CPU 3's rcuo kthreads did not exist.  :-/

More diagnostics in order...

                                                        Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to