On Mon, 1 Jul 2013, Stephen Boyd wrote:
> On 07/01/13 14:24, Thomas Gleixner wrote:
> > On Mon, 1 Jul 2013, Stephen Boyd wrote:
> >> On 07/01/13 13:14, Thomas Gleixner wrote:
> >>> The issue is very subtle. What happens is:
> >>>
> >>> CPU0                                              CPU1
> >>>
> >>> Switch to oneshot mode
> >>>
> >>>  Copy the bits from tick_broadcast_mask to
> >>>  tick_broadcast_oneshot_mask. We need to do
> >>>  that so the other cpus reach the timer irq
> >>>  and the softirq which switches them to
> >>>  oneshot.
> >>>
> >>>  Kick the broadcast device into oneshot.
> >>>
> >>>                                           Timer interrupt fires
> >>>                                           
> >>>                                           irq_enter sees the cpu in
> >>>                                           tick_broadcast_oneshot_mask and
> >>>                                           sets the device to oneshot mode
> >>>                                           
> >>>                                           handle_periodic:
> >>>                                            Sees oneshot mode and adds
> >>>                                            period to
> >>>                                            dev->next_event(KTIME_MAX)
> >>>                   
> >> Yep. It is also racing with the timer interrupt so having more than two
> >> CPUs must help widen the window (which is why we see it on the higher
> >> numbered CPUs).
> > The race above is about the timer interrupt. You mean the broadcast
> > one which is still enabled due to the dummy -> functional transition
> > issue, right? That helps a lot to make this more visible, because we
> > double the number of events.
> 
> I was thinking that tick_check_oneshot_broadcast() is racing with
> tick_switch_to_oneshot() and because we have more CPUs we're more likely
> to have a CPU fix up the handler in tick_switch_to_oneshot() after
> tick_check_oneshot_broadcast() forces that CPU to oneshot mode and the
> periodic handler runs. I wonder if I can reproduce it locally by making
> tick_switch_to_oneshot() spin for a jiffy or two on CPU1.

tick_switch_to_oneshot() is invoked with interrupts disabled, so that
wont help. 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to