On Wed, Oct 21, 2015 at 04:16:31PM +0530, Viresh Kumar wrote: > Cc'ing Frederic. > > On 20-10-15, 15:47, Yunhong Jiang wrote: > > On Sun, Oct 11, 2015 at 08:12:39PM +0200, Thomas Gleixner wrote: > > > On Mon, 28 Sep 2015, Yunhong Jiang wrote: > > > > static void internal_add_timer(struct tvec_base *base, struct > > > > timer_list *timer) > > > > { > > > > + bool kick_nohz = false; > > > > + > > > > /* Advance base->jiffies, if the base is empty */ > > > > if (!base->all_timers++) > > > > base->timer_jiffies = jiffies; > > > > @@ -424,9 +426,17 @@ static void internal_add_timer(struct tvec_base > > > > *base, struct timer_list *timer) > > > > */ > > > > if (!(timer->flags & TIMER_DEFERRABLE)) { > > > > if (!base->active_timers++ || > > > > - time_before(timer->expires, base->next_timer)) > > > > + time_before(timer->expires, base->next_timer)) { > > > > base->next_timer = timer->expires; > > > > - } > > > > + /* > > > > + * CPU in dynticks need reevaluate the timer > > > > wheel > > > > + * if newer timer added with next_timer updated. > > > > + */ > > > > + if (base->nohz_active) > > > > + kick_nohz = true; > > > > + } > > > > + } else if (base->nohz_active && tick_nohz_full_cpu(base->cpu)) > > > > + kick_nohz = true; > > > > > > Why do you want to kick the other cpu when a deferrable timer got added? > > > > This is what happens in current implementation and this patch does not > > change the logic. According to the comments, it's to avoid race with > > idle_cpu(). Frankly speaking, I didn't get the idea of the race. > > > > Viresh, do you have any hints? > > I haven't looked at the core since few months now and looks like I > don't remember anything :) > > This thread is where we discussed it initially: > http://marc.info/?l=linux-kernel&m=139039035809125 > > AFAIU, this is why we kick the other CPU for a deferrable timer: > - The other CPU is a full-dynticks capable CPU and may be running > tickless and we should serve the timer in time (even if it is > deferrable) if the CPU isn't idle. > - We could have saved the kick for a full-dynticks idle CPU, but a > race can happen where we thought the CPU is idle, but it has just > started serving userspace tick-lessly. And the timer wouldn't be > served for long time, even when the cpu was busy.
Yeah, deferrable implies "idle deferrable", not "user deferrable". So if the CPU is tickless in userland, we want to kick it to handle the deferrable timer. This is further optimizable by making sure that we don't kick nohz full CPUs when they are idle as well. That said the handling of deferrable timers is buggy in full dynticks because get_next_timer_interrupt() ignores deferrable timers always. I should pass an "ignore_deferrable" parameter for non-idle dynticks but I'm afraid this will uncover timers that people don't want to see. But we'll have to do it eventually and maybe audit which of these deferrable timers also implies deferrable in userspace. > > Ofcourse, Frederic will kick me if I forgot the lessons he gave me > earlier :) Oh I know too much how easy it is to forget what $CODE_I_REVIEWED_X_MONTH_AGO actually does ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/