On Fri, Jun 12, 2015 at 09:36:50AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 11, 2015 at 07:36:07PM +0200, Frederic Weisbecker wrote:
> > +static void tick_nohz_full_update_dependencies(void)
> > +{
> > +   struct tick_sched *ts = this_cpu_ptr(&tick_cpu_sched);
> > +
> > +   if (!posix_cpu_timers_can_stop_tick(current))
> > +           ts->tick_needed |= TICK_NEEDED_POSIX_CPU_TIMER;
> > +
> > +   if (!perf_event_can_stop_tick())
> > +           ts->tick_needed |= TICK_NEEDED_PERF_EVENT;
> > +
> > +   if (!sched_can_stop_tick())
> > +           ts->tick_needed |= TICK_NEEDED_SCHED;
> >  
> >  #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> >     /*
> > +    * sched_clock_tick() needs us?
> > +    *
> >      * TODO: kick full dynticks CPUs when
> >      * sched_clock_stable is set.
> >      */
> >     if (!sched_clock_stable()) {
> > +           ts->tick_needed |= TICK_NEEDED_CLOCK_UNSTABLE;
> >             /*
> >              * Don't allow the user to think they can get
> >              * full NO_HZ with this machine.
> >              */
> >             WARN_ONCE(tick_nohz_full_running,
> >                       "NO_HZ FULL will not work with unstable sched clock");
> >     }
> >  #endif
> >  }
> 
> Colour me confused; why does this function exist at all? Should not
> these bits be managed by those respective subsystems?

So we have two choices here:

1) Something changes in a subsystem which needs the tick and that subsystem
   sends an IPI to the CPU that is concerned such that it changes the tick
   dependency state.
   
   pros: The dependency bits are always modified and read locally
   cons: We need to also check the subsystems from task switch because the next
         task may have different dependencies than prev. So that's context 
switch
         overhead

2) Whenever a subsystem changes its dependency to the tick (needs or doesn't 
need
   anymore), that subsystem remotely changes the dependency bits then sends an 
IPI
   in case we switched from "tick needed" to "tick not needed".

   pros: Less context switch overhead
   cons: Works for some subsystems for which dependency is per CPU: (scheduler)
         Others for which dependency is per task exclusively or system wide need
         more complicated treatment: posix cpu timers would then need to switch 
to
         a seperate global flag.
         perf depends on both a global state and a per cpu state.
         The flags are read remotely. This involve some ordering but no full 
barrier
         since we have the IPI.

This patchset takes the simple 1) way which definetly can be improved.

Perhaps we should do 2) with one global mask and one per cpu mask and all flags
atomically and remotely set and clear by the relevant subsystems.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to