On Tue, May 16, 2017 at 07:27:42AM -0700, Paul E. McKenney wrote: > On Tue, May 16, 2017 at 05:46:06AM -0700, Paul E. McKenney wrote:
> > Something like this, yes. Maybe even exactly like this. ;-) > > Ah, one thing I forgot... If you are avoiding use of get_online_cpus(), > you usually also have to be very careful about how you use things like > cpu_online() and cpu_is_offline. OK, so I think I got it wrong there. This hunk should close any race between perf_pmu_register() and hotplug by tracking a global state protected by pmus_lock. Thereby insuring the cpuctx->online state gets initialized right. --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8993,6 +8993,8 @@ static int pmu_dev_alloc(struct pmu *pmu static struct lock_class_key cpuctx_mutex; static struct lock_class_key cpuctx_lock; +static cpumask_var_t perf_online_mask; + int perf_pmu_register(struct pmu *pmu, const char *name, int type) { int cpu, ret; @@ -9056,7 +9058,7 @@ int perf_pmu_register(struct pmu *pmu, c lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex); lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock); cpuctx->ctx.pmu = pmu; - cpuctx->online = cpu_online(cpu); + cpuctx->online = cpumask_test_cpu(cpu, perf_online_mask); __perf_mux_hrtimer_init(cpuctx, cpu); } @@ -10952,6 +10954,8 @@ static void __init perf_event_init_all_c struct swevent_htable *swhash; int cpu; + zalloc_cpumask_var(&perf_online_mask, GFP_KERNEL); + for_each_possible_cpu(cpu) { swhash = &per_cpu(swevent_htable, cpu); mutex_init(&swhash->hlist_mutex); @@ -11011,6 +11015,7 @@ static void perf_event_exit_cpu_context( cpuctx->online = 0; mutex_unlock(&ctx->mutex); } + cpumask_clear_cpu(cpu, perf_online_mask); mutex_unlock(&pmus_lock); } #else @@ -11028,6 +11033,7 @@ int perf_event_init_cpu(unsigned int cpu perf_swevent_init_cpu(cpu); mutex_lock(&pmus_lock); + cpumask_set_cpu(cpu, perf_online_mask); list_for_each_entry(pmu, &pmus, entry) { cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu); ctx = &cpuctx->ctx; > > > --- > > > --- a/kernel/events/core.c > > > +++ b/kernel/events/core.c > > > @@ -8997,7 +8997,6 @@ int perf_pmu_register(struct pmu *pmu, c > > > { > > > int cpu, ret; > > > > > > - get_online_cpus(); > > > mutex_lock(&pmus_lock); > > > ret = -ENOMEM; > > > pmu->pmu_disable_count = alloc_percpu(int); > > > > There is usually also some state check in here somewhere for the CPU > > being offline from a perf perspective. Such a check might already exist, > > but I must plead ignorance of perf. This just allocates per-cpu storage, that is per definition on the possible mask and unrelated to the online mask. > > > @@ -9093,7 +9092,6 @@ int perf_pmu_register(struct pmu *pmu, c > > > ret = 0; > > > unlock: > > > mutex_unlock(&pmus_lock); > > > - put_online_cpus(); > > > > > > return ret; > > > > > > @@ -11002,10 +11000,9 @@ static void perf_event_exit_cpu_context( > > > struct perf_cpu_context *cpuctx; > > > struct perf_event_context *ctx; > > > struct pmu *pmu; > > > - int idx; > > > > > > - idx = srcu_read_lock(&pmus_srcu); > > > - list_for_each_entry_rcu(pmu, &pmus, entry) { > > > + mutex_lock(&pmus_lock); > > > > If the state change checked for by perf_pmu_register() needs to be also > > guarded by ctx->mutex, this looks right to me. Right, so we have two locks, pmus_lock that serializes hotplug vs perf_pmu_register and ctx->lock that serializes find_get_context() vs hotplug. Together they ensure that if a PMU is observed, the PMU's cpuctx's have the correct ->online state. > > Just for completeness, the other style is to maintain separate per-CPU > > state, in which case you would instead acquire pmus_lock, mark this > > CPU off-limits to more perf_pmu_register() usage, release pmus_lock, > > then clean up any old usage. I'm not immediately seeing the other style, but per the above, I need that too. Because the previous could race against hotplug if perf_pmu_register() would observe cpu_online() as set after perf_event_exit_cpu() or something. With the above change any chance of a race is gone and we don't need to worry about hotplug ordering too much. Now I just need to do something about the swevent hash, because that's got a hole in now.