On Fri, Oct 25, 2013 at 7:44 PM, Jiri Olsa <jo...@redhat.com> wrote: > On Wed, Oct 23, 2013 at 02:58:05PM +0200, Stephane Eranian wrote: >> The RAPL PMU counters do not interrupt on overflow. >> Therefore, the kernel needs to poll the counters >> to avoid missing an overflow. This patch adds >> the hrtimer code to do this. >> >> The timer internval is calculated at boot time >> based on the power unit used by the HW. >> >> Signed-off-by: Stephane Eranian <eran...@google.com> >> --- >> arch/x86/kernel/cpu/perf_event_intel_rapl.c | 75 >> +++++++++++++++++++++++++-- >> 1 file changed, 70 insertions(+), 5 deletions(-) >> >> diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c >> b/arch/x86/kernel/cpu/perf_event_intel_rapl.c >> index 3d71d39..ed0566a 100644 >> --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c >> +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c >> @@ -92,11 +92,13 @@ static struct kobj_attribute format_attr_##_var = >> \ >> >> struct rapl_pmu { >> spinlock_t lock; >> - atomic_t refcnt; >> int hw_unit; /* 1/2^hw_unit Joule */ >> - int phys_id; >> - int n_active; /* number of active events */ >> + struct hrtimer hrtimer; >> struct list_head active_list; >> + ktime_t timer_interval; /* in ktime_t unit */ >> + int n_active; /* number of active events */ >> + int phys_id; >> + atomic_t refcnt; >> }; >> >> static struct pmu rapl_pmu_class; >> @@ -161,6 +163,47 @@ static u64 rapl_event_update(struct perf_event *event) >> return new_raw_count; >> } >> >> +static void rapl_start_hrtimer(struct rapl_pmu *pmu) >> +{ >> + __hrtimer_start_range_ns(&pmu->hrtimer, >> + pmu->timer_interval, 0, >> + HRTIMER_MODE_REL_PINNED, 0); >> +} >> + >> +static void rapl_stop_hrtimer(struct rapl_pmu *pmu) >> +{ >> + hrtimer_cancel(&pmu->hrtimer); >> +} >> + >> +static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer) >> +{ >> + struct rapl_pmu *pmu = container_of(hrtimer, struct rapl_pmu, hrtimer); >> + struct perf_event *event; >> + unsigned long flags; >> + >> + if (!pmu->n_active) >> + return HRTIMER_NORESTART; >> + >> + spin_lock_irqsave(&pmu->lock, flags); >> + >> + list_for_each_entry(event, &pmu->active_list, active_entry) { >> + rapl_event_update(event); >> + } > > hi, > I dont fully understand the reason for the timer, > I'm probably missing something.. > The reason is rather simple and is similar to what happens with uncore. The counter are narrow, 32-bit and there is no interrupt capability. We need to poll the counters and accumulate in the sw counter to avoid missing an overflow.
> - the timer calls rapl_event_update for all defined events No, only for the defined RAPL events which is what we want. > - but rapl_pmu_event_read calls rapl_event_update any time the > event is read (sys_read) > Yes, but we want to prevent missing a counter overflow. It may happen if the counter counts in a unit which increments fast. > The rapl_event_update only read msr and updates > event->count|hw,prev_count. No, it does update the count: local64_add(sdelta, &event->count); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/