On Wed, Feb 17, 2016 at 10:35:39AM +0100, Jiri Olsa wrote:
> On Wed, Feb 17, 2016 at 04:34:16AM +0100, Oleg Nesterov wrote:
> > Finally I reproduced... let me add CC's and reply to initial message. This 
> > has
> > nothing to do with arm/uprobes.
> > 
> > I simply can't understand how perf calculates ->total_time_enabled/running.
> > At all. But the problem is that
> > 
> >     1. perf_event_enable_on_exec() does enable first, then event_sched_in().
> > 
> >        After that tstamp_enabled < tstamp_running
> > 
> >     2. This means that after the next update_event_times()
> >            total_time_running < total_time_enabled
> > 
> >            again, I fail to understand these calculations, but this is what
> >            perf_event_read_value() reports to user-space.
> > 
> >     3. /usr/bin/perf calls perf_counts_values__scale() which does
> > 
> >             else if (count->run < count->ena) {
> >                     scaled = 1;
> >                     count->val = (u64)((double) count->val * count->ena / 
> > count->run + 0.5);
> >             }
> > 
> >        and this is why you see the wrong number. count->val was correct but
> >        wrongly updated because  total_time_running < total_time_enabled.
> > 
> > I leave this to Peter and Jiri ;)
> 
> I did not notice the ther conversation wasn't public, reposting ;-)
> 
> jirka
> 
> ---
> ouch, I tested with fedora kernel.. I can reproduce with 4.5
> 
> Pratyush bisected this into following commit:
> [3e349507d12de93b08b0aa814fc2aa0dee91c5ba] perf: Fix perf_enable_on_exec() 
> event scheduling
> 
> it seems the commit above introduced unwanted difference between
> counter's enabled and running times.. I'm checking on that ;-)

Does something like so work?

---

So prior to 3e349507d12d ("perf: Fix perf_enable_on_exec() event
scheduling") we used to call task_ctx_sched_out() before
event_enable_on_exec().

ctx_sched_out() will call update_context_time(), therefore
__perf_event_mark_enabled() would have an up-to-date ctx->time.

Now, not so much. So explicitly update the ctx time before calling
event_enable_on_exec().

ctx_resched() will again call update_context_time(), resulting in a
slight difference the other way (running > enabled), which doesn't make
any sense either, but that we can (and should) clip.


--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3173,6 +3173,10 @@ static void perf_event_enable_on_exec(in
 
        cpuctx = __get_cpu_context(ctx);
        perf_ctx_lock(cpuctx, ctx);
+
+       update_context_time(ctx);
+       update_cgrp_time_from_cpuctx(cpuctx);
+
        list_for_each_entry(event, &ctx->event_list, event_entry)
                enabled |= event_enable_on_exec(event, ctx);
 

Reply via email to