On Wed, Feb 17, 2016 at 04:34:16AM +0100, Oleg Nesterov wrote:
> Finally I reproduced... let me add CC's and reply to initial message. This has
> nothing to do with arm/uprobes.
> 
> I simply can't understand how perf calculates ->total_time_enabled/running.
> At all. But the problem is that
> 
>       1. perf_event_enable_on_exec() does enable first, then event_sched_in().
> 
>          After that tstamp_enabled < tstamp_running
> 
>       2. This means that after the next update_event_times()
>            total_time_running < total_time_enabled
> 
>            again, I fail to understand these calculations, but this is what
>            perf_event_read_value() reports to user-space.
> 
>       3. /usr/bin/perf calls perf_counts_values__scale() which does
> 
>               else if (count->run < count->ena) {
>                       scaled = 1;
>                       count->val = (u64)((double) count->val * count->ena / 
> count->run + 0.5);
>               }
> 
>          and this is why you see the wrong number. count->val was correct but
>          wrongly updated because  total_time_running < total_time_enabled.
> 
> I leave this to Peter and Jiri ;)

I did not notice the ther conversation wasn't public, reposting ;-)

jirka

---
ouch, I tested with fedora kernel.. I can reproduce with 4.5

Pratyush bisected this into following commit:
[3e349507d12de93b08b0aa814fc2aa0dee91c5ba] perf: Fix perf_enable_on_exec() 
event scheduling

it seems the commit above introduced unwanted difference between
counter's enabled and running times.. I'm checking on that ;-)

as a workaround you could use --no-scale option to disable
scaling:

  # perf stat --no-scale -x';' -e 'probe_exact:*' ./exact_counts
  65535;;probe_exact:f_65535x;0;100.00
  997;;probe_exact:f_997x;0;100.00
  103;;probe_exact:f_103x;0;100.00
  3;;probe_exact:f_3x;0;100.00
  2;;probe_exact:f_2x;0;100.00
  1;;probe_exact:f_1x;0;100.00

Reply via email to