On Wed, Feb 17, 2016 at 04:34:16AM +0100, Oleg Nesterov wrote: > Finally I reproduced... let me add CC's and reply to initial message. This has > nothing to do with arm/uprobes. > > I simply can't understand how perf calculates ->total_time_enabled/running. > At all. But the problem is that > > 1. perf_event_enable_on_exec() does enable first, then event_sched_in(). > > After that tstamp_enabled < tstamp_running > > 2. This means that after the next update_event_times() > total_time_running < total_time_enabled > > again, I fail to understand these calculations, but this is what > perf_event_read_value() reports to user-space. > > 3. /usr/bin/perf calls perf_counts_values__scale() which does > > else if (count->run < count->ena) { > scaled = 1; > count->val = (u64)((double) count->val * count->ena / > count->run + 0.5); > } > > and this is why you see the wrong number. count->val was correct but > wrongly updated because total_time_running < total_time_enabled. > > I leave this to Peter and Jiri ;)
I did not notice the ther conversation wasn't public, reposting ;-) jirka --- ouch, I tested with fedora kernel.. I can reproduce with 4.5 Pratyush bisected this into following commit: [3e349507d12de93b08b0aa814fc2aa0dee91c5ba] perf: Fix perf_enable_on_exec() event scheduling it seems the commit above introduced unwanted difference between counter's enabled and running times.. I'm checking on that ;-) as a workaround you could use --no-scale option to disable scaling: # perf stat --no-scale -x';' -e 'probe_exact:*' ./exact_counts 65535;;probe_exact:f_65535x;0;100.00 997;;probe_exact:f_997x;0;100.00 103;;probe_exact:f_103x;0;100.00 3;;probe_exact:f_3x;0;100.00 2;;probe_exact:f_2x;0;100.00 1;;probe_exact:f_1x;0;100.00