On Tue, Nov 27, 2018 at 3:36 PM Andi Kleen <a...@linux.intel.com> wrote: > > > It does seem that FREEZE_PERFMON_ON_PMI (misnamed as it is) is of > > rather limited use (or even negative, in our case) to a counter that's > > already restricted to ring 3. > > It's much faster. The PMI cost goes down dramatically. > > I still the the right fix is to add an perf event opt-out and let it be > used by rr. > > V3 is without counter freezing. > V4 is with counter freezing. > The value is the average cost of the PMI handler. > (lower is better) > > perf options ` V3(ns) V4(ns) delta > -c 100000 1088 894 -18% > -g -c 100000 1862 1646 -12% > --call-graph lbr -c 100000 3649 3367 -8% > --c.g. dwarf -c 100000 2248 1982 -12% > Is that measured on the same machine, i.e., do you force V3 on Skylake? All it does, I think, is save one wrmsr(GLOBAL_CTLR) on entry to the PMU interrupt handler or am I missing something? Or does it save two? The wrmsr(GLOBAL_CTRL) at the end to reactivate.
> > -Andi >