* Marty McFadden <mcfadd...@llnl.gov> wrote:

> 
> This patch addresses the following two problems:
>   1. The current msr module grants all-or-nothing access to MSRs,
>      thus making user-level runtime performance adjustments 
>      problematic, particularly for power-constrained HPC systems.
> 
>   2. The current msr module requires a separate system call and the
>      acquisition of the preemption lock for each individual MSR access. 
>      This overhead degrades performance of runtime tools that would
>      ideally sample multiple MSRs at high frequencies.

No, we really don't want to touch the old MSR code - it's a very opaque API 
with 
various deep limitations.

What I'd like to see instead is to use a modern system monitoring interface - 
and 
in fact that already happened in the last kernel release, we added the perf MSR 
access methods via:

 commit b7b7c7821d932ba18ef6c8eafc8536066b4c2ef4
 Author: Andy Lutomirski <l...@kernel.org>
 Date:   Mon Jul 20 11:49:06 2015 -0400

    perf/x86: Add an MSR PMU driver
    
    This patch adds an MSR PMU to support free running MSR counters. Such
    as time and freq related counters includes TSC, IA32_APERF, IA32_MPERF
    and IA32_PPERF, but also SMI_COUNT.
    
    The events are exposed in sysfs for use by perf stat and other tools.
    The files are under /sys/devices/msr/events/

see arch/x86/cpu/perf/msr.c, or arch/x86/events/msr.c in the latest perf tree:

  git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core

For example with the perf ABIs 'batch access' of a group of MSRs is easy: a 
group 
of events can be read or sampled at once. It can be done in a system-wide, per 
task or per task hierarchy fashion, with cgroup management as well - it's a 
modern 
API.

Right now the MSR PMU code is only at its first version, with only these few 
MSRs 
exposed:

enum perf_msr_id {
        PERF_MSR_TSC                    = 0,
        PERF_MSR_APERF                  = 1,
        PERF_MSR_MPERF                  = 2,
        PERF_MSR_PPERF                  = 3,
        PERF_MSR_SMI                    = 4,

        PERF_MSR_EVENT_MAX,
};

but that can (and should) be expanded and more features can be added.

Thanks,

        Ingo

Reply via email to