On 9 August 2016 at 17:28, David Carrillo-Cisneros <davi...@google.com> wrote: > Introduce the flag PMU_EV_CAP_READ_ACTIVE_PKG, useful for uncore events, > that allows a PMU to signal the generic perf code that an event is readable > in the current CPU if the event is active in a CPU in the same package as > the current CPU. > > This is an optimization that avoids a unnecessary IPI for the common case > where uncore events are run and read in the same package but in > different CPUs. > > As an example, the IPI removal speeds up perf_read in my Haswell system > as follows: > - For event UNC_C_LLC_LOOKUP: From 260 us to 31 us. > - For event RAPL's power/energy-cores/: From to 255 us to 27 us. > > For the optimization to work, all events in the group must have it > (similarly to PERF_EV_CAP_SOFTWARE). > > Signed-off-by: David Carrillo-Cisneros <davi...@google.com> > --- > include/linux/perf_event.h | 3 +++ > kernel/events/core.c | 26 ++++++++++++++++++++++++-- > 2 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index fa5617f..c8bb1b3 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -501,8 +501,11 @@ typedef void (*perf_overflow_handler_t)(struct > perf_event *, > * Event capabilities. For event_caps and groups caps. > * > * PERF_EV_CAP_SOFTWARE: Is a software event. > + * PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be > read > + * from any CPU in the package where it is active. > */ > #define PERF_EV_CAP_SOFTWARE BIT(0) > +#define PERF_EV_CAP_READ_ACTIVE_PKG BIT(1) > > #define SWEVENT_HLIST_BITS 8 > #define SWEVENT_HLIST_SIZE (1 << SWEVENT_HLIST_BITS) > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 34049cc..38ec68d 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -3333,6 +3333,22 @@ struct perf_read_data { > int ret; > }; > > +static int find_cpu_to_read(struct perf_event *event, int local_cpu) > +{ > + int event_cpu = event->oncpu; > + u16 local_pkg, event_pkg; > + > + if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) { > + event_pkg = topology_physical_package_id(event_cpu); > + local_pkg = topology_physical_package_id(local_cpu); > + > + if (event_pkg == local_pkg) > + return local_cpu; > + } > + > + return event_cpu; > +} > + > /* > * Cross CPU call to read the hardware event > */ > @@ -3454,7 +3470,7 @@ u64 perf_event_read_local(struct perf_event *event) > > static int perf_event_read(struct perf_event *event, bool group) > { > - int ret = 0; > + int ret = 0, cpu_to_read, local_cpu; > > /* > * If event is enabled and currently active on a CPU, update the > @@ -3466,8 +3482,14 @@ static int perf_event_read(struct perf_event *event, > bool group) > .group = group, > .ret = 0, > }; > - ret = smp_call_function_single(event->oncpu, > + > + local_cpu = get_cpu(); > + cpu_to_read = find_cpu_to_read(event, local_cpu); > + put_cpu(); > + > + ret = smp_call_function_single(cpu_to_read, > __perf_event_read, &data, 1); > + > ret = ret ? : data.ret; > } else if (event->state == PERF_EVENT_STATE_INACTIVE) { > struct perf_event_context *ctx = event->ctx; > -- > 2.8.0.rc3.226.g39d4020 >
David, thanks for making the changes. This patch looks good to me. -- Nilay