As we've learned more about the observability capabilities of Gen graphics we've found that it's not enough to only try and configure the OA unit from userspace without any dedicated support from the kernel.
As it is currently the i965 backends for both AMD_performance_monitor and INTEL_performance_query aren't able to report normalized metrics useful to application developers due to the limitations of configuring the OA unit from userspace via LRIs. More recently we've developed a perf PMU (performance monitoring unit) driver within the drm i915 driver ("i915_oa") that lets userspace configure and open an event fd via the perf_event_open syscall which provides us a more complete interface for configuring the Gen graphics OA unit. With help from the kernel we can support periodic sampling (where the hardware writes reports into a gpu mapped circular buffer that we can forward as perf samples), we can deal with the clock gating + PM limitations imposed by the observability hw and also manage + maintain the selection of performance counters. The perf_event_open(2) man page is a good starting point for anyone wanting to learn about the Linux perf interface. Something to beware of is that there's currently no precedent upstream for exposing device metrics via a perf PMU and although early feedback was sought for this work, some of this may be subject to change based on feedback from the core perf maintainers as well as the i915 drm driver maintainers. This PRM is a good starting point for anyone wanting to learn about the Gen graphics Observability hardware. Some important information is currently missing and this should be updated soon, but that's more directly related to the i915_oa perf driver. Notably though the report formats described here need to be understood by Mesa, since the perf samples simply forward the raw reports from the OA hardware. https://01.org/sites/default/files/documentation/ observability_performance_counters_haswell.pdf This series re-works the i965 driver's support for exposing performance counters, taking advantage of this i915_oa perf event interface. A corresponding kernel branch with an initial i915_oa driver for Haswell can be found here: https://github.com/rib/linux wip/rib/oa-hsw-4.0.0 A corresponding libdrm branch can be found here: https://github.com/rib/drm wip/rib/oa-hsw-4.0.0 In case it's helpful to see another example using the i915_oa perf interface I've also been developing a 'gputop' tool that both lets me test the INTEL_performance_query interface to collect per-context metrics from Mesa and can also visualize system wide metrics (i.e. across all gpu contexts) using perf directly: https://github.com/rib/gputop Although I haven't updated the branches in a while, I could share some initial code adding support for Broadwell if anyone's interested to get a sense of what's involved in supporting later hardware generations. I still anticipate some (hopefully relatively minor) tweaking of implementation details based on review feedback for the i915_oa driver, but I hope that this is a good point to ask for some feedback on the Mesa changes. If it's more convenient, these patches can also be fetched from here: https://github.com/rib/mesa wip/rib/oa-hsw-4.0.0 Regards, - Robert Robert Bragg (6): i965: Remove perf monitor/query backend Separate INTEL_performance_query frontend Model INTEL perf query backend after query object BE i965: Implement INTEL_performance_query extension i965: Expose OA counters via INTEL_performance_query i965: Adds further support for "3D" OA counters src/mapi/glapi/gen/gl_genexec.py | 1 + src/mesa/Makefile.sources | 2 + src/mesa/drivers/dri/i965/Makefile.sources | 2 +- src/mesa/drivers/dri/i965/brw_context.c | 5 +- src/mesa/drivers/dri/i965/brw_context.h | 101 +- .../drivers/dri/i965/brw_performance_monitor.c | 1472 ------------ src/mesa/drivers/dri/i965/brw_performance_query.c | 2356 ++++++++++++++++++++ src/mesa/drivers/dri/i965/intel_batchbuffer.c | 10 +- src/mesa/drivers/dri/i965/intel_extensions.c | 69 +- src/mesa/main/context.c | 3 + src/mesa/main/dd.h | 39 + src/mesa/main/mtypes.h | 28 + src/mesa/main/performance_monitor.c | 579 ----- src/mesa/main/performance_monitor.h | 39 - src/mesa/main/performance_query.c | 608 +++++ src/mesa/main/performance_query.h | 80 + 16 files changed, 3197 insertions(+), 2197 deletions(-) delete mode 100644 src/mesa/drivers/dri/i965/brw_performance_monitor.c create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.c create mode 100644 src/mesa/main/performance_query.c create mode 100644 src/mesa/main/performance_query.h -- 2.3.2 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev