Hi Ian, On Mon, Feb 10, 2025 at 11:17:27PM -0800, Ian Rogers wrote: > DRM clients expose information through usage stats as documented in > Documentation/gpu/drm-usage-stats.rst (available online at > https://docs.kernel.org/gpu/drm-usage-stats.html). Add a tool like > PMU, similar to the hwmon PMU, that exposes DRM information.
Probably better to put this link in a file header comment. > For example on a tigerlake laptop: > ``` > $ perf list drm > > List of pre-defined events (to be used in -e or -M): > > drm: > drm-active-stolen-system0 > [Total memory active in one or more engines. Unit: drm_i915] > drm-active-system0 > [Total memory active in one or more engines. Unit: drm_i915] > drm-engine-capacity-video > [Engine capacity. Unit: drm_i915] > drm-engine-copy > [Utilization in ns. Unit: drm_i915] > drm-engine-render > [Utilization in ns. Unit: drm_i915] > drm-engine-video > [Utilization in ns. Unit: drm_i915] > drm-engine-video-enhance > [Utilization in ns. Unit: drm_i915] > drm-purgeable-stolen-system0 > [Size of resident and purgeable memory bufers. Unit: drm_i915] > drm-purgeable-system0 > [Size of resident and purgeable memory bufers. Unit: drm_i915] > drm-resident-stolen-system0 > [Size of resident memory bufers. Unit: drm_i915] > drm-resident-system0 > [Size of resident memory bufers. Unit: drm_i915] > drm-shared-stolen-system0 > [Size of shared memory bufers. Unit: drm_i915] > drm-shared-system0 > [Size of shared memory bufers. Unit: drm_i915] > drm-total-stolen-system0 > [Size of shared and private memory. Unit: drm_i915] > drm-total-system0 > [Size of shared and private memory. Unit: drm_i915] > ``` > > System wide data can be gathered: > ``` > $ perf stat -x, -I 1000 -e > drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 > 1.000904910,0,bytes,drm-active-stolen-system0,1,100.00,, > 1.000904910,0,bytes,drm-active-system0,1,100.00,, > 1.000904910,36,capacity,drm-engine-capacity-video,1,100.00,, > 1.000904910,0,ns,drm-engine-copy,1,100.00,, > 1.000904910,1472970566175,ns,drm-engine-render,1,100.00,, > 1.000904910,0,ns,drm-engine-video,1,100.00,, > 1.000904910,0,ns,drm-engine-video-enhance,1,100.00,, > 1.000904910,0,bytes,drm-purgeable-stolen-system0,1,100.00,, > 1.000904910,38199296,bytes,drm-purgeable-system0,1,100.00,, > 1.000904910,0,bytes,drm-resident-stolen-system0,1,100.00,, > 1.000904910,4643196928,bytes,drm-resident-system0,1,100.00,, > 1.000904910,0,bytes,drm-shared-stolen-system0,1,100.00,, > 1.000904910,1886871552,bytes,drm-shared-system0,1,100.00,, > 1.000904910,0,bytes,drm-total-stolen-system0,1,100.00,, > 1.000904910,4643196928,bytes,drm-total-system0,1,100.00,, > 2.264426839,0,bytes,drm-active-stolen-system0,1,100.00,, > ``` > > Or for a particular process: > ``` > $ perf stat -x, -I 1000 -e > drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 > -p 200027 > 1.001040274,0,bytes,drm-active-stolen-system0,6,100.00,, > 1.001040274,0,bytes,drm-active-system0,6,100.00,, > 1.001040274,12,capacity,drm-engine-capacity-video,6,100.00,, > 1.001040274,0,ns,drm-engine-copy,6,100.00,, > 1.001040274,1542300,ns,drm-engine-render,6,100.00,, > 1.001040274,0,ns,drm-engine-video,6,100.00,, > 1.001040274,0,ns,drm-engine-video-enhance,6,100.00,, > 1.001040274,0,bytes,drm-purgeable-stolen-system0,6,100.00,, > 1.001040274,13516800,bytes,drm-purgeable-system0,6,100.00,, > 1.001040274,0,bytes,drm-resident-stolen-system0,6,100.00,, > 1.001040274,27746304,bytes,drm-resident-system0,6,100.00,, > 1.001040274,0,bytes,drm-shared-stolen-system0,6,100.00,, > 1.001040274,0,bytes,drm-shared-system0,6,100.00,, > 1.001040274,0,bytes,drm-total-stolen-system0,6,100.00,, > 1.001040274,27746304,bytes,drm-total-system0,6,100.00,, > 2.016629075,0,bytes,drm-active-stolen-system0,6,100.00,, > ``` I've tested it briefly. $ ./perf stat -e drm-engine-render,drm-total-system0 -a sleep 1 Performance counter stats for 'system wide': 2,869,492,628,815 ns drm-engine-render 2,777,497,600 bytes drm-total-system0 1.004182447 seconds time elapsed It seems the numbers are quite big. $ ./perf stat -e drm-engine-render,drm-total-system0 -aA sleep 1 Performance counter stats for 'system wide': CPU0 2,870,871,280,238 ns drm_i915/drm-engine-render/ CPU1 <not counted> ns drm_i915/drm-engine-render/ CPU2 <not counted> ns drm_i915/drm-engine-render/ CPU3 <not counted> ns drm_i915/drm-engine-render/ CPU0 2,750,578,688 bytes drm_i915/drm-total-system0/ CPU1 <not counted> bytes drm_i915/drm-total-system0/ CPU2 <not counted> bytes drm_i915/drm-total-system0/ CPU3 <not counted> bytes drm_i915/drm-total-system0/ 1.001678363 seconds time elapsed Ok, it only reads from the CPU0. But I guess there are some duplications. Have you checked drm-client-id? $ ./perf stat -e drm-engine-render -a --per-thread sleep 1 Performance counter stats for 'system wide': Xorg-6900 852,545,872,646 ns drm-engine-render Xorg:disk$0-6901 852,545,872,646 ns drm-engine-render Xorg:sh0-6902 852,545,872,646 ns drm-engine-render Xorg:traceq0-6904 852,545,872,646 ns drm-engine-render Xorg:gdrv0-6906 852,545,872,646 ns drm-engine-render InputThread-6946 852,545,872,646 ns drm-engine-render gnome-shell-7127 808,521,145,191 ns drm-engine-render pool-spawner-7146 808,521,145,191 ns drm-engine-render gmain-7147 808,521,145,191 ns drm-engine-render gdbus-7149 808,521,145,191 ns drm-engine-render dconf worker-7150 808,521,145,191 ns drm-engine-render JS Helper-7151 808,521,145,191 ns drm-engine-render JS Helper-7152 808,521,145,191 ns drm-engine-render ... Trying record.. $ ./perf record -e drm-engine-render sleep 1 failed to mmap with 9 (Bad file descriptor) I think you can fail evsel__open() if attr.sample_period != 0. > > As with the hwmon PMU, high numbered PMU types are used to encode > multiple possible "DRM" PMUs. The appropriate fdinfo is found by > scanning /proc and filtering which fdinfos to read with stat. To avoid > some unneeding scanning, events not starting with "drm-" are ignored. It's sad that it should scan /proc whenever it reads the event but I don't think we have other options. > The patch builds on commit 57e13264dcea ("perf pmus: > Restructure pmu_read_sysfs to scan fewer PMUs") so that only if full > wild carding is being done, that the drm PMUs will be read. Can you please add a test case? Thanks, Namhyung