Since Feb 2023, DRM_USE_DYNAMIC_DEBUG has been marked BROKEN [1].
Although classmaps worked in normal operation (via sysfs), the "v1"
POC implementation failed to propagate drm.debug boot-args to built-in
drivers and helpers.
The API Fix:
The root cause was a "Define vs Refer" design error. By using
DECLARE_DYNDBG_CLASSMAP in both core and drivers, the implementation
lacked the formal linkage required for dyndbg to associate driver
callsites with the core's controlling parameter during early boot
init.
This series introduces a proper module-scoped API:
- DYNDBG_CLASSMAP_DEFINE: Invoked once in drm_print.c (exported by drm.ko).
- DYNDBG_CLASSMAP_USE: Invoked by 20+ DRM/Accel modules to reference the core.
This linkage allows dyndbg to trace a driver's USE back to the core's
DEFINE. At boot-time, dyndbg can now correctly apply drm.debug
settings to all referencing modules as they are initialized, restoring
full functionality for built-in drivers.
The Benefit and Evidence (+c flag):
While the instructions saved by replacing bit-tests with NOOPs are
individually small, the scale of DRM's debug activity makes the
aggregate impact substantial. In particular, dyndbg elides the fetch
of __drm_debug for every drm_debug_enabled() bit test, eliminating the
fetch from main memory and cache-line thrashing.
To measure the call-counts, the final patch in this series adds +c
flag to dyndbg, whereby enabled pr_debug* callsites increment a
per-cpu counter.
The benchmark (in last patch) sets +c flag on all drm_dbg_*s,
and runs 12 vkcubes for 30 sec:
root@frodo:/home/jimc/projects/lx# count_hits 30 hammer_vk --
Banging on: hammer_vk (&)
[1] 100847
[1]+ Done hammer_vk
#: total hits: 2295401
This ran 1 vkcube for 10sec each, counting 1 DRM_UT_* class at a time:
root@frodo:/home/jimc/projects/lx# isolate_drm_hits 2> /dev/null
Starting isolation study: 10s per class using vkcube
----------------------------------------------------------
DRM CLASS | TOTAL HITS
----------------------------------------------------------
DRM_UT_CORE | 85305
DRM_UT_DRIVER | 0
DRM_UT_KMS | 1435
DRM_UT_PRIME | 0
DRM_UT_ATOMIC | 13645
DRM_UT_VBL | 4071
DRM_UT_STATE | 1780
DRM_UT_LEASE | 0
DRM_UT_DP | 0
DRM_UT_DRMRES | 0
FOO | 0
Replacing this frequent memory fetch & bit-test with static-key NOOPs
could save approximately 200 peta-instructions per year across the
Steam Deck install base alone.
Series Organization:
1. vmlinux.lds.h fix and cleanup (patches 1-4)
fix section alignment of 32 bit arches
2. dyndbg internal refactorings (5-24)
internal callchaing grooming,
struct refactoring, __section renames
drop linked-list, use existing vector/array
3. core API fix (25-30)
replace flawed DECLARE_DYNDBG_CLASSMAP with the DEFINE/USE model.
fix boot-time propagation of drm.debug to built-in drivers/helpers.
add compile-time validation of classmaps
4. interface improvements, documentation (31-38)
query improvments: commas as token separators, % as query separators
control-file epilogue
5. apply API to DRM
call DYNAMIC_DEBUG_CLASSMAP_DEFINE(drm_debug_classes ...) in drm_drv.c
call DYNAMIC_DEBUG_CLASSMAP_USE(drm_debug_classes) in drivers, helpers
6. New additions in v14
add +c flag for benchmarking
add DYNAMIC_DEBUG_CLASSMAP_USEs to more drivers, helpers
drm/nouveau: Fix NULL pointer dereferences in GETPARAM ioctl (RFC)
In v13, to focus the review, I sent only the dyndbg core, and skipped
the DRM uses. But the value of the optimization is best seen in
context; it presented GregKH a "maze with no cheese".
For v14, I've recombined them to show the full scale of the benefit.
While the performance gains accrue to DRM, the infrastructure resides
in dyndbg.
So Id like to add some "cheese" (later); ie patchsets to:
1. reduce __dyndbg_* .data by 40%.
This uses 3 maple trees to store module, filename, function, which
collapses 1st 2 columns by 90%. Looped `cat control` tests indicate
a minor cost increase.
2. cache dynamic-prefixes, to avoid repeated work.
This assembles the prefix from maple trees, and stores the prefix into
another maple tree. The cache is minimal; for +m callsites, it keeps
just 1 prefix per enabled module, for +mf prefixes just 1 per function.
Preliminary benchmarking suggests positive ROI on these.
Fixes: bb2ff6c27bc9 ("drm: Disable dynamic debug as broken")
Assisted-by: google gemini
Signed-off-by: Jim Cromie <[email protected]>
---
Jim Cromie (91):
dyndbg: fix NULL ptr on i386 due to section mis-alignment
vmlinux.lds.h: move BOUNDED_SECTION_* macros to reuse later
dyndbg.lds.S: fix lost dyndbg sections in modules
vmlinux.lds.h: drop unused HEADERED_SECTION* macros
dyndbg: factor ddebug_match_desc out from ddebug_change
dyndbg: add stub macro for DECLARE_DYNDBG_CLASSMAP
docs/dyndbg: update examples \012 to \n
docs/dyndbg: explain flags parse 1st
test-dyndbg: fixup CLASSMAP usage error
dyndbg: reword "class unknown," to "class:_UNKNOWN_"
dyndbg: make ddebug_class_param union members same size
dyndbg: drop NUM_TYPE_ARRAY
dyndbg: tweak pr_fmt to avoid expansion conflicts
dyndbg: reduce verbose/debug clutter
dyndbg: refactor param_set_dyndbg_classes and below
dyndbg: tighten fn-sig of ddebug_apply_class_bitmap
dyndbg: replace classmap list with a vector
dyndbg: macrofy a 2-index for-loop pattern
dyndbg,module: make proper substructs in _ddebug_info
dyndbg: move mod_name down from struct ddebug_table to _ddebug_info
dyndbg: hoist classmap-filter-by-modname up to ddebug_add_module
dyndbg-API: remove DD_CLASS_TYPE_(DISJOINT|LEVEL)_NAMES and code
selftests-dyndbg: add a dynamic_debug run_tests target
dyndbg: change __dynamic_func_call_cls* macros into expressions
dyndbg-API: replace DECLARE_DYNDBG_CLASSMAP
dyndbg: detect class_id reservation conflicts
dyndbg: check DYNAMIC_DEBUG_CLASSMAP_{DEFINE,USE_} args at compile-time
dyndbg-test: change do_prints testpoint to accept a loopct
dyndbg-API: promote DYNAMIC_DEBUG_CLASSMAP_PARAM to API
dyndbg: treat comma as a token separator
dyndbg: split multi-query strings with %
selftests-dyndbg: add test_mod_submod
dyndbg: resolve "protection" of class'd pr_debug
dyndbg: harden classmap and descriptor validation
docs/dyndbg: add classmap info to howto
dyndbg: add epilogue to dynamic_debug/control file
drm: use correct ccflags-y spelling
drm-dyndbg: adapt drm core to use dyndbg classmaps-v2
drm-dyndbg: adapt DRM to invoke DYNAMIC_DEBUG_CLASSMAP_PARAM
drm/i915: Register DRM_CLASSMAP_USE(drm_debug_classes)
drm-dyndbg: DRM_CLASSMAP_USE in amdgpu driver
drm-dyndbg: add DRM_CLASSMAP_USE to virtio_gpu
drm-dyndbg: add DRM_CLASSMAP_USE to Xe
drm/drm_crtc_helper: Register DRM_CLASSMAP_USE(drm_debug_classes)
drm/drm_dp_helper: Register DRM_CLASSMAP_USE(drm_debug_classes)
drm/nouveau: Register DRM_CLASSMAP_USE(drm_debug_classes)
drm/gma500: Register DRM classmap
drm/radeon: Register DRM classmap
drm/vmwgfx: Register DRM classmap
drm/vkms: Register DRM classmap
drm/udl: Register DRM classmap
drm/mgag200: Register DRM classmap
drm/gud: Register DRM classmap
drm/qxl: Register DRM classmap
drm/shmem-helper: Register DRM classmap
drm/ttm-helper: DRM_CLASSMAP_USE(drm_debug_classes);
drm/nouveau: Fix NULL pointer dereferences in GETPARAM ioctl
drm/vc4: Register DRM classmap
drm/msm: Register DRM classmap
drm/hibmc: Register DRM classmap
drm/imx: Register DRM classmap
drm/mediatek: Register DRM classmap
drm/rockchip: Register DRM classmap
drm/sti: Register DRM classmap
drm/stm: Register DRM classmap
accel: add -DDYNAMIC_DEBUG_MODULE to subdir-ccflags
accel/ivpu: implement IVPU_DBG_* as a dyndbg classmap
accel/ethosu: enable drm.debug control
accel/rocket: enable drm.debug control
drm/komeda: Register DRM classmap
drm/bridge/analogix: Register DRM classmap
drm/bridge/dw-hdmi: Register DRM classmap
drm/hisilicon/kirin: Register DRM classmap
drm/imx/dc: Register DRM classmap
drm/imx/dcss: Register DRM classmap
drm/logicvc: Register DRM classmap
drm/loongson: Register DRM classmap
drm/renesas/rcar-du: Register DRM classmap
drm/sysfb/simpledrm: Register DRM classmap
drm/tests: Register DRM classmap in drm_mm_test
drm/ttm: Register DRM classmap
drm: restore CONFIG_DRM_USE_DYNAMIC_DEBUG un-BROKEN
drm-print: fix config-dependent unused variable
drm_print: fix drm_printer dynamic debug bypass
drm: enable DRM_USE_DYNAMIC_DEBUG by default (for testing)
drm-dyndbg: add DRM_CLASSMAP_USE to etnaviv
drm/tiny: panel-mipi-dbi: Add DRM_CLASSMAP_USE
drm/bridge: ite-it6505: Add DRM_CLASSMAP_USE
drm/mipi-dbi: Add DRM_CLASSMAP_USE
drm/clients: Add DRM_CLASSMAP_USE to drm_client_setup
dyndbg: add +c flag to demonstrate advantage of classmaps for DRM
Philipp Hahn (1):
dyndbg: Ignore additional arguments from pr_fmt
Documentation/admin-guide/dynamic-debug-howto.rst | 184 ++++-
MAINTAINERS | 4 +-
drivers/accel/Makefile | 7 +-
drivers/accel/ethosu/ethosu_drv.c | 3 +
drivers/accel/ivpu/ivpu_drv.c | 27 +-
drivers/accel/ivpu/ivpu_drv.h | 45 +-
drivers/accel/rocket/rocket_gem.c | 2 +
drivers/gpu/drm/Kconfig.debug | 3 +-
drivers/gpu/drm/Makefile | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +-
drivers/gpu/drm/arm/display/komeda/komeda_drv.c | 4 +
drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 2 +
drivers/gpu/drm/bridge/ite-it6505.c | 2 +
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 2 +
drivers/gpu/drm/clients/drm_client_setup.c | 2 +
drivers/gpu/drm/display/drm_dp_helper.c | 12 +-
drivers/gpu/drm/drm_crtc_helper.c | 12 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 1 +
drivers/gpu/drm/drm_gem_ttm_helper.c | 2 +
drivers/gpu/drm/drm_mipi_dbi.c | 2 +
drivers/gpu/drm/drm_print.c | 40 +-
drivers/gpu/drm/etnaviv/etnaviv_drv.c | 2 +
drivers/gpu/drm/gma500/psb_drv.c | 2 +
drivers/gpu/drm/gud/gud_drv.c | 2 +
drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c | 2 +
drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 2 +
drivers/gpu/drm/i915/i915_params.c | 12 +-
drivers/gpu/drm/imx/dc/dc-drv.c | 3 +
drivers/gpu/drm/imx/dcss/dcss-drv.c | 3 +
drivers/gpu/drm/imx/ipuv3/imx-drm-core.c | 2 +
drivers/gpu/drm/logicvc/logicvc_drm.c | 2 +
drivers/gpu/drm/loongson/lsdc_drv.c | 2 +
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 3 +
drivers/gpu/drm/mgag200/mgag200_drv.c | 2 +
drivers/gpu/drm/msm/msm_drv.c | 3 +
drivers/gpu/drm/nouveau/nouveau_abi16.c | 25 +-
drivers/gpu/drm/nouveau/nouveau_drm.c | 12 +-
drivers/gpu/drm/qxl/qxl_drv.c | 2 +
drivers/gpu/drm/radeon/radeon_drv.c | 2 +
drivers/gpu/drm/renesas/rcar-du/rcar_du_drv.c | 2 +
drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 2 +
drivers/gpu/drm/sti/sti_drv.c | 2 +
drivers/gpu/drm/stm/drv.c | 2 +
drivers/gpu/drm/sysfb/simpledrm.c | 2 +
drivers/gpu/drm/tests/drm_mm_test.c | 2 +
drivers/gpu/drm/tiny/panel-mipi-dbi.c | 2 +
drivers/gpu/drm/ttm/ttm_device.c | 3 +
drivers/gpu/drm/udl/udl_main.c | 2 +
drivers/gpu/drm/vc4/vc4_drv.c | 2 +
drivers/gpu/drm/virtio/virtgpu_drv.c | 2 +
drivers/gpu/drm/vkms/vkms_drv.c | 2 +
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +
drivers/gpu/drm/xe/xe_module.c | 3 +
include/asm-generic/bounded_sections.lds.h | 23 +
include/asm-generic/dyndbg.lds.h | 26 +
include/asm-generic/vmlinux.lds.h | 48 +-
include/drm/drm_print.h | 17 +-
include/linux/dynamic_debug.h | 334 ++++++--
kernel/module/main.c | 15 +-
lib/Kconfig.debug | 24 +-
lib/Makefile | 5 +
lib/dynamic_debug.c | 889 ++++++++++++++-------
lib/test_dynamic_debug.c | 211 +++--
lib/test_dynamic_debug_submod.c | 21 +
scripts/module.lds.S | 2 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/dynamic_debug/Makefile | 9 +
tools/testing/selftests/dynamic_debug/config | 7 +
.../selftests/dynamic_debug/dyndbg_selftest.sh | 373 +++++++++
69 files changed, 1891 insertions(+), 598 deletions(-)
---
base-commit: d662a710c668a86a39ebaad334d9960a0cc776c2
change-id: 20260419-submit-dyndbg-classmap-foundation-a3c77652c054
Best regards,
--
Jim Cromie <[email protected]>