Since Feb 2023, DRM_USE_DYNAMIC_DEBUG has been marked BROKEN [1].
Although classmaps worked in normal operation (via sysfs), the "v1"
POC implementation failed to propagate drm.debug boot-args to built-in
drivers and helpers.

The API Fix:

The root cause was a "Define vs Refer" design error. By using
DECLARE_DYNDBG_CLASSMAP in both core and drivers, the implementation
lacked the formal linkage required for dyndbg to associate driver
callsites with the core's controlling parameter during early boot
init.

This series introduces a proper module-scoped API:
- DYNDBG_CLASSMAP_DEFINE: Invoked once in drm_print.c (exported by drm.ko).
- DYNDBG_CLASSMAP_USE: Invoked by 20+ DRM/Accel modules to reference the core.

This linkage allows dyndbg to trace a driver's USE back to the core's
DEFINE. At boot-time, dyndbg can now correctly apply drm.debug
settings to all referencing modules as they are initialized, restoring
full functionality for built-in drivers.

The Benefit and Evidence (+c flag):

While the instructions saved by replacing bit-tests with NOOPs are
individually small, the scale of DRM's debug activity makes the
aggregate impact substantial.  In particular, dyndbg elides the fetch
of __drm_debug for every drm_debug_enabled() bit test, eliminating the
fetch from main memory and cache-line thrashing.

To measure the call-counts, the final patch in this series adds +c
flag to dyndbg, whereby enabled pr_debug* callsites increment a
per-cpu counter.

The benchmark (in last patch) sets +c flag on all drm_dbg_*s,
and runs 12 vkcubes for 30 sec:

  root@frodo:/home/jimc/projects/lx# count_hits 30 hammer_vk --
  Banging on: hammer_vk (&)
  [1] 100847
  [1]+  Done                       hammer_vk
  #: total hits: 2295401

This ran 1 vkcube for 10sec each, counting 1 DRM_UT_* class at a time:

root@frodo:/home/jimc/projects/lx# isolate_drm_hits 2> /dev/null
Starting isolation study: 10s per class using vkcube
----------------------------------------------------------
DRM CLASS            | TOTAL HITS
----------------------------------------------------------
DRM_UT_CORE          | 85305
DRM_UT_DRIVER        | 0
DRM_UT_KMS           | 1435
DRM_UT_PRIME         | 0
DRM_UT_ATOMIC        | 13645
DRM_UT_VBL           | 4071
DRM_UT_STATE         | 1780
DRM_UT_LEASE         | 0
DRM_UT_DP            | 0
DRM_UT_DRMRES        | 0
FOO                  | 0

Replacing this frequent memory fetch & bit-test with static-key NOOPs
could save approximately 200 peta-instructions per year across the
Steam Deck install base alone.

Series Organization:

1. vmlinux.lds.h fix and cleanup (patches 1-4)
   fix section alignment of 32 bit arches

2. dyndbg internal refactorings (5-24)
   internal callchaing grooming,
   struct refactoring, __section renames
   drop linked-list, use existing vector/array

3. core API fix (25-30)
   replace flawed DECLARE_DYNDBG_CLASSMAP with the DEFINE/USE model.
   fix boot-time propagation of drm.debug to built-in drivers/helpers.
   add compile-time validation of classmaps

4. interface improvements, documentation (31-38)
   query improvments: commas as token separators, % as query separators
   control-file epilogue

5. apply API to DRM
   call DYNAMIC_DEBUG_CLASSMAP_DEFINE(drm_debug_classes ...) in drm_drv.c
   call DYNAMIC_DEBUG_CLASSMAP_USE(drm_debug_classes) in drivers, helpers

6. New additions in v14
   add +c flag for benchmarking
   add DYNAMIC_DEBUG_CLASSMAP_USEs to more drivers, helpers
   drm/nouveau: Fix NULL pointer dereferences in GETPARAM ioctl (RFC)

In v13, to focus the review, I sent only the dyndbg core, and skipped
the DRM uses.  But the value of the optimization is best seen in
context; it presented GregKH a "maze with no cheese".

For v14, I've recombined them to show the full scale of the benefit.
While the performance gains accrue to DRM, the infrastructure resides
in dyndbg.

So Id like to add some "cheese" (later); ie patchsets to:

1. reduce __dyndbg_* .data by 40%.

This uses 3 maple trees to store module, filename, function, which
collapses 1st 2 columns by 90%.  Looped `cat control` tests indicate
a minor cost increase.

2. cache dynamic-prefixes, to avoid repeated work.

This assembles the prefix from maple trees, and stores the prefix into
another maple tree.  The cache is minimal; for +m callsites, it keeps
just 1 prefix per enabled module, for +mf prefixes just 1 per function.

Preliminary benchmarking suggests positive ROI on these.

Fixes: bb2ff6c27bc9 ("drm: Disable dynamic debug as broken")

Assisted-by: google gemini
Signed-off-by: Jim Cromie <[email protected]>
---
Jim Cromie (91):
      dyndbg: fix NULL ptr on i386 due to section mis-alignment
      vmlinux.lds.h: move BOUNDED_SECTION_* macros to reuse later
      dyndbg.lds.S: fix lost dyndbg sections in modules
      vmlinux.lds.h: drop unused HEADERED_SECTION* macros
      dyndbg: factor ddebug_match_desc out from ddebug_change
      dyndbg: add stub macro for DECLARE_DYNDBG_CLASSMAP
      docs/dyndbg: update examples \012 to \n
      docs/dyndbg: explain flags parse 1st
      test-dyndbg: fixup CLASSMAP usage error
      dyndbg: reword "class unknown," to "class:_UNKNOWN_"
      dyndbg: make ddebug_class_param union members same size
      dyndbg: drop NUM_TYPE_ARRAY
      dyndbg: tweak pr_fmt to avoid expansion conflicts
      dyndbg: reduce verbose/debug clutter
      dyndbg: refactor param_set_dyndbg_classes and below
      dyndbg: tighten fn-sig of ddebug_apply_class_bitmap
      dyndbg: replace classmap list with a vector
      dyndbg: macrofy a 2-index for-loop pattern
      dyndbg,module: make proper substructs in _ddebug_info
      dyndbg: move mod_name down from struct ddebug_table to _ddebug_info
      dyndbg: hoist classmap-filter-by-modname up to ddebug_add_module
      dyndbg-API: remove DD_CLASS_TYPE_(DISJOINT|LEVEL)_NAMES and code
      selftests-dyndbg: add a dynamic_debug run_tests target
      dyndbg: change __dynamic_func_call_cls* macros into expressions
      dyndbg-API: replace DECLARE_DYNDBG_CLASSMAP
      dyndbg: detect class_id reservation conflicts
      dyndbg: check DYNAMIC_DEBUG_CLASSMAP_{DEFINE,USE_} args at compile-time
      dyndbg-test: change do_prints testpoint to accept a loopct
      dyndbg-API: promote DYNAMIC_DEBUG_CLASSMAP_PARAM to API
      dyndbg: treat comma as a token separator
      dyndbg: split multi-query strings with %
      selftests-dyndbg: add test_mod_submod
      dyndbg: resolve "protection" of class'd pr_debug
      dyndbg: harden classmap and descriptor validation
      docs/dyndbg: add classmap info to howto
      dyndbg: add epilogue to dynamic_debug/control file
      drm: use correct ccflags-y spelling
      drm-dyndbg: adapt drm core to use dyndbg classmaps-v2
      drm-dyndbg: adapt DRM to invoke DYNAMIC_DEBUG_CLASSMAP_PARAM
      drm/i915: Register DRM_CLASSMAP_USE(drm_debug_classes)
      drm-dyndbg: DRM_CLASSMAP_USE in amdgpu driver
      drm-dyndbg: add DRM_CLASSMAP_USE to virtio_gpu
      drm-dyndbg: add DRM_CLASSMAP_USE to Xe
      drm/drm_crtc_helper: Register DRM_CLASSMAP_USE(drm_debug_classes)
      drm/drm_dp_helper: Register DRM_CLASSMAP_USE(drm_debug_classes)
      drm/nouveau: Register DRM_CLASSMAP_USE(drm_debug_classes)
      drm/gma500: Register DRM classmap
      drm/radeon: Register DRM classmap
      drm/vmwgfx: Register DRM classmap
      drm/vkms: Register DRM classmap
      drm/udl: Register DRM classmap
      drm/mgag200: Register DRM classmap
      drm/gud: Register DRM classmap
      drm/qxl: Register DRM classmap
      drm/shmem-helper: Register DRM classmap
      drm/ttm-helper: DRM_CLASSMAP_USE(drm_debug_classes);
      drm/nouveau: Fix NULL pointer dereferences in GETPARAM ioctl
      drm/vc4: Register DRM classmap
      drm/msm: Register DRM classmap
      drm/hibmc: Register DRM classmap
      drm/imx: Register DRM classmap
      drm/mediatek: Register DRM classmap
      drm/rockchip: Register DRM classmap
      drm/sti: Register DRM classmap
      drm/stm: Register DRM classmap
      accel: add -DDYNAMIC_DEBUG_MODULE to subdir-ccflags
      accel/ivpu: implement IVPU_DBG_* as a dyndbg classmap
      accel/ethosu: enable drm.debug control
      accel/rocket: enable drm.debug control
      drm/komeda: Register DRM classmap
      drm/bridge/analogix: Register DRM classmap
      drm/bridge/dw-hdmi: Register DRM classmap
      drm/hisilicon/kirin: Register DRM classmap
      drm/imx/dc: Register DRM classmap
      drm/imx/dcss: Register DRM classmap
      drm/logicvc: Register DRM classmap
      drm/loongson: Register DRM classmap
      drm/renesas/rcar-du: Register DRM classmap
      drm/sysfb/simpledrm: Register DRM classmap
      drm/tests: Register DRM classmap in drm_mm_test
      drm/ttm: Register DRM classmap
      drm: restore CONFIG_DRM_USE_DYNAMIC_DEBUG un-BROKEN
      drm-print: fix config-dependent unused variable
      drm_print: fix drm_printer dynamic debug bypass
      drm: enable DRM_USE_DYNAMIC_DEBUG by default (for testing)
      drm-dyndbg: add DRM_CLASSMAP_USE to etnaviv
      drm/tiny: panel-mipi-dbi: Add DRM_CLASSMAP_USE
      drm/bridge: ite-it6505: Add DRM_CLASSMAP_USE
      drm/mipi-dbi: Add DRM_CLASSMAP_USE
      drm/clients: Add DRM_CLASSMAP_USE to drm_client_setup
      dyndbg: add +c flag to demonstrate advantage of classmaps for DRM

Philipp Hahn (1):
      dyndbg: Ignore additional arguments from pr_fmt

 Documentation/admin-guide/dynamic-debug-howto.rst  | 184 ++++-
 MAINTAINERS                                        |   4 +-
 drivers/accel/Makefile                             |   7 +-
 drivers/accel/ethosu/ethosu_drv.c                  |   3 +
 drivers/accel/ivpu/ivpu_drv.c                      |  27 +-
 drivers/accel/ivpu/ivpu_drv.h                      |  45 +-
 drivers/accel/rocket/rocket_gem.c                  |   2 +
 drivers/gpu/drm/Kconfig.debug                      |   3 +-
 drivers/gpu/drm/Makefile                           |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c            |  12 +-
 drivers/gpu/drm/arm/display/komeda/komeda_drv.c    |   4 +
 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c |   2 +
 drivers/gpu/drm/bridge/ite-it6505.c                |   2 +
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c          |   2 +
 drivers/gpu/drm/clients/drm_client_setup.c         |   2 +
 drivers/gpu/drm/display/drm_dp_helper.c            |  12 +-
 drivers/gpu/drm/drm_crtc_helper.c                  |  12 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c             |   1 +
 drivers/gpu/drm/drm_gem_ttm_helper.c               |   2 +
 drivers/gpu/drm/drm_mipi_dbi.c                     |   2 +
 drivers/gpu/drm/drm_print.c                        |  40 +-
 drivers/gpu/drm/etnaviv/etnaviv_drv.c              |   2 +
 drivers/gpu/drm/gma500/psb_drv.c                   |   2 +
 drivers/gpu/drm/gud/gud_drv.c                      |   2 +
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c    |   2 +
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c    |   2 +
 drivers/gpu/drm/i915/i915_params.c                 |  12 +-
 drivers/gpu/drm/imx/dc/dc-drv.c                    |   3 +
 drivers/gpu/drm/imx/dcss/dcss-drv.c                |   3 +
 drivers/gpu/drm/imx/ipuv3/imx-drm-core.c           |   2 +
 drivers/gpu/drm/logicvc/logicvc_drm.c              |   2 +
 drivers/gpu/drm/loongson/lsdc_drv.c                |   2 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.c             |   3 +
 drivers/gpu/drm/mgag200/mgag200_drv.c              |   2 +
 drivers/gpu/drm/msm/msm_drv.c                      |   3 +
 drivers/gpu/drm/nouveau/nouveau_abi16.c            |  25 +-
 drivers/gpu/drm/nouveau/nouveau_drm.c              |  12 +-
 drivers/gpu/drm/qxl/qxl_drv.c                      |   2 +
 drivers/gpu/drm/radeon/radeon_drv.c                |   2 +
 drivers/gpu/drm/renesas/rcar-du/rcar_du_drv.c      |   2 +
 drivers/gpu/drm/rockchip/rockchip_drm_drv.c        |   2 +
 drivers/gpu/drm/sti/sti_drv.c                      |   2 +
 drivers/gpu/drm/stm/drv.c                          |   2 +
 drivers/gpu/drm/sysfb/simpledrm.c                  |   2 +
 drivers/gpu/drm/tests/drm_mm_test.c                |   2 +
 drivers/gpu/drm/tiny/panel-mipi-dbi.c              |   2 +
 drivers/gpu/drm/ttm/ttm_device.c                   |   3 +
 drivers/gpu/drm/udl/udl_main.c                     |   2 +
 drivers/gpu/drm/vc4/vc4_drv.c                      |   2 +
 drivers/gpu/drm/virtio/virtgpu_drv.c               |   2 +
 drivers/gpu/drm/vkms/vkms_drv.c                    |   2 +
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c                |   2 +
 drivers/gpu/drm/xe/xe_module.c                     |   3 +
 include/asm-generic/bounded_sections.lds.h         |  23 +
 include/asm-generic/dyndbg.lds.h                   |  26 +
 include/asm-generic/vmlinux.lds.h                  |  48 +-
 include/drm/drm_print.h                            |  17 +-
 include/linux/dynamic_debug.h                      | 334 ++++++--
 kernel/module/main.c                               |  15 +-
 lib/Kconfig.debug                                  |  24 +-
 lib/Makefile                                       |   5 +
 lib/dynamic_debug.c                                | 889 ++++++++++++++-------
 lib/test_dynamic_debug.c                           | 211 +++--
 lib/test_dynamic_debug_submod.c                    |  21 +
 scripts/module.lds.S                               |   2 +
 tools/testing/selftests/Makefile                   |   1 +
 tools/testing/selftests/dynamic_debug/Makefile     |   9 +
 tools/testing/selftests/dynamic_debug/config       |   7 +
 .../selftests/dynamic_debug/dyndbg_selftest.sh     | 373 +++++++++
 69 files changed, 1891 insertions(+), 598 deletions(-)
---
base-commit: d662a710c668a86a39ebaad334d9960a0cc776c2
change-id: 20260419-submit-dyndbg-classmap-foundation-a3c77652c054

Best regards,
-- 
Jim Cromie <[email protected]>


Reply via email to