The preempt and IRQ tracepoints currently impose measurable overhead
even when they are compiled in but not actively enabled. This overhead
stems from external function calls and unconditional tracepoint checks
in highly active code paths.

The v2 series optimized within the existing CONFIG_TRACE_IRQFLAGS
path, which still required the heavy lockdep and irqsoff
infrastructure to be enabled. This v3 takes a different approach by
providing independent, user-selectable configurations
(CONFIG_TRACE_PREEMPT_TOGGLE and CONFIG_TRACE_IRQFLAGS_TOGGLE) that
expose the tracepoints without pulling in the heavier infrastructure.

The preempt optimization uses inline static key checks, while the IRQ
optimization employs lightweight wrapper functions that check raw
hardware state. Making both configurations explicitly user-selectable
addresses upstream feedback regarding the impact on code generation,
ensuring that this optimization remains strictly opt-in.

---
Performance Measurements

Measurements were taken using the tracer-benchmark kernel module [1].
The module creates one kthread per online CPU. Each thread performs
a configurable number of iterations of
local_irq_disable()/local_irq_enable() and
preempt_disable()/preempt_enable() pairs, timing each pair with
ktime_get_ns(). All threads start simultaneously via a completion
to maximize contention. Per-CPU results (average, median) are aggregated across
CPUs. The 99th percentile is measured separately
on a single pinned CPU. The kernel used was version 7.0.0. All
values are in nanoseconds. Each run collected 10^7 samples.

Configurations compared:
 - 7.0.0: stock kernel
 - irqsoff: stock kernel with CONFIG_IRQSOFF_TRACER=y and
   CONFIG_PREEMPT_TRACER=y
 - preemptirq: patched kernel with CONFIG_TRACE_PREEMPT_TOGGLE=y
   and CONFIG_TRACE_IRQFLAGS_TOGGLE=y

The '+' suffix indicates the test ran with tracepoints enabled.

IRQ Metrics

          Metric          7.0.0  irqsoff  irqsoff+  preemptirq  preemptirq+
          average            19       27       175          19          166
          median             19       27       172          19          164
          99 percentile      21       29       234          21          221

Preempt Metrics

          Metric          7.0.0  irqsoff  irqsoff+  preemptirq  preemptirq+
          average            16       21       169          16          160
          median             16       21       165          17          159
          99 percentile      18       23       236          18          217

The preemptirq configuration matches the stock kernel performance
when tracepoints are disabled, while the irqsoff configuration adds
~40% overhead even when inactive. When tracepoints are enabled,
preemptirq is also slightly faster than irqsoff.

Binary size impact (stripped vmlinux, defconfig):

          7.0.0:       43404576 bytes
          preemptirq:  43429152 bytes (+24576 bytes, +0.057%)

Suggested-by: Steven Rostedt <[email protected]>

---
References:
[1] https://github.com/walac/tracer-benchmark

Changes in v3:
- Reworked series from 2 to 4 patches
- IRQ tracing rearchitected: instead of optimizing within
  CONFIG_TRACE_IRQFLAGS, introduced independent
  CONFIG_TRACE_IRQFLAGS_TOGGLE that provides irq_disable/irq_enable
  tracepoints without pulling in lockdep or irqsoff infrastructure
- Made TRACE_PREEMPT_TOGGLE user-selectable in Kconfig, addressing
  upstream feedback about code generation impact
- Preempt optimization now handles CONFIG_PREEMPT_TRACER alongside
  CONFIG_DEBUG_PREEMPT in the three-way #if split
- Fixed __preempt_trace_enabled() macro to accept val as parameter
- Resolved circular header dependency on m68k by placing
  tracepoint-defs.h include inside conditional blocks instead of
  at the top of preempt.h and irqflags.h
- Moved atomic.h include from tracepoint-defs.h to tracepoint.h
  to break circular dependency chain on ARM32
- Used EXPORT_TRACEPOINT_SYMBOL() instead of raw
  EXPORT_SYMBOL(__tracepoint_*) for proper tracepoint registration
- Narrowed core.c compilation guard to CONFIG_DEBUG_PREEMPT ||
  CONFIG_PREEMPT_TRACER since TRACE_PREEMPT_TOGGLE is now handled
  inline
- Updated performance benchmarks on 7.0.0, including
  tracepoint-enabled measurements and binary size impact

Changes in v2:
- Fixed build failure on arm32 (circular dependency:
  atomic.h -> cmpxchg.h -> irqflags.h -> tracepoint.h -> atomic.h)

Wander Lairson Costa (4):
  tracing/preemptirq: Optimize preempt_disable/enable() tracepoint
    overhead
  trace/preemptirq: make TRACE_PREEMPT_TOGGLE user-selectable
  trace/preemptirq: add TRACE_IRQFLAGS_TOGGLE
  trace/preemptirq: Implement trace_irqflags hooks

 include/linux/irqflags.h          | 62 +++++++++++++++++++++++++++-
 include/linux/preempt.h           | 49 ++++++++++++++++++++--
 include/linux/tracepoint-defs.h   |  1 -
 include/linux/tracepoint.h        |  1 +
 include/trace/events/preemptirq.h |  2 +-
 kernel/sched/core.c               |  2 +-
 kernel/trace/Kconfig              | 19 +++++++--
 kernel/trace/trace_preemptirq.c   | 68 +++++++++++++++++++++++++++++++
 8 files changed, 193 insertions(+), 11 deletions(-)

-- 
2.53.0


Reply via email to