On Thu, 2025-02-06 at 09:57 +0100, Peter Zijlstra wrote: > On Thu, Feb 06, 2025 at 09:36:41AM +0100, Gabriele Monaco wrote: > > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > > > index 165c90ba64ea9..fb5f8aa61ef5d 100644 > > > > --- a/kernel/sched/core.c > > > > +++ b/kernel/sched/core.c > > > > @@ -491,6 +491,12 @@ sched_core_dequeue(struct rq *rq, struct > > > > task_struct *p, int flags) { } > > > > > > > > #endif /* CONFIG_SCHED_CORE */ > > > > > > > > +void trace_set_current_state(int state_value) > > > > +{ > > > > + trace_sched_set_state_tp(current, current->__state, > > > > state_value); > > > > +} > > > > +EXPORT_SYMBOL(trace_set_current_state); > > > > > > Urgh, why !?! > > > > What do you think would be better? > > So I would think having the tracepoint in-line would be better. > Because > as is, everything gets to have this pointless CALL to an empty > function. > > If this were x86_64 only, I would suggest using static_call(), but > barring that, the static_branch() already in the tracepoint is the > best > we can do. >
Ok, I see your point now.. Adding the trace_ call inline seems far from trivial to me, but we could indeed do what's suggested in tracepoint-defs.h and practically use a static branch to call this trace_set_current_state, not sure if this is already what you were suggesting, though. Ignore the inconsistent naming, but something like this should work: diff --git a/include/linux/sched.h b/include/linux/sched.h index af9fa18035c7..7b9d84dbc2f5 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -223,6 +223,12 @@ struct user_event_mm; * * Also see the comments of try_to_wake_up(). */ + +#define trace_set_current_state(state_value) do { \ + if (tracepoint_enabled(sched_set_state_tp)) \ + do_trace_set_current_state(state_value); \ +} while(0) + #define __set_current_state(state_value) \ do { \ debug_normal_state_change((state_value)); \ @@ -332,7 +338,9 @@ extern void io_schedule_finish(int token); extern long io_schedule_timeout(long timeout); extern void io_schedule(void); -extern void trace_set_current_state(int state_value); +#include <linux/tracepoint-defs.h> +DECLARE_TRACEPOINT(sched_set_state_tp); +extern void do_trace_set_current_state(int state_value); /** * struct prev_cputime - snapshot of system and user cputime diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 149d55195532..9fc2be079bb5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -491,11 +491,12 @@ sched_core_dequeue(struct rq *rq, struct task_struct *p, int flags) { } #endif /* CONFIG_SCHED_CORE */ -void trace_set_current_state(int state_value) +void do_trace_set_current_state(int state_value) { trace_sched_set_state_tp(current, current->__state, state_value); } -EXPORT_SYMBOL(trace_set_current_state); +EXPORT_SYMBOL(do_trace_set_current_state); +EXPORT_TRACEPOINT_SYMBOL(sched_set_state_tp); /* * Serialization rules: