On Mon, Jul 28, 2025 at 03:50:17PM +0200, Gabriele Monaco wrote: > DA monitor can be accessed from multiple cores simultaneously, this is > likely, for instance when dealing with per-task monitors reacting on > events that do not always occur on the CPU where the task is running. > This can cause race conditions where two events change the next state > and we see inconsistent values. E.g.: > > [62] event_srs: 27: sleepable x sched_wakeup -> running (final) > [63] event_srs: 27: sleepable x sched_set_state_sleepable -> sleepable > [63] error_srs: 27: event sched_switch_suspend not expected in the state > running > > In this case the monitor fails because the event on CPU 62 wins against > the one on CPU 63, although the correct state should have been > sleepable, since the task get suspended. > > Detect if the current state was modified by using try_cmpxchg while > storing the next value. If it was, try again reading the current state. > After a maximum number of failed retries, react by calling a special > tracepoint, print on the console and reset the monitor. > > Remove the functions da_monitor_curr_state() and da_monitor_set_state() > as they only hide the underlying implementation in this case. > > Monitors where this type of condition can occur must be able to account > for racing events in any possible order, as we cannot know the winner. > > Cc: Ingo Molnar <[email protected]> > Cc: Peter Zijlstra <[email protected]> > Signed-off-by: Gabriele Monaco <[email protected]>
Reviewed-by: Nam Cao <[email protected]>
