On Mon, 2026-06-01 at 09:52 +0200, Nam Cao wrote:
> > +/*
> > + * __ha_monitor_timer_callback - generic callback representation
> > + *
> > + * This callback runs in an RCU read-side critical section to
> > allow the
> > + * destruction sequence to easily synchronize_rcu() with all
> > pending timers
> > + * after asynchronously disabling them. The ha_mon_destroying
> > check ensures
> > + * any callback entering the RCU section after synchronize_rcu()
> > completes
> > + * will see the flag and bail out immediately.
> > + */
> >  static inline void __ha_monitor_timer_callback(struct ha_monitor
> > *ha_mon)
> >  {
> > -   enum states curr_state = READ_ONCE(ha_mon-
> > >da_mon.curr_state);
> >     DECLARE_SEQ_BUF(env_string, ENV_BUFFER_SIZE);
> > -   u64 time_ns = ha_get_ns();
> > -
> > +   enum states curr_state;
> > +   u64 time_ns;
> > +
> > +   guard(rcu)();
> > +   if (unlikely(READ_ONCE(ha_mon_destroying)))
> 
> Instead of this global variable, can we instead use da_mon-
> >monitoring?
> 

We already do in da_monitor_handling_event(), this is guarding for the
very unlikely scenario of:

1. timer callback firing (just before guard(rcu))
2. sync rcu, return before waiting for that one
3. reset and free memory
4. guard(rcu) now and check the state in da_mon (freed)

I'm not too confident saying this cannot happen under PREEMPT_RT.

I know the easy alternative would be to synchronously stop timers on
destruction, but that gets complicated with per-task monitors, where we
iterate over task grabbing a read lock.

Am I just too paranoid?

Thanks,
Gabriele

> > +           return;
> > +   /* Ensure consistent curr_state if we race with
> > da_monitor_reset */
> > +   curr_state = smp_load_acquire(&ha_mon->da_mon.curr_state);
> > +   if (unlikely(!da_monitor_handling_event(&ha_mon->da_mon)))
> > +           return;
> > +
> > +   time_ns = ha_get_ns();
> >     ha_get_env_string(&env_string, ha_mon, time_ns);
> >     ha_react(curr_state, EVENT_NONE, env_string.buffer);
> >     ha_trace_error_env(ha_mon,
> > model_get_state_name(curr_state),
> > -- 
> > 2.54.0


Reply via email to