On Tue, Feb 24, 2026 at 02:26:22PM +0100, Martin Wilck wrote:
> Hi Brian,
> 
> On Mon, 2026-02-23 at 17:00 -0800, Brian Bunker wrote:
> > Fix a race condition in libcheck_check() for both the TUR and ALUA
> > async
> > checkers where a completed result could be missed, returning
> > PATH_PENDING
> > unnecessarily.
> 
> First of all, you should have cc'd the dm-devel mailing list.
> 
> Second, you call this a race condition, but what negative effect does
> it actually have? AFAIU this "race condition" just delays returning the
> new checker state by a single tick (i.e. 1 second), which is a rather
> benign problem for a race, and generally not a big problem for
> multipath, where paths can be offline for much longer periods of time.
> 
> Finally, before going into any detail, I am highly reluctant to changes
> in this area. Ben and I have fought with deadlocks and races in corner
> cases for a long time, and we seem to have solved it for good; at least
> we haven't seen any more bugs in this area for quite some time. For a
> change like this, figuring out possible side effects for all possible
> cases by code inspection alone is extremely hard. I'd rather not risk a
> regression just to detect a status change 1 second earlier than we
> currently do, in rare cases.

I pretty much agree with Martin here. I do think this is likely safe,
but I don't think this is really a race. We don't return the status of
the thread until the thread completes, and the thread signals its
completion by clearing ct->running. That's the same definition of
successful thread completion that we use for deciding to kill the
thread, and I don't see why it's not a valid one.

Yes we can get the thread result slightly before the thread has
completed. But by trusting that the thread will complete once pp->msgid
has changed, we are giving up the ability to kill it if it does hang
later, for instance in that condlog() call. On systemd systems, this is
just some stdio print functions, but on non-systemd systems, this uses
its own pthread locking, and could legitimately hang (although the
argument certainly can be made that if this is hanging, multipathd has
bigger problems to worry about than an extra tur checker thread lying
around).  
 
> > 
> > Race window
> > -----------
> > 
> > In libcheck_thread() / alua_thread():
> > 
> >     pthread_mutex_lock(&ct->lock);
> >     ct->state = state;
> >     ct->msgid = msgid;
> >     pthread_cond_signal(&ct->active);
> >     pthread_mutex_unlock(&ct->lock);
> >     /* <-- race window here --> */
> >     uatomic_set(&ct->running, 0);
> > 
> 
> You seem to be looking at old code. The code I am seeing has this:
> 
>        running = uatomic_xchg(&ct->running, 0);
>        if (!running)
>                pause();
> 
> Anyway, I think we had very good reasons to set the running state after
> releasing the lock. I would need to dig deeper into the history in
> order to figure out exactly why we did it this way. It probably has
> something to do with the actual fatal races that we observed between
> checkers timing out / being cancelled and finishing.
> 
> 
> > When libcheck_check() observes ct->running != 0 during this window,
> > the
> > result is already available under lock, but the previous code could
> > fail
> > to collect it.
> > 
> > Why the old check was broken
> > ----------------------------
> > 
> > The ALUA checker used:
> > 
> >     if (ct->state != PATH_PENDING || ct->msgid != MSG_ALUA_RUNNING)
> > 
> > This conflated two different meanings of PATH_PENDING:
> > 1. Thread still in progress (no result yet)
> > 2. ALUA state is transitioning (valid completed result with
> >    MSG_ALUA_TRANSITIONING)
> 
> True, but what's the actual problem with this? In both cases, the
> checker is indeed not yet able to figure out if the path is usable,
> which is what CHECKER_PENDING means.

This is where I'm stuck too. If there is a legitimate problem with the
current code, then that would certainly weigh against our caution here.
But AFAICS, this is just multipathd checking the thread and seeing that
it hasn't completed yet (to multipathd's definition of completed), and
waiting a second to recheck.

-Ben 
 
> 
> > 
> > The TUR checker had similar issues since PATH_PENDING with
> > MSG_TUR_TRANSITIONING is a valid result.
> > 
> > The fix
> > -------
> > 
> > Check ct->msgid alone under lock. The msgid is set to MSG_*_RUNNING
> > only
> > at async check start and atomically updated with state when complete:
> > 
> >     if (ct->msgid != MSG_TUR_RUNNING) {
> >         tur_status = ct->state;
> >         c->msgid = ct->msgid;
> >         ct->thread = 0;
> >     }
> 
> You probably realize that with this change, you basically imply that
> the test for ct->running is superfluous. If this was true, we could
> actually just skip it and solely rely on ct->msgid. What you're
> basically saying is that ct->running is meaningless. But it isn't. It
> means that the thread is "dead", which it isn't as long as holds the
> mutex.
> 
> I'll wait for Ben's opinion, but I am inclined to nack this, sorry.
> 
> Martin


Reply via email to