Stefan, any thoughts on this? Markus Armbruster <arm...@redhat.com> writes:
> I've thought a bit more. > > A monitor can serve a series of clients. > > Back when all of the monitor ran in the main thread, we completely > finished serving the current client before we started serving the next > one (I think). In other words, sessions did not overlap. > > Since we moved parts of the monitor to the monitor I/O thread (merge > commit 4bdc24fa018), sessions can overlap, and this causes issues, as > you demonstrated. > > Possible fixes: > > 1. Go back to "don't overlap" with suitable synchronization. > > I'm afraid this won't cut it, because exec-oob would have wait in > line behind reconnect. > > It currently waits for other reasons, but that feels fixable. Going > back to "don't overlap" would make it unfixable. > > 2. Make the lingering session not affect / be affected by the new > session's state > > This is what your patch tries. Every access of session state needs > to be guarded like > > if (conn_nr_before == qatomic_read(&mon->connection_nr)) { > access session state > } else { > ??? > } > > Issues: > > * We have to find and guard all existing accesses. That's work. > > * We have to guard all future accesses. More work, and easy to > forget, which makes this fix rather brittle. > > * The fix is spread over many places. > > * We may run into cases where the ??? part gets complicated. > Consider file descriptors. The command in flight will have its > monitor_get_fd() fail after disconnect. Probably okay, because it > can also fail for other reasons. But we might run into cases where > the ??? part creates new failure modes for us to handle. > > 3. Per-session state > > Move per-session state from monitor state into a separate object. > > Use reference counts to keep this object alive until both threads are > done with the session. > > Monitor I/O thread executes monitor core and the out-of-band > commands; its stops using the old session on disconnect, and starts > using the new session on connect. > > Main thread executes in-band commands, and these use the session that > submitted them. > > Do I make sense, or should I explain my idea in more detail?