On Fri, May 21, 2021 at 01:09:17PM -0400, Peter Xu wrote: > On Fri, May 21, 2021 at 05:56:14PM +0100, Daniel P. Berrangé wrote: > > On Fri, May 21, 2021 at 05:33:46PM +0100, Daniel P. Berrangé wrote: > > > On Fri, May 21, 2021 at 10:43:36AM -0400, Peter Xu wrote: > > > > > > > > I think the original problem was that if qemu_chr_fe_set_handlers() is > > > > called > > > > in main thread, it can start to race somehow within execution of the > > > > function > > > > qemu_chr_fe_set_handlers() right after we switch context at: > > > > > > > > qemu_chr_be_update_read_handlers(s, context); > > > > > > > > Then the rest code in qemu_chr_fe_set_handlers() will continue to run > > > > in main > > > > thread for sure, but the should be running with the new iothread > > > > context, which > > > > introduce a race condition. > > > > > > > > Running qemu_chr_be_update_read_handlers() in BH resolves that because > > > > then all > > > > things run in the monitor iothread only and natually serialized. > > > > > > The first message in this thread, however, claims that it is *not* > > > in fact serialized, when using the BH. > > > > > > > So the new comment looks indeed not fully right, as the chr device > > > > should be > > > > indeed within main thread context before qemu_chr_fe_set_handlers(), > > > > it's just > > > > that the race may start right away if without BH when context switch > > > > happens > > > > for the chr. > > > > > > It sounds like both the comment and the code are potentially wrong. > > > > > > I feel like our root cause problem that the original code was trying to > > workaround, is that the chardev is "active" from the very moment it is > > created, regardless of whether the frontend is ready to use it. > > > > IIUC, in this case the socket chardev is already listen()ing and > > accept()ing incoming clients off the network, before the monitor > > has finished configuring its hooks into the chardev. This means > > that the initial listen()/accept() I/O watches are using the > > default GMainContext, and the monitor *has* to remove them and > > put in new watches on the thread private GMainContext. > > > > To eliminate any risk of races, we need to make it possible for the > > monitor to configure the GMainContext on the chardevs *before* any > > I/O watches are configured. > > > > This in turn suggests that we need to split the chardev initialization > > into two phases. First we have the basic chardev creation, with object > > creation, option parsing/sanity checking, socket creation, and then > > second we have the actual activation where the I/O watches are added. > > When we are still running the monitor_init() code IIUC the main thread is > still > occupied so it won't be able to handle any listen()/accept() even if there's > event already, so IIUC race can only happen when main thread started the event > loop then run concurrently with the BH in the other iothread. > > So, besides above split of chardev init (which sounds like a bigger > surgery)... would it also work if we let the main thread wait until that BH > executed in monitor iothread? Then all pending listen()/accept() event will > directly be done in the monitor thread.
My concern is what happens when we add support for monitor hotplug in the future. A client wanting to add a second monitor to a running QEMU will run "chardev-add" followed by a hypothetical "monitor-add" command. Both of these will be processed in the monitor thread, so the main thread will have already started handling I/O events for the chardev before "monitor-add" gets processed. Thus I think the only long term safe solution is to have a design that guarantees no I/O events are registered by *any* chardev impl, until the frontend connects to the chardev. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|