Hi Marc, 在 2021/5/22 0:59, Marc-André Lureau 写道: > Hi > > On Fri, May 21, 2021 at 8:56 PM Daniel P. Berrangé <berra...@redhat.com > <mailto:berra...@redhat.com>> wrote: > > On Fri, May 21, 2021 at 05:33:46PM +0100, Daniel P. Berrangé wrote: > > On Fri, May 21, 2021 at 10:43:36AM -0400, Peter Xu wrote: > > > > > > I think the original problem was that if qemu_chr_fe_set_handlers() is > called > > > in main thread, it can start to race somehow within execution of the > function > > > qemu_chr_fe_set_handlers() right after we switch context at: > > > > > > qemu_chr_be_update_read_handlers(s, context); > > > > > > Then the rest code in qemu_chr_fe_set_handlers() will continue to run > in > main > > > thread for sure, but the should be running with the new iothread > context, which > > > introduce a race condition. > > > > > > Running qemu_chr_be_update_read_handlers() in BH resolves that because > then all > > > things run in the monitor iothread only and natually serialized. > > > > The first message in this thread, however, claims that it is *not* > > in fact serialized, when using the BH. > > > > > So the new comment looks indeed not fully right, as the chr device > should be > > > indeed within main thread context before qemu_chr_fe_set_handlers(), > it's just > > > that the race may start right away if without BH when context switch > happens > > > for the chr. > > > > It sounds like both the comment and the code are potentially wrong. > > > I feel like our root cause problem that the original code was trying to > workaround, is that the chardev is "active" from the very moment it is > created, regardless of whether the frontend is ready to use it. > > IIUC, in this case the socket chardev is already listen()ing and > accept()ing incoming clients off the network, before the monitor > has finished configuring its hooks into the chardev. This means > that the initial listen()/accept() I/O watches are using the > default GMainContext, and the monitor *has* to remove them and > put in new watches on the thread private GMainContext. > > To eliminate any risk of races, we need to make it possible for the > monitor to configure the GMainContext on the chardevs *before* any > I/O watches are configured. > > This in turn suggests that we need to split the chardev initialization > into two phases. First we have the basic chardev creation, with object > creation, option parsing/sanity checking, socket creation, and then > second we have the actual activation where the I/O watches are added. > > IOW, qemu_chr_new() is the former and gets run from generic code in > the main() method, or in QMP chardev_add. A new 'qemu_chr_activate' > method would be called by whatever frontend is using the chardev, > after registering a custom GMainContext. > > This would involve updating every single existing user of chardevs > to add a call to qemu_chr_activate, but that's worth it to eliminate > the race by design, rather than workaround it. > > > > What about my earlier suggestion to add a new "qemu_chr_be_disable_handlers()" > (until update_read_handlers is called again to enable them and the set a > different context)? >
In this case, the BH calls the update_read_handlers, so the new added "qemu_chr_be_disable_handlers" will be called in the monitor iothread BH ? If so, I'm not sure whether it is safe enough, because the Chardev may still be accessed in parallel by main loop and iothread for a while. How about call "qemu_chr_be_disable_handlers" before set the monitor_qmp_setup_handlers_bh ? I think Daniel's soluation is perfect, but it's beyond my ability, I'm not expert in Chardev/QMP, it's difficult to guarantee no other bugs will be introduced, so we prefer to take the simplest and safest way to fix the bug in our production. > > -- > Marc-André Lureau