On Tue, 5 Jun 2018 15:30:01 -0300 Eduardo Habkost <ehabk...@redhat.com> wrote:
> On Tue, Jun 05, 2018 at 04:00:43PM +0200, Igor Mammedov wrote: > > When using --daemonize, the initial lead process will fork a child and > > then wait to be notified that setup is complete via a pipe, before it > > exits. When using --preconfig there is an extra call to main_loop() > > before the notification is done from os_setup_post(). Thus the parent > > process won't exit until the mgmt application connects to the monitor > > and tells QEMU to leave the RUN_STATE_PRECONFIG. The mgmt application > > won't connect to the monitor until daemonizing has completed though. > > > > This is a chicken and egg problem, leading to deadlock at startup. > > > > The only viable way to fix this is to call os_setup_post() before > > the early main_loop_wait() call when in RUN_STATE_PRECONFIG. This has > > the downside that any errors from this point onwards won't be handled > > well by the mgmt application, because it will think QEMU has started > > successfully, so not be expecting an abrupt exit. The only way to > > deal with that is to move as much user input validation as possible > > to before the main_loop() call. This is left as an exercise for > > future interested developers. > > > > Based on: > > From: Daniel P. Berrangé <berra...@redhat.com> > > Subject: [PATCH v2 2/2] vl: fix use of --daemonize with --preconfig > > Message-Id: <20180604120345.12955-3-berra...@redhat.com> > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > --- > > v3: > > - rewrite to apply on top of 1/2 > > --- > > os-posix.c | 6 ++++++ > > vl.c | 2 +- > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/os-posix.c b/os-posix.c > > index 9ce6f74..ee06a8d 100644 > > --- a/os-posix.c > > +++ b/os-posix.c > > @@ -309,8 +309,14 @@ void os_daemonize(void) > > > > void os_setup_post(void) > > { > > + static bool os_setup_post_done = false; > > int fd = 0; > > > > + if (os_setup_post_done) { > > + return; > > + } > > + os_setup_post_done = true; > > + > > This part is nice because it allows the os_setup_post() call in > the second main loop to be unconditional, but: > > > if (daemonize) { > > if (chdir("/")) { > > error_report("not able to chdir to /: %s", strerror(errno)); > > diff --git a/vl.c b/vl.c > > index fa44138..d6fa67f 100644 > > --- a/vl.c > > +++ b/vl.c > > @@ -1960,6 +1960,7 @@ static void main_loop(void) > > #ifdef CONFIG_PROFILER > > ti = profile_getclock(); > > #endif > > + os_setup_post(); > > main_loop_wait(false); > > Ensuring the correctness of this os_setup_post() call depends on > reading the whole body of main_loop_should_exit(), which is a > complex and large function. I think this is too fragile for my > taste. Fragility was the reason why I moved it into main_loop(), as os_setup_post() was already overlooked once, since one would have to make very non-obvious connection with libvirt requirement to call it before main_loop_wait() This way call to os_setup_post() will not be forgotten, and would get an attention whenever main_loop() is concerned. > I prefer Daniel's approach where we have two > os_setup_post()/main_loop() call sites, and the first one is > conditional on --preconfig. Considering we are unlikely to add one more invocation of main_loop(). I'll post here Daniel's version that applies on top of 1/2 with a comment so we won't forget about libvirt's requirement (not the best way to write something robust but better then nothing). So pick whatever variant would seem the best. > > #ifdef CONFIG_PROFILER > > dev_time += profile_getclock() - ti; > > @@ -4707,7 +4708,6 @@ int main(int argc, char **argv, char **envp) > > } > > > > accel_setup_post(current_machine); > > - os_setup_post(); > > > > main_loop(); > > > > -- > > 2.7.4 > > >