Eric Blake <ebl...@redhat.com> writes: > In kill_qemu() we have an assert that checks that the QEMU process > didn't dump core: > assert(!WCOREDUMP(wstatus)); > > Unfortunately the WCOREDUMP macro here means the resulting message > is not very easy to comprehend on at least some systems: > > ahci-test: tests/libqtest.c:113: kill_qemu: Assertion `!(((__extension__ > (((union { __typeof(wstatus) __in; int __i; }) { .__in = (wstatus) }).__i))) > & 0x80)' failed. > > and it doesn't identify what signal the process took. > > Furthermore, we are NOT detecting EINTR (while EINTR shouldn't be > happening if we didn't install signal handlers, it's still better > to always be robust), and also want to log unexpected non-zero status > that was not accompanied by a core dump. > > Instead of using a raw assert, print the information in an > easier to understand way: > > /i386/ahci/sanity: tests/libqtest.c:119: kill_qemu() detected QEMU death with > core dump from signal 11 (Segmentation fault) > Aborted (core dumped) > > (Of course, the really useful information would be why the QEMU > process dumped core in the first place, but we don't have that > by the time the test program has picked up the exit status.) > > Suggested-by: Peter Maydell <peter.mayd...@linaro.org> > Signed-off-by: Eric Blake <ebl...@redhat.com> > --- > > I've taken the ideas from Peter's patch: > https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg04430.html > as well as fixing a related issue brought up last time this was touched: > https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg05710.html > > tests/libqtest.c | 39 +++++++++++++++++++++++++++++++++++++-- > 1 file changed, 37 insertions(+), 2 deletions(-) > > diff --git a/tests/libqtest.c b/tests/libqtest.c > index 098af6aec44..f3dabfadd78 100644 > --- a/tests/libqtest.c > +++ b/tests/libqtest.c > @@ -105,12 +105,47 @@ static void kill_qemu(QTestState *s) > if (s->qemu_pid != -1) { > int wstatus = 0; > pid_t pid; > + bool die = false; > > kill(s->qemu_pid, SIGTERM); > + retry: > pid = waitpid(s->qemu_pid, &wstatus, 0); > + if (pid == -1 && errno == EINTR) { > + goto retry; > + } > > - if (pid == s->qemu_pid && WIFSIGNALED(wstatus)) { > - assert(!WCOREDUMP(wstatus)); > + assert(pid == s->qemu_pid); > + /* > + * We expect qemu to exit with status 0; anything else is > + * fishy and should be logged. Abort except when death by > + * signal is not accompanied by a coredump (as that's the only > + * time it was likely that the user is trying to kill the > + * testsuite early). > + */ > + if (wstatus) { > + die = true; > + if (WIFEXITED(wstatus)) { > + fprintf(stderr, "%s:%d: kill_qemu() tried to terminate QEMU " > + "process but encountered exit status %d\n", > + __FILE__, __LINE__, WEXITSTATUS(wstatus)); > + } else if (WIFSIGNALED(wstatus)) { > + int sig = WTERMSIG(wstatus); > + const char *signame = strsignal(sig) ?: "unknown ???"; > + > + if (!WCOREDUMP(wstatus)) { > + die = false;
Does WCOREDUMP(wstatus) depend on the user's ulimit -c? What higher level kind of QEMU termination are you trying to detect here? > + fprintf(stderr, "%s:%d: kill_qemu() ignoring QEMU death " > + "by signal %d (%s)\n", > + __FILE__, __LINE__, sig, signame); > + } else { > + fprintf(stderr, "%s:%d: kill_qemu() detected QEMU death " > + "with core dump from signal %d (%s)\n", > + __FILE__, __LINE__, sig, signame); > + } > + } > + } > + if (die) { > + abort(); > } > } > }