Hi, More exciting news from the bdrv_drain() front!
I've noticed in the past that iotest 194 sometimes hangs. I usually run the tests on tmpfs, but I've just now verified that it happens on my SSD just as well. So the reproducer is a plain: while ./check -raw 194; do; done (No difference between raw or qcow2, though.) And then, after a couple of runs (or a couple ten), it will just hang. The reason is that the source VM lingers around and doesn't quit voluntarily -- the test itself was successful, but it just can't exit. If you force it to exit by killing the VM (e.g. through pkill -11 qemu), this is the backtrace: #0 0x00007f7cfc297e06 in ppoll () at /lib64/libc.so.6 #1 0x0000563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x0000563b846bcac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x0000563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80, blocking=<optimized out>) at util/aio-posix.c:629 #4 0x0000563b8463afa4 in bdrv_drain_recurse (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201 #5 0x0000563b8463baff in bdrv_drain_all_begin () at block/io.c:381 #6 0x0000563b8463bc99 in bdrv_drain_all () at block/io.c:411 #7 0x0000563b8459888b in block_migration_cleanup (opaque=<optimized out>) at migration/block.c:714 #8 0x0000563b845883be in qemu_savevm_state_cleanup () at migration/savevm.c:1251 #9 0x0000563b845811fd in migration_thread (opaque=0x563b856f1da0) at migration/migration.c:2298 #10 0x00007f7cfc56f36d in start_thread () at /lib64/libpthread.so.0 #11 0x00007f7cfc2a3e1f in clone () at /lib64/libc.so.6 And when you make bdrv_drain_all_begin() print what we are trying to drain, you can see that it's the format node (managed by the "raw" driver in this case). So I thought, before I put more time into this, let's ask whether the test author has any ideas. :-) Max
signature.asc
Description: OpenPGP digital signature