On Tue, Mar 24, 2020 at 02:47:43PM +0100, Max Reitz wrote: > Hi Dietmar, > > I assume this is with master and has popped up only recently? > > Maybe it has something to do with the recent mutex patches by Stefan, so > I’m Cc-ing him. >
Hi, I was able to reproduce the issue with a build after the last batch of AIO fixes and before Stefan's optimizations. This seems to be a new issue related to { "completion-mode": "grouped" }. Without that property, the transaction finishes without a crash. I'm going to take a look at this. Sergio. > > On 24.03.20 14:33, Dietmar Maurer wrote: > > spoke too soon - the error is still there, sigh > > > >> This is fixed with this patch: > >> > >> https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07249.html > >> > >> thanks! > >> > >>> On March 24, 2020 12:13 PM Dietmar Maurer <diet...@proxmox.com> wrote: > >>> > >>> > >>> I get a core dump with backup transactions when using io-threads. > >>> > >>> To reproduce, create and start a VM with: > >>> > >>> # qemu-img create disk1.raw 100M > >>> # qemu-img create disk2.raw 100M > >>> #./x86_64-softmmu/qemu-system-x86_64 -chardev > >>> 'socket,id=qmp,path=/var/run/qemu-test.qmp,server,nowait' -mon > >>> 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/108.pid -m 512 > >>> -object 'iothread,id=iothread-virtioscsi0' -object > >>> 'iothread,id=iothread-virtioscsi1' -device > >>> 'virtio-scsi-pci,id=virtioscsi0,iothread=iothread-virtioscsi0' -drive > >>> 'file=disk1.raw,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' > >>> -device > >>> 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0' > >>> -device 'virtio-scsi-pci,id=virtioscsi1,iothread=iothread-virtioscsi1' > >>> -drive > >>> 'file=disk2.raw,if=none,id=drive-scsi1,format=raw,cache=none,aio=native,detect-zeroes=on' > >>> -device > >>> 'scsi-hd,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1' > >>> > >>> Then open socat to the qmp socket > >>> # socat /var/run/qemu-test.qmp - > >>> > >>> And run the following qmp command: > >>> > >>> { "execute": "qmp_capabilities", "arguments": {} } > >>> { "execute": "transaction", "arguments": { "actions": [{ "type": > >>> "drive-backup", "data": { "device": "drive-scsi0", "sync": "full", > >>> "target": "backup-sysi0.raw" }}, { "type": "drive-backup", "data": { > >>> "device": "drive-scsi1", "sync": "full", "target": "backup-scsi1.raw" > >>> }}], "properties": { "completion-mode": "grouped" } } } > >>> > >>> The VM will core dump: > >>> > >>> qemu: qemu_mutex_unlock_impl: Operation not permitted > >>> Aborted (core dumped) > >>> (gdb) bt > >>> #0 0x00007f099d5037bb in __GI_raise (sig=sig@entry=6) at > >>> ../sysdeps/unix/sysv/linux/raise.c:50 > >>> #1 0x00007f099d4ee535 in __GI_abort () at abort.c:79 > >>> #2 0x000055c04e39525e in error_exit (err=<optimized out>, > >>> msg=msg@entry=0x55c04e5122e0 <__func__.16544> "qemu_mutex_unlock_impl") > >>> at util/qemu-thread-posix.c:36 > >>> #3 0x000055c04e395813 in qemu_mutex_unlock_impl > >>> (mutex=mutex@entry=0x7f09903154e0, file=file@entry=0x55c04e51129f > >>> "util/async.c", line=line@entry=601) > >>> at util/qemu-thread-posix.c:108 > >>> #4 0x000055c04e38f8e5 in aio_context_release > >>> (ctx=ctx@entry=0x7f0990315480) at util/async.c:601 > >>> #5 0x000055c04e299073 in bdrv_set_aio_context_ignore (bs=0x7f0929a76500, > >>> new_context=new_context@entry=0x7f0990315000, > >>> ignore=ignore@entry=0x7ffe08fa7400) > >>> at block.c:6238 > >>> #6 0x000055c04e2990cc in bdrv_set_aio_context_ignore > >>> (bs=bs@entry=0x7f092af47900, > >>> new_context=new_context@entry=0x7f0990315000, > >>> ignore=ignore@entry=0x7ffe08fa7400) > >>> at block.c:6211 > >>> #7 0x000055c04e299443 in bdrv_child_try_set_aio_context > >>> (bs=bs@entry=0x7f092af47900, ctx=0x7f0990315000, > >>> ignore_child=ignore_child@entry=0x0, errp=errp@entry=0x0) > >>> at block.c:6324 > >>> #8 0x000055c04e299576 in bdrv_try_set_aio_context (errp=0x0, > >>> ctx=<optimized out>, bs=0x7f092af47900) at block.c:6333 > >>> #9 0x000055c04e299576 in bdrv_replace_child > >>> (child=child@entry=0x7f09902ef5e0, new_bs=new_bs@entry=0x0) at > >>> block.c:2551 > >>> #10 0x000055c04e2995ff in bdrv_detach_child (child=0x7f09902ef5e0) at > >>> block.c:2666 > >>> #11 0x000055c04e299ec9 in bdrv_root_unref_child (child=<optimized out>) > >>> at block.c:2677 > >>> #12 0x000055c04e29f3fe in block_job_remove_all_bdrv > >>> (job=job@entry=0x7f0927c18800) at blockjob.c:191 > >>> #13 0x000055c04e29f429 in block_job_free (job=0x7f0927c18800) at > >>> blockjob.c:88 > >>> #14 0x000055c04e2a0909 in job_unref (job=0x7f0927c18800) at job.c:359 > >>> #15 0x000055c04e2a0909 in job_unref (job=0x7f0927c18800) at job.c:351 > >>> #16 0x000055c04e2a0b68 in job_conclude (job=0x7f0927c18800) at job.c:620 > >>> #17 0x000055c04e2a0b68 in job_finalize_single (job=0x7f0927c18800) at > >>> job.c:688 > >>> #18 0x000055c04e2a0b68 in job_finalize_single (job=0x7f0927c18800) at > >>> job.c:660 > >>> #19 0x000055c04e2a14fc in job_txn_apply (txn=<optimized out>, > >>> fn=0x55c04e2a0a50 <job_finalize_single>) at job.c:145 > >>> #20 0x000055c04e2a14fc in job_do_finalize (job=0x7f0927c1c200) at > >>> job.c:781 > >>> #21 0x000055c04e2a1751 in job_completed_txn_success (job=0x7f0927c1c200) > >>> at job.c:831 > >>> #22 0x000055c04e2a1751 in job_completed (job=0x7f0927c1c200) at job.c:844 > >>> #23 0x000055c04e2a1751 in job_completed (job=0x7f0927c1c200) at job.c:835 > >>> #24 0x000055c04e2a17b0 in job_exit (opaque=0x7f0927c1c200) at job.c:863 > >>> #25 0x000055c04e38ee75 in aio_bh_call (bh=0x7f098ec52000) at > >>> util/async.c:164 > >>> #26 0x000055c04e38ee75 in aio_bh_poll (ctx=ctx@entry=0x7f0990315000) at > >>> util/async.c:164 > >>> #27 0x000055c04e3924fe in aio_dispatch (ctx=0x7f0990315000) at > >>> util/aio-posix.c:380 > >>> #28 0x000055c04e38ed5e in aio_ctx_dispatch (source=<optimized out>, > >>> callback=<optimized out>, user_data=<optimized out>) at util/async.c:298 > >>> #29 0x00007f099f020f2e in g_main_context_dispatch () at > >>> /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 > >>> #30 0x000055c04e391768 in glib_pollfds_poll () at util/main-loop.c:219 > >>> #31 0x000055c04e391768 in os_host_main_loop_wait (timeout=<optimized > >>> out>) at util/main-loop.c:242 > >>> #32 0x000055c04e391768 in main_loop_wait > >>> (nonblocking=nonblocking@entry=0) at util/main-loop.c:518 > >>> #33 0x000055c04e032329 in qemu_main_loop () at > >>> /home/dietmar/pve5-devel/mirror_qemu/softmmu/vl.c:1665 > >>> #34 0x000055c04df36a8e in main (argc=<optimized out>, argv=<optimized > >>> out>, envp=<optimized out>) at > >>> /home/dietmar/pve5-devel/mirror_qemu/softmmu/main.c:49 > > > >
signature.asc
Description: PGP signature