Il 02/07/2014 13:57, ChenLiang ha scritto: >>> Hmm, dbs->in_cancel will be true always. Although this will avoid freeing >>> dbs by dma_comlete. >>> But it maybe a mistake. >> >> This was on purpose; I'm doing the free myself in dma_aio_cancel, so I >> wanted to avoid the qemu_aio_release from dma_complete. This was in case of >> a recursive call to dma_complete. But I don't see how that recursive call >> could happen outside the "if (dbs->acb)"; and inside the "if" the protection >> is there already. >> >> Can you gather the backtraces for _both_ calls to qemu_aio_release, rather >> than just the second? > > (gdb) bt > #0 qemu_aio_release (p=0x7f44788d1290) at block.c:4260 > #1 0x00007f4477494e5e in dma_complete (dbs=0x7f44788d1290, ret=0) at > dma-helpers.c:135 > #2 0x00007f44774952c2 in dma_aio_cancel (acb=0x7f44788d1290) at > dma-helpers.c:195 > #3 0x00007f447744825b in bdrv_aio_cancel (acb=0x7f44788d1290) at block.c:3848 > #4 0x00007f4477513911 in ide_bus_reset (bus=0x7f44785f1bd8) at > hw/ide/core.c:1957 > #5 0x00007f4477516b3c in piix3_reset (opaque=0x7f44785f1530) at > hw/ide/piix.c:113 > #6 0x00007f4477647b9f in qemu_devices_reset () at vl.c:2131 > #7 0x00007f4477647c0f in qemu_system_reset (report=true) at vl.c:2140 > #8 0x00007f4477648127 in main_loop_should_exit () at vl.c:2274 > #9 0x00007f447764823a in main_loop () at vl.c:2323 > #10 0x00007f447764f6da in main (argc=57, argv=0x7fff5d194378, > envp=0x7fff5d194548) at vl.c:4803
And the second is #7 0x00007f3cb525de5e in dma_complete (dbs=0x7f3cb63f3220, ret=0) at dma-helpers.c:135 #8 0x00007f3cb525df3d in dma_bdrv_cb (opaque=0x7f3cb63f3220, ret=0) at dma-helpers.c:152 #9 0x00007f3cb5212102 in bdrv_co_em_bh (opaque=0x7f3cb6398980) at block.c:4127 #10 0x00007f3cb51f6cef in aio_bh_poll (ctx=0x7f3cb622a8f0) at async.c:70 #11 0x00007f3cb51f695a in aio_poll (ctx=0x7f3cb622a8f0, blocking=false) at aio-posix.c:185 #12 0x00007f3cb51f7056 in aio_ctx_dispatch (source=0x7f3cb622a8f0, callback=0x0, user_data=0x0) at async.c:167 #13 0x00007f3cb48b969a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 This explains why my patch "fixes" the bug. It turns a double free into a dangling pointer: the second call now sees in_cancel == true and skips the free. The second call should have happened within dma_aio_cancel's call to bdrv_aio_cancel. This is the real bug. What is your version of QEMU? I cannot see any where bdrv_co_em_bh is at line 4127 or bdrv_aio_cancel is at line 3848. Can you reproduce it with qemu.git master? Paolo