Hi all (yes, that's my new address, I hope for a long time. )
I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see that data->done is not accessed atomically, and doesn't have any barrier protecting it.. Is following possible: main-loop iothread | aio_wait_bh_oneshot() | aio_bh_schedule_oneshot() | | handle bh: | 1. set data->done = true | 2. call aio_wait_kick(), inserting the | dummy bh into main context | ... in AIO_WAIT_WHILE(): handle dummy bh, go to next iteration, but still read data->done=false due to some processor data reordering, go to next iteration of polling and hang ? I've seen a following dead-lock on 2.12-based Qemu, but failed to find is it (and how is it) fixed in master: 1. main() thread is stuck in qemu_mutex_lock_iothread() 2. The global mutex is taken by migration_thread(), which has the following stack: aio_poll ( ctx=qemu_aio_context, blocking=true ) aio_wait_bh_oneshot ( ctx=context_of_iothread, cb=virtio_blk_data_plane_stop_bh ) virtio_blk_data_plane_stop virtio_bus_stop_ioeventfd virtio_vmstate_change vm_state_notify do_vm_stop migration_completion The iothread itself is in qemu_poll_ns() -> ppoll(). data->done of the BH is true, so I assume iothread completely handled the BH. Also, there is no dummy_bh in the main qemu aio context bh-list, so I assume it's either already handled, or aio_wait_kick() was called even before entering AIO_WAIT_WHILE. But still, AIO_WAIT_WHILE somehow go into block aio_poll, like data->done was false. -- Best regards, Vladimir