On 10/27/2015 09:12 PM, Paolo Bonzini wrote:
On 27/10/2015 15:09, Denis V. Lunev wrote:
aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.
- the command can hang (HMP thread) with missed wakeup (the operation is
actually complete)
io_submit
ioq_submit
laio_submit
raw_aio_submit
raw_aio_readv
bdrv_co_io_em
bdrv_co_readv_em
bdrv_aligned_preadv
bdrv_co_do_preadv
bdrv_co_do_readv
bdrv_co_readv
qcow2_co_readv
bdrv_aligned_preadv
bdrv_co_do_pwritev
bdrv_rw_co_entry
- QEMU can assert in coroutine re-enter
__GI_abort
qemu_coroutine_enter
bdrv_co_io_em_complete
qemu_laio_process_completion
qemu_laio_completion_bh
aio_bh_poll
aio_dispatch
aio_poll
iothread_run
AioContext lock is reqursive. Thus nested locking should not be a problem.
Signed-off-by: Denis V. Lunev <d...@openvz.org>
CC: Stefan Hajnoczi <stefa...@redhat.com>
CC: Paolo Bonzini <pbonz...@redhat.com>
CC: Juan Quintela <quint...@redhat.com>
CC: Amit Shah <amit.s...@redhat.com>
---
block/snapshot.c | 5 +++++
migration/savevm.c | 7 +++++++
2 files changed, 12 insertions(+)
diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..f6fa17a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState
*bs,
{
int ret;
Error *local_err = NULL;
+ AioContext *aio_context = bdrv_get_aio_context(bs);
+
+ aio_context_acquire(aio_context);
ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
if (ret == -ENOENT || ret == -EINVAL) {
@@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState
*bs,
ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
}
+ aio_context_release(aio_context);
Why here and not in hmp_delvm, for consistency?
The call from hmp_savevm is already protected.
Thanks for fixing the bug!
Paolo
the situation is more difficult. There are several disks in VM.
One disk is used for state saving (protected in savevm)
and there are several disks touched via
static int del_existing_snapshots(Monitor *mon, const char *name)
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
}
}
in savevm and similar looking code in delvm with similar cycle
implemented differently.
This patchset looks minimal for me to kludge situation enough.
True fix would be a drop of this code in favour of blockdev
transactions. At least this is my opinion. Though I can not do
this at this stage, this will take a lot of time.
Den