[RFC PATCH v1 3/7] migration/snapshot: Move RAM_SAVE_FLAG_xxx defines to migration/ram.h

2021-05-12 Thread Andrey Gruzdev
Move RAM_SAVE_FLAG_xxx defines from migration/ram.c to migration/ram.h Signed-off-by: Andrey Gruzdev --- migration/ram.c | 16 migration/ram.h | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index

[RFC PATCH v1 4/7] migration/snapshot: Block layer AIO support in qemu-snapshot

2021-05-12 Thread Andrey Gruzdev
This commit enables asynchronous block layer I/O for qemu-snapshot tool. Implementation provides in-order request completion delivery to simplify migration code. Several file utility routines are introduced as well. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 30

[PATCH 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-18 Thread Andrey Gruzdev
Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB for device state and VMDESC. Signed-off-by: Andrey Gruzdev --- migration/migration.c | 4 +++- 1 f

[PATCH 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-18 Thread Andrey Gruzdev
state Andrey Gruzdev (3): migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread migration: Inhibit virtio-balloon for the duration of background snapshot migration: Pre-fault memory before starting background snasphot hw/virtio/virtio-balloon.c | 8 -- include

[PATCH 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-18 Thread Andrey Gruzdev
block to be protected before making a userfault_fd wr-protect ioctl(). Signed-off-by: Andrey Gruzdev --- migration/migration.c | 6 + migration/ram.c | 51 +++ migration/ram.h | 1 + 3 files changed, 58 insertions(+) diff --git a/migration

[PATCH 2/3] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-03-18 Thread Andrey Gruzdev
The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Signed-off-by: Andrey Gruzdev --- hw/virtio/virtio-balloon.c | 8 ++-- include/migration/misc.h | 2 ++ migration/migration.c | 8

Re: [PATCH 2/3] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-03-19 Thread Andrey Gruzdev
On 18.03.2021 21:16, David Hildenbrand wrote: On 18.03.21 18:46, Andrey Gruzdev wrote: The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Signed-off-by: Andrey Gruzdev ---   hw/virtio/virtio

Re: [PATCH 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-19 Thread Andrey Gruzdev
the RamDiscardManager patches are still stuck waiting for acks ... and now we're in soft-freeze. RamDiscardManager patches - do they also modify migration code? I mean which part is responsible of not migrating discarded ranges. -- Andrey Gruzdev, Principal Engineer Virtuozzo GmbH +7-903-247-6397 virtuzzo.com

Re: [PATCH 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-19 Thread Andrey Gruzdev
char tmp = *(volatile char *)(ptr + offset) I wanted to do a "= *(ptr + offset)" here. Yep /* Don't optimize the read out. */ asm volatile ("" : "+r" (tmp)); So this is the only volatile thing that the compiler must guarantee to not optimize awa

Re: [PATCH 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-19 Thread Andrey Gruzdev
On 19.03.2021 14:27, David Hildenbrand wrote: On 19.03.21 12:05, Andrey Gruzdev wrote: On 19.03.2021 12:28, David Hildenbrand wrote: +/* + * ram_block_populate_pages: populate memory in the RAM block by reading + *   an integer from the beginning of each page. + * + * Since it's solely

Re: [PATCH 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-19 Thread Andrey Gruzdev
On 19.03.2021 15:39, David Hildenbrand wrote: On 18.03.21 18:46, Andrey Gruzdev wrote: Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB for dev

[PATCH v1 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-19 Thread Andrey Gruzdev
Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB for device state and VMDESC. Signed-off-by: Andrey Gruzdev --- migration/migration.c | 4 +++- 1 f

[PATCH v1 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-19 Thread Andrey Gruzdev
block to be protected before making a userfault_fd wr-protect ioctl(). Signed-off-by: Andrey Gruzdev --- migration/migration.c | 6 ++ migration/ram.c | 48 +++ migration/ram.h | 1 + 3 files changed, 55 insertions(+) diff --git a/migration

[PATCH v1 2/3] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-03-19 Thread Andrey Gruzdev
The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Signed-off-by: Andrey Gruzdev Reviewed-by: David Hildenbrand --- hw/virtio/virtio-balloon.c | 8 ++-- include/migration/misc.h | 2

[PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-19 Thread Andrey Gruzdev
ate * Solution to compatibility issues with virtio-balloon device * Fix to the issue when discarded or never populated pages miss UFFD write protection and get into migration stream in dirty state Andrey Gruzdev (3): migration: Fix missing qemu_fflush() on buffer file in bg_migration_thr

Re: [PATCH v1 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-23 Thread Andrey Gruzdev
On 22.03.2021 23:17, Peter Xu wrote: On Fri, Mar 19, 2021 at 05:52:47PM +0300, Andrey Gruzdev wrote: Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB

Re: [PATCH v1 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-23 Thread Andrey Gruzdev
On 23.03.2021 17:54, Peter Xu wrote: On Tue, Mar 23, 2021 at 10:51:57AM +0300, Andrey Gruzdev wrote: On 22.03.2021 23:17, Peter Xu wrote: On Fri, Mar 19, 2021 at 05:52:47PM +0300, Andrey Gruzdev wrote: Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial

Re: [PATCH v1 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-24 Thread Andrey Gruzdev
On 23.03.2021 21:35, Peter Xu wrote: On Tue, Mar 23, 2021 at 08:21:43PM +0300, Andrey Gruzdev wrote: For the long term I think we'd better have a helper: qemu_put_qio_channel_buffer(QEMUFile *file, QIOChannelBuffer *bioc) So as to hide this flush operation, which is tricky.

Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-24 Thread Andrey Gruzdev
On 24.03.2021 01:21, Peter Xu wrote: On Fri, Mar 19, 2021 at 05:52:46PM +0300, Andrey Gruzdev wrote: Changes v0->v1: * Using qemu_real_host_page_size instead of TARGET_PAGE_SIZE for host page size in ram_block_populate_pages() * More elegant implementation of ram_block_populate_pa

Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-25 Thread Andrey Gruzdev
On 24.03.2021 18:41, Peter Xu wrote: On Wed, Mar 24, 2021 at 11:09:27AM +0300, Andrey Gruzdev wrote: I'm also looking into introducing UFFD_FEATURE_WP_UNALLOCATED so as to wr-protect page holes too for a uffd-wp region when the feature bit is set. With that feature we should be able to

Re: [RFC PATCH 0/9] migration/snap-tool: External snapshot utility

2021-03-29 Thread Andrey Gruzdev
Ping On 17.03.2021 19:32, Andrey Gruzdev wrote: This series is a kind of PoC for asynchronous snapshot reverting. This is about external snapshots only and doesn't involve block devices. Thus, it's mainly intended to be used with the new 'background-snapshot' migr

[PATCH v2 2/3] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-03-31 Thread Andrey Gruzdev
The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Signed-off-by: Andrey Gruzdev Reviewed-by: David Hildenbrand --- hw/virtio/virtio-balloon.c | 8 ++-- include/migration/misc.h | 2

[PATCH v2 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-31 Thread Andrey Gruzdev
Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB for device state and VMDESC. Signed-off-by: Andrey Gruzdev --- migration/migration.c | 8 +++-

[PATCH v2 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-31 Thread Andrey Gruzdev
patch series contains: * Fix to the issue with occasionally truncated non-iterable device state * Solution to compatibility issues with virtio-balloon device * Fix to the issue when discarded or never populated pages miss UFFD write protection and get into migration stream in dirty state Andrey G

[PATCH v2 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-31 Thread Andrey Gruzdev
block to be protected before making a userfault_fd wr-protect ioctl(). Signed-off-by: Andrey Gruzdev --- migration/migration.c | 6 ++ migration/ram.c | 48 +++ migration/ram.h | 1 + 3 files changed, 55 insertions(+) diff --git a/migration

Re: [PATCH v2 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-31 Thread Andrey Gruzdev
On 31.03.2021 19:02, Peter Xu wrote: On Wed, Mar 31, 2021 at 06:48:06PM +0300, Andrey Gruzdev wrote: Changes v1->v2: * Added comment over the overlooked qemu_flush() in bg_migration_thread Changes v0->v1: * Using qemu_real_host_page_size instead of TARGET_PAGE_SIZE for host pag

[PATCH for-6.0 2/3] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-03-31 Thread Andrey Gruzdev
The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Signed-off-by: Andrey Gruzdev Reviewed-by: David Hildenbrand --- hw/virtio/virtio-balloon.c | 8 ++-- include/migration/misc.h | 2

[PATCH for-6.0 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-31 Thread Andrey Gruzdev
block to be protected before making a userfault_fd wr-protect ioctl(). Signed-off-by: Andrey Gruzdev --- migration/migration.c | 6 ++ migration/ram.c | 48 +++ migration/ram.h | 1 + 3 files changed, 55 insertions(+) diff --git a/migration

[PATCH for-6.0 0/3] migration: Fixes to the 'background-snapshot' code

2021-03-31 Thread Andrey Gruzdev
state Andrey Gruzdev (3): migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread migration: Inhibit virtio-balloon for the duration of background snapshot migration: Pre-fault memory before starting background snasphot hw/virtio/virtio-balloon.c | 8 +-- include

[PATCH for-6.0 1/3] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-03-31 Thread Andrey Gruzdev
Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require >200KB for device state and VMDESC. Signed-off-by: Andrey Gruzdev --- migration/migration.c | 8 +++-

Re: [PATCH for-6.0 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-31 Thread Andrey Gruzdev
On 31.03.2021 20:33, David Hildenbrand wrote: On 31.03.21 19:28, Andrey Gruzdev wrote: This commit solves the issue with userfault_fd WP feature that background snapshot is based on. For any never poluated or discarded memory page, the UFFDIO_WRITEPROTECT ioctl() would skip updating PTE for

Re: [PATCH for-6.0 3/3] migration: Pre-fault memory before starting background snasphot

2021-03-31 Thread Andrey Gruzdev
On 31.03.2021 20:37, David Hildenbrand wrote: On 31.03.21 19:33, David Hildenbrand wrote: On 31.03.21 19:28, Andrey Gruzdev wrote: This commit solves the issue with userfault_fd WP feature that background snapshot is based on. For any never poluated or discarded memory page, the

[PATCH for-6.0 v1 2/4] migration: Inhibit virtio-balloon for the duration of background snapshot

2021-04-01 Thread Andrey Gruzdev
The same thing as for incoming postcopy - we cannot deal with concurrent RAM discards when using background snapshot feature in outgoing migration. Fixes: 8518278a6af589ccc401f06e35f171b1e6fae800 (migration: implementation of background snapshot thread) Signed-off-by: Andrey Gruzdev Reported

[PATCH for-6.0 v1 1/4] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-04-01 Thread Andrey Gruzdev
ion of background snapshot thread) Signed-off-by: Andrey Gruzdev --- migration/migration.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index ca8b97baa5..00e13f9d58 100644 --- a/migration/migration.c +++ b/migrat

[PATCH for-6.0 v1 4/4] migration: Rename 'bs' to 'block' in background snapshot code

2021-04-01 Thread Andrey Gruzdev
Rename 'bs' to commonly used 'block' in migration/ram.c background snapshot code. Signed-off-by: Andrey Gruzdev Reported-by: David Hildenbrand --- migration/ram.c | 86 + 1 file changed, 44 insertions(+), 42 deletions(-) di

[PATCH for-6.0 v1 3/4] migration: Pre-fault memory before starting background snasphot

2021-04-01 Thread Andrey Gruzdev
block to be protected before making a userfault_fd wr-protect ioctl(). Fixes: 278e2f551a095b234de74dca9c214d5502a1f72c (migration: support UFFD write fault processing in ram_save_iterate()) Signed-off-by: Andrey Gruzdev Reported-by: David Hildenbrand Reviewed-by: David Hildenbrand --- migration

[PATCH for-6.0 v1 0/4] migration: Fixes to the 'background-snapshot' code

2021-04-01 Thread Andrey Gruzdev
h virtio-balloon device * Fix to the issue when discarded or never populated pages miss UFFD write protection and get into migration stream in dirty state * Renaming of 'bs' to commonly used 'block' in migration/ram.c background snapshot code Andrey Gruzdev (4): mig

Re: [PATCH for-6.0 v1 1/4] migration: Fix missing qemu_fflush() on buffer file in bg_migration_thread

2021-04-06 Thread Andrey Gruzdev
On 06.04.2021 15:29, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: Added missing qemu_fflush() on buffer file holding precopy device state. Increased initial QIOChannelBuffer allocation to 512KB to avoid reallocs. Typical configurations often require

Re: [PATCH for-6.0 v1 0/4] migration: Fixes to the 'background-snapshot' code

2021-04-06 Thread Andrey Gruzdev
On 06.04.2021 19:53, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: Changes v0->v1: * Fixes to coding style and commit messages * Renamed 'bs' to 'block' in migration/ram.c background snapshot code This patch series contains:

Re: [PULL 0/6] migration + virtiofsd queue

2021-04-08 Thread Andrey Gruzdev
4 The definition of ram_write_tracking_prepare() is inside an #if defined(__linux__), but the callsite is not, I think. OK, reproduced here. Let me see. Dave Seems that non-linux stub is missing, I'll respin. thanks -- PMM -- Andrey Gruzdev, Principal Engineer Virtuozzo GmbH +7-903-247-6397 virtuzzo.com

Re: [PULL 0/6] migration + virtiofsd queue

2021-04-08 Thread Andrey Gruzdev
On 08.04.2021 13:50, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: On 07.04.2021 19:50, Dr. David Alan Gilbert wrote: * Peter Maydell (peter.mayd...@linaro.org) wrote: On Wed, 7 Apr 2021 at 11:22, Dr. David Alan Gilbert (git) wrote: From: "Dr.

[RFC PATCH 1/9] migration/snap-tool: Introduce qemu-snap tool

2021-03-17 Thread Andrey Gruzdev
Initial commit with code to set up execution environment, parse command-line arguments, show usage/version info and so on. Signed-off-by: Andrey Gruzdev --- include/qemu-snap.h | 35 meson.build | 2 + qemu-snap.c | 414 3

[RFC PATCH 6/9] migration/snap-tool: Move RAM_SAVE_FLAG_xxx defines to migration/ram.h

2021-03-17 Thread Andrey Gruzdev
Move RAM_SAVE_FLAG_xxx defines from migration/ram.c to migration/ram.h Signed-off-by: Andrey Gruzdev --- migration/ram.c | 16 migration/ram.h | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index

[RFC PATCH 0/9] migration/snap-tool: External snapshot utility

2021-03-17 Thread Andrey Gruzdev
asynchronous revert works well only with SSD, not with rotational disk.. Some performance stats: * SATA SSD drive with ~500/450 MB/s sequantial read/write and ~60K IOPS max. * 220 MB/s average save rate (depends on workload) * 440 MB/s average load rate in precopy * 260 MB/s average load rate in p

[RFC PATCH 4/9] migration/snap-tool: Introduce qemu_ftell2() routine to qemu-file.c

2021-03-17 Thread Andrey Gruzdev
: Andrey Gruzdev --- migration/qemu-file.c | 6 ++ migration/qemu-file.h | 1 + 2 files changed, 7 insertions(+) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index d6e03dbc0e..66be5e6460 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -657,6 +657,12 @@ int64_t

[RFC PATCH 3/9] migration/snap-tool: Preparations to run code in main loop context

2021-03-17 Thread Andrey Gruzdev
Major part of code is using QEMUFile and block layer routines, thus to take advantage from concurrent I/O operations we need to use coroutines and run in the the main loop context. Signed-off-by: Andrey Gruzdev --- include/qemu-snap.h | 3 +++ meson.build | 2 +- qemu-snap

[RFC PATCH 5/9] migration/snap-tool: Block layer AIO support and file utility routines

2021-03-17 Thread Andrey Gruzdev
ev/null +++ b/qemu-snap-io.c @@ -0,0 +1,325 @@ +/* + * QEMU External Snapshot Utility + * + * Copyright Virtuozzo GmbH, 2021 + * + * Authors: + * Andrey Gruzdev + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory.

[RFC PATCH 9/9] migration/snap-tool: Implementation of snapshot loading in postcopy

2021-03-17 Thread Andrey Gruzdev
Implementation of asynchronous snapshot loading using standard postcopy migration mechanism on destination VM. The point of switchover to postcopy is trivially selected based on percentage of non-zero pages loaded in precopy. Signed-off-by: Andrey Gruzdev --- include/qemu-snap.h | 11 + qemu

[RFC PATCH 2/9] migration/snap-tool: Snapshot image create/open routines for qemu-snap tool

2021-03-17 Thread Andrey Gruzdev
: Andrey Gruzdev --- qemu-snap.c | 94 ++--- 1 file changed, 90 insertions(+), 4 deletions(-) diff --git a/qemu-snap.c b/qemu-snap.c index c7118927f7..c9f8d7166a 100644 --- a/qemu-snap.c +++ b/qemu-snap.c @@ -31,6 +31,16 @@ #include "migration

[RFC PATCH 7/9] migration/snap-tool: Complete implementation of snapshot saving

2021-03-17 Thread Andrey Gruzdev
Includes code to parse incoming migration stream, dispatch data to section handlers and deal with complications of open-coded migration format without introducing strong dependencies on QEMU migration code. Signed-off-by: Andrey Gruzdev --- include/qemu-snap.h | 42 +++ qemu-snap-handlers.c

[RFC PATCH 8/9] migration/snap-tool: Implementation of snapshot loading in precopy

2021-03-17 Thread Andrey Gruzdev
This part implements snapshot loading in precopy mode. Signed-off-by: Andrey Gruzdev --- include/qemu-snap.h | 24 ++ qemu-snap-handlers.c | 586 ++- qemu-snap.c | 44 +++- 3 files changed, 649 insertions(+), 5 deletions(-) diff --git a

[RFC PATCH v1 2/7] migration/snapshot: Introduce qemu_ftell2() routine

2021-05-12 Thread Andrey Gruzdev
In qemu-snapshot it is needed to retrieve current QEMUFile offset as a number of bytes read by qemu_get_byte()/qemu_get_buffer(). The existing qemu_ftell() routine would give read position as a number of bytes fetched from underlying IOChannel which is not the same. Signed-off-by: Andrey Gruzdev

[RFC PATCH v1 0/7] migration/snapshot: External snapshot utility

2021-05-12 Thread Andrey Gruzdev
ility postcopy-ram on * qemu> migrate_incoming "exec:qemu-snapshot --revert --postcopy=60 ,cache.direct=on,file.aio=native" And yes, asynchronous revert works well only with SSD, not with rotational disk.. Some performance stats: * SATA SSD drive with ~500/450 MB/s sequan

[RFC PATCH v1 7/7] migration/snapshot: Implementation of qemu-snapshot load path in postcopy mode

2021-05-12 Thread Andrey Gruzdev
The commit enables asynchronous snapshot loading using standard postcopy migration mechanism on destination VM. The point of switchover to postcopy is trivially selected based on percentage of non-zero pages loaded in precopy. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 12

[RFC PATCH v1 1/7] migration/snapshot: Introduce qemu-snapshot tool

2021-05-12 Thread Andrey Gruzdev
Execution environment, command-line argument parsing, usage/version info etc. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 59 ++ meson.build | 2 + qemu-snapshot-vm.c | 57 ++ qemu-snapshot.c | 439 4

[RFC PATCH v1 6/7] migration/snapshot: Implementation of qemu-snapshot load path

2021-05-12 Thread Andrey Gruzdev
This part implements snapshot loading in precopy mode. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 24 +- qemu-snapshot-vm.c | 588 +++- qemu-snapshot.c | 47 +++- 3 files changed, 654 insertions(+), 5 deletions(-) diff --git a

[RFC PATCH v1 5/7] migration/snapshot: Implementation of qemu-snapshot save path

2021-05-12 Thread Andrey Gruzdev
Includes code to parse incoming migration stream, dispatch data to section handlers and deal with complications of open-coded migration format without introducing strong dependencies on QEMU migration code. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 34 +- qemu-snapshot-vm.c

[RFC PATCH v2 0/7] migration/snapshot: External snapshot utility

2021-05-12 Thread Andrey Gruzdev
fer' * qemu> migrate_set_capability postcopy-ram on * qemu> migrate_incoming "exec:qemu-snapshot --revert --postcopy=60 ,cache.direct=on,file.aio=native" And yes, asynchronous revert works well only with SSD, not with rotational disk.. Some performance stats: * SA

[RFC PATCH v2 2/7] migration/snapshot: Introduce qemu_ftell2() routine

2021-05-12 Thread Andrey Gruzdev
In qemu-snapshot it is needed to retrieve current QEMUFile offset as a number of bytes read by qemu_get_byte()/qemu_get_buffer(). The existing qemu_ftell() routine would give read position as a number of bytes fetched from underlying IOChannel which is not the same. Signed-off-by: Andrey Gruzdev

[RFC PATCH v2 6/7] migration/snapshot: Implementation of qemu-snapshot load path

2021-05-12 Thread Andrey Gruzdev
This part implements snapshot loading in precopy mode. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 24 +- qemu-snapshot-vm.c | 588 +++- qemu-snapshot.c | 47 +++- 3 files changed, 654 insertions(+), 5 deletions(-) diff --git a

[RFC PATCH v2 4/7] migration/snapshot: Block layer AIO support in qemu-snapshot

2021-05-12 Thread Andrey Gruzdev
This commit enables asynchronous block layer I/O for qemu-snapshot tool. Implementation provides in-order request completion delivery to simplify migration code. Several file utility routines are introduced as well. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 30

[RFC PATCH v2 1/7] migration/snapshot: Introduce qemu-snapshot tool

2021-05-12 Thread Andrey Gruzdev
Execution environment, command-line argument parsing, usage/version info etc. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 59 ++ meson.build | 2 + qemu-snapshot-vm.c | 57 ++ qemu-snapshot.c | 439 4

[RFC PATCH v2 7/7] migration/snapshot: Implementation of qemu-snapshot load path in postcopy mode

2021-05-12 Thread Andrey Gruzdev
The commit enables asynchronous snapshot loading using standard postcopy migration mechanism on destination VM. The point of switchover to postcopy is trivially selected based on percentage of non-zero pages loaded in precopy. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 12

[RFC PATCH v2 5/7] migration/snapshot: Implementation of qemu-snapshot save path

2021-05-12 Thread Andrey Gruzdev
Includes code to parse incoming migration stream, dispatch data to section handlers and deal with complications of open-coded migration format without introducing strong dependencies on QEMU migration code. Signed-off-by: Andrey Gruzdev --- include/qemu-snapshot.h | 34 +- qemu-snapshot-vm.c

[RFC PATCH v2 3/7] migration/snapshot: Move RAM_SAVE_FLAG_xxx defines to migration/ram.h

2021-05-12 Thread Andrey Gruzdev
Move RAM_SAVE_FLAG_xxx defines from migration/ram.c to migration/ram.h Signed-off-by: Andrey Gruzdev --- migration/ram.c | 16 migration/ram.h | 16 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index

Re: [RFC PATCH 0/9] migration/snap-tool: External snapshot utility

2021-04-16 Thread Andrey Gruzdev
On 16.04.2021 02:50, Peter Xu wrote: On Wed, Mar 17, 2021 at 07:32:13PM +0300, Andrey Gruzdev wrote: This series is a kind of PoC for asynchronous snapshot reverting. This is about external snapshots only and doesn't involve block devices. Thus, it's mainly intended to be used wi

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-24 Thread Andrey Gruzdev
having to go over the dirty bitmap to cross off "discarded" parts and later having to find bits to migrate. At least find_next_bit() can skip whole longs (8 bytes) and is fairly efficient. There is certainly room for improvement (the current guest free page hinting API is certainly a

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-24 Thread Andrey Gruzdev
be as you said UFFDIO_ZEROCOPY is not the only route. Thanks, Just to add: one of the good options is too keep track of virtio-baloon discarded pages and pre-fault them before migration starts. What do you think? Just pre-fault everything and inhibit the balloon. That should work. Yep.

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-24 Thread Andrey Gruzdev
On 19.02.2021 23:50, Peter Xu wrote: Andrey, On Fri, Feb 19, 2021 at 09:57:37AM +0300, Andrey Gruzdev wrote: For the discards that happen before snapshot is started, I need to dig into Linux and QEMU virtio-baloon code more to get clear with it. Yes it's very tricky on how the error

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-24 Thread Andrey Gruzdev
On 24.02.2021 20:01, David Hildenbrand wrote: On 24.02.21 17:56, Andrey Gruzdev wrote: On 22.02.2021 21:11, David Hildenbrand wrote: On 22.02.21 18:54, Peter Xu wrote: On Mon, Feb 22, 2021 at 06:33:27PM +0100, David Hildenbrand wrote: On 22.02.21 18:29, Peter Xu wrote: On Sat, Feb 20, 2021

Re: [PATCH v11 5/5] migration: introduce 'userfaultfd-wrlat.py' script

2021-01-21 Thread Andrey Gruzdev
On 20.01.2021 00:01, Peter Xu wrote: On Wed, Jan 06, 2021 at 06:21:20PM +0300, Andrey Gruzdev wrote: Add BCC/eBPF script to analyze userfaultfd write fault latency distribution. Signed-off-by: Andrey Gruzdev Acked-by: Peter Xu (This seems to be the last patch that lacks a r-b ... Let'

Re: [PATCH v11 5/5] migration: introduce 'userfaultfd-wrlat.py' script

2021-01-21 Thread Andrey Gruzdev
On 21.01.2021 18:37, Peter Xu wrote: On Thu, Jan 21, 2021 at 04:12:23PM +0300, Andrey Gruzdev wrote: +/* KRETPROBE for handle_userfault(). */ +int retprobe_handle_userfault(struct pt_regs *ctx) +{ +u64 pid = (u32) bpf_get_current_pid_tgid(); +u64 *addr_p; + +/* + * Here we just

Re: [PATCH v11 4/5] migration: implementation of background snapshot thread

2021-01-21 Thread Andrey Gruzdev
On 21.01.2021 19:11, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: On 21.01.2021 12:56, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: On 19.01.2021 21:49, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz

Re: [PATCH v11 4/5] migration: implementation of background snapshot thread

2021-01-21 Thread Andrey Gruzdev
On 21.01.2021 20:48, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: On 21.01.2021 19:11, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: On 21.01.2021 12:56, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz

Re: [PATCH v13 4/5] migration: implementation of background snapshot thread

2021-01-29 Thread Andrey Gruzdev
On 28.01.2021 21:29, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: Introducing implementation of 'background' snapshot thread which in overall follows the logic of precopy migration while internally utilizes completely different mechanism

Re: [PATCH v14 0/5] UFFD write-tracking migration/snapshots

2021-02-04 Thread Andrey Gruzdev
On 04.02.2021 18:01, Dr. David Alan Gilbert wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's implemented in his series '[PATCH v0 0/4] migration: add background snapshot'.

Re: [PATCH v14 0/5] UFFD write-tracking migration/snapshots

2021-02-04 Thread Andrey Gruzdev
On 04.02.2021 19:53, Dr. David Alan Gilbert wrote: * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's implemented in his series '[PAT

Re: [PATCH v14 0/5] UFFD write-tracking migration/snapshots

2021-02-08 Thread Andrey Gruzdev
On 04.02.2021 19:53, Dr. David Alan Gilbert wrote: * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: * Andrey Gruzdev (andrey.gruz...@virtuozzo.com) wrote: This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's implemented in his series '[PAT

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-09 Thread Andrey Gruzdev
a problem exists.. If we are talking about a write to an unpopulated page, we should get first page fault on non-present page and populate it with protection bits from respective vma. For UFFD_WP vma's  page will be populated non-writable. So we'll get another page fault on present but re

Re: [PATCH v5 0/4] migration: UFFD write-tracking migration/snapshots

2020-12-08 Thread Andrey Gruzdev
On 08.12.2020 21:24, Peter Xu wrote: On Fri, Dec 04, 2020 at 12:30:59PM +0300, Andrey Gruzdev wrote: This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's implemented in his series '[PATCH v0 0/4] migration: add background snapshot'.

Re: [PATCH v5 1/4] migration: introduce 'background-snapshot' migration capability

2020-12-08 Thread Andrey Gruzdev
On 08.12.2020 18:47, Peter Xu wrote: On Fri, Dec 04, 2020 at 12:31:00PM +0300, Andrey Gruzdev wrote: +static +WriteTrackingSupport migrate_query_write_tracking(void) +{ +static WriteTrackingSupport wt_support = WT_SUPPORT_UNKNOWN; Better to be non-static - consider uncompatible memory can

Re: [PATCH v6 0/4] migration: UFFD write-tracking migration/snapshots

2020-12-11 Thread Andrey Gruzdev
On 09.12.2020 13:08, Andrey Gruzdev wrote: This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's implemented in his series '[PATCH v0 0/4] migration: add background snapshot'. Currently the only way to make (external) live VM snapshot is usi

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-11 Thread Andrey Gruzdev
apshot we've got - Guest: (should still be in the state of waiting for cmd) this time we enter "check" Thanks, Hi David, Peter, A little unexpected behavior, from my point of view, for UFFD write-protection. So, that means that UFFD_WP protection/e

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-11 Thread Andrey Gruzdev
page, so we'll have a lot of additional UFFD events, much more MISSING events then WP-faults. And the main problem is that adding MISSING handler is impossible in current single-threaded snapshot code. We'll get an immediate deadlock on iterative page read. -- Andrey Gruz

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-11 Thread Andrey Gruzdev
On 11.02.2021 20:18, Peter Xu wrote: On Thu, Feb 11, 2021 at 12:21:51PM +0300, Andrey Gruzdev wrote: On 09.02.2021 23:31, Peter Xu wrote: On Tue, Feb 09, 2021 at 03:09:28PM -0500, Peter Xu wrote: Hi, David, Andrey, On Tue, Feb 09, 2021 at 08:06:58PM +0100, David Hildenbrand wrote: Hi, just

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-11 Thread Andrey Gruzdev
On 11.02.2021 20:32, Peter Xu wrote: On Thu, Feb 11, 2021 at 07:19:47PM +0300, Andrey Gruzdev wrote: On 09.02.2021 22:06, David Hildenbrand wrote: Hi, just stumbled over this, quick question: I recently played with UFFD_WP and notices that write protection is only effective on pages/ranges

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-13 Thread Andrey Gruzdev
ocument this specific behaviour but also clarify that the saved state remains consistent and secure, off course if you agree with my arguments. -- Andrey Gruzdev, Principal Engineer Virtuozzo GmbH +7-903-247-6397 virtuzzo.com

Re: [PATCH v13 0/5] UFFD write-tracking migration/snapshots

2021-02-18 Thread Andrey Gruzdev
On 17.02.2021 02:35, Peter Xu wrote: Hi, Andrey, On Sat, Feb 13, 2021 at 12:34:07PM +0300, Andrey Gruzdev wrote: On 12.02.2021 19:11, Peter Xu wrote: On Fri, Feb 12, 2021 at 09:52:52AM +0100, David Hildenbrand wrote: On 12.02.21 04:06, Peter Xu wrote: On Thu, Feb 11, 2021 at 10:09:58PM

Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 21:25, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:36PM +0300, Andrey Gruzdev via wrote: [...] +/** + * ram_find_block_by_host_address: find RAM block containing host page + * + * Returns true if RAM block is found and pss->block/page are + * pointing to the given host p

Re: [PATCH v3 2/7] introduce UFFD-WP low-level interface helpers

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 21:39, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:35PM +0300, Andrey Gruzdev via wrote: +/** + * uffd_register_memory: register memory range with UFFD + * + * Returns 0 in case of success, negative value on error + * + * @uffd: UFFD file descriptor + * @start: starting virtual

Re: [PATCH v3 5/7] implementation of vm_start() BH

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 21:46, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:38PM +0300, Andrey Gruzdev wrote: To avoid saving updated versions of memory pages we need to start tracking RAM writes before we resume operation of vCPUs. This sequence is especially critical for virtio device backends whos

Re: [PATCH v3 1/7] introduce 'track-writes-ram' migration capability

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 21:51, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:34PM +0300, Andrey Gruzdev via wrote: Signed-off-by: Andrey Gruzdev --- migration/migration.c | 96 +++ migration/migration.h | 1 + qapi/migration.json | 7 +++- 3 files changed

Re: [PATCH v3 1/7] introduce 'track-writes-ram' migration capability

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 22:07, Peter Xu wrote: On Thu, Nov 19, 2020 at 01:51:50PM -0500, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:34PM +0300, Andrey Gruzdev via wrote: Signed-off-by: Andrey Gruzdev --- migration/migration.c | 96 +++ migration/migration.h

Re: [PATCH v3 4/7] implementation of write-tracking migration thread

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 21:47, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:37PM +0300, Andrey Gruzdev via wrote: Signed-off-by: Andrey Gruzdev Some commit message would always be appreciated... Thanks, Yep, missed it.. -- Andrey Gruzdev, Principal Engineer Virtuozzo GmbH +7-903-247-6397

Re: [PATCH v3 7/7] introduce simple linear scan rate limiting mechanism

2020-11-20 Thread Andrey Gruzdev
On 19.11.2020 23:02, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:40PM +0300, Andrey Gruzdev wrote: Since reading UFFD events and saving paged data are performed from the same thread, write fault latencies are sensitive to migration stream stalls. Limiting total page saving rate is a method to

Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()

2020-11-20 Thread Andrey Gruzdev
On 20.11.2020 18:07, Peter Xu wrote: On Fri, Nov 20, 2020 at 01:44:53PM +0300, Andrey Gruzdev wrote: On 19.11.2020 21:25, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:36PM +0300, Andrey Gruzdev via wrote: [...] +/** + * ram_find_block_by_host_address: find RAM block containing host page

Re: [PATCH v3 2/7] introduce UFFD-WP low-level interface helpers

2020-11-20 Thread Andrey Gruzdev
On 20.11.2020 18:01, Peter Xu wrote: On Fri, Nov 20, 2020 at 02:04:46PM +0300, Andrey Gruzdev wrote: +RAMBLOCK_FOREACH_NOT_IGNORED(bs) { +/* Nothing to do with read-only and MMIO-writable regions */ +if (bs->mr->readonly || bs->mr->rom_device) { +

Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()

2020-11-20 Thread Andrey Gruzdev
On 20.11.2020 19:43, Peter Xu wrote: On Fri, Nov 20, 2020 at 07:15:07PM +0300, Andrey Gruzdev wrote: Yeah, I think we can re-use the postcopy queue code for faulting pages. I'm worring a little about some additional overhead dealing with urgent request semaphore. Also, the code won'

Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()

2020-11-24 Thread Andrey Gruzdev
On 24.11.2020 00:34, Peter Xu wrote: On Fri, Nov 20, 2020 at 07:53:34PM +0300, Andrey Gruzdev wrote: On 20.11.2020 19:43, Peter Xu wrote: On Fri, Nov 20, 2020 at 07:15:07PM +0300, Andrey Gruzdev wrote: Yeah, I think we can re-use the postcopy queue code for faulting pages. I'm worr

Re: [PATCH v3 1/7] introduce 'track-writes-ram' migration capability

2020-11-24 Thread Andrey Gruzdev
On 24.11.2020 19:55, Dr. David Alan Gilbert wrote: * Peter Xu (pet...@redhat.com) wrote: On Thu, Nov 19, 2020 at 01:51:50PM -0500, Peter Xu wrote: On Thu, Nov 19, 2020 at 03:59:34PM +0300, Andrey Gruzdev via wrote: Signed-off-by: Andrey Gruzdev --- migration/migration.c | 96

  1   2   3   >