[PULL 00/25] Migration next patches

2024-02-27 Thread peterx
gitlab.com/peterx/qemu.git tags/migration-next-pull-request for you to fetch changes up to 9425ef3f990a42b98329d5059362f40714e70442: migration: Use migrate_has_error() in close_return_path_on_source() (2024-02-28 11:31:28 +0800) Migr

[PULL 04/25] migration/multifd: Remove p->quit from recv side

2024-02-27 Thread peterx
From: Fabiano Rosas Like we did on the sending side, replace the p->quit per-channel flag with a global atomic 'exiting' flag. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20240220224138.24759-5-faro...@suse.de Signed-off-by: Peter Xu --- migration/multi

[PULL 02/25] tests/qtest/migration: Rename fd_proto test

2024-02-27 Thread peterx
From: Fabiano Rosas Next patch adds another fd test. Rename the existing one closer to what's used on other tests, with the 'precopy' prefix. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20240220224138.24759-3-faro...@suse.de Signed-off-by: Peter Xu ---

[PULL 03/25] tests/qtest/migration: Add a fd + file test

2024-02-27 Thread peterx
From: Fabiano Rosas The fd URI supports an fd that is backed by a file. The code should select between QIOChannelFile and QIOChannelSocket, depending on the type of the fd. Add a test for that. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20240220224138.24

[PULL 01/25] docs/devel/migration.rst: Document the file transport

2024-02-27 Thread peterx
From: Fabiano Rosas When adding the support for file migration with the file: transport, we missed adding documentation for it. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20240220224138.24759-2-faro...@suse.de Signed-off-by: Peter Xu --- docs/devel/mig

[PULL 08/25] migration/multifd: Make multifd_channel_connect() return void

2024-02-27 Thread peterx
From: Peter Xu It never fails, drop the retval and also the Error**. Suggested-by: Avihai Horon Reviewed-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240222095301.171137-4-pet...@redhat.com Signed-off-by: Peter Xu --- migration/multifd.c | 8 +++- 1 file changed, 3 insertions(+),

[PULL 07/25] migration/multifd: Drop registered_yank

2024-02-27 Thread peterx
From: Peter Xu With a clear definition of p->c protocol, where we only set it up if the channel is fully established (TLS or non-TLS), registered_yank boolean will have equal meaning of "p->c != NULL". Drop registered_yank by checking p->c instead. Reviewed-by: Fabiano Rosas Link: https://lore

[PULL 09/25] migration/multifd: Cleanup outgoing_args in state destroy

2024-02-27 Thread peterx
From: Peter Xu outgoing_args is a global cache of socket address to be reused in multifd. Freeing the cache in per-channel destructor is more or less a hack. Move it to multifd_send_cleanup_state() so it only get checked once. Use a small helper to do so because it's internal of socket.c. Revi

[PULL 06/25] migration/multifd: Cleanup TLS iochannel referencing

2024-02-27 Thread peterx
From: Peter Xu Commit a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") introduced a thread for TLS channels, which will resolve the issue on blocking the main thread. However in the same commit p->c is slightly abused just to be able to pass over the pointe

[PULL 10/25] migration/multifd: Drop unnecessary helper to destroy IOC

2024-02-27 Thread peterx
From: Peter Xu Both socket_send_channel_destroy() and multifd_send_channel_destroy() are unnecessary wrappers to destroy an IOC, as the only thing to do is to release the final IOC reference. We have plenty of code that destroys an IOC using direct unref() already; keep that style. Reviewed-by:

[PULL 11/25] notify: pass error to notifier with return

2024-02-27 Thread peterx
From: Steve Sistare Pass an error object as the third parameter to "notifier with return" notifiers, so clients no longer need to bundle an error object in the opaque data. The new parameter is used in a later patch. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Reviewed-by: David Hilden

[PULL 14/25] migration: MigrationEvent for notifiers

2024-02-27 Thread peterx
From: Steve Sistare Passing MigrationState to notifiers is unsound because they could access unstable migration state internals or even modify the state. Instead, pass the minimal info needed in a new MigrationEvent struct, which could be extended in the future if needed. Suggested-by: Peter Xu

[PULL 05/25] migration/multifd: Release recv sem_sync earlier

2024-02-27 Thread peterx
From: Fabiano Rosas Now that multifd_recv_terminate_threads() is called only once, release the recv side sem_sync earlier like we do for the send side. Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/20240220224138.24759-6-faro...@suse.de Signed-off-by: Peter

[PULL 12/25] migration: remove error from notifier data

2024-02-27 Thread peterx
From: Steve Sistare Remove the error object from opaque data passed to notifiers. Use the new error parameter passed to the notifier instead. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Link: https://lore.kernel.org/r/1708622920-68779-3-git-send-email-st

[PULL 18/25] migration: refactor migrate_fd_connect failures

2024-02-27 Thread peterx
From: Steve Sistare Move common code for the error path in migrate_fd_connect to a shared fail label. No functional change. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Link: https://lore.kernel.org/r/1708622920-68779-9-git-send-email-steven.sist...@orac

[PULL 23/25] migration: Fix qmp_query_migrate mbps value

2024-02-27 Thread peterx
From: Fabiano Rosas The QMP command query_migrate might see incorrect throughput numbers if it runs after we've set the migration completion status but before migration_calculate_complete() has updated s->total_time and s->mbps. The migration status would show COMPLETED, but the throughput value

[PULL 20/25] migration: stop vm for cpr

2024-02-27 Thread peterx
From: Steve Sistare When migration for cpr is initiated, stop the vm and set state RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the possibility of ram and device state being out of sync, and guarantees that a guest in the suspended state remains suspended, because qmp_cont rejec

[PULL 15/25] migration: remove postcopy_after_devices

2024-02-27 Thread peterx
From: Steve Sistare postcopy_after_devices and migration_in_postcopy_after_devices are no longer used, so delete them. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/1708622920-68779-6-git-send-email-steven.sist...@oracle.com Signed-off-by: Peter Xu --- i

[PULL 24/25] migration: Join the return path thread before releasing to_dst_file

2024-02-27 Thread peterx
From: Fabiano Rosas The return path thread might hang at a blocking system call. Before joining the thread we might need to issue a shutdown() on the socket file descriptor to release it. To determine whether the shutdown() is necessary we look at the QEMUFile error. Make sure we only clean up t

[PULL 19/25] migration: notifier error checking

2024-02-27 Thread peterx
From: Steve Sistare Check the status returned by migration notifiers for event type MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers return an error status at this time. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/1708622920-68779-10-gi

[PULL 22/25] migration: options incompatible with cpr

2024-02-27 Thread peterx
From: Steve Sistare Fail the migration request if options are set that are incompatible with cpr. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sist...@oracle.com Signed-off-by: Peter Xu --- qapi/migration.json

[PULL 16/25] migration: MigrationNotifyFunc

2024-02-27 Thread peterx
From: Steve Sistare Define MigrationNotifyFunc to improve type safety and simplify migration notifiers. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Link: https://lore.kernel.org/r/1708622920-68779-7-git-send-email-steven.sist...@oracle.com Signed-off-by:

[PULL 25/25] migration: Use migrate_has_error() in close_return_path_on_source()

2024-02-27 Thread peterx
From: Cédric Le Goater close_return_path_on_source() retrieves the migration error from the the QEMUFile '->to_dst_file' to know if a shutdown is required. This shutdown is required to exit the return-path thread. Avoid relying on '->to_dst_file' and use migrate_has_error() instead. (using to_d

[PULL 17/25] migration: per-mode notifiers

2024-02-27 Thread peterx
From: Steve Sistare Keep a separate list of migration notifiers for each migration mode. Suggested-by: Peter Xu Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Link: https://lore.kernel.org/r/1708622920-68779-8-git-send-email-steven.sist...@oracle.com Signe

[PULL 13/25] migration: convert to NotifierWithReturn

2024-02-27 Thread peterx
: David Hildenbrand Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sist...@oracle.com [peterx: dropped unexpected update to roms/seabios-hppa] Signed-off-by: Peter Xu --- include/hw/vfio/vfio-common.h | 2 +- include/hw/virtio/virtio-net.h | 2 +- include/migration

[PULL 21/25] migration: update cpr-reboot description

2024-02-27 Thread peterx
From: Steve Sistare Clarify qapi for cpr-reboot migration mode, and add vfio support. Signed-off-by: Steve Sistare Reviewed-by: Peter Xu Link: https://lore.kernel.org/r/1708622920-68779-14-git-send-email-steven.sist...@oracle.com Signed-off-by: Peter Xu --- qapi/migration.json | 35

[PATCH] migration/multifd: Document two places for mapped-ram

2024-03-01 Thread peterx
From: Peter Xu Add two documentations for mapped-ram migration on two spots that may not be extremely clear. Signed-off-by: Peter Xu --- Based-on: <20240229153017.2221-1-faro...@suse.de> --- migration/multifd.c | 12 migration/ram.c | 8 +++- 2 files changed, 19 insertion

[PULL 01/27] migration: massage cpr-reboot documentation

2024-03-03 Thread peterx
send-email-steven.sist...@oracle.com [peterx: s/qemu/QEMU per Markus's suggestion] Reviewed-by: Markus Armbruster Signed-off-by: Peter Xu --- qapi/migration.json | 46 +++-- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/qapi/migrat

[PULL 05/27] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file

2024-03-03 Thread peterx
From: Nikolay Borisov Add a generic QIOChannel feature SEEKABLE which would be used by the qemu_file* apis. For the time being this will be only implemented for file channels. Signed-off-by: Nikolay Borisov Reviewed-by: "Daniel P. Berrangé" Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas

[PULL 09/27] migration/qemu-file: add utility methods for working with seekable channels

2024-03-03 Thread peterx
From: Fabiano Rosas Add utility methods that will be needed when implementing 'mapped-ram' migration capability. Signed-off-by: Fabiano Rosas Reviewed-by: "Daniel P. Berrangé" Link: https://lore.kernel.org/r/20240229153017.2221-7-faro...@suse.de Signed-off-by: Peter Xu --- include/migration/

[PULL 11/27] migration: Add mapped-ram URI compatibility check

2024-03-03 Thread peterx
From: Fabiano Rosas The mapped-ram migration format needs a channel that supports seeking to be able to write each page to an arbitrary offset in the migration stream. Reviewed-by: "Daniel P. Berrangé" Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229

[PULL 06/27] io: Add generic pwritev/preadv interface

2024-03-03 Thread peterx
From: Nikolay Borisov Introduce basic pwritev/preadv support in the generic channel layer. Specific implementation will follow for the file channel as this is required in order to support migration streams with fixed location of each ram page. Signed-off-by: Nikolay Borisov Reviewed-by: "Daniel

[PULL 02/27] migration: Properly apply migration compression level parameters

2024-03-03 Thread peterx
From: Bryan Zhang Some glue code was missing, so that using `qmp_migrate_set_parameters` to set `multifd-zstd-level` or `multifd-zlib-level` did not work. This commit adds the glue code to fix that. Signed-off-by: Bryan Zhang Link: https://lore.kernel.org/r/20240301035901.4006936-2-bryan.zh...

[PULL 00/27] Migration next patches

2024-03-03 Thread peterx
From: Peter Xu The following changes since commit c0c6a0e3528b88aaad0b9d333e295707a195587b: Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging (2024-02-28 17:27:10 +) are available in the Git repository at: https://gitlab.com/peterx/qem

[PULL 08/27] io: fsync before closing a file channel

2024-03-03 Thread peterx
From: Fabiano Rosas Make sure the data is flushed to disk before closing file channels. This is to ensure data is on disk and not lost in the event of a host crash. This is currently being implemented to affect the migration code when migrating to a file, but all QIOChannelFile users should bene

[PULL 16/27] migration/multifd: Decouple recv method from pages

2024-03-03 Thread peterx
From: Fabiano Rosas Next patches will abstract the type of data being received by the channels, so do some cleanup now to remove references to pages and dependency on 'normal_num'. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229153017.2221-14-faro...

[PULL 15/27] migration/multifd: Rename MultiFDSend|RecvParams::data to compress_data

2024-03-03 Thread peterx
From: Fabiano Rosas Use a more specific name for the compression data so we can use the generic for the multifd core code. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229153017.2221-13-faro...@suse.de Signed-off-by: Peter Xu --- migration/multifd.h

[PULL 04/27] migration/multifd: Cleanup multifd_recv_sync_main

2024-03-03 Thread peterx
From: Fabiano Rosas Some minor cleanups and documentation for multifd_recv_sync_main. Use thread_count as done in other parts of the code. Remove p->id from the multifd_recv_state sync, since that is global and not tied to a channel. Add documentation for the sync steps. Reviewed-by: Peter Xu

[PULL 14/27] tests/qtest/migration: Add tests for mapped-ram file-based migration

2024-03-03 Thread peterx
From: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229153017.2221-12-faro...@suse.de Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 59 1 file changed, 59 insertions(+) diff --git a/test

[PULL 19/27] migration/multifd: Add a wrapper for channels_created

2024-03-03 Thread peterx
From: Fabiano Rosas We'll need to access multifd_send_state->channels_created from outside multifd.c, so introduce a helper for that. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229153017.2221-17-faro...@suse.de Signed-off-by: Peter Xu --- migratio

[PULL 26/27] tests/qtest/migration: Add a multifd + mapped-ram migration test

2024-03-03 Thread peterx
From: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240229153017.2221-24-faro...@suse.de Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 68 1 file changed, 68 insertions(+) diff --git a/test

[PULL 20/27] migration/multifd: Add outgoing QIOChannelFile support

2024-03-03 Thread peterx
From: Fabiano Rosas Allow multifd to open file-backed channels. This will be used when enabling the mapped-ram migration stream format which expects a seekable transport. The QIOChannel read and write methods will use the preadv/pwritev versions which don't update the file offset at each call so

[PULL 17/27] migration/multifd: Allow multifd without packets

2024-03-03 Thread peterx
From: Fabiano Rosas For the upcoming support to the new 'mapped-ram' migration stream format, we cannot use multifd packets because each write into the ramblock section in the migration file is expected to contain only the guest pages. They are written at their respective offsets relative to the

[PULL 25/27] migration/multifd: Add mapped-ram support to fd: URI

2024-03-03 Thread peterx
From: Fabiano Rosas If we receive a file descriptor that points to a regular file, there's nothing stopping us from doing multifd migration with mapped-ram to that file. Enable the fd: URI to work with multifd + mapped-ram. Note that the fds passed into multifd are duplicated because we want to

[PULL 10/27] migration/ram: Introduce 'mapped-ram' migration capability

2024-03-03 Thread peterx
From: Fabiano Rosas Add a new migration capability 'mapped-ram'. The core of the feature is to ensure that RAM pages are mapped directly to offsets in the resulting migration file instead of being streamed at arbitrary points. The reasons why we'd want such behavior are: - The resulting file

[PULL 23/27] migration/multifd: Support outgoing mapped-ram stream format

2024-03-03 Thread peterx
From: Fabiano Rosas The new mapped-ram stream format uses a file transport and puts ram pages in the migration file at their respective offsets and can be done in parallel by using the pwritev system call which takes iovecs and an offset. Add support to enabling the new format along with multifd

[PULL 22/27] migration/multifd: Prepare multifd sync for mapped-ram migration

2024-03-03 Thread peterx
From: Fabiano Rosas The mapped-ram migration can be performed live or non-live, but it is always asynchronous, i.e. the source machine and the destination machine are not migrating at the same time. We only need some pieces of the multifd sync operations. multifd_send_sync_main() ---

[PULL 27/27] migration/multifd: Document two places for mapped-ram

2024-03-03 Thread peterx
From: Peter Xu Add two documentations for mapped-ram migration on two spots that may not be extremely clear. Reviewed-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240301091524.39900-1-pet...@redhat.com Cc: Prasad Pandit [peterx: fix two English errors per Prasad] Signed-off-by: Peter

[PULL 21/27] migration/multifd: Add incoming QIOChannelFile support

2024-03-03 Thread peterx
From: Fabiano Rosas On the receiving side we don't need to differentiate between main channel and threads, so whichever channel is defined first gets to be the main one. And since there are no packets, use the atomic channel count to index into the params array. Reviewed-by: Peter Xu Signed-off

[PULL 12/27] migration/ram: Add outgoing 'mapped-ram' migration

2024-03-03 Thread peterx
From: Fabiano Rosas Implement the outgoing migration side for the 'mapped-ram' capability. A bitmap is introduced to track which pages have been written in the migration file. Pages are written at a fixed location for every ramblock. Zero pages are ignored as they'd be zero in the destination mi

[PULL 03/27] tests/migration: Set compression level in migration tests

2024-03-03 Thread peterx
From: Bryan Zhang Adds calls to set compression level for `zstd` and `zlib` migration tests, just to make sure that the calls work. Signed-off-by: Bryan Zhang Link: https://lore.kernel.org/r/20240301035901.4006936-3-bryan.zh...@bytedance.com Signed-off-by: Peter Xu --- tests/qtest/migration-

[PULL 13/27] migration/ram: Add incoming 'mapped-ram' migration

2024-03-03 Thread peterx
From: Fabiano Rosas Add the necessary code to parse the format changes for the 'mapped-ram' capability. One of the more notable changes in behavior is that in the 'mapped-ram' case ram pages are restored in one go rather than constantly looping through the migration stream. Signed-off-by: Nikol

[PULL 07/27] io: implement io_pwritev/preadv for QIOChannelFile

2024-03-03 Thread peterx
From: Nikolay Borisov The upcoming 'mapped-ram' feature will require qemu to write data to (and restore from) specific offsets of the migration file. Add a minimal implementation of pwritev/preadv and expose them via the io_pwritev and io_preadv interfaces. Signed-off-by: Nikolay Borisov Revie

[PULL 24/27] migration/multifd: Support incoming mapped-ram stream format

2024-03-03 Thread peterx
From: Fabiano Rosas For the incoming mapped-ram migration we need to read the ramblock headers, get the pages bitmap and send the host address of each non-zero page to the multifd channel thread for writing. Usage on HMP is: (qemu) migrate_set_capability multifd on (qemu) migrate_set_capability

[PULL 18/27] migration/multifd: Allow receiving pages without packets

2024-03-03 Thread peterx
From: Fabiano Rosas Currently multifd does not need to have knowledge of pages on the receiving side because all the information needed is within the packets that come in the stream. We're about to add support to mapped-ram migration, which cannot use packets because it expects the ramblock sect

[PATCH v2 1/5] migration/multifd: Cleanup TLS iochannel referencing

2024-02-22 Thread peterx
From: Peter Xu Commit a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") introduced a thread for TLS channels, which will resolve the issue on blocking the main thread. However in the same commit p->c is slightly abused just to be able to pass over the pointe

[PATCH v2 0/5] migration: cleanup TLS channel referencing

2024-02-22 Thread peterx
From: Peter Xu v2: - add patches - migration/multifd: Make multifd_channel_connect() return void - migration/multifd: Cleanup outgoing_args in state destroy - migration/multifd: Drop unnecessary helper to destroy IOC - fix spelling This is a small cleanup patchset to firstly cleanup tls io

[PATCH v2 4/5] migration/multifd: Cleanup outgoing_args in state destroy

2024-02-22 Thread peterx
From: Peter Xu outgoing_args is a global cache of socket address to be reused in multifd. Freeing the cache in per-channel destructor is more or less a hack. Move it to multifd_send_cleanup_state() so it only get checked once. Use a small helper to do so because it's internal of socket.c. Sign

[PATCH v2 5/5] migration/multifd: Drop unnecessary helper to destroy IOC

2024-02-22 Thread peterx
From: Peter Xu Both socket_send_channel_destroy() and multifd_send_channel_destroy() are unnecessary wrappers to destroy an IOC, as the only thing to do is to release the final IOC reference. We have plenty of code that destroys an IOC using direct unref() already; keep that style. Signed-off-b

[PATCH v2 3/5] migration/multifd: Make multifd_channel_connect() return void

2024-02-22 Thread peterx
From: Peter Xu It never fails, drop the retval and also the Error**. Suggested-by: Avihai Horon Signed-off-by: Peter Xu --- migration/multifd.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index da2e7c1db1..f52f01ca85 10

[PATCH v2 2/5] migration/multifd: Drop registered_yank

2024-02-22 Thread peterx
From: Peter Xu With a clear definition of p->c protocol, where we only set it up if the channel is fully established (TLS or non-TLS), registered_yank boolean will have equal meaning of "p->c != NULL". Drop registered_yank by checking p->c instead. Reviewed-by: Fabiano Rosas Signed-off-by: Pet

[PULL 00/14] Migration 20240126 patches

2024-01-28 Thread peterx
From: Peter Xu The following changes since commit 7a1dc45af581d2b643cdbf33c01fd96271616fbd: Merge tag 'pull-target-arm-20240126' of https://git.linaro.org/people/pmaydell/qemu-arm into staging (2024-01-26 18:16:35 +) are available in the Git repository at: https://gitlab.

[PULL 13/14] migration: Centralize BH creation and dispatch

2024-01-28 Thread peterx
From: Fabiano Rosas Now that the migration state reference counting is correct, further wrap the bottom half dispatch process to avoid future issues. Move BH creation and scheduling together and wrap the dispatch with an intermediary function that will ensure we always keep the ref/unref balance

[PULL 02/14] migration: Plug memory leak on HMP migrate error path

2024-01-28 Thread peterx
/20240117140722.3979657-1-arm...@redhat.com [peterx: fix CID number as reported by Peter Maydell] Signed-off-by: Peter Xu --- migration/migration-hmp-cmds.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index

[PULL 03/14] migration: Make threshold_size an uint64_t

2024-01-28 Thread peterx
From: Peter Xu It's always used to compare against another uint64_t. Make it always clear that it's never a negative. Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240117075848.139045-2-pet...@redhat.com Signed-off-by: Peter Xu --- migratio

[PULL 12/14] migration: Add a wrapper to qemu_bh_schedule

2024-01-28 Thread peterx
From: Fabiano Rosas Wrap qemu_bh_schedule() to ensure we always hold a reference to the current_migration object. Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240119233922.32588-5-faro...@suse.de Signed-off-by: Peter Xu --- migration/migration.c | 31 ++--

[PULL 08/14] migration/yank: Use channel features

2024-01-28 Thread peterx
From: Fabiano Rosas Stop using outside knowledge about the io channels when registering yank functions. Query for features instead. The yank method for all channels used with migration code currently is to call the qio_channel_shutdown() function, so query for QIO_CHANNEL_FEATURE_SHUTDOWN. We co

[PULL 11/14] migration: Reference migration state around loadvm_postcopy_handle_run_bh

2024-01-28 Thread peterx
From: Fabiano Rosas We need to hold a reference to the current_migration object around async calls to avoid it been freed while still in use. Even on this load-side function, we might still use the MigrationState, e.g to check for capabilities. Signed-off-by: Fabiano Rosas Link: https://lore.ke

[PULL 10/14] migration: Take reference to migration state around bg_migration_vm_start_bh

2024-01-28 Thread peterx
From: Fabiano Rosas We need to hold a reference to the current_migration object around async calls to avoid it been freed while still in use. Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240119233922.32588-3-faro...@suse.de Signed-off-by: Peter Xu --- migration/migration.c |

[PULL 14/14] Make 'uri' optional for migrate QAPI

2024-01-28 Thread peterx
From: Het Gala 'uri' argument should be optional, as 'uri' and 'channels' arguments are mutally exclusive in nature. Fixes: 074dbce5fcce (migration: New migrate and migrate-incoming argument 'channels') Signed-off-by: Het Gala Link: https://lore.kernel.org/r/20240123064219.40514-1-het.g...@nut

[PULL 07/14] ci: Disable migration compatibility tests for aarch64

2024-01-28 Thread peterx
patch when 9.0 releases. Signed-off-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240118164951.30350-4-faro...@suse.de [peterx: use _SKIPPED rather than _OPTIONAL] Signed-off-by: Peter Xu --- .gitlab-ci.d/buildtest.yml | 4 1 file changed, 4 insertions(+) diff --git a/.gitlab-ci.d

[PULL 09/14] migration: Fix use-after-free of migration state object

2024-01-28 Thread peterx
From: Fabiano Rosas We're currently allowing the process_incoming_migration_bh bottom-half to run without holding a reference to the 'current_migration' object, which leads to a segmentation fault if the BH is still live after migration_shutdown() has dropped the last reference to current_migrati

[PULL 01/14] userfaultfd: use 1ULL to build ioctl masks

2024-01-28 Thread peterx
From: Paolo Bonzini There is no need to use the Linux-internal __u64 type, 1ULL is guaranteed to be wide enough. Signed-off-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Link: https://lore.kernel.org/r/20240117160313.175609-1-pbonz...@redhat.com Signed-off-by: Peter Xu --- migration/

[PULL 05/14] analyze-migration.py: Remove trick on parsing ramblocks

2024-01-28 Thread peterx
From: Peter Xu RAM_SAVE_FLAG_MEM_SIZE contains the total length of ramblock idstr to know whether scanning of ramblocks is complete. Drop the trick. Reviewed-by: Fabiano Rosas Link: https://lore.kernel.org/r/20240117075848.139045-4-pet...@redhat.com Signed-off-by: Peter Xu --- scripts/analyz

[PULL 04/14] migration: Drop unnecessary check in ram's pending_exact()

2024-01-28 Thread peterx
From: Peter Xu When the migration frameworks fetches the exact pending sizes, it means this check: remaining_size < s->threshold_size Must have been done already, actually at migration_iteration_run(): if (must_precopy <= s->threshold_size) { qemu_savevm_state_pending_exact(&must

[PULL 06/14] ci: Add a migration compatibility test job

2024-01-28 Thread peterx
From: Fabiano Rosas The migration tests have support for being passed two QEMU binaries to test migration compatibility. Add a CI job that builds the lastest release of QEMU and another job that uses that version plus an already present build of the current version and run the migration tests wi

[PATCH 03/14] migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths

2024-01-31 Thread peterx
From: Peter Xu Multifd send side has two fields to indicate error quits: - MultiFDSendParams.quit - &multifd_send_state->exiting Merge them into the global one. The replacement is done by changing all p->quit checks into the global var check. The global check doesn't need any lock. A few

[PATCH 00/14] migration/multifd: Refactor ->send_prepare() and cleanups

2024-01-31 Thread peterx
From: Peter Xu This patchset contains quite a few refactorings to current multifd: - It picked up some patches from an old series of mine [0] (the last patches were dropped, though; I did the cleanup slightly differently): I still managed to include one patch to split pending_job, but

[PATCH 01/14] migration/multifd: Drop stale comment for multifd zero copy

2024-01-31 Thread peterx
From: Peter Xu We've already done that with multifd_flush_after_each_section, for multifd in general. Drop the stale "TODO-like" comment. Reviewed-by: Fabiano Rosas Signed-off-by: Peter Xu --- migration/multifd.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/migration/multi

[PATCH 08/14] migration/multifd: Drop pages->num check in sender thread

2024-01-31 Thread peterx
From: Peter Xu Now with a split SYNC handler, we always have pages->num set for pending_job==true. Assert it instead. Signed-off-by: Peter Xu --- migration/multifd.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c ind

[PATCH 12/14] migration/multifd: multifd_send_prepare_header()

2024-01-31 Thread peterx
From: Peter Xu Introduce a helper multifd_send_prepare_header() to setup the header packet for multifd sender. It's fine to setup the IOV[0] _before_ send_prepare() because the packet buffer is already ready, even if the content is to be filled in. With this helper, we can already slightly clea

[PATCH 14/14] migration/multifd: Forbid spurious wakeups

2024-01-31 Thread peterx
From: Peter Xu Now multifd's logic is designed to have no spurious wakeup. I still remember a talk to Juan and he seems to agree we should drop it now, and if my memory was right it was there because multifd used to hit that when still debugging. Let's drop it and see what can explode; as long

[PATCH 04/14] migration/multifd: Postpone reset of MultiFDPages_t

2024-01-31 Thread peterx
From: Peter Xu Now we reset MultiFDPages_t object in the multifd sender thread in the middle of the sending job. That's not necessary, because the "*pages" struct will not be reused anyway until pending_job is cleared. Move that to the end after the job is completed, provide a helper to reset a

[PATCH 09/14] migration/multifd: Rename p->num_packets and clean it up

2024-01-31 Thread peterx
From: Peter Xu This field, no matter whether on src or dest, is only used for debugging purpose. They can even be removed already, unless it still more or less provide some accounting on "how many packets are sent/recved for this thread". The other more important one is called packet_num, which

[PATCH 05/14] migration/multifd: Drop MultiFDSendParams.normal[] array

2024-01-31 Thread peterx
From: Peter Xu This array is redundant when p->pages exists. Now we extended the life of p->pages to the whole period where pending_job is set, it should be safe to always use p->pages->offset[] rather than p->normal[]. Drop the array. Alongside, the normal_num is also redundant, which is the

[PATCH 02/14] migration/multifd: multifd_send_kick_main()

2024-01-31 Thread peterx
From: Peter Xu When a multifd sender thread hit errors, it always needs to kick the main thread by kicking all the semaphores that it can be waiting upon. Provide a helper for it and deduplicate the code. Reviewed-by: Fabiano Rosas Signed-off-by: Peter Xu --- migration/multifd.c | 21 +++

[PATCH 13/14] migration/multifd: Move header prepare/fill into send_prepare()

2024-01-31 Thread peterx
From: Peter Xu This patch redefines the interfacing of ->send_prepare(). It further simplifies multifd_send_thread() especially on zero copy. Now with the new interface, we require the hook to do all the work for preparing the IOVs to send. After it's completed, the IOVs should be ready to be

[PATCH 11/14] migration/multifd: Move trace_multifd_send|recv()

2024-01-31 Thread peterx
From: Peter Xu Move them into fill/unfill of packets. With that, we can further cleanup the send/recv thread procedure, and remove one more temp var. Signed-off-by: Peter Xu --- migration/multifd.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/migration/multi

[PATCH 06/14] migration/multifd: Separate SYNC request with normal jobs

2024-01-31 Thread peterx
From: Peter Xu Multifd provide a threaded model for processing jobs. On sender side, there can be two kinds of job: (1) a list of pages to send, or (2) a sync request. The sync request is a very special kind of job. It never contains a page array, but only a multifd packet telling the dest sid

[PATCH 07/14] migration/multifd: Simplify locking in sender thread

2024-01-31 Thread peterx
From: Peter Xu The sender thread will yield the p->mutex before IO starts, trying to not block the requester thread. This may be unnecessary lock optimizations, because the requester can already read pending_job safely even without the lock, because the requester is currently the only one who ca

[PATCH 10/14] migration/multifd: Move total_normal_pages accounting

2024-01-31 Thread peterx
From: Peter Xu Just like the previous patch, move the accounting for total_normal_pages on both src/dst sides into the packet fill/unfill procedures. Signed-off-by: Peter Xu --- migration/multifd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/migration/multifd.c b/mi

[PATCH v2 03/23] migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths

2024-02-02 Thread peterx
From: Peter Xu Multifd send side has two fields to indicate error quits: - MultiFDSendParams.quit - &multifd_send_state->exiting Merge them into the global one. The replacement is done by changing all p->quit checks into the global var check. The global check doesn't need any lock. A few

[PATCH v2 07/23] migration/multifd: Simplify locking in sender thread

2024-02-02 Thread peterx
From: Peter Xu The sender thread will yield the p->mutex before IO starts, trying to not block the requester thread. This may be unnecessary lock optimizations, because the requester can already read pending_job safely even without the lock, because the requester is currently the only one who ca

[PATCH v2 02/23] migration/multifd: multifd_send_kick_main()

2024-02-02 Thread peterx
From: Peter Xu When a multifd sender thread hit errors, it always needs to kick the main thread by kicking all the semaphores that it can be waiting upon. Provide a helper for it and deduplicate the code. Reviewed-by: Fabiano Rosas Signed-off-by: Peter Xu --- migration/multifd.c | 21 +++

[PATCH v2 05/23] migration/multifd: Drop MultiFDSendParams.normal[] array

2024-02-02 Thread peterx
From: Peter Xu This array is redundant when p->pages exists. Now we extended the life of p->pages to the whole period where pending_job is set, it should be safe to always use p->pages->offset[] rather than p->normal[]. Drop the array. Alongside, the normal_num is also redundant, which is the

[PATCH v2 13/23] migration/multifd: Move header prepare/fill into send_prepare()

2024-02-02 Thread peterx
From: Peter Xu This patch redefines the interfacing of ->send_prepare(). It further simplifies multifd_send_thread() especially on zero copy. Now with the new interface, we require the hook to do all the work for preparing the IOVs to send. After it's completed, the IOVs should be ready to be

[PATCH v2 11/23] migration/multifd: Move trace_multifd_send|recv()

2024-02-02 Thread peterx
From: Peter Xu Move them into fill/unfill of packets. With that, we can further cleanup the send/recv thread procedure, and remove one more temp var. Reviewed-by: Fabiano Rosas Signed-off-by: Peter Xu --- migration/multifd.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-)

[PATCH v2 01/23] migration/multifd: Drop stale comment for multifd zero copy

2024-02-02 Thread peterx
From: Peter Xu We've already done that with multifd_flush_after_each_section, for multifd in general. Drop the stale "TODO-like" comment. Reviewed-by: Fabiano Rosas Signed-off-by: Peter Xu --- migration/multifd.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/migration/multi

[PATCH v2 19/23] migration/multifd: Cleanup multifd_save_cleanup()

2024-02-02 Thread peterx
From: Peter Xu Shrink the function by moving relevant works into helpers: move the thread join()s into multifd_send_terminate_threads(), then create two more helpers to cover channel/state cleanups. Add a TODO entry for the thread terminate process because p->running is still buggy. We need to

[PATCH v2 04/23] migration/multifd: Postpone reset of MultiFDPages_t

2024-02-02 Thread peterx
From: Peter Xu Now we reset MultiFDPages_t object in the multifd sender thread in the middle of the sending job. That's not necessary, because the "*pages" struct will not be reused anyway until pending_job is cleared. Move that to the end after the job is completed, provide a helper to reset a

  1   2   3   4   >