Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw
在 2022/11/11 3:17, Michael S. Tsirkin 写道: On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Save the acked_features once it be configured by guest virtio driver so it can't miss any features. Note that this patch also change the features saving logic in chr_closed_bh, which originally backup features no matter whether the features are 0 or not, but now do it only if features aren't 0. As to reset acked_features to 0 if needed, Qemu always keeping the backup acked_features up-to-date, and save the acked_features after virtio_net_set_features in advance, including reset acked_features to 0, so the behavior is also covered. Signed-off-by: Hyman Huang(黄勇) Signed-off-by: Guoyi Tu --- hw/net/vhost_net.c | 9 + hw/net/virtio-net.c | 5 + include/net/vhost_net.h | 2 ++ net/vhost-user.c| 6 +- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index d28f8b9..2bffc27 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net) return net->dev.acked_features; } +void vhost_net_save_acked_features(NetClientState *nc) +{ +if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) { +return; +} + +vhost_user_save_acked_features(nc, false); +} + static int vhost_net_get_fd(NetClientState *backend) { switch (backend->info->type) { diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e9f696b..5f8f788 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) continue; } vhost_net_ack_features(get_vhost_net(nc->peer), features); +/* + * keep acked_features in NetVhostUserState up-to-date so it + * can't miss any features configured by guest virtio driver. + */ +vhost_net_save_acked_features(nc->peer); } if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) { So when do you want to ack features but *not* save them? When openvswitch restart and reconnect and Qemu start the vhost_dev, acked_features in vhost_dev Qemu need to be initialized and the initialized value be fetched from acked_features int NetVhostUserState. At this time, acked_features may not be up-to-date but we want it. Is the effect of this patch, fundamentally, that guest features from virtio are always copied to vhost-user? Do we even need an extra copy in vhost user then? I'm trying to explain this from my view, please point out the mistake if i failed. :) When socket used by vhost-user device disconnectted from openvswitch, Qemu will stop the vhost-user and clean up the whole struct of vhost_dev(include vm's memory region and acked_features), once socket is reconnected from openvswitch, Qemu will collect vm's memory region dynamically but as to acked_features, IMHO, Qemu can not fetch it from guest features of virtio-net, because acked_features are kind of different from guest features(bit 30 is different at least),so Qemu need an extra copy. all this came in with: commit a463215b087c41d7ca94e51aa347cde523831873 Author: Marc-André Lureau Date: Mon Jun 6 18:45:05 2016 +0200 vhost-net: save & restore vhost-user acked features Marc-André do you remember why we have a copy of features in vhost-user and not just reuse the features from virtio? diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h index 387e913..3a5579b 100644 --- a/include/net/vhost_net.h +++ b/include/net/vhost_net.h @@ -46,6 +46,8 @@ int vhost_set_vring_enable(NetClientState * nc, int enable); uint64_t vhost_net_get_acked_features(VHostNetState *net); +void vhost_net_save_acked_features(NetClientState *nc); + int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu); #endif diff --git a/net/vhost-user.c b/net/vhost-user.c index 74f349c..c512cc9 100644 --- a/net/vhost-user.c +++ b/net/vhost-user.c @@ -258,11 +258,7 @@ static void chr_closed_bh(void *opaque) s = DO_UPCAST(NetVhostUserState, nc, ncs[0]); for (i = queues -1; i >= 0; i--) { -s = DO_UPCAST(NetVhostUserState, nc, ncs[i]); - -if (s->vhost_net) { -s->acked_features = vhost_net_get_acked_features(s->vhost_net); -} +vhost_user_save_acked_features(ncs[i], false); } qmp_set_link(name, false, &err); -- 1.8.3.1
Re: [PATCH v2 07/11] migration: Implement dirty-limit convergence algo
在 2022/11/30 7:17, Peter Xu 写道: On Mon, Nov 21, 2022 at 11:26:39AM -0500, huang...@chinatelecom.cn wrote: diff --git a/migration/migration.c b/migration/migration.c index 86950a1..096b61a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -240,6 +240,7 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); Disable it only if migrate_dirty_limit() is true? It seems okay if the admin wants to use dirtylimit separately from migration. Ok. migrate_fd_cancel(current_migration); } [...] @@ -1148,22 +1175,31 @@ static void migration_trigger_throttle(RAMState *rs) uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE; uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100; -/* During block migration the auto-converge logic incorrectly detects - * that ram migration makes no progress. Avoid this by disabling the - * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ + +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +/* + * During block migration the auto-converge logic incorrectly detects + * that ram migration makes no progress. Avoid this by disabling the + * throttling logic during the bulk phase of block migration + */ + +if (migrate_auto_converge() && !blk_mig_bulk_active()) { Does dirtylimit cap needs to check blk_mig_bulk_active() too? I assume that check was used to ignore the bulk block migration phase where major bandwidth will be consumed by block migrations so the measured bandwidth is not accurate. IIUC it applies to dirtylimit too.Indeed, i'll add this next version. trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit() && + kvm_dirty_ring_enabled() && + migration_is_active(s)) { Is "kvm_dirty_ring_enabled()" and "migration_is_active(s)" check helpful? Can we only rely on migrate_dirty_limit() alone? In qmp_set_vcpu_dirty_limit, it checks if kvm enabled and dirty ring size set. When "dirty-limit" capability set, we also check this in migrate_caps_check, so kvm_dirty_ring_enabled can be dropped indeed. As for migration_is_active, dirty-limit can be set anytime and migration is active already in the path. It also can be dropped. I'll fix this next version.
Re: [PATCH v2 08/11] migration: Export dirty-limit time info
在 2022/11/30 8:09, Peter Xu 写道: On Mon, Nov 21, 2022 at 11:26:40AM -0500, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Export dirty limit throttle time and estimated ring full time, through which we can observe the process of dirty limit during live migration. Signed-off-by: Hyman Huang(黄勇) --- include/sysemu/dirtylimit.h | 2 ++ migration/migration.c | 10 ++ monitor/hmp-cmds.c | 10 ++ qapi/migration.json | 10 +- softmmu/dirtylimit.c| 31 +++ 5 files changed, 62 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3..98cc4a6 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +int64_t dirtylimit_throttle_us_per_full(void); +int64_t dirtylimit_us_ring_full(void); #endif diff --git a/migration/migration.c b/migration/migration.c index 096b61a..886c25d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -62,6 +62,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "sysemu/kvm.h" +#include "sysemu/dirtylimit.h" #define MAX_THROTTLE (128 << 20) /* Migration transfer speed throttling */ @@ -1112,6 +1113,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->remaining = ram_bytes_remaining(); info->ram->dirty_pages_rate = ram_counters.dirty_pages_rate; } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_us_per_full = true; +info->dirty_limit_throttle_us_per_full = +dirtylimit_throttle_us_per_full(); + +info->has_dirty_limit_us_ring_full = true; +info->dirty_limit_us_ring_full = dirtylimit_us_ring_full(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 9ad6ee5..9d02baf 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -339,6 +339,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_us_per_full) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIi64 " us\n", + info->dirty_limit_throttle_us_per_full); +} + +if (info->has_dirty_limit_us_ring_full) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIi64 " us\n", + info->dirty_limit_us_ring_full); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/qapi/migration.json b/qapi/migration.json index af6b2da..62db5cb 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -242,6 +242,12 @@ # Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-us-per-full: Throttle time (us) during the period of +#dirty ring full (since 7.1) +# +# @dirty-limit-us-ring-full: Estimated periodic time (us) of dirty ring full. +#(since 7.1) s/7.1/7.3/ Could you enrich the document for the new fields? For example, currently you only report throttle time for vcpu0 on the 1st field, while for the latter it's an average of all vcpus. These need to be mentioned. > OTOH, how do you normally use these values? Maybe that can also be added into the documents too. Of course yes. I'll do that next version +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -259,7 +265,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-us-per-full': 'int64', + '*dirty-limit-us-ring-full': 'int64'} } ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 3f3c405..9d1df9b 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -573,6 +573,37 @@ static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index) return info; } +/* Pick up first vcpu throttle time by default */ +int64_t dirtylimit_throttle_us_per_full(void) +{ +CPUState *cpu = first_cpu
Re: [PATCH v2 08/11] migration: Export dirty-limit time info
在 2022/11/22 0:26, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) Export dirty limit throttle time and estimated ring full time, through which we can observe the process of dirty limit during live migration. Signed-off-by: Hyman Huang(黄勇) --- include/sysemu/dirtylimit.h | 2 ++ migration/migration.c | 10 ++ monitor/hmp-cmds.c | 10 ++ qapi/migration.json | 10 +- softmmu/dirtylimit.c| 31 +++ 5 files changed, 62 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3..98cc4a6 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +int64_t dirtylimit_throttle_us_per_full(void); +int64_t dirtylimit_us_ring_full(void); #endif diff --git a/migration/migration.c b/migration/migration.c index 096b61a..886c25d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -62,6 +62,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "sysemu/kvm.h" +#include "sysemu/dirtylimit.h" #define MAX_THROTTLE (128 << 20) /* Migration transfer speed throttling */ @@ -1112,6 +1113,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->remaining = ram_bytes_remaining(); info->ram->dirty_pages_rate = ram_counters.dirty_pages_rate; } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_us_per_full = true; +info->dirty_limit_throttle_us_per_full = +dirtylimit_throttle_us_per_full(); + +info->has_dirty_limit_us_ring_full = true; +info->dirty_limit_us_ring_full = dirtylimit_us_ring_full(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 9ad6ee5..9d02baf 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -339,6 +339,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_us_per_full) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIi64 " us\n", + info->dirty_limit_throttle_us_per_full); +} + +if (info->has_dirty_limit_us_ring_full) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIi64 " us\n", + info->dirty_limit_us_ring_full); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/qapi/migration.json b/qapi/migration.json index af6b2da..62db5cb 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -242,6 +242,12 @@ # Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-us-per-full: Throttle time (us) during the period of +#dirty ring full (since 7.1) +# > +# @dirty-limit-us-ring-full: Estimated periodic time (us) of dirty ring full. +#(since 7.1) How about the following documents: # @dirty-limit-throttletime-each-round: Max throttle time (us) of all virtual CPUs each dirty ring # full round, used to observe if dirty-limit take effect # during live migration. (since 7.3) # # @dirty-limit-ring-full-time: Estimated average dirty ring full time (us) each round, note that # the value equals dirty ring memory size divided by average dirty # page rate of virtual CPU, which can be used to observe the average # memory load of virtual CPU indirectly. (since 7.3) Is it more easy-understanding ? +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -259,7 +265,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-us-per-full': 'int64', + '*dirty-limit-us-ring-full': 'int64'} } > ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 3f3c405..9d1df9b 100644 ---
Re: [PULL 06/30] softmmu/dirtylimit: Implement virtual CPU throttle
在 2022/7/29 22:14, Richard Henderson 写道: On 7/29/22 06:31, Peter Maydell wrote: On Wed, 20 Jul 2022 at 12:30, Dr. David Alan Gilbert (git) wrote: From: Hyman Huang(黄勇) Setup a negative feedback system when vCPU thread handling KVM_EXIT_DIRTY_RING_FULL exit by introducing throttle_us_per_full field in struct CPUState. Sleep throttle_us_per_full microseconds to throttle vCPU if dirtylimit is in service. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Peter Xu Message-Id: <977e808e03a1cef5151cae75984658b6821be618.1656177590.git.huang...@chinatelecom.cn> Signed-off-by: Dr. David Alan Gilbert Hi; Coverity points out a problem with this code (CID 1490787): Thanks for pointing out this bug. I'm making a access request for https://scan.coverity.com so that coverity problem can be found once new series be posted. Hoping this bug doesn't appear anymore. :) +static inline int64_t dirtylimit_dirty_ring_full_time(uint64_t dirtyrate) +{ + static uint64_t max_dirtyrate; + uint32_t dirty_ring_size = kvm_dirty_ring_size(); + uint64_t dirty_ring_size_meory_MB = + dirty_ring_size * TARGET_PAGE_SIZE >> 20; Because dirty_ring_size and TARGET_PAGE_SIZE are both 32 bits, this multiplication will be done as a 32-bit operation, which could overflow. You should cast one of the operands to uint64_t to ensure that the operation is done as a 64 bit multiplication. To compute MB, you don't need multiplication: dirty_ring_size >> (20 - TARGET_PAGE_BITS) In addition, why the mismatch in type? dirty_ring_size_memory_MB can never be larger than dirty_ring_size. r~ I'll post bugfix patch later as your suggestion, please review, thanks. Side note: typo in the variable name: should be 'memory'. + if (max_dirtyrate < dirtyrate) { + max_dirtyrate = dirtyrate; + } + + return dirty_ring_size_meory_MB * 100 / max_dirtyrate; +} thanks -- PMM
Re: [PATCH 0/8] migration: introduce dirtylimit capability
Ping. How about this series? hoping to get comments if anyone has played with it. Thanks ! Hyman 在 2022/7/23 15:49, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) Abstract This series added a new migration capability called "dirtylimit". It can be enabled when dirty ring is enabled, and it'll improve the vCPU performance during the process of migration. It is based on the previous patchset: https://lore.kernel.org/qemu-devel/cover.1656177590.git.huang...@chinatelecom.cn/ As mentioned in patchset "support dirty restraint on vCPU", dirtylimit way of migration can make the read-process not be penalized. This series wires up the vcpu dirty limit and wrappers as dirtylimit capability of migration. I introduce two parameters vcpu-dirtylimit-period and vcpu-dirtylimit to implement the setup of dirtylimit during live migration. To validate the implementation, i tested a 32 vCPU vm live migration with such model: Only dirty vcpu0, vcpu1 with heavy memory workoad and leave the rest vcpus untouched, running unixbench on the vpcu8-vcpu15 by setup the cpu affinity as the following command: taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} The following are results: host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| From the data above we can draw a conclusion that vcpus that do not dirty memory in vm are almost unaffected during the dirtylimit migration, but the auto converge way does. I also tested the total time of dirtylimit migration with variable dirty memory size in vm. senario 1: host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |---++---| | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) | |---++---| | 60| 2014 | 2131 | | 70| 5381 | 12590 | | 90| 6037 | 33545 | | 110 | 7660 | [*] | |---++---| [*]: This case means migration is not convergent. senario 2: host cpu: Intel(R) Xeon(R) CPU E5-2650 host interface speed: 1Mb/s |---++---| | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) | |---++---| | 1600 | 15842 | 27548 | | 2000 | 19026 | 38447 | | 2400 | 19897 | 46381 | | 2800 | 22338 | 57149 | |---++---| Above data shows that dirtylimit way of migration can also reduce the total time of migration and it achieves convergence more easily in some case. In addition to implement dirtylimit capability itself, this series add 3 tests for migration, aiming at playing around for developer simply: 1. qtest for dirty limit migration 2. support dirty ring way of migration for guestperf tool 3. support dirty limit migration for guestperf tool Please review, thanks ! Hyman Huang (8): qapi/migration: Introduce x-vcpu-dirty-limit-period parameter qapi/migration: Introduce vcpu-dirty-limit parameters migration: Introduce dirty-limit capability migration: Implement dirty-limit convergence algo migration: Export dirty-limit time info tests: Add migration dirty-limit capability test tests/migration: Introduce dirty-ring-size option into guestperf tests/migration: Introduce dirty-limit into guestperf include/sysemu/dirtylimit.h | 2 + migration/migration.c | 50 ++ migration/migration.h | 1 + migration/ram.c | 53 ++- migration/trace-events | 1 + monitor/hmp-cmds.c | 26 ++ qapi/migration.json | 57 softmmu/dirtylimit.c| 33 +++- tests/migration/guestperf/comparison.py | 14 + tests/migration/guestperf/engine.py
[PATCH QEMU v9 0/9] migration: introduce dirtylimit capability
Markus, thank Markus for crafting the comments please review the latest version. Yong v7~v9: Rebase on master, fix conflicts and craft the docs suggested by Markus v6: 1. Rebase on master 2. Split the commit "Implement dirty-limit convergence algo" into two as Juan suggested as the following: a. Put the detection logic before auto-converge checking b. Implement dirty-limit convergence algo 3. Put the detection logic before auto-converge checking 4. Sort the migrate_dirty_limit function in commit "Introduce dirty-limit capability" suggested by Juan 5. Substitute the the int64_t to uint64_t in the last 2 commits 6. Fix the comments spell mistake 7. Add helper function in the commit "Implement dirty-limit convergence algo" suggested by Juan v5: 1. Rebase on master and enrich the comment for "dirty-limit" capability, suggesting by Markus. 2. Drop commits that have already been merged. v4: 1. Polish the docs and update the release version suggested by Markus 2. Rename the migrate exported info "dirty-limit-throttle-time-per- round" to "dirty-limit-throttle-time-per-full". v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter Hyman Huang(黄勇) (9): softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" qapi/migration: Introduce x-vcpu-dirty-limit-period parameter qapi/migration: Introduce vcpu-dirty-limit parameters migration: Introduce dirty-limit capability migration: Refactor auto-converge capability logic migration: Put the detection logic before auto-converge checking migration: Implement dirty-limit convergence algorithm migration: Extend query-migrate to provide dirty-limit info tests: Add migration dirty-limit capability test include/sysemu/dirtylimit.h| 2 + migration/migration-hmp-cmds.c | 26 ++ migration/migration.c | 13 +++ migration/options.c| 73 migration/options.h| 1 + migration/ram.c| 61 ++--- migration/trace-events | 1 + qapi/migration.json| 72 +-- softmmu/dirtylimit.c | 91 +-- tests/qtest/migration-test.c | 155 + 10 files changed, 470 insertions(+), 25 deletions(-) -- 2.38.5
[PATCH QEMU v9 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirty page rate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 28 +++ qapi/migration.json| 35 +++--- 3 files changed, 64 insertions(+), 7 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..352e9ec716 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) } } } + +monitor_printf(mon, "%s: %" PRIu64 " ms\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), +params->x_vcpu_dirty_limit_period); } qapi_free_MigrationParameters(params); @@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; +case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: +p->has_x_vcpu_dirty_limit_period = true; +visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 5a9505adf7..1de63ba775 100644 --- a/migration/options.c +++ b/migration/options.c @@ -80,6 +80,8 @@ #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ + Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, store_global_state, true), @@ -163,6 +165,9 @@ Property migration_properties[] = { DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds), DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname), DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz), +DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, + parameters.x_vcpu_dirty_limit_period, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) s->parameters.block_bitmap_mapping); } +params->has_x_vcpu_dirty_limit_period = true; +params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; + return params; } @@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_max = true; params->has_announce_rounds = true; params->has_announce_step = true; +params->has_x_vcpu_dirty_limit_period = true; } /* @@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) } #endif +if (params->has_x_vcpu_dirty_limit_period && +(params->x_vcpu_dirty_limit_period < 1 || + params->x_vcpu_dirty_limit_period > 1000)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "x-vcpu-dirty-limit-period", + "a value between 1 and 1000"); +return false; +} + return true; } @@ -1199,6 +1217,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->has_block_bitmap_mapping = true; dest->block_bitmap_mapping = params->block_bitmap_mapping; } + +if (params->has_x_vcpu_dirty_limit_period) { +dest->x_vcpu_dirty_limit_period = +params->x_vcpu_dirty_limit_period; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) QAPI_CLONE(BitmapMigrationNodeAliasList, params->block_bitmap_mapping); } + +if (params->has_x_vcpu_dirty_limit_period) { +
[PATCH QEMU v9 6/9] migration: Put the detection logic before auto-converge checking
From: Hyman Huang(黄勇) This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index f31de47a47..1d9300f4c5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs) return; } -if (migrate_auto_converge()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +if (migrate_auto_converge()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); } -- 2.38.5
[PATCH QEMU v9 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
From: Hyman Huang(黄勇) dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid if less than 0, so add parameter check for it. Note that this patch also delete the unsolicited help message and clean up the code. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- softmmu/dirtylimit.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 015a9038d1..5c12d26d49 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict *qdict) int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1); Error *err = NULL; -qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); -if (err) { -hmp_handle_error(mon, err); -return; +if (dirty_rate < 0) { +error_setg(&err, "invalid dirty page limit %ld", dirty_rate); +goto out; } -monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query " - "dirty limit for virtual CPU]\n"); +qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); + +out: +hmp_handle_error(mon, err); } static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index) -- 2.38.5
[PATCH QEMU v9 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters
From: Hyman Huang(黄勇) Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 21 + qapi/migration.json| 18 +++--- 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 352e9ec716..35e8020bbf 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 " ms\n", MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), params->x_vcpu_dirty_limit_period); + +monitor_printf(mon, "%s: %" PRIu64 " MB/s\n", +MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT), +params->vcpu_dirty_limit); } qapi_free_MigrationParameters(params); @@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); break; +case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT: +p->has_vcpu_dirty_limit = true; +visit_type_size(v, param, &p->vcpu_dirty_limit, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 1de63ba775..7d2d98830e 100644 --- a/migration/options.c +++ b/migration/options.c @@ -81,6 +81,7 @@ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1 /* MB/s */ Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, @@ -168,6 +169,9 @@ Property migration_properties[] = { DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, parameters.x_vcpu_dirty_limit_period, DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), +DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState, + parameters.vcpu_dirty_limit, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->has_x_vcpu_dirty_limit_period = true; params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; +params->has_vcpu_dirty_limit = true; +params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit; return params; } @@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_rounds = true; params->has_announce_step = true; params->has_x_vcpu_dirty_limit_period = true; +params->has_vcpu_dirty_limit = true; } /* @@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) return false; } +if (params->has_vcpu_dirty_limit && +(params->vcpu_dirty_limit < 1)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "vcpu_dirty_limit", + "is invalid, it must greater then 1 MB/s"); +return false; +} + return true; } @@ -1222,6 +1237,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +dest->vcpu_dirty_limit = params->vcpu_dirty_limit; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) s->parameters.x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit; +} } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/qapi/migration.json b/qapi/migration.json index 363055d252..7e92dfa045 100644 --- a/qapi/migration.json
[PATCH QEMU v9 7/9] migration: Implement dirty-limit convergence algorithm
From: Hyman Huang(黄勇) Implement dirty-limit convergence algorithm for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration.c | 3 +++ migration/ram.c| 36 migration/trace-events | 1 + softmmu/dirtylimit.c | 29 + 4 files changed, 69 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 91bba630a8..619af62461 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -166,6 +166,9 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +if (migrate_dirty_limit()) { +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); +} migrate_fd_cancel(current_migration); } diff --git a/migration/ram.c b/migration/ram.c index 1d9300f4c5..9040d66e61 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -46,6 +46,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -59,6 +60,8 @@ #include "multifd.h" #include "sysemu/runstate.h" #include "options.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +/* + * dirty page rate quota for all vCPUs fetched from + * migration parameter 'vcpu_dirty_limit' + */ +static int64_t quota_dirtyrate; +MigrationState *s = migrate_get_current(); + +/* + * If dirty limit already enabled and migration parameter + * vcpu-dirty-limit untouched. + */ +if (dirtylimit_in_service() && +quota_dirtyrate == s->parameters.vcpu_dirty_limit) { +return; +} + +quota_dirtyrate = s->parameters.vcpu_dirty_limit; + +/* + * Set all vCPU a quota dirtyrate, note that the second + * parameter will be ignored if setting all vCPU for the vm + */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} + static void migration_trigger_throttle(RAMState *rs) { uint64_t threshold = migrate_throttle_trigger_threshold(); @@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs) trace_migration_throttle(); mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit()) { +migration_dirty_limit_guest(); } } } diff --git a/migration/trace-events b/migration/trace-events index 5259c1044b..580895e86e 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx" migration_throttle(void) "" +migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" PRIi64 " MB/s" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x" diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 953ef934bc..5134296667 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void) dirtylimit_state_finalize(); } +/* + * dirty page rate limit is not allowed to set if migration + * is running with dirty-limit capability enabled. + */ +static bool dirtylimit_is_allowed(void) +{ +MigrationState *ms = migrate_get_current(); + +if (migration_is_running(ms->state) && +(!qemu_thread_is_
[PATCH QEMU v9 4/9] migration: Introduce dirty-limit capability
From: Hyman Huang(黄勇) Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/options.c | 24 migration/options.h | 1 + qapi/migration.json | 9 - softmmu/dirtylimit.c | 12 +++- 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/migration/options.c b/migration/options.c index 7d2d98830e..631c12cf32 100644 --- a/migration/options.c +++ b/migration/options.c @@ -27,6 +27,7 @@ #include "qemu-file.h" #include "ram.h" #include "options.h" +#include "sysemu/kvm.h" /* Maximum migrate downtime set to 2000 seconds */ #define MAX_MIGRATE_DOWNTIME_SECONDS 2000 @@ -196,6 +197,8 @@ Property migration_properties[] = { #endif DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), +DEFINE_PROP_MIG_CAP("x-dirty-limit", +MIGRATION_CAPABILITY_DIRTY_LIMIT), DEFINE_PROP_END_OF_LIST(), }; @@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void) return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS]; } +bool migrate_dirty_limit(void) +{ +MigrationState *s = migrate_get_current(); + +return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT]; +} + bool migrate_events(void) { MigrationState *s = migrate_get_current(); @@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) } } +if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) { +if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) { +error_setg(errp, "dirty-limit conflicts with auto-converge" + " either of then available currently"); +return false; +} + +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty-limit requires KVM with accelerator" + " property 'dirty-ring-size' set"); +return false; +} +} + return true; } diff --git a/migration/options.h b/migration/options.h index 9aaf363322..b5a950d4e4 100644 --- a/migration/options.h +++ b/migration/options.h @@ -24,6 +24,7 @@ extern Property migration_properties[]; /* capabilities */ bool migrate_auto_converge(void); +bool migrate_dirty_limit(void); bool migrate_background_snapshot(void); bool migrate_block(void); bool migrate_colo(void); diff --git a/qapi/migration.json b/qapi/migration.json index 7e92dfa045..1c289e6658 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -497,6 +497,12 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # +# @dirty-limit: If enabled, migration will throttle vCPUs as needed to +# keep their dirty page rate within @vcpu-dirty-limit. This can +# improve responsiveness of large guests during live migration, +# and can result in more stable read performance. Requires KVM +# with accelerator property "dirty-ring-size" set. (Since 8.1) +# # Features: # # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -512,7 +518,8 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] } + 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', + 'dirty-limit'] } ## # @MigrationCapabilityStatus: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5c12d26d49..953ef934bc 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -24,6 +24,9 @@ #include "hw/boards.h" #include "sysemu/kvm.h" #include "trace.h" +#include "migration/misc.h" +#include "migration/migration.h" +#include "migration/options.h" /* * Dirtylimit stop working if dirty p
[PATCH QEMU v9 8/9] migration: Extend query-migrate to provide dirty-limit info
From: Hyman Huang(黄勇) Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- include/sysemu/dirtylimit.h| 2 ++ migration/migration-hmp-cmds.c | 10 + migration/migration.c | 10 + qapi/migration.json| 16 +- softmmu/dirtylimit.c | 39 ++ 5 files changed, 76 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3a6b..d11edb 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +uint64_t dirtylimit_throttle_time_per_round(void); +uint64_t dirtylimit_ring_full_time(void); #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 35e8020bbf..c115ef2d23 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_time_per_round) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n", + info->dirty_limit_throttle_time_per_round); +} + +if (info->has_dirty_limit_ring_full_time) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n", + info->dirty_limit_ring_full_time); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/migration/migration.c b/migration/migration.c index 619af62461..3b8587c4ae 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -64,6 +64,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "options.h" +#include "sysemu/dirtylimit.h" static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->dirty_pages_rate = stat64_get(&mig_stats.dirty_pages_rate); } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_time_per_round = true; +info->dirty_limit_throttle_time_per_round = +dirtylimit_throttle_time_per_round(); + +info->has_dirty_limit_ring_full_time = true; +info->dirty_limit_ring_full_time = dirtylimit_ring_full_time(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/qapi/migration.json b/qapi/migration.json index 1c289e6658..8740ce268c 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -250,6 +250,18 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) each dirty ring full round. The value +# equals dirty ring memory size divided by average dirty page +# rate of the virtual CPU, which can be used to observe the +# average memory load of the virtual CPU indirectly. Note that +# zero means guest doesn't dirty memory. (since 8.1) +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -267,7 +279,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-time-per-round': 'uint64', + '*dirty-limit-ring-full-time': 'uint64'} } ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5134296667..a0686323e5 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -565,6 +565,45 @@ out: hmp_handle_error(mon, err); } +/* Return the max throttle time of each virtual CPU */ +uint64_t dirtylimit_throttle_time_per_round(void) +{ +CPUState *cpu; +
[PATCH QEMU v9 5/9] migration: Refactor auto-converge capability logic
From: Hyman Huang(黄勇) Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/ram.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 0ada6477e8..f31de47a47 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs) /* During block migration the auto-converge logic incorrectly detects * that ram migration makes no progress. Avoid this by disabling the * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { +if (blk_mig_bulk_active()) { +return; +} + +if (migrate_auto_converge()) { /* The following detection logic can be refined later. For now: Check to see if the ratio between dirtied bytes and the approx. amount of bytes that just got transferred since the last time -- 2.38.5
[PATCH QEMU v9 9/9] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/migration-test.c | 155 +++ 1 file changed, 155 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index e256da1216..e6f77d176c 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2743,6 +2743,159 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * restart target + * migrate + * + * And see that if dirty limit works correctly + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +.only_target = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Restart dst vm, src vm already show up so we needn't wait anymore */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Start migrate */ +migrate_qmp(from, uri, "{}&
[PATCH QEMU v10 5/9] migration: Refactor auto-converge capability logic
From: Hyman Huang(黄勇) Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/ram.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 0ada6477e8..f31de47a47 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs) /* During block migration the auto-converge logic incorrectly detects * that ram migration makes no progress. Avoid this by disabling the * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { +if (blk_mig_bulk_active()) { +return; +} + +if (migrate_auto_converge()) { /* The following detection logic can be refined later. For now: Check to see if the ratio between dirtied bytes and the approx. amount of bytes that just got transferred since the last time -- 2.38.5
[PATCH QEMU v10 4/9] migration: Introduce dirty-limit capability
From: Hyman Huang(黄勇) Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/options.c | 24 migration/options.h | 1 + qapi/migration.json | 9 - softmmu/dirtylimit.c | 12 +++- 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/migration/options.c b/migration/options.c index 7d2d98830e..631c12cf32 100644 --- a/migration/options.c +++ b/migration/options.c @@ -27,6 +27,7 @@ #include "qemu-file.h" #include "ram.h" #include "options.h" +#include "sysemu/kvm.h" /* Maximum migrate downtime set to 2000 seconds */ #define MAX_MIGRATE_DOWNTIME_SECONDS 2000 @@ -196,6 +197,8 @@ Property migration_properties[] = { #endif DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), +DEFINE_PROP_MIG_CAP("x-dirty-limit", +MIGRATION_CAPABILITY_DIRTY_LIMIT), DEFINE_PROP_END_OF_LIST(), }; @@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void) return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS]; } +bool migrate_dirty_limit(void) +{ +MigrationState *s = migrate_get_current(); + +return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT]; +} + bool migrate_events(void) { MigrationState *s = migrate_get_current(); @@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) } } +if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) { +if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) { +error_setg(errp, "dirty-limit conflicts with auto-converge" + " either of then available currently"); +return false; +} + +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty-limit requires KVM with accelerator" + " property 'dirty-ring-size' set"); +return false; +} +} + return true; } diff --git a/migration/options.h b/migration/options.h index 9aaf363322..b5a950d4e4 100644 --- a/migration/options.h +++ b/migration/options.h @@ -24,6 +24,7 @@ extern Property migration_properties[]; /* capabilities */ bool migrate_auto_converge(void); +bool migrate_dirty_limit(void); bool migrate_background_snapshot(void); bool migrate_block(void); bool migrate_colo(void); diff --git a/qapi/migration.json b/qapi/migration.json index 535fc27403..b4d9100ef3 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -497,6 +497,12 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # +# @dirty-limit: If enabled, migration will throttle vCPUs as needed to +# keep their dirty page rate within @vcpu-dirty-limit. This can +# improve responsiveness of large guests during live migration, +# and can result in more stable read performance. Requires KVM +# with accelerator property "dirty-ring-size" set. (Since 8.2) +# # Features: # # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -512,7 +518,8 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] } + 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', + 'dirty-limit'] } ## # @MigrationCapabilityStatus: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5c12d26d49..953ef934bc 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -24,6 +24,9 @@ #include "hw/boards.h" #include "sysemu/kvm.h" #include "trace.h" +#include "migration/misc.h" +#include "migration/migration.h" +#include "migration/options.h" /* * Dirtylimit stop working if dirty p
[PATCH QEMU v10 9/9] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/migration-test.c | 155 +++ 1 file changed, 155 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index e256da1216..e6f77d176c 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2743,6 +2743,159 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * restart target + * migrate + * + * And see that if dirty limit works correctly + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +.only_target = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Restart dst vm, src vm already show up so we needn't wait anymore */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Start migrate */ +migrate_qmp(from, uri, "{}&
[PATCH QEMU v10 8/9] migration: Extend query-migrate to provide dirty-limit info
From: Hyman Huang(黄勇) Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- include/sysemu/dirtylimit.h| 2 ++ migration/migration-hmp-cmds.c | 10 + migration/migration.c | 10 + qapi/migration.json| 16 +- softmmu/dirtylimit.c | 39 ++ 5 files changed, 76 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3a6b..d11edb 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +uint64_t dirtylimit_throttle_time_per_round(void); +uint64_t dirtylimit_ring_full_time(void); #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 35e8020bbf..c115ef2d23 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_time_per_round) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n", + info->dirty_limit_throttle_time_per_round); +} + +if (info->has_dirty_limit_ring_full_time) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n", + info->dirty_limit_ring_full_time); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/migration/migration.c b/migration/migration.c index 619af62461..3b8587c4ae 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -64,6 +64,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "options.h" +#include "sysemu/dirtylimit.h" static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->dirty_pages_rate = stat64_get(&mig_stats.dirty_pages_rate); } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_time_per_round = true; +info->dirty_limit_throttle_time_per_round = +dirtylimit_throttle_time_per_round(); + +info->has_dirty_limit_ring_full_time = true; +info->dirty_limit_ring_full_time = dirtylimit_ring_full_time(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/qapi/migration.json b/qapi/migration.json index b4d9100ef3..7bf4b30614 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -250,6 +250,18 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (Since 8.2) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) each dirty ring full round. The value +# equals dirty ring memory size divided by average dirty page +# rate of the virtual CPU, which can be used to observe the +# average memory load of the virtual CPU indirectly. Note that +# zero means guest doesn't dirty memory. (Since 8.2) +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -267,7 +279,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-time-per-round': 'uint64', + '*dirty-limit-ring-full-time': 'uint64'} } ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5134296667..a0686323e5 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -565,6 +565,45 @@ out: hmp_handle_error(mon, err); } +/* Return the max throttle time of each virtual CPU */ +uint64_t dirtylimit_throttle_time_per_round(void) +{ +CPUState *cpu; +
[PATCH QEMU v10 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
From: Hyman Huang(黄勇) dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid if less than 0, so add parameter check for it. Note that this patch also delete the unsolicited help message and clean up the code. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- softmmu/dirtylimit.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 015a9038d1..5c12d26d49 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict *qdict) int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1); Error *err = NULL; -qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); -if (err) { -hmp_handle_error(mon, err); -return; +if (dirty_rate < 0) { +error_setg(&err, "invalid dirty page limit %ld", dirty_rate); +goto out; } -monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query " - "dirty limit for virtual CPU]\n"); +qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); + +out: +hmp_handle_error(mon, err); } static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index) -- 2.38.5
[PATCH QEMU v10 7/9] migration: Implement dirty-limit convergence algorithm
From: Hyman Huang(黄勇) Implement dirty-limit convergence algorithm for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration.c | 3 +++ migration/ram.c| 36 migration/trace-events | 1 + softmmu/dirtylimit.c | 29 + 4 files changed, 69 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 91bba630a8..619af62461 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -166,6 +166,9 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +if (migrate_dirty_limit()) { +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); +} migrate_fd_cancel(current_migration); } diff --git a/migration/ram.c b/migration/ram.c index 1d9300f4c5..9040d66e61 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -46,6 +46,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -59,6 +60,8 @@ #include "multifd.h" #include "sysemu/runstate.h" #include "options.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +/* + * dirty page rate quota for all vCPUs fetched from + * migration parameter 'vcpu_dirty_limit' + */ +static int64_t quota_dirtyrate; +MigrationState *s = migrate_get_current(); + +/* + * If dirty limit already enabled and migration parameter + * vcpu-dirty-limit untouched. + */ +if (dirtylimit_in_service() && +quota_dirtyrate == s->parameters.vcpu_dirty_limit) { +return; +} + +quota_dirtyrate = s->parameters.vcpu_dirty_limit; + +/* + * Set all vCPU a quota dirtyrate, note that the second + * parameter will be ignored if setting all vCPU for the vm + */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} + static void migration_trigger_throttle(RAMState *rs) { uint64_t threshold = migrate_throttle_trigger_threshold(); @@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs) trace_migration_throttle(); mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit()) { +migration_dirty_limit_guest(); } } } diff --git a/migration/trace-events b/migration/trace-events index 5259c1044b..580895e86e 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx" migration_throttle(void) "" +migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" PRIi64 " MB/s" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x" diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 953ef934bc..5134296667 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void) dirtylimit_state_finalize(); } +/* + * dirty page rate limit is not allowed to set if migration + * is running with dirty-limit capability enabled. + */ +static bool dirtylimit_is_allowed(void) +{ +MigrationState *ms = migrate_get_current(); + +if (migration_is_running(ms->state) && +(!qemu_thread_is_
[PATCH QEMU v10 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters
From: Hyman Huang(黄勇) Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 21 + qapi/migration.json| 18 +++--- 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 352e9ec716..35e8020bbf 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 " ms\n", MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), params->x_vcpu_dirty_limit_period); + +monitor_printf(mon, "%s: %" PRIu64 " MB/s\n", +MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT), +params->vcpu_dirty_limit); } qapi_free_MigrationParameters(params); @@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); break; +case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT: +p->has_vcpu_dirty_limit = true; +visit_type_size(v, param, &p->vcpu_dirty_limit, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 1de63ba775..7d2d98830e 100644 --- a/migration/options.c +++ b/migration/options.c @@ -81,6 +81,7 @@ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1 /* MB/s */ Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, @@ -168,6 +169,9 @@ Property migration_properties[] = { DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, parameters.x_vcpu_dirty_limit_period, DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), +DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState, + parameters.vcpu_dirty_limit, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->has_x_vcpu_dirty_limit_period = true; params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; +params->has_vcpu_dirty_limit = true; +params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit; return params; } @@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_rounds = true; params->has_announce_step = true; params->has_x_vcpu_dirty_limit_period = true; +params->has_vcpu_dirty_limit = true; } /* @@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) return false; } +if (params->has_vcpu_dirty_limit && +(params->vcpu_dirty_limit < 1)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "vcpu_dirty_limit", + "is invalid, it must greater then 1 MB/s"); +return false; +} + return true; } @@ -1222,6 +1237,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +dest->vcpu_dirty_limit = params->vcpu_dirty_limit; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) s->parameters.x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit; +} } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/qapi/migration.json b/qapi/migration.json index 16ba4e78df..535fc27403 100644 --- a/qapi/migration.json
[PATCH QEMU v10 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirty page rate calculation period configurable. Currently, as the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes. Test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 28 +++ qapi/migration.json| 35 +++--- 3 files changed, 64 insertions(+), 7 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..352e9ec716 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) } } } + +monitor_printf(mon, "%s: %" PRIu64 " ms\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), +params->x_vcpu_dirty_limit_period); } qapi_free_MigrationParameters(params); @@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; +case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: +p->has_x_vcpu_dirty_limit_period = true; +visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 5a9505adf7..1de63ba775 100644 --- a/migration/options.c +++ b/migration/options.c @@ -80,6 +80,8 @@ #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ + Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, store_global_state, true), @@ -163,6 +165,9 @@ Property migration_properties[] = { DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds), DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname), DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz), +DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, + parameters.x_vcpu_dirty_limit_period, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) s->parameters.block_bitmap_mapping); } +params->has_x_vcpu_dirty_limit_period = true; +params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; + return params; } @@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_max = true; params->has_announce_rounds = true; params->has_announce_step = true; +params->has_x_vcpu_dirty_limit_period = true; } /* @@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) } #endif +if (params->has_x_vcpu_dirty_limit_period && +(params->x_vcpu_dirty_limit_period < 1 || + params->x_vcpu_dirty_limit_period > 1000)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "x-vcpu-dirty-limit-period", + "a value between 1 and 1000"); +return false; +} + return true; } @@ -1199,6 +1217,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->has_block_bitmap_mapping = true; dest->block_bitmap_mapping = params->block_bitmap_mapping; } + +if (params->has_x_vcpu_dirty_limit_period) { +dest->x_vcpu_dirty_limit_period = +params->x_vcpu_dirty_limit_period; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) QAPI_CLONE(BitmapMigrationNodeAliasList, params->block_bitmap_mapping); } + +if (params->has_x_vcpu_dirty_limit_period) { +s->
[PATCH QEMU v10 0/9] migration: introduce dirtylimit capability
Hi, Juan, Markus and i has crafted docs for the series, please use the latest version to make a pull request if it is convenient to you. No functional changes since v6. Thanks. Yong v7~v10: Rebase on master, update "Since" tags to 8.2, fix conflicts and craft the docs suggested by Markus v6: 1. Rebase on master 2. Split the commit "Implement dirty-limit convergence algo" into two as Juan suggested as the following: a. Put the detection logic before auto-converge checking b. Implement dirty-limit convergence algo 3. Put the detection logic before auto-converge checking 4. Sort the migrate_dirty_limit function in commit "Introduce dirty-limit capability" suggested by Juan 5. Substitute the the int64_t to uint64_t in the last 2 commits 6. Fix the comments spell mistake 7. Add helper function in the commit "Implement dirty-limit convergence algo" suggested by Juan v5: 1. Rebase on master and enrich the comment for "dirty-limit" capability, suggesting by Markus. 2. Drop commits that have already been merged. v4: 1. Polish the docs and update the release version suggested by Markus 2. Rename the migrate exported info "dirty-limit-throttle-time-per- round" to "dirty-limit-throttle-time-per-full". v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter Hyman Huang(黄勇) (9): softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" qapi/migration: Introduce x-vcpu-dirty-limit-period parameter qapi/migration: Introduce vcpu-dirty-limit parameters migration: Introduce dirty-limit capability migration: Refactor auto-converge capability logic migration: Put the detection logic before auto-converge checking migration: Implement dirty-limit convergence algorithm migration: Extend query-migrate to provide dirty-limit info tests: Add migration dirty-limit capability test include/sysemu/dirtylimit.h| 2 + migration/migration-hmp-cmds.c | 26 ++ migration/migration.c | 13 +++ migration/options.c| 73 migration/options.h| 1 + migration/ram.c| 61 ++--- migration/trace-events | 1 + qapi/migration.json| 72 +-- softmmu/dirtylimit.c | 91 +-- tests/qtest/migration-test.c | 155 + 10 files changed, 470 insertions(+), 25 deletions(-) -- 2.38.5
[PATCH QEMU v10 6/9] migration: Put the detection logic before auto-converge checking
From: Hyman Huang(黄勇) This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index f31de47a47..1d9300f4c5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs) return; } -if (migrate_auto_converge()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +if (migrate_auto_converge()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); } -- 2.38.5
[PATCH QEMU 0/2] migration: craft the doc comments
Hi, Markus, This patchset aims to reformat migration doc comments as commit a937b6aa739. Meanwhile, add myself to the dirty-limit feature maintainer list. Please review, Thanks. Yong Hyman Huang(黄勇) (2): qapi: Reformat and craft the migration doc comments MAINTAINERS: Add Hyman Huang to dirty-limit feature MAINTAINERS | 6 + qapi/migration.json | 66 + 2 files changed, 37 insertions(+), 35 deletions(-) -- 2.38.5
[PATCH QEMU 1/2] qapi: Reformat and craft the migration doc comments
From: Hyman Huang(黄勇) Reformat migration doc comments to conform to current conventions as commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions). Also, craft the dirty-limit capability comment. Signed-off-by: Hyman Huang(黄勇) --- qapi/migration.json | 66 + 1 file changed, 31 insertions(+), 35 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 6b49593d2f..5d5649c885 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -258,17 +258,17 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # -# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual -# CPUs each dirty ring full round, which shows how -# MigrationCapability dirty-limit affects the guest -# during live migration. (since 8.1) -# -# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds) -# each dirty ring full round, note that the value equals -# dirty ring memory size divided by average dirty page rate -# of virtual CPU, which can be used to observe the average -# memory load of virtual CPU indirectly. Note that zero -# means guest doesn't dirty memory (since 8.1) +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (Since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) each dirty ring full round. The value +# equals dirty ring memory size divided by average dirty page +# rate of the virtual CPU, which can be used to observe the +# average memory load of the virtual CPU indirectly. Note that +# zero means guest doesn't dirty memory. (Since 8.1) # # Since: 0.14 ## @@ -519,15 +519,11 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # -# @dirty-limit: If enabled, migration will use the dirty-limit algo to -# throttle down guest instead of auto-converge algo. -# Throttle algo only works when vCPU's dirtyrate greater -# than 'vcpu-dirty-limit', read processes in guest os -# aren't penalized any more, so this algo can improve -# performance of vCPU during live migration. This is an -# optional performance feature and should not affect the -# correctness of the existing auto-converge algo. -# (since 8.1) +# @dirty-limit: If enabled, migration will throttle vCPUs as needed to +# keep their dirty page rate within @vcpu-dirty-limit. This can +# improve responsiveness of large guests during live migration, +# and can result in more stable read performance. Requires KVM +# with accelerator property "dirty-ring-size" set. (Since 8.1) # # Features: # @@ -822,17 +818,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms, +# defaults to 1000ms. (Since 8.1) # # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. -#Defaults to 1. (Since 8.1) +# Defaults to 1. (Since 8.1) # # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period -#are experimental. +# are experimental. # # Since: 2.4 ## @@ -988,17 +984,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms, +# defaults to 1000ms. (Since 8.1) # # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. -#Defaults to 1. (Since 8.1) +# Defaults to 1. (Since 8.1) # # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu
[PATCH QEMU 2/2] MAINTAINERS: Add Hyman Huang to dirty-limit feature
From: Hyman Huang(黄勇) Signed-off-by: Hyman Huang(黄勇) --- MAINTAINERS | 6 ++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 12e59b6b27..d72fd63a8e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3437,6 +3437,12 @@ F: hw/core/clock-vmstate.c F: hw/core/qdev-clock.c F: docs/devel/clocks.rst +Dirty-limit feature +M: Hyman Huang +S: Maintained +F: softmmu/dirtylimit.c +F: include/sysemu/dirtylimit.h + Usermode Emulation -- Overall usermode emulation -- 2.38.5
[PATCH QEMU 2/3] qapi: Craft the dirty-limit capability comment
From: Hyman Huang(黄勇) Signed-off-by: Markus Armbruster Signed-off-by: Hyman Huang(黄勇) --- qapi/migration.json | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index a74ade4d72..62ab151da2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -519,14 +519,11 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # -# @dirty-limit: If enabled, migration will use the dirty-limit -# algorithim to throttle down guest instead of auto-converge -# algorithim. Throttle algorithim only works when vCPU's dirtyrate -# greater than 'vcpu-dirty-limit', read processes in guest os -# aren't penalized any more, so this algorithim can improve -# performance of vCPU during live migration. This is an optional -# performance feature and should not affect the correctness of the -# existing auto-converge algorithim. (Since 8.1) +# @dirty-limit: If enabled, migration will throttle vCPUs as needed to +# keep their dirty page rate within @vcpu-dirty-limit. This can +# improve responsiveness of large guests during live migration, +# and can result in more stable read performance. Requires KVM +# with accelerator property "dirty-ring-size" set. (Since 8.1) # # Features: # -- 2.38.5
[PATCH QEMU 3/3] MAINTAINERS: Add Hyman Huang as maintainer
From: Hyman Huang(黄勇) I've built interests in dirty-limit and dirty page rate features and also have been working on projects related to this subsystem. Self-recommand myself as a maintainer for this subsystem so that I can help to improve the dirty-limit algorithm and review the patches about dirty page rate. Signed-off-by: Hyman Huang(黄勇) --- MAINTAINERS | 9 + 1 file changed, 9 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 12e59b6b27..d4b1c91096 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3437,6 +3437,15 @@ F: hw/core/clock-vmstate.c F: hw/core/qdev-clock.c F: docs/devel/clocks.rst +Dirty-limit and dirty page rate feature +M: Hyman Huang +S: Maintained +F: softmmu/dirtylimit.c +F: include/sysemu/dirtylimit.h +F: migration/dirtyrate.c +F: migration/dirtyrate.h +F: include/sysemu/dirtyrate.h + Usermode Emulation -- Overall usermode emulation -- 2.38.5
[PATCH QEMU 0/3] migration: craft the doc comments
Hi, Markus, Juan. Please review the version 2, thanks. v2: - split the first commit in v1 into 2 - add commit message of commit: MAINTAINERS: Add Hyman Huang as maintainer Yong Hyman Huang(黄勇) (3): qapi: Reformat the dirty-limit migration doc comments qapi: Craft the dirty-limit capability comment MAINTAINERS: Add Hyman Huang as maintainer MAINTAINERS | 9 +++ qapi/migration.json | 66 + 2 files changed, 40 insertions(+), 35 deletions(-) -- 2.38.5
[PATCH QEMU 1/3] qapi: Reformat the dirty-limit migration doc comments
From: Hyman Huang(黄勇) Reformat the dirty-limit migration doc comments to conform to current conventions as commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions). Signed-off-by: Markus Armbruster Signed-off-by: Hyman Huang(黄勇) --- qapi/migration.json | 69 ++--- 1 file changed, 34 insertions(+), 35 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 6b49593d2f..a74ade4d72 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -258,17 +258,17 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # -# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual -# CPUs each dirty ring full round, which shows how -# MigrationCapability dirty-limit affects the guest -# during live migration. (since 8.1) -# -# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds) -# each dirty ring full round, note that the value equals -# dirty ring memory size divided by average dirty page rate -# of virtual CPU, which can be used to observe the average -# memory load of virtual CPU indirectly. Note that zero -# means guest doesn't dirty memory (since 8.1) +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (Since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) for each dirty ring full round. The +# value equals the dirty ring memory size divided by the average +# dirty page rate of the virtual CPU, which can be used to +# observe the average memory load of the virtual CPU indirectly. +# Note that zero means guest doesn't dirty memory. (Since 8.1) # # Since: 0.14 ## @@ -519,15 +519,14 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # -# @dirty-limit: If enabled, migration will use the dirty-limit algo to -# throttle down guest instead of auto-converge algo. -# Throttle algo only works when vCPU's dirtyrate greater -# than 'vcpu-dirty-limit', read processes in guest os -# aren't penalized any more, so this algo can improve -# performance of vCPU during live migration. This is an -# optional performance feature and should not affect the -# correctness of the existing auto-converge algo. -# (since 8.1) +# @dirty-limit: If enabled, migration will use the dirty-limit +# algorithim to throttle down guest instead of auto-converge +# algorithim. Throttle algorithim only works when vCPU's dirtyrate +# greater than 'vcpu-dirty-limit', read processes in guest os +# aren't penalized any more, so this algorithim can improve +# performance of vCPU during live migration. This is an optional +# performance feature and should not affect the correctness of the +# existing auto-converge algorithim. (Since 8.1) # # Features: # @@ -822,17 +821,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms. +# Defaults to 1000ms. (Since 8.1) # # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. -#Defaults to 1. (Since 8.1) +# Defaults to 1. (Since 8.1) # # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period -#are experimental. +# are experimental. # # Since: 2.4 ## @@ -988,17 +987,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms. +# Defaults to 1000ms. (Since 8.1) # # @vc
[PATCH QEMU 3/3] tests/migration: Introduce dirty-limit into guestperf
From: Hyman Huang(黄勇) Currently, guestperf does not cover the dirty-limit migration, support this feature. Note that dirty-limit requires 'dirty-ring-size' set. To enable dirty-limit, setting x-vcpu-dirty-limit-period as 500ms and x-vcpu-dirty-limit as 10MB/s: $ ./tests/migration/guestperf.py \ --dirty-ring-size 4096 \ --dirty-limit --x-vcpu-dirty-limit-period 500 \ --vcpu-dirty-limit 10 --output output.json \ To run the entire standardized set of dirty-limit-enabled comparisons, with unix migration: $ ./tests/migration/guestperf-batch.py \ --dirty-ring-size 4096 \ --dst-host localhost --transport unix \ --filter compr-dirty-limit* --output outputdir Signed-off-by: Hyman Huang(黄勇) --- tests/migration/guestperf/comparison.py | 23 +++ tests/migration/guestperf/engine.py | 17 + tests/migration/guestperf/progress.py | 16 ++-- tests/migration/guestperf/scenario.py | 11 ++- tests/migration/guestperf/shell.py | 18 +- 5 files changed, 81 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/comparison.py b/tests/migration/guestperf/comparison.py index c03b3f6d7e..42cc0372d1 100644 --- a/tests/migration/guestperf/comparison.py +++ b/tests/migration/guestperf/comparison.py @@ -135,4 +135,27 @@ COMPARISONS = [ Scenario("compr-multifd-channels-64", multifd=True, multifd_channels=64), ]), + +# Looking at effect of dirty-limit with +# varying x_vcpu_dirty_limit_period +Comparison("compr-dirty-limit-period", scenarios = [ +Scenario("compr-dirty-limit-period-500", + dirty_limit=True, x_vcpu_dirty_limit_period=500), +Scenario("compr-dirty-limit-period-800", + dirty_limit=True, x_vcpu_dirty_limit_period=800), +Scenario("compr-dirty-limit-period-1000", + dirty_limit=True, x_vcpu_dirty_limit_period=1000), +]), + + +# Looking at effect of dirty-limit with +# varying vcpu_dirty_limit +Comparison("compr-dirty-limit", scenarios = [ +Scenario("compr-dirty-limit-10MB", + dirty_limit=True, vcpu_dirty_limit=10), +Scenario("compr-dirty-limit-20MB", + dirty_limit=True, vcpu_dirty_limit=20), +Scenario("compr-dirty-limit-50MB", + dirty_limit=True, vcpu_dirty_limit=50), +]), ] diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index 29ebb5011b..93a6f78e46 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -102,6 +102,8 @@ class Engine(object): info.get("expected-downtime", 0), info.get("setup-time", 0), info.get("cpu-throttle-percentage", 0), +info.get("dirty-limit-throttle-time-per-round", 0), +info.get("dirty-limit-ring-full-time", 0), ) def _migrate(self, hardware, scenario, src, dst, connect_uri): @@ -203,6 +205,21 @@ class Engine(object): resp = dst.command("migrate-set-parameters", multifd_channels=scenario._multifd_channels) +if scenario._dirty_limit: +if not hardware._dirty_ring_size: +raise Exception("dirty ring size must be configured when " +"testing dirty limit migration") + +resp = src.command("migrate-set-capabilities", + capabilities = [ + { "capability": "dirty-limit", + "state": True } + ]) +resp = src.command("migrate-set-parameters", +x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period) +resp = src.command("migrate-set-parameters", + vcpu_dirty_limit=scenario._vcpu_dirty_limit) + resp = src.command("migrate", uri=connect_uri) post_copy = False diff --git a/tests/migration/guestperf/progress.py b/tests/migration/guestperf/progress.py index ab1ee57273..d490584217 100644 --- a/tests/migration/guestperf/progress.py +++ b/tests/migration/guestperf/progress.py @@ -81,7 +81,9 @@ class Progress(object): downtime, downtime_expected, setup_time, - throttle_pcent): + throttle_pcent, + dirty_limit_throttle_time_per_round, + dirty_limit_ring_full_time): self._status = status self._ram = ram @@ -91,6 +93,10 @@ class Progress(object): self._downtime_expected =
[PATCH QEMU 0/3] migration: enrich the dirty-limit test case
Dirty-limit feature was introduced in 8.1, and the test case could be enriched to make sure the behavior and the performance of dirty-limit is exactly what we want. This series add 2 test cases, the first commit aims for the functional test and the others aim for the performance test. Please review, thanks. Yong. Hyman Huang(黄勇) (3): tests: Add migration dirty-limit capability test tests/migration: Introduce dirty-ring-size option into guestperf tests/migration: Introduce dirty-limit into guestperf tests/migration/guestperf/comparison.py | 23 tests/migration/guestperf/engine.py | 23 +++- tests/migration/guestperf/hardware.py | 8 +- tests/migration/guestperf/progress.py | 16 ++- tests/migration/guestperf/scenario.py | 11 +- tests/migration/guestperf/shell.py | 24 +++- tests/qtest/migration-test.c| 155 7 files changed, 252 insertions(+), 8 deletions(-) -- 2.38.5
[PATCH QEMU 1/3] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/migration-test.c | 155 +++ 1 file changed, 155 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 62d3f37021..52b1973afb 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2739,6 +2739,159 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * restart target + * migrate + * + * And see that if dirty limit works correctly + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +.only_target = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Restart dst vm, src vm already show up so we needn't wait anymore */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Start migrate */ +migrate_qmp(from, uri, "{}&
[PATCH QEMU 2/3] tests/migration: Introduce dirty-ring-size option into guestperf
From: Hyman Huang(黄勇) Dirty ring size configuration is not supported by guestperf tool. Introduce dirty-ring-size (ranges in [1024, 65536]) option so developers can play with dirty-ring and dirty-limit feature easier. To set dirty ring size with 4096 during migration test: $ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx Signed-off-by: Hyman Huang(黄勇) --- tests/migration/guestperf/engine.py | 6 +- tests/migration/guestperf/hardware.py | 8 ++-- tests/migration/guestperf/shell.py| 6 +- 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index e69d16a62c..29ebb5011b 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -325,7 +325,6 @@ class Engine(object): cmdline = "'" + cmdline + "'" argv = [ -"-accel", "kvm", "-cpu", "host", "-kernel", self._kernel, "-initrd", self._initrd, @@ -333,6 +332,11 @@ class Engine(object): "-m", str((hardware._mem * 1024) + 512), "-smp", str(hardware._cpus), ] +if hardware._dirty_ring_size: +argv.extend(["-accel", "kvm,dirty-ring-size=%s" % + hardware._dirty_ring_size]) +else: +argv.extend(["-accel", "kvm"]) argv.extend(self._get_qemu_serial_args()) diff --git a/tests/migration/guestperf/hardware.py b/tests/migration/guestperf/hardware.py index 3145785ffd..f779cc050b 100644 --- a/tests/migration/guestperf/hardware.py +++ b/tests/migration/guestperf/hardware.py @@ -23,7 +23,8 @@ class Hardware(object): src_cpu_bind=None, src_mem_bind=None, dst_cpu_bind=None, dst_mem_bind=None, prealloc_pages = False, - huge_pages=False, locked_pages=False): + huge_pages=False, locked_pages=False, + dirty_ring_size=0): self._cpus = cpus self._mem = mem # GiB self._src_mem_bind = src_mem_bind # List of NUMA nodes @@ -33,6 +34,7 @@ class Hardware(object): self._prealloc_pages = prealloc_pages self._huge_pages = huge_pages self._locked_pages = locked_pages +self._dirty_ring_size = dirty_ring_size def serialize(self): @@ -46,6 +48,7 @@ class Hardware(object): "prealloc_pages": self._prealloc_pages, "huge_pages": self._huge_pages, "locked_pages": self._locked_pages, +"dirty_ring_size": self._dirty_ring_size, } @classmethod @@ -59,4 +62,5 @@ class Hardware(object): data["dst_mem_bind"], data["prealloc_pages"], data["huge_pages"], -data["locked_pages"]) +data["locked_pages"], +data["dirty_ring_size"]) diff --git a/tests/migration/guestperf/shell.py b/tests/migration/guestperf/shell.py index 8a809e3dda..7d6b8cd7cf 100644 --- a/tests/migration/guestperf/shell.py +++ b/tests/migration/guestperf/shell.py @@ -60,6 +60,8 @@ class BaseShell(object): parser.add_argument("--prealloc-pages", dest="prealloc_pages", default=False) parser.add_argument("--huge-pages", dest="huge_pages", default=False) parser.add_argument("--locked-pages", dest="locked_pages", default=False) +parser.add_argument("--dirty-ring-size", dest="dirty_ring_size", +default=0, type=int) self._parser = parser @@ -89,7 +91,9 @@ class BaseShell(object): locked_pages=args.locked_pages, huge_pages=args.huge_pages, -prealloc_pages=args.prealloc_pages) +prealloc_pages=args.prealloc_pages, + +dirty_ring_size=args.dirty_ring_size) class Shell(BaseShell): -- 2.38.5
[PATCH QEMU v3 2/3] qapi: Craft the dirty-limit capability comment
From: Hyman Huang(黄勇) Signed-off-by: Markus Armbruster Signed-off-by: Hyman Huang(黄勇) --- qapi/migration.json | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index a74ade4d72..62ab151da2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -519,14 +519,11 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # -# @dirty-limit: If enabled, migration will use the dirty-limit -# algorithim to throttle down guest instead of auto-converge -# algorithim. Throttle algorithim only works when vCPU's dirtyrate -# greater than 'vcpu-dirty-limit', read processes in guest os -# aren't penalized any more, so this algorithim can improve -# performance of vCPU during live migration. This is an optional -# performance feature and should not affect the correctness of the -# existing auto-converge algorithim. (Since 8.1) +# @dirty-limit: If enabled, migration will throttle vCPUs as needed to +# keep their dirty page rate within @vcpu-dirty-limit. This can +# improve responsiveness of large guests during live migration, +# and can result in more stable read performance. Requires KVM +# with accelerator property "dirty-ring-size" set. (Since 8.1) # # Features: # -- 2.38.5
[PATCH QEMU v3 0/3] migration: craft the doc comments
Hi, please review the version 3 of the series, thanks. V3: - craft the commit message of "Add section for migration dirty limit and dirty page rate", and put the section after section "Migration", suggested by Markus. V2: - split the first commit in v1 into 2 - add commit message of commit: MAINTAINERS: Add Hyman Huang as maintainer Yong Hyman Huang(黄勇) (3): qapi: Reformat the dirty-limit migration doc comments qapi: Craft the dirty-limit capability comment MAINTAINERS: Add section "Migration dirty limit and dirty page rate" MAINTAINERS | 9 +++ qapi/migration.json | 66 + 2 files changed, 40 insertions(+), 35 deletions(-) -- 2.38.5
[PATCH QEMU v3 3/3] MAINTAINERS: Add section "Migration dirty limit and dirty page rate"
From: Hyman Huang(黄勇) I've built interests in dirty limit and dirty page rate features and also have been working on projects related to this subsystem. Add a section to the MAINTAINERS file for migration dirty limit and dirty page rate. Add myself as a maintainer for this subsystem so that I can help to improve the dirty limit algorithm and review the patches about dirty page rate. Signed-off-by: Hyman Huang(黄勇) Signed-off-by: Markus Armbruster Acked-by: Peter Xu --- MAINTAINERS | 9 + 1 file changed, 9 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 12e59b6b27..6111b6b4d9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3209,6 +3209,15 @@ F: qapi/migration.json F: tests/migration/ F: util/userfaultfd.c +Migration dirty limit and dirty page rate +M: Hyman Huang +S: Maintained +F: softmmu/dirtylimit.c +F: include/sysemu/dirtylimit.h +F: migration/dirtyrate.c +F: migration/dirtyrate.h +F: include/sysemu/dirtyrate.h + D-Bus M: Marc-André Lureau S: Maintained -- 2.38.5
[PATCH QEMU v3 1/3] qapi: Reformat the dirty-limit migration doc comments
From: Hyman Huang(黄勇) Reformat the dirty-limit migration doc comments to conform to current conventions as commit a937b6aa739 (qapi: Reformat doc comments to conform to current conventions). Signed-off-by: Markus Armbruster Signed-off-by: Hyman Huang(黄勇) --- qapi/migration.json | 69 ++--- 1 file changed, 34 insertions(+), 35 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 6b49593d2f..a74ade4d72 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -258,17 +258,17 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # -# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual -# CPUs each dirty ring full round, which shows how -# MigrationCapability dirty-limit affects the guest -# during live migration. (since 8.1) -# -# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds) -# each dirty ring full round, note that the value equals -# dirty ring memory size divided by average dirty page rate -# of virtual CPU, which can be used to observe the average -# memory load of virtual CPU indirectly. Note that zero -# means guest doesn't dirty memory (since 8.1) +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (Since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) for each dirty ring full round. The +# value equals the dirty ring memory size divided by the average +# dirty page rate of the virtual CPU, which can be used to +# observe the average memory load of the virtual CPU indirectly. +# Note that zero means guest doesn't dirty memory. (Since 8.1) # # Since: 0.14 ## @@ -519,15 +519,14 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # -# @dirty-limit: If enabled, migration will use the dirty-limit algo to -# throttle down guest instead of auto-converge algo. -# Throttle algo only works when vCPU's dirtyrate greater -# than 'vcpu-dirty-limit', read processes in guest os -# aren't penalized any more, so this algo can improve -# performance of vCPU during live migration. This is an -# optional performance feature and should not affect the -# correctness of the existing auto-converge algo. -# (since 8.1) +# @dirty-limit: If enabled, migration will use the dirty-limit +# algorithim to throttle down guest instead of auto-converge +# algorithim. Throttle algorithim only works when vCPU's dirtyrate +# greater than 'vcpu-dirty-limit', read processes in guest os +# aren't penalized any more, so this algorithim can improve +# performance of vCPU during live migration. This is an optional +# performance feature and should not affect the correctness of the +# existing auto-converge algorithim. (Since 8.1) # # Features: # @@ -822,17 +821,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms. +# Defaults to 1000ms. (Since 8.1) # # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration. -#Defaults to 1. (Since 8.1) +# Defaults to 1. (Since 8.1) # # Features: # # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period -#are experimental. +# are experimental. # # Since: 2.4 ## @@ -988,17 +987,17 @@ # Nodes are mapped to their block device name if there is one, and # to their node name otherwise. (Since 5.2) # -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit during -# live migration. Should be in the range 1 to 1000ms, -# defaults to 1000ms. (Since 8.1) +# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty +# limit during live migration. Should be in the range 1 to 1000ms. +# Defaults to 1000ms. (Since 8.1) # # @vc
[PATCH QEMU v2 1/3] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Note that this test case involves many passes, so it runs in slow mode only. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/migration-test.c | 164 +++ 1 file changed, 164 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 62d3f37021..0be2d17c42 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2739,6 +2739,166 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source destination + * start vm + * start incoming vm + * migrate + * wait dirty limit to begin + * cancel migrate + * cancellation check + * restart incoming vm + * migrate + * wait dirty limit to begin + * wait pre-switchover event + * convergence condition check + * + * And see if dirty limit migration works correctly. + * This test case involves many passes, so it runs in slow mode only. + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +.only_target = true, +.use_dirty_ring = true, +}, +
[PATCH QEMU v2 0/3] migration: enrich the dirty-limit test case
The dirty-limit migration test involves many passes and takes about 1 minute on average, so put it in the slow mode of migration-test. Inspired by Peter. V2: - put the dirty-limit migration test in slow mode and enrich the test case comment Dirty-limit feature was introduced in 8.1, and the test case could be enriched to make sure the behavior and the performance of dirty-limit is exactly what we want. This series adds 2 test cases, the first commit aims for the functional test and the others aim for the performance test. Please review, thanks. Yong. Hyman Huang(黄勇) (3): tests: Add migration dirty-limit capability test tests/migration: Introduce dirty-ring-size option into guestperf tests/migration: Introduce dirty-limit into guestperf tests/migration/guestperf/comparison.py | 23 tests/migration/guestperf/engine.py | 23 +++- tests/migration/guestperf/hardware.py | 8 +- tests/migration/guestperf/progress.py | 16 ++- tests/migration/guestperf/scenario.py | 11 +- tests/migration/guestperf/shell.py | 24 +++- tests/qtest/migration-test.c| 164 7 files changed, 261 insertions(+), 8 deletions(-) -- 2.38.5
[PATCH QEMU v2 2/3] tests/migration: Introduce dirty-ring-size option into guestperf
From: Hyman Huang(黄勇) Dirty ring size configuration is not supported by guestperf tool. Introduce dirty-ring-size (ranges in [1024, 65536]) option so developers can play with dirty-ring and dirty-limit feature easier. To set dirty ring size with 4096 during migration test: $ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx Signed-off-by: Hyman Huang(黄勇) --- tests/migration/guestperf/engine.py | 6 +- tests/migration/guestperf/hardware.py | 8 ++-- tests/migration/guestperf/shell.py| 6 +- 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index e69d16a62c..29ebb5011b 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -325,7 +325,6 @@ class Engine(object): cmdline = "'" + cmdline + "'" argv = [ -"-accel", "kvm", "-cpu", "host", "-kernel", self._kernel, "-initrd", self._initrd, @@ -333,6 +332,11 @@ class Engine(object): "-m", str((hardware._mem * 1024) + 512), "-smp", str(hardware._cpus), ] +if hardware._dirty_ring_size: +argv.extend(["-accel", "kvm,dirty-ring-size=%s" % + hardware._dirty_ring_size]) +else: +argv.extend(["-accel", "kvm"]) argv.extend(self._get_qemu_serial_args()) diff --git a/tests/migration/guestperf/hardware.py b/tests/migration/guestperf/hardware.py index 3145785ffd..f779cc050b 100644 --- a/tests/migration/guestperf/hardware.py +++ b/tests/migration/guestperf/hardware.py @@ -23,7 +23,8 @@ class Hardware(object): src_cpu_bind=None, src_mem_bind=None, dst_cpu_bind=None, dst_mem_bind=None, prealloc_pages = False, - huge_pages=False, locked_pages=False): + huge_pages=False, locked_pages=False, + dirty_ring_size=0): self._cpus = cpus self._mem = mem # GiB self._src_mem_bind = src_mem_bind # List of NUMA nodes @@ -33,6 +34,7 @@ class Hardware(object): self._prealloc_pages = prealloc_pages self._huge_pages = huge_pages self._locked_pages = locked_pages +self._dirty_ring_size = dirty_ring_size def serialize(self): @@ -46,6 +48,7 @@ class Hardware(object): "prealloc_pages": self._prealloc_pages, "huge_pages": self._huge_pages, "locked_pages": self._locked_pages, +"dirty_ring_size": self._dirty_ring_size, } @classmethod @@ -59,4 +62,5 @@ class Hardware(object): data["dst_mem_bind"], data["prealloc_pages"], data["huge_pages"], -data["locked_pages"]) +data["locked_pages"], +data["dirty_ring_size"]) diff --git a/tests/migration/guestperf/shell.py b/tests/migration/guestperf/shell.py index 8a809e3dda..7d6b8cd7cf 100644 --- a/tests/migration/guestperf/shell.py +++ b/tests/migration/guestperf/shell.py @@ -60,6 +60,8 @@ class BaseShell(object): parser.add_argument("--prealloc-pages", dest="prealloc_pages", default=False) parser.add_argument("--huge-pages", dest="huge_pages", default=False) parser.add_argument("--locked-pages", dest="locked_pages", default=False) +parser.add_argument("--dirty-ring-size", dest="dirty_ring_size", +default=0, type=int) self._parser = parser @@ -89,7 +91,9 @@ class BaseShell(object): locked_pages=args.locked_pages, huge_pages=args.huge_pages, -prealloc_pages=args.prealloc_pages) +prealloc_pages=args.prealloc_pages, + +dirty_ring_size=args.dirty_ring_size) class Shell(BaseShell): -- 2.38.5
[PATCH QEMU v2 3/3] tests/migration: Introduce dirty-limit into guestperf
From: Hyman Huang(黄勇) Currently, guestperf does not cover the dirty-limit migration, support this feature. Note that dirty-limit requires 'dirty-ring-size' set. To enable dirty-limit, setting x-vcpu-dirty-limit-period as 500ms and x-vcpu-dirty-limit as 10MB/s: $ ./tests/migration/guestperf.py \ --dirty-ring-size 4096 \ --dirty-limit --x-vcpu-dirty-limit-period 500 \ --vcpu-dirty-limit 10 --output output.json \ To run the entire standardized set of dirty-limit-enabled comparisons, with unix migration: $ ./tests/migration/guestperf-batch.py \ --dirty-ring-size 4096 \ --dst-host localhost --transport unix \ --filter compr-dirty-limit* --output outputdir Signed-off-by: Hyman Huang(黄勇) --- tests/migration/guestperf/comparison.py | 23 +++ tests/migration/guestperf/engine.py | 17 + tests/migration/guestperf/progress.py | 16 ++-- tests/migration/guestperf/scenario.py | 11 ++- tests/migration/guestperf/shell.py | 18 +- 5 files changed, 81 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/comparison.py b/tests/migration/guestperf/comparison.py index c03b3f6d7e..42cc0372d1 100644 --- a/tests/migration/guestperf/comparison.py +++ b/tests/migration/guestperf/comparison.py @@ -135,4 +135,27 @@ COMPARISONS = [ Scenario("compr-multifd-channels-64", multifd=True, multifd_channels=64), ]), + +# Looking at effect of dirty-limit with +# varying x_vcpu_dirty_limit_period +Comparison("compr-dirty-limit-period", scenarios = [ +Scenario("compr-dirty-limit-period-500", + dirty_limit=True, x_vcpu_dirty_limit_period=500), +Scenario("compr-dirty-limit-period-800", + dirty_limit=True, x_vcpu_dirty_limit_period=800), +Scenario("compr-dirty-limit-period-1000", + dirty_limit=True, x_vcpu_dirty_limit_period=1000), +]), + + +# Looking at effect of dirty-limit with +# varying vcpu_dirty_limit +Comparison("compr-dirty-limit", scenarios = [ +Scenario("compr-dirty-limit-10MB", + dirty_limit=True, vcpu_dirty_limit=10), +Scenario("compr-dirty-limit-20MB", + dirty_limit=True, vcpu_dirty_limit=20), +Scenario("compr-dirty-limit-50MB", + dirty_limit=True, vcpu_dirty_limit=50), +]), ] diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index 29ebb5011b..93a6f78e46 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -102,6 +102,8 @@ class Engine(object): info.get("expected-downtime", 0), info.get("setup-time", 0), info.get("cpu-throttle-percentage", 0), +info.get("dirty-limit-throttle-time-per-round", 0), +info.get("dirty-limit-ring-full-time", 0), ) def _migrate(self, hardware, scenario, src, dst, connect_uri): @@ -203,6 +205,21 @@ class Engine(object): resp = dst.command("migrate-set-parameters", multifd_channels=scenario._multifd_channels) +if scenario._dirty_limit: +if not hardware._dirty_ring_size: +raise Exception("dirty ring size must be configured when " +"testing dirty limit migration") + +resp = src.command("migrate-set-capabilities", + capabilities = [ + { "capability": "dirty-limit", + "state": True } + ]) +resp = src.command("migrate-set-parameters", +x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period) +resp = src.command("migrate-set-parameters", + vcpu_dirty_limit=scenario._vcpu_dirty_limit) + resp = src.command("migrate", uri=connect_uri) post_copy = False diff --git a/tests/migration/guestperf/progress.py b/tests/migration/guestperf/progress.py index ab1ee57273..d490584217 100644 --- a/tests/migration/guestperf/progress.py +++ b/tests/migration/guestperf/progress.py @@ -81,7 +81,9 @@ class Progress(object): downtime, downtime_expected, setup_time, - throttle_pcent): + throttle_pcent, + dirty_limit_throttle_time_per_round, + dirty_limit_ring_full_time): self._status = status self._ram = ram @@ -91,6 +93,10 @@ class Progress(object): self._downtime_expected =
[PATCH QEMU 3/3] vhost-user-blk-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "a4eef0711b vhost-user-blk-pci: default num_queues to -smp N" implment sizing the number of vhost-user-blk-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for vhost-user-blk-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the vhost-user-blk-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the vhost-user-blk-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/block/vhost-user-blk.c | 1 + hw/virtio/vhost-user-blk-pci.c | 9 - include/hw/virtio/vhost-user-blk.h | 5 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index eecf3f7a81..34e23b1727 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -566,6 +566,7 @@ static const VMStateDescription vmstate_vhost_user_blk = { static Property vhost_user_blk_properties[] = { DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev), +DEFINE_PROP_BOOL("auto-num-queues", VHostUserBlk, auto_num_queues, true), DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues, VHOST_USER_BLK_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("queue-size", VHostUserBlk, queue_size, 128), diff --git a/hw/virtio/vhost-user-blk-pci.c b/hw/virtio/vhost-user-blk-pci.c index eef8641a98..f7776e928a 100644 --- a/hw/virtio/vhost-user-blk-pci.c +++ b/hw/virtio/vhost-user-blk-pci.c @@ -56,7 +56,14 @@ static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) DeviceState *vdev = DEVICE(&dev->vdev); if (dev->vdev.num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) { -dev->vdev.num_queues = virtio_pci_optimal_num_queues(0); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.auto_num_queues) +dev->vdev.num_queues = virtio_pci_optimal_num_queues(0); +else +dev->vdev.num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h index ea085ee1ed..e6f0515bc6 100644 --- a/include/hw/virtio/vhost-user-blk.h +++ b/include/hw/virtio/vhost-user-blk.h @@ -50,6 +50,11 @@ struct VHostUserBlk { bool connected; /* vhost_user_blk_start/vhost_user_blk_stop */ bool started_vu; +/* + * Set to true if virtqueues allow to be allocated to + * match the number of virtual CPUs automatically. + */ +bool auto_num_queues; }; #endif -- 2.38.5
[PATCH QEMU 1/3] virtio-scsi-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "6a55882284 virtio-scsi-pci: default num_queues to -smp N" implment sizing the number of virtio-scsi-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for virtio-scsi-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the virtio-scsi-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the virtio-scsi-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/scsi/vhost-scsi.c| 2 ++ hw/scsi/vhost-user-scsi.c | 2 ++ hw/scsi/virtio-scsi.c | 2 ++ hw/virtio/vhost-scsi-pci.c | 11 +-- hw/virtio/vhost-user-scsi-pci.c | 11 +-- hw/virtio/virtio-scsi-pci.c | 11 +-- include/hw/virtio/virtio-scsi.h | 5 + 7 files changed, 38 insertions(+), 6 deletions(-) diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c index 443f67daa4..78a8929c49 100644 --- a/hw/scsi/vhost-scsi.c +++ b/hw/scsi/vhost-scsi.c @@ -284,6 +284,8 @@ static Property vhost_scsi_properties[] = { DEFINE_PROP_STRING("vhostfd", VirtIOSCSICommon, conf.vhostfd), DEFINE_PROP_STRING("wwpn", VirtIOSCSICommon, conf.wwpn), DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0), +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size, diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c index ee99b19e7a..1b837f370a 100644 --- a/hw/scsi/vhost-user-scsi.c +++ b/hw/scsi/vhost-user-scsi.c @@ -161,6 +161,8 @@ static void vhost_user_scsi_unrealize(DeviceState *dev) static Property vhost_user_scsi_properties[] = { DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev), DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0), +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size, diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 45b95ea070..2ec13032aa 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -1279,6 +1279,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev) } static Property virtio_scsi_properties[] = { +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSI, parent_obj.auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, diff --git a/hw/virtio/vhost-scsi-pci.c b/hw/virtio/vhost-scsi-pci.c index 08980bc23b..927c155278 100644 --- a/hw/virtio/vhost-scsi-pci.c +++ b/hw/virtio/vhost-scsi-pci.c @@ -51,8 +51,15 @@ static void vhost_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf; if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) { -conf->num_queues = -virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.parent_obj.parent_obj.auto_num_queues) +conf->num_queues = +virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED); +else +conf->num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/hw/virtio/vhost-user-scsi-pci.c b/hw/virtio/vhost-user-scsi-pci.c index 75882e3cf9..9c521a7f93 100644 --- a/hw/virtio/vhost-user-scsi-pci.c +++ b/hw/virtio/vhost-user-scsi-pci.c @@ -57,8 +57,15 @@ static void vhost_user_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf; if (conf-
[PATCH QEMU 0/3] provide a smooth upgrade solution for multi-queues disk
A 1:1 virtqueue:vCPU mapping implementation for virtio-*-pci disk introduced since qemu >= 5.2.0, which improves IO performance remarkably. To enjoy this feature for exiting running VMs without service interruption, the common solution is to migrate VMs from the lower version of the hypervisor to the upgraded hypervisor, then wait for the next cold reboot of the VM to enable this feature. That's the way "discard" and "write-zeroes" features work. As to multi-queues disk allocation automatically, it's a little different because the destination will allocate queues to match the number of vCPUs automatically by default in the case of live migration, and the VMs on the source side remain 1 queue by default, which results in migration failure due to loading disk VMState incorrectly on the destination side. This issue requires Qemu to provide a hint that shows multi-queues disk allocation is automatically supported, and this allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of this. And upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. To fix the issue, we introduce the auto-num-queues property for virtio-*-pci as a solution, which would be probed by APPs, e.g., libvirt by querying the device properties of QEMU. When launching live migration, libvirt will send the auto-num-queues property as a migration cookie to the destination, and thus the destination knows if the source side supports auto-num-queues. If not, the destination would switch off by building the command line with "auto-num-queues=off" when preparing the incoming VM process. The following patches of libvirt show how it roughly works: https://github.com/newfriday/libvirt/commit/ce2bae2e1a6821afeb80756dc01f3680f525e506 https://github.com/newfriday/libvirt/commit/f546972b009458c88148fe079544db7e9e1f43c3 https://github.com/newfriday/libvirt/commit/5ee19c8646fdb4d87ab8b93f287c20925268ce83 The smooth upgrade solution requires the introduction of the auto-num- queues property on the QEMU side, which is what the patch set does. I'm hoping for comments about the series. Please review, thanks. Yong Hyman Huang(黄勇) (3): virtio-scsi-pci: introduce auto-num-queues property virtio-blk-pci: introduce auto-num-queues property vhost-user-blk-pci: introduce auto-num-queues property hw/block/vhost-user-blk.c | 1 + hw/block/virtio-blk.c | 1 + hw/scsi/vhost-scsi.c | 2 ++ hw/scsi/vhost-user-scsi.c | 2 ++ hw/scsi/virtio-scsi.c | 2 ++ hw/virtio/vhost-scsi-pci.c | 11 +-- hw/virtio/vhost-user-blk-pci.c | 9 - hw/virtio/vhost-user-scsi-pci.c| 11 +-- hw/virtio/virtio-blk-pci.c | 9 - hw/virtio/virtio-scsi-pci.c| 11 +-- include/hw/virtio/vhost-user-blk.h | 5 + include/hw/virtio/virtio-blk.h | 5 + include/hw/virtio/virtio-scsi.h| 5 + 13 files changed, 66 insertions(+), 8 deletions(-) -- 2.38.5
[PATCH QEMU 2/3] virtio-blk-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "9445e1e15 virtio-blk-pci: default num_queues to -smp N" implment sizing the number of virtio-blk-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for virtio-blk-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the virtio-blk-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the virtio-blk-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/block/virtio-blk.c | 1 + hw/virtio/virtio-blk-pci.c | 9 - include/hw/virtio/virtio-blk.h | 5 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 39e7f23fab..9e498ca64a 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1716,6 +1716,7 @@ static Property virtio_blk_properties[] = { #endif DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0, true), +DEFINE_PROP_BOOL("auto-num-queues", VirtIOBlock, auto_num_queues, true), DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, VIRTIO_BLK_AUTO_NUM_QUEUES), DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256), diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c index 9743bee965..4b6b4c4933 100644 --- a/hw/virtio/virtio-blk-pci.c +++ b/hw/virtio/virtio-blk-pci.c @@ -54,7 +54,14 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOBlkConf *conf = &dev->vdev.conf; if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { -conf->num_queues = virtio_pci_optimal_num_queues(0); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.auto_num_queues) +conf->num_queues = virtio_pci_optimal_num_queues(0); +else +conf->num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index dafec432ce..dab6d7c70c 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -65,6 +65,11 @@ struct VirtIOBlock { uint64_t host_features; size_t config_size; BlockRAMRegistrar blk_ram_registrar; +/* + * Set to true if virtqueues allow to be allocated to + * match the number of virtual CPUs automatically. + */ +bool auto_num_queues; }; typedef struct VirtIOBlockReq { -- 2.38.5
Re: [PATCH v3 3/3] cpus-common: implement dirty limit on vCPU
在 2021/11/22 19:26, Markus Armbruster 写道: Hyman Huang writes: 在 2021/11/22 17:10, Markus Armbruster 写道: Hyman Huang writes: =E5=9C=A8 2021/11/22 15:35, Markus Armbruster =E5=86=99=E9=81=93: huang...@chinatelecom.cn writes: From: Hyman Huang(=E9=BB=84=E5=8B=87) implement dirtyrate calculation periodically basing on dirty-ring and throttle vCPU until it reachs the quota dirtyrate given by user. introduce qmp commands set-dirty-limit/cancel-dirty-limit to set/cancel dirty limit on vCPU. Please start sentences with a capital letter. Ok,i'll check the syntax problem next version. Signed-off-by: Hyman Huang(黄勇) [...] diff --git a/qapi/misc.json b/qapi/misc.json index 358548a..98e6001 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -527,3 +527,42 @@ 'data': { '*option': 'str' }, 'returns': ['CommandLineOptionInfo'], 'allow-preconfig': true } + +## +# @set-dirty-limit: +# +# This command could be used to cap the vCPU memory load, which is also +# refered as dirtyrate. One should use "calc-dirty-rate" with "dirty-ring" +# and to calculate vCPU dirtyrate and query it with "query-dirty-rate". +# Once getting the vCPU current dirtyrate, "set-dirty-limit" can be used +# to set the upper limit of dirtyrate for the interested vCPU. "dirtyrate" is not a word. Let's spell it "dirty page rate", for consistency with the documentation in migration.json. Ok, sounds good. Regarding "One should use ...": sounds like you have to run calc-dirty-rate with argument @mode set to @dirty-ring before this command. Correct? What happens when you don't? set-dirty-limit fails? You didn't answer this question. set-dirty-limit doesn't do any pre-check about if calc-dirty-rate has executed, so it doesn't fail. Peeking at qmp_set_dirty_limit()... it fails when !kvm_dirty_ring_enabled(). kvm_dirty_ring_enabled() returns true when kvm_state->kvm_dirty_ring_size is non-zero. How can it become non-zero? If we enable dirty-ring with qemu commandline "-accel kvm,dirty-ring-size=xxx",qemu will parse the dirty-ring-size and set it. So we check if dirty-ring is enabled by the kvm_dirty_ring_size. Since only executing calc-dirty-rate with dirty-ring mode can we get the vCPU dirty page rate currently(while the dirty-bitmap only get the vm dirty page rate), "One should use ..." maybe misleading, what i actually want to say is "One should use the dirty-ring mode to calculate the vCPU dirty page rate". I'm still confused on what exactly users must do for the page dirty rate limiting to work as intended, and at least as importantly, what happens when they get it wrong. User can set-dirty-limit unconditionally and the dirtylimit will work. "One should use ..." just emphasize if users want to know which vCPU is in high memory load and want to limit it's dirty page rate, they can use calc-dirty-rate but it is not prerequisite for set-dirty-limit. Umm, I think "One should use ..." explanation make things complicated. I'll reconsider the comment next version. [...]
Re: [PATCH v6 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/11/26 15:03, Markus Armbruster 写道: huang...@chinatelecom.cn writes: From: Hyman Huang(黄勇) Implement dirtyrate calculation periodically basing on dirty-ring and throttle vCPU until it reachs the quota dirty page rate given by user. Introduce qmp commands set-dirty-limit/cancel-dirty-limit to set/cancel dirty page limit on vCPU. Signed-off-by: Hyman Huang(黄勇) --- cpus-common.c | 41 + include/hw/core/cpu.h | 9 + qapi/migration.json | 47 +++ softmmu/vl.c | 1 + 4 files changed, 98 insertions(+) diff --git a/cpus-common.c b/cpus-common.c index 6e73d3e..3c156b3 100644 --- a/cpus-common.c +++ b/cpus-common.c @@ -23,6 +23,11 @@ #include "hw/core/cpu.h" #include "sysemu/cpus.h" #include "qemu/lockable.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/cpu-throttle.h" +#include "sysemu/kvm.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-migration.h" static QemuMutex qemu_cpu_list_lock; static QemuCond exclusive_cond; @@ -352,3 +357,39 @@ void process_queued_cpu_work(CPUState *cpu) qemu_mutex_unlock(&cpu->work_mutex); qemu_cond_broadcast(&qemu_work_cond); } + +void qmp_set_dirty_limit(int64_t idx, + uint64_t dirtyrate, + Error **errp) +{ +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "setting a dirty page limit requires support from dirty ring"); Can we phrase the message in a way that gives the user a chance to guess what he needs to do to avoid it? > Perhaps: "setting a dirty page limit requires KVM with accelerator property 'dirty-ring-size' set". Sound good, this make things more clear. +return; +} + +dirtylimit_calc(); +dirtylimit_vcpu(idx, dirtyrate); +} + +void qmp_cancel_dirty_limit(int64_t idx, +Error **errp) +{ Three cases: Case 1: enable is impossible, so nothing to do. Case 2: enable is possible and we actually enabled. Case 3: enable is possible, but we didn't. Nothing to do. +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "no need to cancel a dirty page limit as dirty ring not enabled"); +return; This is case 1. We error out. +} + +if (unlikely(!dirtylimit_cancel_vcpu(idx))) { I don't think unlikely() matters here. +dirtylimit_calc_quit(); +} In case 2, dirtylimit_calc_quit() returns zero if this was the last limit, else non-zero. If the former, we request the thread to stop.I am wildly guessing you misunderstood the function dirtylimit_cancel_vcpu, see below. In case 3, dirtylimit_calc_quit() returns zero, and we do nothing. In this case, we cancel the "dirtylimit thread" in function dirtylimit_cancel_vcpu actually, if it was the last limit thread of the whole vm, dirtylimit_cancel_vcpu return zero and we request the dirtyrate calculation thread to stop, so we call the function dirtylimit_calc_quit , which stop the "dirtyrate calculation thread" internally. Why is case 1 and error, but case 3 isn't? Both could silently do nothing, like case 3 does now. Both could error out, like case 1 does now. A possible common error message: "there is no dirty page limit to cancel". I'd be okay with consistently doing nothing, and with consistently erroring out. +} + +void dirtylimit_setup(int max_cpus) +{ +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +return; +} + +dirtylimit_calc_state_init(max_cpus); +dirtylimit_state_init(max_cpus); +} diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index e948e81..11df012 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -881,6 +881,15 @@ void end_exclusive(void); */ void qemu_init_vcpu(CPUState *cpu); +/** + * dirtylimit_setup: + * + * Initializes the global state of dirtylimit calculation and + * dirtylimit itself. This is prepared for vCPU dirtylimit which + * could be triggered during vm lifecycle. + */ +void dirtylimit_setup(int max_cpus); + #define SSTEP_ENABLE 0x1 /* Enable simulated HW single stepping */ #define SSTEP_NOIRQ 0x2 /* Do not use IRQ while single stepping */ #define SSTEP_NOTIMER 0x4 /* Do not Timers while single stepping */ diff --git a/qapi/migration.json b/qapi/migration.json index bbfd48c..2b0fe19 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1850,6 +1850,53 @@ { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' } ## +# @set-dirty-limit: +# +# Set the upper limit of dirty page rate for a vCPU. +# +# This command could be used to cap the vCPU memory load, which is also "Could be used" suggests there
Re: [PATCH v16 0/7] support dirty restraint on vCPU
在 2022/3/3 0:53, Dr. David Alan Gilbert 写道: * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: * huang...@chinatelecom.cn (huang...@chinatelecom.cn) wrote: From: Hyman Huang(黄勇) Queued via my migration/hmp/etc tree Hi, Unfortunately I've had to unqueue this - it breaks the qmp-cmd-test: # starting QEMU: exec ./x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-142136.sock -qtest-log /dev/fd/2 -chardev socket,path=/tmp/qtest-142136.qmp,id=char0 -mon chardev=char0,mode=control -display none -nodefaults -machine none -accel qtest [I 1646239093.713627] OPENED [R +0.000190] endianness [S +0.000196] OK little {"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 6}, "package": "v6.2.0-1867-g817703d65a"}, "capabilities": ["oob"]}}{"execute": "qmp_capabilities"} {"return": {}}{"execute": "query-vcpu-dirty-limit"} {"error": {"class": "GenericError", "desc": "dirty page limit not enabled"}}** ERROR:../tests/qtest/qmp-cmd-test.c:84:test_query: assertion failed: (qdict_haskey(resp, "return")) Bail out! ERROR:../tests/qtest/qmp-cmd-test.c:84:test_query: assertion failed: (qdict_haskey(resp, "return")) [I +0.195433] CLOSED Aborted (core dumped) qmp-cmd-test tries to run every query command; so either you need to: a) Add it to the list of skipped command in qmp-cmd-test query-vcpu-dirty-limit sucess only if dirty ring feature enabled. So i prefer to add this command to the list of kipped command. I'll fix it next version and run the qtests before i post the patchset. Thinks Yong b) Make it not actually error when the limit isn't enabled. Dave v16 - rebase on master - drop the unused typedef syntax in [PATCH v15 6/7] - add the Reviewed-by and Acked-by tags by the way v15 - rebase on master - drop the 'init_time_ms' parameter in function vcpu_calculate_dirtyrate - drop the 'setup' field in dirtylimit_state and call dirtylimit_process directly, which makes code cleaner. - code clean in dirtylimit_adjust_throttle - fix miss dirtylimit_state_unlock() in dirtylimit_process and dirtylimit_query_all - add some comment Please review. Thanks, Regards Yong v14 - v13 sent by accident, resend patchset. v13 - rebase on master - passing NULL to kvm_dirty_ring_reap in commit "refactor per-vcpu dirty ring reaping" to keep the logic unchanged. In other word, we still try the best to reap as much PFNs as possible if dirtylimit not in service. - move the cpu list gen id changes into a separate patch. - release the lock before sleep during dirty page rate calculation. - move the dirty ring size fetch logic into a separate patch. - drop the DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK MACRO . - substitute bh with function pointer when implement dirtylimit. - merge the dirtylimit_start/stop into dirtylimit_change. - fix "cpu-index" parameter type with "int" to keep consistency. - fix some syntax error in documents. Please review. Thanks, Yong v12 - rebase on master - add a new commmit to refactor per-vcpu dirty ring reaping, which can resolve the "vcpu miss the chances to sleep" problem - remove the dirtylimit_thread and implemtment throttle in bottom half instead. - let the dirty ring reaper thread keep sleeping when dirtylimit is in service - introduce cpu_list_generation_id to identify cpu_list changing. - keep taking the cpu_list_lock during dirty_stat_wait to prevent vcpu plug/unplug when calculating the dirty page rate - move the dirtylimit global initializations out of dirtylimit_set_vcpu and do some code clean - add DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK in case of oscillation when throttling - remove the unmatched count field in dirtylimit_state - add stub to fix build on non-x86 - refactor the documents Thanks Peter and Markus for reviewing the previous versions, please review. Thanks, Yong v11 - rebase on master - add a commit " refactor dirty page rate calculation" so that dirty page rate limit can reuse the calculation logic. - handle the cpu hotplug/unplug case in the dirty page rate calculation logic. - modify the qmp commands according to Markus's advice. - introduce a standalone file dirtylimit.c to implement dirty page rate limit - check if dirty limit in service by dirtylimit_state pointer instead of global variable - introduce dirtylimit_mutex to protect dirtylimit_state - do some code clean and docs See the commit for more detail, thanks Markus and Peter very mush for the code review and give the experienced and insightful advices, most modifications are based on these advices. v10: - rebase on master - make the following modifications on patch [1/3]: 1. Make "dirtylimit-calc" thread joinable and join it af
Re: [PATCH V13 0/7] support dirty restraint on vCPU
"Sent by accident, please ignore, I'll send v14 when ready." 在 2022/2/11 0:06, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) v13 - rebase on master - passing NULL to kvm_dirty_ring_reap in commit "refactor per-vcpu dirty ring reaping" to keep the logic unchanged. In other word, we still try the best to reap as much PFNs as possible if dirtylimit not in service. - move the cpu list gen id changes into a separate patch. - release the lock before sleep during dirty page rate calculation. - move the dirty ring size fetch logic into a separate patch. - drop the DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK MACRO . - substitute bh with function pointer when implement dirtylimit. - merge the dirtylimit_start/stop into dirtylimit_change. - fix "cpu-index" parameter type with "int" to keep consistency. - fix some syntax error in documents. Please review. Thanks, Yong v12 - rebase on master - add a new commmit to refactor per-vcpu dirty ring reaping, which can resolve the "vcpu miss the chances to sleep" problem - remove the dirtylimit_thread and implemtment throttle in bottom half instead. - let the dirty ring reaper thread keep sleeping when dirtylimit is in service - introduce cpu_list_generation_id to identify cpu_list changing. - keep taking the cpu_list_lock during dirty_stat_wait to prevent vcpu plug/unplug when calculating the dirty page rate - move the dirtylimit global initializations out of dirtylimit_set_vcpu and do some code clean - add DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK in case of oscillation when throttling - remove the unmatched count field in dirtylimit_state - add stub to fix build on non-x86 - refactor the documents Thanks Peter and Markus for reviewing the previous versions, please review. Thanks, Yong v11 - rebase on master - add a commit " refactor dirty page rate calculation" so that dirty page rate limit can reuse the calculation logic. - handle the cpu hotplug/unplug case in the dirty page rate calculation logic. - modify the qmp commands according to Markus's advice. - introduce a standalone file dirtylimit.c to implement dirty page rate limit - check if dirty limit in service by dirtylimit_state pointer instead of global variable - introduce dirtylimit_mutex to protect dirtylimit_state - do some code clean and docs See the commit for more detail, thanks Markus and Peter very mush for the code review and give the experienced and insightful advices, most modifications are based on these advices. v10: - rebase on master - make the following modifications on patch [1/3]: 1. Make "dirtylimit-calc" thread joinable and join it after quitting. 2. Add finalize function to free dirtylimit_calc_state 3. Do some code clean work - make the following modifications on patch [2/3]: 1. Remove the original implementation of throttle according to Peter's advice. 2. Introduce a negative feedback system and implement the throttle on all vcpu in one thread named "dirtylimit". 3. Simplify the algo when calculation the throttle_us_per_full: increase/decrease linearly when there exists a wide difference between quota and current dirty page rate, increase/decrease a fixed time slice when the difference is narrow. This makes throttle responds faster and reach the quota smoothly. 4. Introduce a unfit_cnt in algo to make sure throttle really takes effect. 5. Set the max sleep time 99 times more than "ring_full_time_us". 6. Make "dirtylimit" thread joinable and join it after quitting. - make the following modifications on patch [3/3]: 1. Remove the unplug cpu handling logic. 2. "query-vcpu-dirty-limit" only return dirtylimit information of vcpus that enable dirtylimit
Re: [PATCH 2/8] qapi/migration: Introduce vcpu-dirty-limit parameters
在 2022/8/18 6:07, Peter Xu 写道: On Sat, Jul 23, 2022 at 03:49:14PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) --- migration/migration.c | 14 ++ monitor/hmp-cmds.c| 8 qapi/migration.json | 18 +++--- 3 files changed, 37 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 7b19f85..ed1a47b 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -117,6 +117,7 @@ #define DEFAULT_MIGRATE_ANNOUNCE_STEP100 #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 500 /* ms */ +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1 /* MB/s */ This default value also looks a bit weird.. why 1MB/s? Thanks, Indeed, it seems kind of weired, the reason to set default dirty limit to 1MB/s is that we want to keep the dirty limit working until vcpu dirty page rate drop to 1MB/s once dirtylimit capability enabled during migration. In this way, migration has the largest chance to get converged before vcpu dirty page rate drop to 1MB/s。 If we set default dirty limit greater than 1MB/s, the probability of success for migration may be reduced, and the default behavior of migration is try the best to become sucessful.
Re: [PATCH 1/8] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
在 2022/8/18 6:06, Peter Xu 写道: On Sat, Jul 23, 2022 at 03:49:13PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is used to make dirtyrate calculation period configurable. Signed-off-by: Hyman Huang(黄勇) --- migration/migration.c | 16 monitor/hmp-cmds.c| 8 qapi/migration.json | 31 --- 3 files changed, 48 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index e03f698..7b19f85 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -116,6 +116,8 @@ #define DEFAULT_MIGRATE_ANNOUNCE_ROUNDS5 #define DEFAULT_MIGRATE_ANNOUNCE_STEP100 +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 500 /* ms */ Why 500 but not DIRTYLIMIT_CALC_TIME_MS? This is a empirical value actually, the iteration time of migration is less than 1000ms normally. In my test it varies from 200ms to 500ms, we assume iteration time is 500ms and calculation period is 1000ms, so 2 iteration pass when 1 dirty page rate get calculated. We want calculation period as close to iteration time as possible so that 1 iteration pass, 1 new dirty page rate be calculated and get compared, hoping the dirtylimit working more precisely. But as the "x-" prefix implies, i'm a little unsure that if the solution works。 Is it intended to make this parameter experimental, but the other one not? Since i'm not very sure vcpu-dirty-limit-period have impact on migration(as described above), so it is made experimental. As to vcpu-dirty-limit, it indeed have impact on migration in theory, so it is not made experimental. But from another point of view, 2 parameter are introduced in the first time and none of them suffer lots of tests, it is also reasonable to make 2 parameter experimental, i'm not insist that. Yong Thanks,
Re: [PATCH 4/8] migration: Implement dirty-limit convergence algo
在 2022/8/18 6:09, Peter Xu 写道: On Sat, Jul 23, 2022 at 03:49:16PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Signed-off-by: Hyman Huang(黄勇) --- migration/ram.c| 53 +- migration/trace-events | 1 + 2 files changed, 41 insertions(+), 13 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index b94669b..2a5cd23 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -45,6 +45,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -57,6 +58,8 @@ #include "qemu/iov.h" #include "multifd.h" #include "sysemu/runstate.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -1139,6 +1142,21 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +if (!dirtylimit_in_service()) { +MigrationState *s = migrate_get_current(); +int64_t quota_dirtyrate = s->parameters.vcpu_dirty_limit; + +/* Set quota dirtyrate if dirty limit not in service */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} +} What if migration is cancelled? Do we have logic to stop the dirty limit, or should we? Yes, we should have logic to stop dirty limit, i'll add that. Thanks for your suggestion. :) Yong
Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"
在 2022/11/2 13:42, Jason Wang 写道: On Tue, Nov 1, 2022 at 12:19 AM wrote: From: Hyman Huang(黄勇) For netdev device that can offload virtio-net dataplane to slave, such as vhost-net, vhost-user and vhost-vdpa, exporting it's capability information and acked features would be more friendly for developers. These infomation can be analyzed and compare to slave capability provided by, eg dpdk or other slaves directly, helping to draw conclusions about if vm network interface works normally, if it vm can be migrated to another feature-compatible destination or whatever else. For developers who devote to offload virtio-net dataplane to DPU and make efforts to migrate vm lively from software-based source host to DPU-offload destination host smoothly, virtio-net feature compatibility is an serious issue, exporting the key capability and acked_features of netdev could also help to debug greatly. So we export out the key capabilities of netdev, which may affect the final negotiated virtio-net features, meanwhile, backed-up acked_features also exported, which is used to initialize or restore features negotiated between qemu and vhost slave when starting vhost_dev device. Signed-off-by: Hyman Huang(黄勇) --- net/net.c | 44 +++ qapi/net.json | 66 +++ 2 files changed, 110 insertions(+) diff --git a/net/net.c b/net/net.c index 2db160e..5d11674 100644 --- a/net/net.c +++ b/net/net.c @@ -53,6 +53,7 @@ #include "sysemu/runstate.h" #include "net/colo-compare.h" #include "net/filter.h" +#include "net/vhost-user.h" #include "qapi/string-output-visitor.h" /* Net bridge is currently not supported for W32. */ @@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp) } } +static NetDevInfo *query_netdev(NetClientState *nc) +{ +NetDevInfo *info = NULL; + +if (!nc || !nc->is_netdev) { +return NULL; +} + +info = g_malloc0(sizeof(*info)); +info->name = g_strdup(nc->name); +info->type = nc->info->type; +info->ufo = nc->info->has_ufo; +info->vnet_hdr = nc->info->has_vnet_hdr; +info->vnet_hdr_len = nc->info->has_vnet_hdr_len; So all the fields are virtio specific, I wonder if it's better to rename the command as query-vhost or query-virtio? Indeed, i'm also a little struggling about the naming, i prefer Thomas's suggestion: 'x-query-virtio-netdev' and 'info virtio-netdev', since we may add or del some capabilities about the *netdev* , so adding a "x-" prefix seems to reasonable, as to '-netdev' suffix, it implies the *backend*. Thanks, Yong Thanks + +if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) { +info->has_acked_features = true; +info->acked_features = vhost_user_get_acked_features(nc); +} + +return info; +} + +NetDevInfoList *qmp_query_netdev(Error **errp) +{ +NetClientState *nc; +NetDevInfo *info = NULL; +NetDevInfoList *head = NULL, **tail = &head; + +QTAILQ_FOREACH(nc, &net_clients, next) { +if (nc->info->type == NET_CLIENT_DRIVER_NIC) { +continue; +} + +info = query_netdev(nc); +if (info) { +QAPI_LIST_APPEND(tail, info); +} +} + +return head; +} + static void netfilter_print_info(Monitor *mon, NetFilterState *nf) { char *str; diff --git a/qapi/net.json b/qapi/net.json index dd088c0..76a6513 100644 --- a/qapi/net.json +++ b/qapi/net.json @@ -631,6 +631,72 @@ 'if': 'CONFIG_VMNET' } } } ## +# @NetDevInfo: +# +# NetDev information. This structure describes a NetDev information, including +# capabilities and negotiated features. +# +# @name: The NetDev name. +# +# @type: Type of NetDev. +# +# @ufo: True if NetDev has ufo capability. +# +# @vnet-hdr: True if NetDev has vnet_hdr. +# +# @vnet-hdr-len: True if given length can be assigned to NetDev. +# +# @acked-features: Negotiated features with vhost slave device if device support +# dataplane offload. +# +# Since: 7.1 +## +{'struct': 'NetDevInfo', + 'data': { +'name': 'str', +'type': 'NetClientDriver', +'ufo':'bool', +'vnet-hdr':'bool', +'vnet-hdr-len':'bool', +'*acked-features': 'uint64' } } + +## +# @query-netdev: +# +# Get a list of NetDevInfo for all virtual netdev peer devices. +# +# Returns: a list of @NetDevInfo describing each virtual netdev peer device. +# +# Since: 7.1 +# +# Example: +# +# -> { "execute": "query-netdev" } +# <- { +# "return":[ +# { +# &
Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"
在 2022/11/2 14:41, Michael S. Tsirkin 写道: On Wed, Nov 02, 2022 at 01:42:39PM +0800, Jason Wang wrote: On Tue, Nov 1, 2022 at 12:19 AM wrote: From: Hyman Huang(黄勇) For netdev device that can offload virtio-net dataplane to slave, such as vhost-net, vhost-user and vhost-vdpa, exporting it's capability information and acked features would be more friendly for developers. These infomation can be analyzed and compare to slave capability provided by, eg dpdk or other slaves directly, helping to draw conclusions about if vm network interface works normally, if it vm can be migrated to another feature-compatible destination or whatever else. For developers who devote to offload virtio-net dataplane to DPU and make efforts to migrate vm lively from software-based source host to DPU-offload destination host smoothly, virtio-net feature compatibility is an serious issue, exporting the key capability and acked_features of netdev could also help to debug greatly. So we export out the key capabilities of netdev, which may affect the final negotiated virtio-net features, meanwhile, backed-up acked_features also exported, which is used to initialize or restore features negotiated between qemu and vhost slave when starting vhost_dev device. Signed-off-by: Hyman Huang(黄勇) --- net/net.c | 44 +++ qapi/net.json | 66 +++ 2 files changed, 110 insertions(+) diff --git a/net/net.c b/net/net.c index 2db160e..5d11674 100644 --- a/net/net.c +++ b/net/net.c @@ -53,6 +53,7 @@ #include "sysemu/runstate.h" #include "net/colo-compare.h" #include "net/filter.h" +#include "net/vhost-user.h" #include "qapi/string-output-visitor.h" /* Net bridge is currently not supported for W32. */ @@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp) } } +static NetDevInfo *query_netdev(NetClientState *nc) +{ +NetDevInfo *info = NULL; + +if (!nc || !nc->is_netdev) { +return NULL; +} + +info = g_malloc0(sizeof(*info)); +info->name = g_strdup(nc->name); +info->type = nc->info->type; +info->ufo = nc->info->has_ufo; +info->vnet_hdr = nc->info->has_vnet_hdr; +info->vnet_hdr_len = nc->info->has_vnet_hdr_len; So all the fields are virtio specific, I wonder if it's better to rename the command as query-vhost or query-virtio? Thanks We have info virtio already. Seems to fit there logically. Ok, it seems that 'x-query-virtio-netdev' is a good option. + +if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) { +info->has_acked_features = true; +info->acked_features = vhost_user_get_acked_features(nc); +} + +return info; +} + +NetDevInfoList *qmp_query_netdev(Error **errp) +{ +NetClientState *nc; +NetDevInfo *info = NULL; +NetDevInfoList *head = NULL, **tail = &head; + +QTAILQ_FOREACH(nc, &net_clients, next) { +if (nc->info->type == NET_CLIENT_DRIVER_NIC) { +continue; +} + +info = query_netdev(nc); +if (info) { +QAPI_LIST_APPEND(tail, info); +} +} + +return head; +} + static void netfilter_print_info(Monitor *mon, NetFilterState *nf) { char *str; diff --git a/qapi/net.json b/qapi/net.json index dd088c0..76a6513 100644 --- a/qapi/net.json +++ b/qapi/net.json @@ -631,6 +631,72 @@ 'if': 'CONFIG_VMNET' } } } ## +# @NetDevInfo: +# +# NetDev information. This structure describes a NetDev information, including +# capabilities and negotiated features. +# +# @name: The NetDev name. +# +# @type: Type of NetDev. +# +# @ufo: True if NetDev has ufo capability. +# +# @vnet-hdr: True if NetDev has vnet_hdr. +# +# @vnet-hdr-len: True if given length can be assigned to NetDev. +# +# @acked-features: Negotiated features with vhost slave device if device support +# dataplane offload. +# +# Since: 7.1 +## +{'struct': 'NetDevInfo', + 'data': { +'name': 'str', +'type': 'NetClientDriver', +'ufo':'bool', +'vnet-hdr':'bool', +'vnet-hdr-len':'bool', +'*acked-features': 'uint64' } } + +## +# @query-netdev: +# +# Get a list of NetDevInfo for all virtual netdev peer devices. +# +# Returns: a list of @NetDevInfo describing each virtual netdev peer device. +# +# Since: 7.1 +# +# Example: +# +# -> { "execute": "query-netdev" } +# <- { +# "return":[ +# { +# "name":"hostnet0", +# "type":"vhost-user", +# "ufo":true, +# "vnet-hdr"
Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"
在 2022/11/2 15:10, Thomas Huth 写道: On 02/11/2022 06.42, Jason Wang wrote: On Tue, Nov 1, 2022 at 12:19 AM wrote: From: Hyman Huang(黄勇) For netdev device that can offload virtio-net dataplane to slave, such as vhost-net, vhost-user and vhost-vdpa, exporting it's capability information and acked features would be more friendly for developers. These infomation can be analyzed and compare to slave capability provided by, eg dpdk or other slaves directly, helping to draw conclusions about if vm network interface works normally, if it vm can be migrated to another feature-compatible destination or whatever else. For developers who devote to offload virtio-net dataplane to DPU and make efforts to migrate vm lively from software-based source host to DPU-offload destination host smoothly, virtio-net feature compatibility is an serious issue, exporting the key capability and acked_features of netdev could also help to debug greatly. So we export out the key capabilities of netdev, which may affect the final negotiated virtio-net features, meanwhile, backed-up acked_features also exported, which is used to initialize or restore features negotiated between qemu and vhost slave when starting vhost_dev device. Signed-off-by: Hyman Huang(黄勇) --- net/net.c | 44 +++ qapi/net.json | 66 +++ 2 files changed, 110 insertions(+) diff --git a/net/net.c b/net/net.c index 2db160e..5d11674 100644 --- a/net/net.c +++ b/net/net.c @@ -53,6 +53,7 @@ #include "sysemu/runstate.h" #include "net/colo-compare.h" #include "net/filter.h" +#include "net/vhost-user.h" #include "qapi/string-output-visitor.h" /* Net bridge is currently not supported for W32. */ @@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp) } } +static NetDevInfo *query_netdev(NetClientState *nc) +{ + NetDevInfo *info = NULL; + + if (!nc || !nc->is_netdev) { + return NULL; + } + + info = g_malloc0(sizeof(*info)); + info->name = g_strdup(nc->name); + info->type = nc->info->type; + info->ufo = nc->info->has_ufo; + info->vnet_hdr = nc->info->has_vnet_hdr; + info->vnet_hdr_len = nc->info->has_vnet_hdr_len; So all the fields are virtio specific, I wonder if it's better to rename the command as query-vhost or query-virtio? And add a "x-" prefix (and a "-netdev" suffix) as long as we don't feel confident about this yet? "x-query-virtio-netdev" ? Agree with that, thanks for the comment. Yong. Thomas
Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw
在 2022/11/11 3:00, Michael S. Tsirkin 写道: On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Save the acked_features once it be configured by guest virtio driver so it can't miss any features. Note that this patch also change the features saving logic in chr_closed_bh, which originally backup features no matter whether the features are 0 or not, but now do it only if features aren't 0. I'm not sure how is this change even related to what we are trying to do (fix a bug). Explain here? For this series, all we want to do is to making sure acked_features in the NetVhostUserState is credible and uptodate in the scenario that virtio features negotiation and openvswitch service restart happens simultaneously. To make sure that happens, we save the acked_features to NetVhostUserState right after guest setting virtio-net features. Assume that we do not save acked_features to NetVhostUserState just as it is, the acked_features in NetVhostUserState has chance to be assigned only when chr_closed_bh/vhost_user_stop happen. Note that openvswitch service stop will cause chr_closed_bh happens and acked_features in vhost_dev will be stored into NetVhostUserState, if the acked_features in vhost_dev are out-of-date(may be updated in the next few seconds), so does the acked_features in NetVhostUserState after doing the assignment, this is the bug. Let's refine the scenario and derive the bug: qemu threaddpdk | | vhost_net_init() | | | assign acked_features in vhost_dev | with 0x4000 | | openvswitch.service stop chr_closed_bh| | | assign acked_features in | NetVhostUserState with 0x4000 | | | virtio_net_set_features()| | | assign acked_features in vhost_dev | with 0x7060a782 | | openvswitch.service start | | vhost_user_start | | | assign acked_features in vhost_dev | with 0x4000 | | | As the step shows, if we do not keep the acked_features in NetVhostUserState up-to-date, the acked_features in vhost_dev may be reloaded with the wrong value(eg, 0x4000) when vhost_user_start happens. As to reset acked_features to 0 if needed, Qemu always keeping the backup acked_features up-to-date, and save the acked_features after virtio_net_set_features in advance, including reset acked_features to 0, so the behavior is also covered. Signed-off-by: Hyman Huang(黄勇) Signed-off-by: Guoyi Tu --- hw/net/vhost_net.c | 9 + hw/net/virtio-net.c | 5 + include/net/vhost_net.h | 2 ++ net/vhost-user.c| 6 +- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index d28f8b9..2bffc27 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net) return net->dev.acked_features; } +void vhost_net_save_acked_features(NetClientState *nc) +{ +if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) { +return; +} + +vhost_user_save_acked_features(nc, false); +} + static int vhost_net_get_fd(NetClientState *backend) { switch (backend->info->type) { diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e9f696b..5f8f788 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) continue; } vhost_net_ack_features(get_vhost_net(nc->peer), features); +/* + * keep acked_features in NetVhostUserState up-to-date so it + * can't miss any features configured by guest virtio driver. + */ +vhost_net_save_acked_features(nc->peer); } if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) { diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h index 387e913..3a5579b 100644 --- a/include/net/vhost_net.h +++ b/include/net/vhost_net.h @@ -46,6 +46,8 @@ int vhost_set_vring_enable(N
Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw
The previous reply email has an text format error, please ignore and 在 2022/11/11 3:00, Michael S. Tsirkin 写道: On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Save the acked_features once it be configured by guest virtio driver so it can't miss any features. Note that this patch also change the features saving logic in chr_closed_bh, which originally backup features no matter whether the features are 0 or not, but now do it only if features aren't 0. I'm not sure how is this change even related to what we are trying to do (fix a bug). Explain here? For this series, all we want to do is to making sure acked_features in the NetVhostUserState is credible and uptodate in the scenario that virtio features negotiation and openvswitch service restart happens simultaneously. To make sure that happens, we save the acked_features to NetVhostUserState right after guest setting virtio-net features. Assume that we do not save acked_features to NetVhostUserState just as it is, the acked_features in NetVhostUserState has chance to be assigned only when chr_closed_bh/vhost_user_stop happen. Note that openvswitch service stop will cause chr_closed_bh happens and acked_features in vhost_dev will be stored into NetVhostUserState, if the acked_features in vhost_dev are out-of-date(may be updated in the next few seconds), so does the acked_features in NetVhostUserState after doing the assignment, this is the bug. Let's refine the scenario and derive the bug: qemu threaddpdk | | vhost_net_init() | | | assign acked_features in vhost_dev | with 0x4000 | | openvswitch.service stop chr_closed_bh| | | assign acked_features in | NetVhostUserState with 0x4000 | | | virtio_net_set_features()| | | assign acked_features in vhost_dev | with 0x7060a782 | | openvswitch.service start | | vhost_user_start | | | assign acked_features in vhost_dev | with 0x4000 | | | As the step shows, if we do not keep the acked_features in NetVhostUserState up-to-date, the acked_features in vhost_dev may be reloaded with the wrong value(eg, 0x4000) when vhost_user_start happens. As to reset acked_features to 0 if needed, Qemu always keeping the backup acked_features up-to-date, and save the acked_features after virtio_net_set_features in advance, including reset acked_features to 0, so the behavior is also covered. Signed-off-by: Hyman Huang(黄勇) Signed-off-by: Guoyi Tu --- hw/net/vhost_net.c | 9 + hw/net/virtio-net.c | 5 + include/net/vhost_net.h | 2 ++ net/vhost-user.c| 6 +- 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index d28f8b9..2bffc27 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net) return net->dev.acked_features; } +void vhost_net_save_acked_features(NetClientState *nc) +{ +if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) { +return; +} + +vhost_user_save_acked_features(nc, false); +} + static int vhost_net_get_fd(NetClientState *backend) { switch (backend->info->type) { diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e9f696b..5f8f788 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features) continue; } vhost_net_ack_features(get_vhost_net(nc->peer), features); +/* + * keep acked_features in NetVhostUserState up-to-date so it + * can't miss any features configured by guest virtio driver. + */ +vhost_net_save_acked_features(nc->peer); } if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) { diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h index 387e913..3a5579b 100644 --- a/include/net/vhost_net.h +++ b/include/net/vhost_net.h @@ -46,6 +46,
Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability
Ping ? 在 2022/12/4 1:09, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. Thanks Peter and Markus sincerely for the passionate, efficient and careful comments and suggestions. Please review. Yong v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter A more comprehensive test was done comparing with version 1. The following are test environment: - a. Host hardware info: CPU: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):2 NUMA node0 CPU(s): 0-15,32-47 NUMA node1 CPU(s): 16-31,48-63 Memory: Hynix 503Gi Interface: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) Speed: 1000Mb/s b. Host software info: OS: ctyunos release 2 Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64 Libvirt baseline version: libvirt-6.9.0 Qemu baseline version: qemu-5.0 c. vm scale CPU: 4 Memory: 4G - All the supplementary test data shown as follows are basing on above test environment. In version 1, we post a test data from unixbench as follows: $ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| In version 2, we post a supplementary test data that do not use taskset and make the scenario more general, see as follows: $ ./Run per-vcpu data: |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 2991 | 2902 | 1722 | | whetstone-double| 1018 | 1006 | 627
Re: [RFC PATCH 2/2] tests: Add dirty page rate limit test
在 2022/3/10 16:29, Peter Xu 写道: On Wed, Mar 09, 2022 at 11:58:01PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Add dirty page rate limit test if kernel support dirty ring, create a standalone file to implement the test case. Thanks for writting this test case. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/dirtylimit-test.c | 288 ++ tests/qtest/meson.build | 2 + 2 files changed, 290 insertions(+) create mode 100644 tests/qtest/dirtylimit-test.c diff --git a/tests/qtest/dirtylimit-test.c b/tests/qtest/dirtylimit-test.c new file mode 100644 index 000..07eac2c --- /dev/null +++ b/tests/qtest/dirtylimit-test.c @@ -0,0 +1,288 @@ +/* + * QTest testcase for Dirty Page Rate Limit + * + * Copyright (c) 2022 CHINA TELECOM CO.,LTD. + * + * Authors: + * Hyman Huang(黄勇) + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "libqos/libqtest.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qlist.h" +#include "qapi/qobject-input-visitor.h" +#include "qapi/qobject-output-visitor.h" + +#include "migration-helpers.h" +#include "tests/migration/i386/a-b-bootblock.h" + +/* + * Dirtylimit stop working if dirty page rate error + * value less than DIRTYLIMIT_TOLERANCE_RANGE + */ +#define DIRTYLIMIT_TOLERANCE_RANGE 25 /* MB/s */ + +static QDict *qmp_command(QTestState *who, const char *command, ...) +{ +va_list ap; +QDict *resp, *ret; + +va_start(ap, command); +resp = qtest_vqmp(who, command, ap); +va_end(ap); + +g_assert(!qdict_haskey(resp, "error")); +g_assert(qdict_haskey(resp, "return")); + +ret = qdict_get_qdict(resp, "return"); +qobject_ref(ret); +qobject_unref(resp); + +return ret; +} + +static void calc_dirty_rate(QTestState *who, uint64_t calc_time) +{ +qobject_unref(qmp_command(who, +"{ 'execute': 'calc-dirty-rate'," +"'arguments': { " +"'calc-time': %ld," +"'mode': 'dirty-ring' }}", +calc_time)); +} + +static QDict *query_dirty_rate(QTestState *who) +{ +return qmp_command(who, "{ 'execute': 'query-dirty-rate' }"); +} + +static void dirtylimit_set_all(QTestState *who, uint64_t dirtyrate) +{ +qobject_unref(qmp_command(who, +"{ 'execute': 'set-vcpu-dirty-limit'," +"'arguments': { " +"'dirty-rate': %ld } }", +dirtyrate)); +} + +static void cancel_vcpu_dirty_limit(QTestState *who) +{ +qobject_unref(qmp_command(who, +"{ 'execute': 'cancel-vcpu-dirty-limit' }")); +} + +static QDict *query_vcpu_dirty_limit(QTestState *who) +{ +QDict *rsp; + +rsp = qtest_qmp(who, "{ 'execute': 'query-vcpu-dirty-limit' }"); +g_assert(!qdict_haskey(rsp, "error")); +g_assert(qdict_haskey(rsp, "return")); + +return rsp; +} + +static int64_t get_dirty_rate(QTestState *who) +{ +QDict *rsp_return; +gchar *status; +QList *rates; +const QListEntry *entry; +QDict *rate; +int64_t dirtyrate; + +rsp_return = query_dirty_rate(who); +g_assert(rsp_return); + +status = g_strdup(qdict_get_str(rsp_return, "status")); +g_assert(status); +g_assert_cmpstr(status, ==, "measured"); + +rates = qdict_get_qlist(rsp_return, "vcpu-dirty-rate"); +g_assert(rates && !qlist_empty(rates)); + +entry = qlist_first(rates); +g_assert(entry); + +rate = qobject_to(QDict, qlist_entry_obj(entry)); +g_assert(rate); + +dirtyrate = qdict_get_try_int(rate, "dirty-rate", -1); + +qobject_unref(rsp_return); +return dirtyrate; +} + +static int64_t get_limit_rate(QTestState *who) +{ +QDict *rsp_return; +QList *rates; +const QListEntry *entry; +QDict *rate; +int64_t dirtyrate; + +rsp_return = query_vcpu_dirty_limit(who); +g_assert(rsp_return); + +rates = qdict_get_qlist(rsp_return, "return"); +g_assert(rates && !qlist_empty(rates)); + +entry = qlist_first(rates); +g_assert(entry); + +rate = qobject_to(QDict, qlist_entry_obj(entry)); +g_assert(rate); + +dirtyrate = qdict_get_try_int(rate, "limit-rate", -1); + +qobject_unref(rsp_return); +return dirtyrate; +} + +static QTestState *start_vm(void) +{ +QTestState *vm = NULL; +g_autofree gchar *cmd = NULL; +const char *arch = qtest_get_ar
Re: [PATCH v21 8/9] migration-test: Export migration-test util funtions
在 2022/3/30 2:54, Peter Xu 写道: On Wed, Mar 16, 2022 at 09:07:20PM +0800, huang...@chinatelecom.cn wrote: +void wait_for_serial(const char *tmpfs, const char *side) Passing over tmpfs over and over (even if it's mostly a constant) doesn't sound appealing to me.. I hope there's still a way that we could avoid doing that when spliting the file. Or, how about you just add a new test into migration-test? After all all migration tests (including auto-converge) is there, and I don't strongly feel that we need a separate file urgently. Ok, i separated file just for code readability. I'm not very insistent to do this if we think it's ok to add dirtylimit test to migration test. Thanks for the comment. :) Yong > Thanks,
Re: [PATCH v21 9/9] tests: Add dirty page rate limit test
在 2022/3/30 3:54, Peter Xu 写道: On Wed, Mar 16, 2022 at 09:07:21PM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Add dirty page rate limit test if kernel support dirty ring, create a standalone file to implement the test case. The following qmp commands are covered by this test case: "calc-dirty-rate", "query-dirty-rate", "set-vcpu-dirty-limit", "cancel-vcpu-dirty-limit" and "query-vcpu-dirty-limit". Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/dirtylimit-test.c | 327 ++ tests/qtest/meson.build | 2 + 2 files changed, 329 insertions(+) create mode 100644 tests/qtest/dirtylimit-test.c diff --git a/tests/qtest/dirtylimit-test.c b/tests/qtest/dirtylimit-test.c new file mode 100644 index 000..b8d9960 --- /dev/null +++ b/tests/qtest/dirtylimit-test.c @@ -0,0 +1,327 @@ +/* + * QTest testcase for Dirty Page Rate Limit + * + * Copyright (c) 2022 CHINA TELECOM CO.,LTD. + * + * Authors: + * Hyman Huang(黄勇) + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "libqos/libqtest.h" +#include "qapi/qmp/qdict.h" +#include "qapi/qmp/qlist.h" +#include "qapi/qobject-input-visitor.h" +#include "qapi/qobject-output-visitor.h" + +#include "migration-helpers.h" +#include "tests/migration/i386/a-b-bootblock.h" + +/* + * Dirtylimit stop working if dirty page rate error + * value less than DIRTYLIMIT_TOLERANCE_RANGE + */ +#define DIRTYLIMIT_TOLERANCE_RANGE 25 /* MB/s */ + +static const char *tmpfs; + +static QDict *qmp_command(QTestState *who, const char *command, ...) +{ +va_list ap; +QDict *resp, *ret; + +va_start(ap, command); +resp = qtest_vqmp(who, command, ap); +va_end(ap); + +g_assert(!qdict_haskey(resp, "error")); +g_assert(qdict_haskey(resp, "return")); + +ret = qdict_get_qdict(resp, "return"); +qobject_ref(ret); +qobject_unref(resp); + +return ret; +} + +static void calc_dirty_rate(QTestState *who, uint64_t calc_time) +{ +qobject_unref(qmp_command(who, + "{ 'execute': 'calc-dirty-rate'," + "'arguments': { " + "'calc-time': %ld," + "'mode': 'dirty-ring' }}", + calc_time)); +} + +static QDict *query_dirty_rate(QTestState *who) +{ +return qmp_command(who, "{ 'execute': 'query-dirty-rate' }"); +} + +static void dirtylimit_set_all(QTestState *who, uint64_t dirtyrate) +{ +qobject_unref(qmp_command(who, + "{ 'execute': 'set-vcpu-dirty-limit'," + "'arguments': { " + "'dirty-rate': %ld } }", + dirtyrate)); +} + +static void cancel_vcpu_dirty_limit(QTestState *who) +{ +qobject_unref(qmp_command(who, + "{ 'execute': 'cancel-vcpu-dirty-limit' }")); +} + +static QDict *query_vcpu_dirty_limit(QTestState *who) +{ +QDict *rsp; + +rsp = qtest_qmp(who, "{ 'execute': 'query-vcpu-dirty-limit' }"); +g_assert(!qdict_haskey(rsp, "error")); +g_assert(qdict_haskey(rsp, "return")); + +return rsp; +} + +static bool calc_dirtyrate_ready(QTestState *who) +{ +QDict *rsp_return; +gchar *status; + +rsp_return = query_dirty_rate(who); +g_assert(rsp_return); + +status = g_strdup(qdict_get_str(rsp_return, "status")); +g_assert(status); + +return g_strcmp0(status, "measuring"); +} + +static void wait_for_calc_dirtyrate_complete(QTestState *who, + int64_t calc_time) +{ +int max_try_count = 200; +usleep(calc_time); + +while (!calc_dirtyrate_ready(who) && max_try_count--) { +usleep(1000); +} + +/* + * Set the timeout with 200 ms(max_try_count * 1000us), + * if dirtyrate measurement not complete, test failed. + */ +g_assert_cmpint(max_try_count, !=, 0); 200ms might be still too challenging for busy systems? How about make it in seconds (e.g. 10 seconds)? +} + +static int64_t get_dirty_rate(QTestState *who) +{ +QDict *rsp_return; +gchar *status; +QList *rates; +const QListEntry *entry; +QDict *rate; +int64_t dirtyrate; + +rsp_return = query_dirty_rate(who); +g_assert(rsp_return); + +status = g_strdup(qdict_get_str(rsp_return, "status")); +g_assert(status); +g_assert_cmpstr(status, ==, "measured"); + +rates = qdict_get_qlist(rsp_return, "vcpu-dirty-ra
Re: [PATCH v1 0/8] migration: introduce dirtylimit capability
在 2022/9/7 4:46, Peter Xu 写道: On Fri, Sep 02, 2022 at 01:22:28AM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) v1: - make parameter vcpu-dirty-limit experimental - switch dirty limit off when cancel migrate - add cancel logic in migration test Please review, thanks, Yong Abstract This series added a new migration capability called "dirtylimit". It can be enabled when dirty ring is enabled, and it'll improve the vCPU performance during the process of migration. It is based on the previous patchset: https://lore.kernel.org/qemu-devel/cover.1656177590.git.huang...@chinatelecom.cn/ As mentioned in patchset "support dirty restraint on vCPU", dirtylimit way of migration can make the read-process not be penalized. This series wires up the vcpu dirty limit and wrappers as dirtylimit capability of migration. I introduce two parameters vcpu-dirtylimit-period and vcpu-dirtylimit to implement the setup of dirtylimit during live migration. To validate the implementation, i tested a 32 vCPU vm live migration with such model: Only dirty vcpu0, vcpu1 with heavy memory workoad and leave the rest vcpus untouched, running unixbench on the vpcu8-vcpu15 by setup the cpu affinity as the following command: taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item} The following are results: host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |-+++---| | UnixBench test item | Normal | Dirtylimit | Auto-converge | |-+++---| | dhry2reg| 32800 | 32786 | 25292 | | whetstone-double| 10326 | 10315 | 9847 | | pipe| 15442 | 15271 | 14506 | | context1| 7260 | 6235 | 4514 | | spawn | 3663 | 3317 | 3249 | | syscall | 4669 | 4667 | 3841 | |-+++---| From the data above we can draw a conclusion that vcpus that do not dirty memory in vm are almost unaffected during the dirtylimit migration, but the auto converge way does. I also tested the total time of dirtylimit migration with variable dirty memory size in vm. senario 1: host cpu: Intel(R) Xeon(R) Platinum 8378A host interface speed: 1000Mb/s |---++---| | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) | |---++---| | 60| 2014 | 2131 | | 70| 5381 | 12590 | | 90| 6037 | 33545 | | 110 | 7660 | [*] | |---++---| [*]: This case means migration is not convergent. senario 2: host cpu: Intel(R) Xeon(R) CPU E5-2650 host interface speed: 1Mb/s |---++---| | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) | |---++---| | 1600 | 15842 | 27548 | | 2000 | 19026 | 38447 | | 2400 | 19897 | 46381 | | 2800 | 22338 | 57149 | |---++---| Above data shows that dirtylimit way of migration can also reduce the total time of migration and it achieves convergence more easily in some case. In addition to implement dirtylimit capability itself, this series add 3 tests for migration, aiming at playing around for developer simply: 1. qtest for dirty limit migration 2. support dirty ring way of migration for guestperf tool 3. support dirty limit migration for guestperf tool Yong, I should have asked even earlier - just curious whether you have started using this in production systems? It's definitely not required for any patchset to be merged, but it'll be very useful (and supportive) information to have if there's proper testing beds applied already. Actually no when i posted the cover letter above, the qemu version in our production is much lower than upstream, and the patchset is different from here, i built test mode and did the test on my own in the first time. But this feature is in the process of test conducted by another professional test team, so once report is ready, i'll post it. :) Thanks,
Re: [PATCH v1 4/8] migration: Implement dirty-limit convergence algo
在 2022/9/7 4:37, Peter Xu 写道: On Fri, Sep 02, 2022 at 01:22:32AM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Signed-off-by: Hyman Huang(黄勇) --- migration/migration.c | 1 + migration/ram.c| 53 +- migration/trace-events | 1 + 3 files changed, 42 insertions(+), 13 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index d117bb4..64696de 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -239,6 +239,7 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); migrate_fd_cancel(current_migration); } diff --git a/migration/ram.c b/migration/ram.c index dc1de9d..cc19c5e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -45,6 +45,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -57,6 +58,8 @@ #include "qemu/iov.h" #include "multifd.h" #include "sysemu/runstate.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -1139,6 +1142,21 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +if (!dirtylimit_in_service()) { +MigrationState *s = migrate_get_current(); +int64_t quota_dirtyrate = s->parameters.x_vcpu_dirty_limit; + +/* Set quota dirtyrate if dirty limit not in service */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} +} + static void migration_trigger_throttle(RAMState *rs) { MigrationState *s = migrate_get_current(); @@ -1148,22 +1166,31 @@ static void migration_trigger_throttle(RAMState *rs) uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE; uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100; -/* During block migration the auto-converge logic incorrectly detects - * that ram migration makes no progress. Avoid this by disabling the - * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ + +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +/* + * During block migration the auto-converge logic incorrectly detects + * that ram migration makes no progress. Avoid this by disabling the + * throttling logic during the bulk phase of block migration + */ + +if (migrate_auto_converge() && !blk_mig_bulk_active()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit() && + kvm_dirty_ring_enabled() && + migration_is_active(s)) { +migration_dirty_limit_guest(); We'll call this multiple time, but only the 1st call will make sense, right? Yes. Can we call it once somewhere? E.g. at the start of migration?It make sense indeed, if dirtylimit run once migration start, the behavior of dirtylimit migration would be kind of different from auto-converge, i mean, dirtylimit will make guest write vCPU slow no matter if dirty_rate_high_cnt ex
Re: [PATCH v22 0/8] support dirty restraint on vCPU
Ping. Hi, David and Peter, how do you think this patchset? Is it suitable for queueing ? or is there still something need to be done ? Yong 在 2022/4/1 1:49, huang...@chinatelecom.cn 写道: From: Hyman Huang(黄勇) This is v22 of dirtylimit series. The following is the history of the patchset, since v22 kind of different from the original version, i made abstracts of changelog: RFC and v1: https://lore.kernel.org/qemu-devel/cover.1637214721.git.huang...@chinatelecom.cn/ v2: https://lore.kernel.org/qemu-devel/cover.1637256224.git.huang...@chinatelecom.cn/ v1->v2 changelog: - rename some function and variables. refactor the original algo of dirtylimit. Thanks for the comments given by Juan Quintela. v3: https://lore.kernel.org/qemu-devel/cover.1637403404.git.huang...@chinatelecom.cn/ v4: https://lore.kernel.org/qemu-devel/cover.1637653303.git.huang...@chinatelecom.cn/ v5: https://lore.kernel.org/qemu-devel/cover.1637759139.git.huang...@chinatelecom.cn/ v6: https://lore.kernel.org/qemu-devel/cover.1637856472.git.huang...@chinatelecom.cn/ v7: https://lore.kernel.org/qemu-devel/cover.1638202004.git.huang...@chinatelecom.cn/ v2->v7 changelog: - refactor the docs, annotation and fix bugs of the original algo of dirtylimit. Thanks for the review given by Markus Armbruster. v8: https://lore.kernel.org/qemu-devel/cover.1638463260.git.huang...@chinatelecom.cn/ v9: https://lore.kernel.org/qemu-devel/cover.1638495274.git.huang...@chinatelecom.cn/ v10: https://lore.kernel.org/qemu-devel/cover.1639479557.git.huang...@chinatelecom.cn/ v7->v10 changelog: - introduce a simpler but more efficient algo of dirtylimit inspired by Peter Xu. - keep polishing the annotation suggested by Markus Armbruster. v11: https://lore.kernel.org/qemu-devel/cover.1641315745.git.huang...@chinatelecom.cn/ v12: https://lore.kernel.org/qemu-devel/cover.1642774952.git.huang...@chinatelecom.cn/ v13: https://lore.kernel.org/qemu-devel/cover.1644506963.git.huang...@chinatelecom.cn/ v10->v13 changelog: - handle the hotplug/unplug scenario. - refactor the new algo, split the commit and make the code more clean. v14: https://lore.kernel.org/qemu-devel/cover.1644509582.git.huang...@chinatelecom.cn/ v13->v14 changelog: - sent by accident. v15: https://lore.kernel.org/qemu-devel/cover.1644976045.git.huang...@chinatelecom.cn/ v16: https://lore.kernel.org/qemu-devel/cover.1645067452.git.huang...@chinatelecom.cn/ v17: https://lore.kernel.org/qemu-devel/cover.1646243252.git.huang...@chinatelecom.cn/ v14->v17 changelog: - do some code clean and fix test bug reported by Dr. David Alan Gilbert. v18: https://lore.kernel.org/qemu-devel/cover.1646247968.git.huang...@chinatelecom.cn/ v19: https://lore.kernel.org/qemu-devel/cover.1647390160.git.huang...@chinatelecom.cn/ v20: https://lore.kernel.org/qemu-devel/cover.1647396907.git.huang...@chinatelecom.cn/ v21: https://lore.kernel.org/qemu-devel/cover.1647435820.git.huang...@chinatelecom.cn/ v17->v21 changelog: - add qtest, fix bug and do code clean. v21->v22 changelog: - move the vcpu dirty limit test into migration-test and do some modification suggested by Peter. Please review. Yong. Abstract This patchset introduce a mechanism to impose dirty restraint on vCPU, aiming to keep the vCPU running in a certain dirtyrate given by user. dirty restraint on vCPU maybe an alternative method to implement convergence logic for live migration, which could improve guest memory performance during migration compared with traditional method in theory. For the current live migration implementation, the convergence logic throttles all vCPUs of the VM, which has some side effects. -'read processes' on vCPU will be unnecessarily penalized - throttle increase percentage step by step, which seems struggling to find the optimal throttle percentage when dirtyrate is high. - hard to predict the remaining time of migration if the throttling percentage reachs 99% to a certain extent, the dirty restraint machnism can fix these effects by throttling at vCPU granularity during migration. the implementation is rather straightforward, we calculate vCPU dirtyrate via the Dirty Ring mechanism periodically as the commit 0e21bf246 "implement dirty-ring dirtyrate calculation" does, for vCPU that be specified to impose dirty restraint, we throttle it periodically as the auto-converge does, once after throttling, we compare the quota dirtyrate with current dirtyrate, if current dirtyrate is not under the quota, increase the throttling percentage until current dirtyrate is under the quota. this patchset is the basis of implmenting a new auto-converge method for live migration, we introduce two qmp commands for impose/cancel the dirty restraint on specified vCPU, so it also can be an independent api to supply the upper app such as libvirt, which can use it to implement the convergence logic during live migration, supplemented with the qmp 'ca
Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/12/6 16:28, Peter Xu 写道: On Sat, Dec 04, 2021 at 08:00:19PM +0800, Hyman Huang wrote: 在 2021/12/3 20:34, Markus Armbruster 写道: huang...@chinatelecom.cn writes: From: Hyman Huang(黄勇) Implement dirtyrate calculation periodically basing on dirty-ring and throttle vCPU until it reachs the quota dirty page rate given by user. Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit" to enable, disable, query dirty page limit for virtual CPU. Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit", "info vcpu_dirty_limit" so developers can play with them easier. Signed-off-by: Hyman Huang(黄勇) [...] I see you replaced the interface. Back to square one... diff --git a/qapi/migration.json b/qapi/migration.json index 3da8fdf..dc15b3f 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1872,6 +1872,54 @@ 'current-rate': 'int64' } } ## +# @vcpu-dirty-limit: +# +# Set or cancel the upper limit of dirty page rate for a virtual CPU. +# +# Requires KVM with accelerator property "dirty-ring-size" set. +# A virtual CPU's dirty page rate is a measure of its memory load. +# To observe dirty page rates, use @calc-dirty-rate. +# +# @cpu-index: index of virtual CPU. +# +# @enable: true to enable, false to disable. +# +# @dirty-rate: upper limit of dirty page rate for virtual CPU. +# +# Since: 7.0 +# +# Example: +# {"execute": "vcpu-dirty-limit"} +#"arguments": { "cpu-index": 0, +# "enable": true, +# "dirty-rate": 200 } } +# +## +{ 'command': 'vcpu-dirty-limit', + 'data': { 'cpu-index': 'int', +'enable': 'bool', +'dirty-rate': 'uint64'} } When @enable is false, @dirty-rate makes no sense and is not used (I checked the code), but users have to specify it anyway. That's bad design. Better: drop @enable, make @dirty-rate optional, present means enable, absent means disable. Uh, if we drop @enable, enabling dirty limit should be like: vcpu-dirty-limit cpu-index=0 dirty-rate=1000 And disabling dirty limit like: vcpu-dirty-limit cpu-index=0 For disabling case, there is no hint of disabling in command "vcpu-dirty-limit". How about make @dirty-rate optional, when enable dirty limit, it should present, ignored otherwise? Sounds good, I think we can make both "enable" and "dirty-rate" optional. To turn it on we either use "enable=true,dirty-rate=XXX" or "dirty-rate=XXX" > To turn it off we use "enable=false". Indeed, this make things more convenient. >> + +## +# @query-vcpu-dirty-limit: +# +# Returns information about the virtual CPU dirty limit status. +# +# @cpu-index: index of the virtual CPU to query, if not specified, all +# virtual CPUs will be queried. +# +# Since: 7.0 +# +# Example: +# {"execute": "query-vcpu-dirty-limit"} +#"arguments": { "cpu-index": 0 } } +# +## +{ 'command': 'query-vcpu-dirty-limit', + 'data': { '*cpu-index': 'int' }, + 'returns': [ 'DirtyLimitInfo' ] } Why would anyone ever want to specify @cpu-index? Output isn't that large even if you have a few hundred CPUs. Let's keep things simple and drop the parameter. Ok, this make things simple. I found that it'll be challenging for any human being to identify "whether he/she has turned throttle off for all vcpus".. I think that could be useful when we finally decided to cancel current migration. That's question, how about adding an optional argument "global" and making "cpu-index", "enable", "dirty-rate" all optional in "vcpu-dirty-limit", keeping the "cpu-index" and "global" options mutually exclusive? { 'command': 'vcpu-dirty-limit', 'data': { '*cpu-index': 'int', '*global': 'bool' '*enable': 'bool', '*dirty-rate': 'uint64'} } In the case of enabling all vcpu throttle: Either use "global=true,enable=true,dirty-rate=XXX" or "global=true,dirty-rate=XXX" In the case of disabling all vcpu throttle: use "global=true,enable=false,dirty-rate=XXX" In other case, we pass the same option just like what we did for specified vcpu throttle before. I thought about adding a "global=on/off" flag, but instead can we just return the vcpu info for the ones that enabled the per-vcpu throttling? For anyone who wants to read all vcpu dirty information he/she can use calc-dirty-rate. Ok, I'll pick up this advice next version. Thanks,
Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/12/6 16:36, Peter Xu 写道: On Fri, Dec 03, 2021 at 09:39:47AM +0800, huang...@chinatelecom.cn wrote: From: Hyman Huang(黄勇) Implement dirtyrate calculation periodically basing on dirty-ring and throttle vCPU until it reachs the quota dirty page rate given by user. Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit" to enable, disable, query dirty page limit for virtual CPU. Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit", "info vcpu_dirty_limit" so developers can play with them easier. Thanks. Even if I start to use qmp-shell more recently but still in some case where only hmp is specified this could still be handy. +void qmp_vcpu_dirty_limit(int64_t cpu_index, + bool enable, + uint64_t dirty_rate, + Error **errp) +{ +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty page limit feature requires KVM with" + " accelerator property 'dirty-ring-size' set'"); +return; +} + +if (!dirtylimit_is_vcpu_index_valid(cpu_index)) { +error_setg(errp, "cpu index out of range"); +return; +} + +if (enable) { +dirtylimit_calc(); +dirtylimit_vcpu(cpu_index, dirty_rate); +} else { +if (!dirtylimit_enabled(cpu_index)) { +error_setg(errp, "dirty page limit for CPU %ld not set", + cpu_index); +return; +} We don't need to fail the user for enable=off even if vcpu is not throttled, imho. Ok. + +if (!dirtylimit_cancel_vcpu(cpu_index)) { +dirtylimit_calc_quit(); +} +} +} [...] +struct DirtyLimitInfoList *qmp_query_vcpu_dirty_limit(bool has_cpu_index, + int64_t cpu_index, + Error **errp) +{ +DirtyLimitInfo *info = NULL; +DirtyLimitInfoList *head = NULL, **tail = &head; + +if (has_cpu_index && +(!dirtylimit_is_vcpu_index_valid(cpu_index))) { +error_setg(errp, "cpu index out of range"); +return NULL; +} + +if (has_cpu_index) { +info = dirtylimit_query_vcpu(cpu_index); +QAPI_LIST_APPEND(tail, info); +} else { +CPUState *cpu; +CPU_FOREACH(cpu) { +if (!cpu->unplug) { +info = dirtylimit_query_vcpu(cpu->cpu_index); +QAPI_LIST_APPEND(tail, info); +} There're special handling for unplug in a few places. Could you explain why? E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to even keep it there? The dirty limit logic only allow plugged vcpu to be enabled throttle, so that the "dirtylimit-{cpu-index}" thread don't need to be forked and we can save the overhead. So in query logic we just filter the unplugged vcpu. Another reason is that i thought it could make user confused when we return the unplugged vcpu dirtylimit info. Uh, in most time of vm lifecycle, hotplugging vcpu may never happen. +} +} + +return head; +}
Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/12/6 16:39, Peter Xu 写道: On Fri, Dec 03, 2021 at 09:39:47AM +0800, huang...@chinatelecom.cn wrote: +void dirtylimit_setup(int max_cpus) +{ +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +return; +} + +dirtylimit_calc_state_init(max_cpus); +dirtylimit_state_init(max_cpus); +} [...] diff --git a/softmmu/vl.c b/softmmu/vl.c index 620a1f1..0f83ce3 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp) qemu_init_displays(); accel_setup_post(current_machine); os_setup_post(); +dirtylimit_setup(current_machine->smp.max_cpus); resume_mux_open(); Can we do the init only when someone enables it? We could also do proper free() for the structs when it's globally turned off. Yes, i'll try this next version
Re: [PATCH v9 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically
在 2021/12/6 18:18, Peter Xu 写道: On Fri, Dec 03, 2021 at 09:39:45AM +0800, huang...@chinatelecom.cn wrote: +static void dirtylimit_calc_func(void) +{ +CPUState *cpu; +DirtyPageRecord *dirty_pages; +int64_t start_time, end_time, calc_time; +DirtyRateVcpu rate; +int i = 0; + +dirty_pages = g_malloc0(sizeof(*dirty_pages) * +dirtylimit_calc_state->data.nvcpu); + +CPU_FOREACH(cpu) { +record_dirtypages(dirty_pages, cpu, true); +} + +start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); +g_usleep(DIRTYLIMIT_CALC_TIME_MS * 1000); +end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); +calc_time = end_time - start_time; + +qemu_mutex_lock_iothread(); +memory_global_dirty_log_sync(); +qemu_mutex_unlock_iothread(); + +CPU_FOREACH(cpu) { +record_dirtypages(dirty_pages, cpu, false); +} + +for (i = 0; i < dirtylimit_calc_state->data.nvcpu; i++) { +uint64_t increased_dirty_pages = +dirty_pages[i].end_pages - dirty_pages[i].start_pages; +uint64_t memory_size_MB = +(increased_dirty_pages * TARGET_PAGE_SIZE) >> 20; +int64_t dirtyrate = (memory_size_MB * 1000) / calc_time; + +rate.id = i; +rate.dirty_rate = dirtyrate; +dirtylimit_calc_state->data.rates[i] = rate; + +trace_dirtyrate_do_calculate_vcpu(i, +dirtylimit_calc_state->data.rates[i].dirty_rate); +} +} This looks so like the calc-dirty-rate code already. I think adding a new resion (GLOBAL_DIRTY_LIMIT) is fine, however still, any Ok. chance to merge the code? I'm not sure about merging but i'll try it. :)
Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/12/7 10:24, Peter Xu 写道: On Mon, Dec 06, 2021 at 10:56:00PM +0800, Hyman wrote: I found that it'll be challenging for any human being to identify "whether he/she has turned throttle off for all vcpus".. I think that could be useful when we finally decided to cancel current migration. That's question, how about adding an optional argument "global" and making "cpu-index", "enable", "dirty-rate" all optional in "vcpu-dirty-limit", keeping the "cpu-index" and "global" options mutually exclusive? { 'command': 'vcpu-dirty-limit', 'data': { '*cpu-index': 'int', '*global': 'bool' '*enable': 'bool', '*dirty-rate': 'uint64'} } In the case of enabling all vcpu throttle: Either use "global=true,enable=true,dirty-rate=XXX" or "global=true,dirty-rate=XXX" In the case of disabling all vcpu throttle: use "global=true,enable=false,dirty-rate=XXX" In other case, we pass the same option just like what we did for specified vcpu throttle before. Could we merge "cpu-index" and "global" somehow? They're mutual exclusive. > For example, merge them into one "vcpu" parameter, "vcpu=all" means global, "vcpu=1" means vcpu 1. But then we'll need to make it a string. Ok, sound good
Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU
在 2021/12/7 10:57, Peter Xu 写道: On Mon, Dec 06, 2021 at 11:19:21PM +0800, Hyman wrote: +if (has_cpu_index) { +info = dirtylimit_query_vcpu(cpu_index); +QAPI_LIST_APPEND(tail, info); +} else { +CPUState *cpu; +CPU_FOREACH(cpu) { +if (!cpu->unplug) { +info = dirtylimit_query_vcpu(cpu->cpu_index); +QAPI_LIST_APPEND(tail, info); +} There're special handling for unplug in a few places. Could you explain why? E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to even keep it there? The dirty limit logic only allow plugged vcpu to be enabled throttle, so that the "dirtylimit-{cpu-index}" thread don't need to be forked and we can save the overhead. So in query logic we just filter the unplugged vcpu. I've commented similarly in the other thread - please consider not using NVCPU threads only for vcpu throttling, irrelevant of vcpu hot plug/unplug. Per-vcpu throttle is totally not a cpu intensive workload, 1 thread should be enough globally, imho. A guest with hundreds of vcpus are becoming more common, we shouldn't waste OS thread resources just for this. Ok, i'll try this out next version Another reason is that i thought it could make user confused when we return the unplugged vcpu dirtylimit info. Uh, in most time of vm lifecycle, hotplugging vcpu may never happen. I just think if plug/unplug does not affect the throttle logic then we should treat them the same, it avoids unnecessary special care on those vcpus too. Indeed, i'm struggling too :), i'll remove the plug/unplug logic the next version.
Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle
在 2021/12/6 18:10, Peter Xu 写道: On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote: +static uint64_t dirtylimit_pct(unsigned int last_pct, + uint64_t quota, + uint64_t current) +{ +uint64_t limit_pct = 0; +RestrainPolicy policy; +bool mitigate = (quota > current) ? true : false; + +if (mitigate && ((current == 0) || +(last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) { +return 0; +} + +policy = dirtylimit_policy(last_pct, quota, current); +switch (policy) { +case RESTRAIN_SLIGHT: +/* [90, 99] */ +if (mitigate) { +limit_pct = +last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE; +} else { +limit_pct = +last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE; + +limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX); +} + break; +case RESTRAIN_HEAVY: +/* [75, 90) */ +if (mitigate) { +limit_pct = +last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE; +} else { +limit_pct = +last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE; + +limit_pct = MIN(limit_pct, +DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK); +} + break; +case RESTRAIN_RATIO: +/* [0, 75) */ +if (mitigate) { +if (last_pct <= (((quota - current) * 100 / quota))) { +limit_pct = 0; +} else { +limit_pct = last_pct - +((quota - current) * 100 / quota); +limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN); +} +} else { +limit_pct = last_pct + +((current - quota) * 100 / current); + +limit_pct = MIN(limit_pct, +DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK); +} + break; +case RESTRAIN_KEEP: +default: + limit_pct = last_pct; + break; +} + +return limit_pct; +} + +static void *dirtylimit_thread(void *opaque) +{ +int cpu_index = *(int *)opaque; +uint64_t quota_dirtyrate, current_dirtyrate; +unsigned int last_pct = 0; +unsigned int pct = 0; + +rcu_register_thread(); + +quota_dirtyrate = dirtylimit_quota(cpu_index); +current_dirtyrate = dirtylimit_current(cpu_index); + +pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate); + +do { +trace_dirtylimit_impose(cpu_index, +quota_dirtyrate, current_dirtyrate, pct); + +last_pct = pct; +if (pct == 0) { +sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S); +} else { +dirtylimit_check(cpu_index, pct); +} + +quota_dirtyrate = dirtylimit_quota(cpu_index); +current_dirtyrate = dirtylimit_current(cpu_index); + +pct = dirtylimit_pct(last_pct, quota_dirtyrate, current_dirtyrate); So what I had in mind is we can start with an extremely simple version of negative feedback system. Say, firstly each vcpu will have a simple number to sleep for some interval (this is ugly code, but just show what I meant..): === diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index eecd8031cf..c320fd190f 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu) trace_kvm_dirty_ring_full(cpu->cpu_index); qemu_mutex_lock_iothread(); kvm_dirty_ring_reap(kvm_state); +if (dirtylimit_enabled(cpu->cpu_index) && cpu->throttle_us_per_full) +usleep(cpu->throttle_us_per_full); qemu_mutex_unlock_iothread(); ret = 0; break; === I think this will have finer granularity when throttle (for 4096 ring size, that's per-16MB operation) than current way where we inject per-vcpu async task to sleep, like auto-converge. Then we have the "black box" to tune this value with below input/output: - Input: dirty rate information, same as current algo - Output: increase/decrease of per-vcpu throttle_us_per_full above, and that's all We can do the sampling per-second, then we keep doing it: we can have 1 thread doing per-second task collecting dirty rate information for all the vcpus, then tune that throttle_us_per_full for each of them. The simplest linear algorithm would be as simple as (for each vcpu): if (quota < current) throttle_us_per_full += SOMETHING; if (throttle_us_per_full > MAX) throttle_us_per_full = MAX; else throttle_us_per_full -= SOMETHING; if (throttle_us_per_full < 0) throttle_us_per_full = 0; I think your algorithm is fine, but thoroughly review every single bit of it in one shot will be challenging, and it's also hard to prove every bit of the algorithm is helpful, as there're a lot of hand-made macros and state changes. I actually tes
Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle
在 2021/12/8 23:36, Hyman 写道: 在 2021/12/6 18:10, Peter Xu 写道: On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote: +static uint64_t dirtylimit_pct(unsigned int last_pct, + uint64_t quota, + uint64_t current) +{ + uint64_t limit_pct = 0; + RestrainPolicy policy; + bool mitigate = (quota > current) ? true : false; + + if (mitigate && ((current == 0) || + (last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) { + return 0; + } + + policy = dirtylimit_policy(last_pct, quota, current); + switch (policy) { + case RESTRAIN_SLIGHT: + /* [90, 99] */ + if (mitigate) { + limit_pct = + last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE; + } else { + limit_pct = + last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE; + + limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX); + } + break; + case RESTRAIN_HEAVY: + /* [75, 90) */ + if (mitigate) { + limit_pct = + last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE; + } else { + limit_pct = + last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE; + + limit_pct = MIN(limit_pct, + DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK); + } + break; + case RESTRAIN_RATIO: + /* [0, 75) */ + if (mitigate) { + if (last_pct <= (((quota - current) * 100 / quota))) { + limit_pct = 0; + } else { + limit_pct = last_pct - + ((quota - current) * 100 / quota); + limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN); + } + } else { + limit_pct = last_pct + + ((current - quota) * 100 / current); + + limit_pct = MIN(limit_pct, + DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK); + } + break; + case RESTRAIN_KEEP: + default: + limit_pct = last_pct; + break; + } + + return limit_pct; +} + +static void *dirtylimit_thread(void *opaque) +{ + int cpu_index = *(int *)opaque; + uint64_t quota_dirtyrate, current_dirtyrate; + unsigned int last_pct = 0; + unsigned int pct = 0; + + rcu_register_thread(); + + quota_dirtyrate = dirtylimit_quota(cpu_index); + current_dirtyrate = dirtylimit_current(cpu_index); + + pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate); + + do { + trace_dirtylimit_impose(cpu_index, + quota_dirtyrate, current_dirtyrate, pct); + + last_pct = pct; + if (pct == 0) { + sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S); + } else { + dirtylimit_check(cpu_index, pct); + } + + quota_dirtyrate = dirtylimit_quota(cpu_index); + current_dirtyrate = dirtylimit_current(cpu_index); + + pct = dirtylimit_pct(last_pct, quota_dirtyrate, current_dirtyrate); So what I had in mind is we can start with an extremely simple version of negative feedback system. Say, firstly each vcpu will have a simple number to sleep for some interval (this is ugly code, but just show what I meant..): === diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index eecd8031cf..c320fd190f 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu) trace_kvm_dirty_ring_full(cpu->cpu_index); qemu_mutex_lock_iothread(); kvm_dirty_ring_reap(kvm_state); + if (dirtylimit_enabled(cpu->cpu_index) && cpu->throttle_us_per_full) + usleep(cpu->throttle_us_per_full); qemu_mutex_unlock_iothread(); ret = 0; break; === I think this will have finer granularity when throttle (for 4096 ring size, that's per-16MB operation) than current way where we inject per-vcpu async task to sleep, like auto-converge. Then we have the "black box" to tune this value with below input/output: - Input: dirty rate information, same as current algo - Output: increase/decrease of per-vcpu throttle_us_per_full above, and that's all We can do the sampling per-second, then we keep doing it: we can have 1 thread doing per-second task collecting dirty rate information for all the vcpus, then tune that throttle_us_per_full for each of them. The simplest linear algorithm would be as simple as (for each vcpu): if (quota < current) throttle_us_per_full += SOMETHING; if (throttle_us_per_full > MAX) throttle_us_per_full = MAX; else throttle_us_per_full -= SOMETHING; if (throttle_us_per_full < 0) throttle_us_per_full = 0; I think your algorithm is fine, but thoroughly review every single bit of it in one shot will be challenging, and it's also har
[PATCH QEMU v3 1/3] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Note that this test case involves many passes, so it runs in slow mode only. Signed-off-by: Hyman Huang(黄勇) Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht> --- tests/qtest/migration-test.c | 164 +++ 1 file changed, 164 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 62d3f37021..0be2d17c42 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2739,6 +2739,166 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source destination + * start vm + * start incoming vm + * migrate + * wait dirty limit to begin + * cancel migrate + * cancellation check + * restart incoming vm + * migrate + * wait dirty limit to begin + * wait pre-switchover event + * convergence condition check + * + * And see if dirty limit migration works correctly. + * This test case involves many passes, so it runs in slow mode only. + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +
[PATCH QEMU v3 2/3] tests/migration: Introduce dirty-ring-size option into guestperf
From: Hyman Huang(黄勇) Dirty ring size configuration is not supported by guestperf tool. Introduce dirty-ring-size (ranges in [1024, 65536]) option so developers can play with dirty-ring and dirty-limit feature easier. To set dirty ring size with 4096 during migration test: $ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx Signed-off-by: Hyman Huang(黄勇) Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht> --- tests/migration/guestperf/engine.py | 6 +- tests/migration/guestperf/hardware.py | 8 ++-- tests/migration/guestperf/shell.py| 6 +- 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index e69d16a62c..29ebb5011b 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -325,7 +325,6 @@ class Engine(object): cmdline = "'" + cmdline + "'" argv = [ -"-accel", "kvm", "-cpu", "host", "-kernel", self._kernel, "-initrd", self._initrd, @@ -333,6 +332,11 @@ class Engine(object): "-m", str((hardware._mem * 1024) + 512), "-smp", str(hardware._cpus), ] +if hardware._dirty_ring_size: +argv.extend(["-accel", "kvm,dirty-ring-size=%s" % + hardware._dirty_ring_size]) +else: +argv.extend(["-accel", "kvm"]) argv.extend(self._get_qemu_serial_args()) diff --git a/tests/migration/guestperf/hardware.py b/tests/migration/guestperf/hardware.py index 3145785ffd..f779cc050b 100644 --- a/tests/migration/guestperf/hardware.py +++ b/tests/migration/guestperf/hardware.py @@ -23,7 +23,8 @@ class Hardware(object): src_cpu_bind=None, src_mem_bind=None, dst_cpu_bind=None, dst_mem_bind=None, prealloc_pages = False, - huge_pages=False, locked_pages=False): + huge_pages=False, locked_pages=False, + dirty_ring_size=0): self._cpus = cpus self._mem = mem # GiB self._src_mem_bind = src_mem_bind # List of NUMA nodes @@ -33,6 +34,7 @@ class Hardware(object): self._prealloc_pages = prealloc_pages self._huge_pages = huge_pages self._locked_pages = locked_pages +self._dirty_ring_size = dirty_ring_size def serialize(self): @@ -46,6 +48,7 @@ class Hardware(object): "prealloc_pages": self._prealloc_pages, "huge_pages": self._huge_pages, "locked_pages": self._locked_pages, +"dirty_ring_size": self._dirty_ring_size, } @classmethod @@ -59,4 +62,5 @@ class Hardware(object): data["dst_mem_bind"], data["prealloc_pages"], data["huge_pages"], -data["locked_pages"]) +data["locked_pages"], +data["dirty_ring_size"]) diff --git a/tests/migration/guestperf/shell.py b/tests/migration/guestperf/shell.py index 8a809e3dda..7d6b8cd7cf 100644 --- a/tests/migration/guestperf/shell.py +++ b/tests/migration/guestperf/shell.py @@ -60,6 +60,8 @@ class BaseShell(object): parser.add_argument("--prealloc-pages", dest="prealloc_pages", default=False) parser.add_argument("--huge-pages", dest="huge_pages", default=False) parser.add_argument("--locked-pages", dest="locked_pages", default=False) +parser.add_argument("--dirty-ring-size", dest="dirty_ring_size", +default=0, type=int) self._parser = parser @@ -89,7 +91,9 @@ class BaseShell(object): locked_pages=args.locked_pages, huge_pages=args.huge_pages, -prealloc_pages=args.prealloc_pages) +prealloc_pages=args.prealloc_pages, + +dirty_ring_size=args.dirty_ring_size) class Shell(BaseShell): -- 2.38.5
[PATCH QEMU v3 0/3] migration: enrich the dirty-limit test case
Ping This version is a copy of version 2 and is rebased on the master. No functional changes. The dirty-limit migration test involves many passes and takes about 1 minute on average, so put it in the slow mode of migration-test. Inspired by Peter. V2: - put the dirty-limit migration test in slow mode and enrich the test case comment Dirty-limit feature was introduced in 8.1, and the test case could be enriched to make sure the behavior and the performance of dirty-limit is exactly what we want. This series adds 2 test cases, the first commit aims for the functional test and the others aim for the performance test. Please review, thanks. Yong. Hyman Huang(黄勇) (3): tests: Add migration dirty-limit capability test tests/migration: Introduce dirty-ring-size option into guestperf tests/migration: Introduce dirty-limit into guestperf tests/migration/guestperf/comparison.py | 23 tests/migration/guestperf/engine.py | 23 +++- tests/migration/guestperf/hardware.py | 8 +- tests/migration/guestperf/progress.py | 16 ++- tests/migration/guestperf/scenario.py | 11 +- tests/migration/guestperf/shell.py | 24 +++- tests/qtest/migration-test.c| 164 7 files changed, 261 insertions(+), 8 deletions(-) -- 2.38.5
[PATCH QEMU v3 3/3] tests/migration: Introduce dirty-limit into guestperf
From: Hyman Huang(黄勇) Currently, guestperf does not cover the dirty-limit migration, support this feature. Note that dirty-limit requires 'dirty-ring-size' set. To enable dirty-limit, setting x-vcpu-dirty-limit-period as 500ms and x-vcpu-dirty-limit as 10MB/s: $ ./tests/migration/guestperf.py \ --dirty-ring-size 4096 \ --dirty-limit --x-vcpu-dirty-limit-period 500 \ --vcpu-dirty-limit 10 --output output.json \ To run the entire standardized set of dirty-limit-enabled comparisons, with unix migration: $ ./tests/migration/guestperf-batch.py \ --dirty-ring-size 4096 \ --dst-host localhost --transport unix \ --filter compr-dirty-limit* --output outputdir Signed-off-by: Hyman Huang(黄勇) Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht> --- tests/migration/guestperf/comparison.py | 23 +++ tests/migration/guestperf/engine.py | 17 + tests/migration/guestperf/progress.py | 16 ++-- tests/migration/guestperf/scenario.py | 11 ++- tests/migration/guestperf/shell.py | 18 +- 5 files changed, 81 insertions(+), 4 deletions(-) diff --git a/tests/migration/guestperf/comparison.py b/tests/migration/guestperf/comparison.py index c03b3f6d7e..42cc0372d1 100644 --- a/tests/migration/guestperf/comparison.py +++ b/tests/migration/guestperf/comparison.py @@ -135,4 +135,27 @@ COMPARISONS = [ Scenario("compr-multifd-channels-64", multifd=True, multifd_channels=64), ]), + +# Looking at effect of dirty-limit with +# varying x_vcpu_dirty_limit_period +Comparison("compr-dirty-limit-period", scenarios = [ +Scenario("compr-dirty-limit-period-500", + dirty_limit=True, x_vcpu_dirty_limit_period=500), +Scenario("compr-dirty-limit-period-800", + dirty_limit=True, x_vcpu_dirty_limit_period=800), +Scenario("compr-dirty-limit-period-1000", + dirty_limit=True, x_vcpu_dirty_limit_period=1000), +]), + + +# Looking at effect of dirty-limit with +# varying vcpu_dirty_limit +Comparison("compr-dirty-limit", scenarios = [ +Scenario("compr-dirty-limit-10MB", + dirty_limit=True, vcpu_dirty_limit=10), +Scenario("compr-dirty-limit-20MB", + dirty_limit=True, vcpu_dirty_limit=20), +Scenario("compr-dirty-limit-50MB", + dirty_limit=True, vcpu_dirty_limit=50), +]), ] diff --git a/tests/migration/guestperf/engine.py b/tests/migration/guestperf/engine.py index 29ebb5011b..93a6f78e46 100644 --- a/tests/migration/guestperf/engine.py +++ b/tests/migration/guestperf/engine.py @@ -102,6 +102,8 @@ class Engine(object): info.get("expected-downtime", 0), info.get("setup-time", 0), info.get("cpu-throttle-percentage", 0), +info.get("dirty-limit-throttle-time-per-round", 0), +info.get("dirty-limit-ring-full-time", 0), ) def _migrate(self, hardware, scenario, src, dst, connect_uri): @@ -203,6 +205,21 @@ class Engine(object): resp = dst.command("migrate-set-parameters", multifd_channels=scenario._multifd_channels) +if scenario._dirty_limit: +if not hardware._dirty_ring_size: +raise Exception("dirty ring size must be configured when " +"testing dirty limit migration") + +resp = src.command("migrate-set-capabilities", + capabilities = [ + { "capability": "dirty-limit", + "state": True } + ]) +resp = src.command("migrate-set-parameters", +x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period) +resp = src.command("migrate-set-parameters", + vcpu_dirty_limit=scenario._vcpu_dirty_limit) + resp = src.command("migrate", uri=connect_uri) post_copy = False diff --git a/tests/migration/guestperf/progress.py b/tests/migration/guestperf/progress.py index ab1ee57273..d490584217 100644 --- a/tests/migration/guestperf/progress.py +++ b/tests/migration/guestperf/progress.py @@ -81,7 +81,9 @@ class Progress(object): downtime, downtime_expected, setup_time, - throttle_pcent): + throttle_pcent, + dirty_limit_throttle_time_per_round, + dirty_limit_ring_full_time): self._status = status self._ram = ram @@ -91,6 +93,10 @@ class Progress(object):
[PATCH QEMU] docs/migration: Add the dirty limit section
From: Hyman Huang(黄勇) The dirty limit feature has been introduced since the 8.1 QEMU release but has not reflected in the document, add a section for that. Signed-off-by: Hyman Huang(黄勇) --- docs/devel/migration.rst | 70 1 file changed, 70 insertions(+) diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst index c3e1400c0c..4cc83adc8e 100644 --- a/docs/devel/migration.rst +++ b/docs/devel/migration.rst @@ -588,6 +588,76 @@ path. Return path - opened by main thread, written by main thread AND postcopy thread (protected by rp_mutex) +Dirty limit += +The dirty limit, short for dirty page rate upper limit, is a new capability +introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM +dirty ring to throttle down the guest during live migration. + +The algorithm framework is as follows: + +:: + + --- + main --> throttle thread > PREPARE(1) < + thread \| | + \ | | +\ V | + -\CALCULATE(2) | + \ | | +\ | | + \ V | + \SET PENALTY(3) - + -\ | + \ | + \V + -> virtual CPU thread ---> ACCEPT PENALTY(4) + --- +When the qmp command qmp_set_vcpu_dirty_limit is called for the first time, +the QEMU main thread starts the throttle thread. The throttle thread, once +launched, executes the loop, which consists of three steps: + + - PREPARE (1) + + The entire work of PREPARE (1) is prepared for the second stage, + CALCULATE(2), as the name implies. It involves preparing the dirty + page rate value and the corresponding upper limit of the VM: + The dirty page rate is calculated via the KVM dirty ring mechanism, + which tells QEMU how many dirty pages a virtual CPU has had since the + last KVM_EXIT_DIRTY_RING_RULL exception; The dirty page rate upper + limit is specified by caller, therefore fetch it directly. + + - CALCULATE (2) + + Calculate a suitable sleep period for each virtual CPU, which will be + used to determine the penalty for the target virtual CPU. The + computation must be done carefully in order to reduce the dirty page + rate progressively down to the upper limit without oscillation. To + achieve this, two strategies are provided: the first is to add or + subtract sleep time based on the ratio of the current dirty page rate + to the limit, which is used when the current dirty page rate is far + from the limit; the second is to add or subtract a fixed time when + the current dirty page rate is close to the limit. + + - SET PENALTY (3) + + Set the sleep time for each virtual CPU that should be penalized based + on the results of the calculation supplied by step CALCULATE (2). + +After completing the three above stages, the throttle thread loops back +to step PREPARE (1) until the dirty limit is reached. + +On the other hand, each virtual CPU thread reads the sleep duration and +sleeps in the path of the KVM_EXIT_DIRTY_RING_RULL exception handler, that +is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will +obviously exit to the path and get penalized, whereas virtual CPUs involved +with read processes will not. + +In summary, thanks to the KVM dirty ring technology, the dirty limit +algorithm will restrict virtual CPUs as needed to keep their dirty page +rate inside the limit. This leads to more steady reading performance during +live migration and can aid in improving large guest responsiveness. + Postcopy -- 2.38.5
[PATCH QEMU v2 2/3] virtio-blk-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "9445e1e15 virtio-blk-pci: default num_queues to -smp N" implment sizing the number of virtio-blk-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for virtio-blk-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the virtio-blk-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the virtio-blk-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/block/virtio-blk.c | 1 + hw/virtio/virtio-blk-pci.c | 9 - include/hw/virtio/virtio-blk.h | 5 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 39e7f23fab..9e498ca64a 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1716,6 +1716,7 @@ static Property virtio_blk_properties[] = { #endif DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0, true), +DEFINE_PROP_BOOL("auto-num-queues", VirtIOBlock, auto_num_queues, true), DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues, VIRTIO_BLK_AUTO_NUM_QUEUES), DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256), diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c index 9743bee965..4b6b4c4933 100644 --- a/hw/virtio/virtio-blk-pci.c +++ b/hw/virtio/virtio-blk-pci.c @@ -54,7 +54,14 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOBlkConf *conf = &dev->vdev.conf; if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) { -conf->num_queues = virtio_pci_optimal_num_queues(0); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.auto_num_queues) +conf->num_queues = virtio_pci_optimal_num_queues(0); +else +conf->num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index dafec432ce..dab6d7c70c 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -65,6 +65,11 @@ struct VirtIOBlock { uint64_t host_features; size_t config_size; BlockRAMRegistrar blk_ram_registrar; +/* + * Set to true if virtqueues allow to be allocated to + * match the number of virtual CPUs automatically. + */ +bool auto_num_queues; }; typedef struct VirtIOBlockReq { -- 2.38.5
[PATCH QEMU v2 0/3] provide a smooth upgrade solution for multi-queues disk
Ping, This version is a copy of version 1 and is rebased on the master. No functional changes. A 1:1 virtqueue:vCPU mapping implementation for virtio-*-pci disk introduced since qemu >= 5.2.0, which improves IO performance remarkably. To enjoy this feature for exiting running VMs without service interruption, the common solution is to migrate VMs from the lower version of the hypervisor to the upgraded hypervisor, then wait for the next cold reboot of the VM to enable this feature. That's the way "discard" and "write-zeroes" features work. As to multi-queues disk allocation automatically, it's a little different because the destination will allocate queues to match the number of vCPUs automatically by default in the case of live migration, and the VMs on the source side remain 1 queue by default, which results in migration failure due to loading disk VMState incorrectly on the destination side. This issue requires Qemu to provide a hint that shows multi-queues disk allocation is automatically supported, and this allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of this. And upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. To fix the issue, we introduce the auto-num-queues property for virtio-*-pci as a solution, which would be probed by APPs, e.g., libvirt by querying the device properties of QEMU. When launching live migration, libvirt will send the auto-num-queues property as a migration cookie to the destination, and thus the destination knows if the source side supports auto-num-queues. If not, the destination would switch off by building the command line with "auto-num-queues=off" when preparing the incoming VM process. The following patches of libvirt show how it roughly works: https://github.com/newfriday/libvirt/commit/ce2bae2e1a6821afeb80756dc01f3680f525e506 https://github.com/newfriday/libvirt/commit/f546972b009458c88148fe079544db7e9e1f43c3 https://github.com/newfriday/libvirt/commit/5ee19c8646fdb4d87ab8b93f287c20925268ce83 The smooth upgrade solution requires the introduction of the auto-num- queues property on the QEMU side, which is what the patch set does. I'm hoping for comments about the series. Please review, thanks. Yong Hyman Huang(黄勇) (3): virtio-scsi-pci: introduce auto-num-queues property virtio-blk-pci: introduce auto-num-queues property vhost-user-blk-pci: introduce auto-num-queues property hw/block/vhost-user-blk.c | 1 + hw/block/virtio-blk.c | 1 + hw/scsi/vhost-scsi.c | 2 ++ hw/scsi/vhost-user-scsi.c | 2 ++ hw/scsi/virtio-scsi.c | 2 ++ hw/virtio/vhost-scsi-pci.c | 11 +-- hw/virtio/vhost-user-blk-pci.c | 9 - hw/virtio/vhost-user-scsi-pci.c| 11 +-- hw/virtio/virtio-blk-pci.c | 9 - hw/virtio/virtio-scsi-pci.c| 11 +-- include/hw/virtio/vhost-user-blk.h | 5 + include/hw/virtio/virtio-blk.h | 5 + include/hw/virtio/virtio-scsi.h| 5 + 13 files changed, 66 insertions(+), 8 deletions(-) -- 2.38.5
[PATCH QEMU v2 1/3] virtio-scsi-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "6a55882284 virtio-scsi-pci: default num_queues to -smp N" implment sizing the number of virtio-scsi-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for virtio-scsi-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the virtio-scsi-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the virtio-scsi-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/scsi/vhost-scsi.c| 2 ++ hw/scsi/vhost-user-scsi.c | 2 ++ hw/scsi/virtio-scsi.c | 2 ++ hw/virtio/vhost-scsi-pci.c | 11 +-- hw/virtio/vhost-user-scsi-pci.c | 11 +-- hw/virtio/virtio-scsi-pci.c | 11 +-- include/hw/virtio/virtio-scsi.h | 5 + 7 files changed, 38 insertions(+), 6 deletions(-) diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c index 443f67daa4..78a8929c49 100644 --- a/hw/scsi/vhost-scsi.c +++ b/hw/scsi/vhost-scsi.c @@ -284,6 +284,8 @@ static Property vhost_scsi_properties[] = { DEFINE_PROP_STRING("vhostfd", VirtIOSCSICommon, conf.vhostfd), DEFINE_PROP_STRING("wwpn", VirtIOSCSICommon, conf.wwpn), DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0), +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size, diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c index ee99b19e7a..1b837f370a 100644 --- a/hw/scsi/vhost-user-scsi.c +++ b/hw/scsi/vhost-user-scsi.c @@ -161,6 +161,8 @@ static void vhost_user_scsi_unrealize(DeviceState *dev) static Property vhost_user_scsi_properties[] = { DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev), DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0), +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size, diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 45b95ea070..2ec13032aa 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -1279,6 +1279,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev) } static Property virtio_scsi_properties[] = { +DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSI, parent_obj.auto_num_queues, + true), DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues, VIRTIO_SCSI_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI, diff --git a/hw/virtio/vhost-scsi-pci.c b/hw/virtio/vhost-scsi-pci.c index 08980bc23b..927c155278 100644 --- a/hw/virtio/vhost-scsi-pci.c +++ b/hw/virtio/vhost-scsi-pci.c @@ -51,8 +51,15 @@ static void vhost_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf; if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) { -conf->num_queues = -virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.parent_obj.parent_obj.auto_num_queues) +conf->num_queues = +virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED); +else +conf->num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/hw/virtio/vhost-user-scsi-pci.c b/hw/virtio/vhost-user-scsi-pci.c index 75882e3cf9..9c521a7f93 100644 --- a/hw/virtio/vhost-user-scsi-pci.c +++ b/hw/virtio/vhost-user-scsi-pci.c @@ -57,8 +57,15 @@ static void vhost_user_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf; if (conf-
[PATCH QEMU v2 3/3] vhost-user-blk-pci: introduce auto-num-queues property
From: Hyman Huang(黄勇) Commit "a4eef0711b vhost-user-blk-pci: default num_queues to -smp N" implment sizing the number of vhost-user-blk-pci request virtqueues to match the number of vCPUs automatically. Which improves IO preformance remarkably. To enable this feature for the existing VMs, the cloud platform may migrate VMs from the source hypervisor (num_queues is set to 1 by default) to the destination hypervisor (num_queues is set to -smp N) lively. The different num-queues for vhost-user-blk-pci devices between the source side and the destination side will result in migration failure due to loading vmstate incorrectly on the destination side. To provide a smooth upgrade solution, introduce the auto-num-queues property for the vhost-user-blk-pci device. This allows upper APPs, e.g., libvirt, to recognize the hypervisor's capability of allocating the virtqueues automatically by probing the vhost-user-blk-pci.auto-num-queues property. Basing on which, upper APPs can ensure to allocate the same num-queues on the destination side in case of migration failure. Signed-off-by: Hyman Huang(黄勇) --- hw/block/vhost-user-blk.c | 1 + hw/virtio/vhost-user-blk-pci.c | 9 - include/hw/virtio/vhost-user-blk.h | 5 + 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index eecf3f7a81..34e23b1727 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -566,6 +566,7 @@ static const VMStateDescription vmstate_vhost_user_blk = { static Property vhost_user_blk_properties[] = { DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev), +DEFINE_PROP_BOOL("auto-num-queues", VHostUserBlk, auto_num_queues, true), DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues, VHOST_USER_BLK_AUTO_NUM_QUEUES), DEFINE_PROP_UINT32("queue-size", VHostUserBlk, queue_size, 128), diff --git a/hw/virtio/vhost-user-blk-pci.c b/hw/virtio/vhost-user-blk-pci.c index eef8641a98..f7776e928a 100644 --- a/hw/virtio/vhost-user-blk-pci.c +++ b/hw/virtio/vhost-user-blk-pci.c @@ -56,7 +56,14 @@ static void vhost_user_blk_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp) DeviceState *vdev = DEVICE(&dev->vdev); if (dev->vdev.num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) { -dev->vdev.num_queues = virtio_pci_optimal_num_queues(0); +/* + * Allocate virtqueues automatically only if auto_num_queues + * property set true. + */ +if (dev->vdev.auto_num_queues) +dev->vdev.num_queues = virtio_pci_optimal_num_queues(0); +else +dev->vdev.num_queues = 1; } if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) { diff --git a/include/hw/virtio/vhost-user-blk.h b/include/hw/virtio/vhost-user-blk.h index ea085ee1ed..e6f0515bc6 100644 --- a/include/hw/virtio/vhost-user-blk.h +++ b/include/hw/virtio/vhost-user-blk.h @@ -50,6 +50,11 @@ struct VHostUserBlk { bool connected; /* vhost_user_blk_start/vhost_user_blk_stop */ bool started_vu; +/* + * Set to true if virtqueues allow to be allocated to + * match the number of virtual CPUs automatically. + */ +bool auto_num_queues; }; #endif -- 2.38.5
[PATCH QEMU v8 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
From: Hyman Huang(黄勇) dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid if less than 0, so add parameter check for it. Note that this patch also delete the unsolicited help message and clean up the code. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- softmmu/dirtylimit.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 015a9038d1..5c12d26d49 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict *qdict) int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1); Error *err = NULL; -qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); -if (err) { -hmp_handle_error(mon, err); -return; +if (dirty_rate < 0) { +error_setg(&err, "invalid dirty page limit %ld", dirty_rate); +goto out; } -monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query " - "dirty limit for virtual CPU]\n"); +qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err); + +out: +hmp_handle_error(mon, err); } static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index) -- 2.38.5
[PATCH QEMU v8 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 28 +++ qapi/migration.json| 35 +++--- 3 files changed, 64 insertions(+), 7 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..352e9ec716 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) } } } + +monitor_printf(mon, "%s: %" PRIu64 " ms\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), +params->x_vcpu_dirty_limit_period); } qapi_free_MigrationParameters(params); @@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; +case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: +p->has_x_vcpu_dirty_limit_period = true; +visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 5a9505adf7..1de63ba775 100644 --- a/migration/options.c +++ b/migration/options.c @@ -80,6 +80,8 @@ #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ + Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, store_global_state, true), @@ -163,6 +165,9 @@ Property migration_properties[] = { DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds), DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname), DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz), +DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, + parameters.x_vcpu_dirty_limit_period, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) s->parameters.block_bitmap_mapping); } +params->has_x_vcpu_dirty_limit_period = true; +params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; + return params; } @@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_max = true; params->has_announce_rounds = true; params->has_announce_step = true; +params->has_x_vcpu_dirty_limit_period = true; } /* @@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) } #endif +if (params->has_x_vcpu_dirty_limit_period && +(params->x_vcpu_dirty_limit_period < 1 || + params->x_vcpu_dirty_limit_period > 1000)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "x-vcpu-dirty-limit-period", + "a value between 1 and 1000"); +return false; +} + return true; } @@ -1199,6 +1217,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->has_block_bitmap_mapping = true; dest->block_bitmap_mapping = params->block_bitmap_mapping; } + +if (params->has_x_vcpu_dirty_limit_period) { +dest->x_vcpu_dirty_limit_period = +params->x_vcpu_dirty_limit_period; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) QAPI_CLONE(BitmapMigrationNodeAliasList, params->block_bitmap_mapping); } + +if (params->has_x_vcpu_dirty_limit_period) { +s->
[PATCH QEMU v8 4/9] migration: Introduce dirty-limit capability
From: Hyman Huang(黄勇) Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/options.c | 24 migration/options.h | 1 + qapi/migration.json | 12 +++- softmmu/dirtylimit.c | 12 +++- 4 files changed, 47 insertions(+), 2 deletions(-) diff --git a/migration/options.c b/migration/options.c index 7d2d98830e..631c12cf32 100644 --- a/migration/options.c +++ b/migration/options.c @@ -27,6 +27,7 @@ #include "qemu-file.h" #include "ram.h" #include "options.h" +#include "sysemu/kvm.h" /* Maximum migrate downtime set to 2000 seconds */ #define MAX_MIGRATE_DOWNTIME_SECONDS 2000 @@ -196,6 +197,8 @@ Property migration_properties[] = { #endif DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), +DEFINE_PROP_MIG_CAP("x-dirty-limit", +MIGRATION_CAPABILITY_DIRTY_LIMIT), DEFINE_PROP_END_OF_LIST(), }; @@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void) return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS]; } +bool migrate_dirty_limit(void) +{ +MigrationState *s = migrate_get_current(); + +return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT]; +} + bool migrate_events(void) { MigrationState *s = migrate_get_current(); @@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) } } +if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) { +if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) { +error_setg(errp, "dirty-limit conflicts with auto-converge" + " either of then available currently"); +return false; +} + +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty-limit requires KVM with accelerator" + " property 'dirty-ring-size' set"); +return false; +} +} + return true; } diff --git a/migration/options.h b/migration/options.h index 9aaf363322..b5a950d4e4 100644 --- a/migration/options.h +++ b/migration/options.h @@ -24,6 +24,7 @@ extern Property migration_properties[]; /* capabilities */ bool migrate_auto_converge(void); +bool migrate_dirty_limit(void); bool migrate_background_snapshot(void); bool migrate_block(void); bool migrate_colo(void); diff --git a/qapi/migration.json b/qapi/migration.json index e43371955a..031832cde5 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -497,6 +497,15 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # +# @dirty-limit: If enabled, migration will use the dirty-limit +# algorithm to throttle down guest instead of auto-converge +# algorithm. This algorithm only works when vCPU's dirtyrate +# greater than 'vcpu-dirty-limit', read processes in guest os +# aren't penalized any more, so the algorithm can improve +# performance of vCPU during live migration. This is an optional +# performance feature and should not affect the correctness of the +# existing auto-converge algorithm. (since 8.1) +# # Features: # # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -512,7 +521,8 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] } + 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', + 'dirty-limit'] } ## # @MigrationCapabilityStatus: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5c12d26d49..953ef934bc 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -24,6 +24,9 @@ #include "hw/boards.h" #include "sysemu/kvm.h" #include "trace.h" +#i
[PATCH QEMU v8 5/9] migration: Refactor auto-converge capability logic
From: Hyman Huang(黄勇) Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/ram.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 5283a75f02..78746849b5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs) /* During block migration the auto-converge logic incorrectly detects * that ram migration makes no progress. Avoid this by disabling the * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { +if (blk_mig_bulk_active()) { +return; +} + +if (migrate_auto_converge()) { /* The following detection logic can be refined later. For now: Check to see if the ratio between dirtied bytes and the approx. amount of bytes that just got transferred since the last time -- 2.38.5
[PATCH QEMU v8 6/9] migration: Put the detection logic before auto-converge checking
From: Hyman Huang(黄勇) This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 78746849b5..b6559f9312 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs) return; } -if (migrate_auto_converge()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +if (migrate_auto_converge()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); } -- 2.38.5
[PATCH QEMU v8 0/9] migration: introduce dirtylimit capability
Hi, Juan and Markus, thanks for reviewing the previous versions and please review the latest version if you have time :) Yong v8: 1. Rebase on master and refactor the docs suggested by Markus v7: 1. Rebase on master and fix conflicts v6: 1. Rebase on master 2. Split the commit "Implement dirty-limit convergence algo" into two as Juan suggested as the following: a. Put the detection logic before auto-converge checking b. Implement dirty-limit convergence algo 3. Put the detection logic before auto-converge checking 4. Sort the migrate_dirty_limit function in commit "Introduce dirty-limit capability" suggested by Juan 5. Substitute the the int64_t to uint64_t in the last 2 commits 6. Fix the comments spell mistake 7. Add helper function in the commit "Implement dirty-limit convergence algo" suggested by Juan v5: 1. Rebase on master and enrich the comment for "dirty-limit" capability, suggesting by Markus. 2. Drop commits that have already been merged. v4: 1. Polish the docs and update the release version suggested by Markus 2. Rename the migrate exported info "dirty-limit-throttle-time-per- round" to "dirty-limit-throttle-time-per-full". v3(resend): - fix the syntax error of the topic. v3: This version make some modifications inspired by Peter and Markus as following: 1. Do the code clean up in [PATCH v2 02/11] suggested by Markus 2. Replace the [PATCH v2 03/11] with a much simpler patch posted by Peter to fix the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=2124756 3. Fix the error path of migrate_params_check in [PATCH v2 04/11] pointed out by Markus. Enrich the commit message to explain why x-vcpu-dirty-limit-period an unstable parameter. 4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11] suggested by Peter: a. apply blk_mig_bulk_active check before enable dirty-limit b. drop the unhelpful check function before enable dirty-limit c. change the migration_cancel logic, just cancel dirty-limit only if dirty-limit capability turned on. d. abstract a code clean commit [PATCH v3 07/10] to adjust the check order before enable auto-converge 5. Change the name of observing indexes during dirty-limit live migration to make them more easy-understanding. Use the maximum throttle time of vpus as "dirty-limit-throttle-time-per-full" 6. Fix some grammatical and spelling errors pointed out by Markus and enrich the document about the dirty-limit live migration observing indexes "dirty-limit-ring-full-time" and "dirty-limit-throttle-time-per-full" 7. Change the default value of x-vcpu-dirty-limit-period to 1000ms, which is optimal value pointed out in cover letter in that testing environment. 8. Drop the 2 guestperf test commits [PATCH v2 10/11], [PATCH v2 11/11] and post them with a standalone series in the future. v2: This version make a little bit modifications comparing with version 1 as following: 1. fix the overflow issue reported by Peter Maydell 2. add parameter check for hmp "set_vcpu_dirty_limit" command 3. fix the racing issue between dirty ring reaper thread and Qemu main thread. 4. add migrate parameter check for x-vcpu-dirty-limit-period and vcpu-dirty-limit. 5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit, cancel_vcpu_dirty_limit during dirty-limit live migration when implement dirty-limit convergence algo. 6. add capability check to ensure auto-converge and dirty-limit are mutually exclusive. 7. pre-check if kvm dirty ring size is configured before setting dirty-limit migrate parameter Hyman Huang(黄勇) (9): softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit" qapi/migration: Introduce x-vcpu-dirty-limit-period parameter qapi/migration: Introduce vcpu-dirty-limit parameters migration: Introduce dirty-limit capability migration: Refactor auto-converge capability logic migration: Put the detection logic before auto-converge checking migration: Implement dirty-limit convergence algo migration: Extend query-migrate to provide dirty page limit info tests: Add migration dirty-limit capability test include/sysemu/dirtylimit.h| 2 + migration/migration-hmp-cmds.c | 26 ++ migration/migration.c | 13 +++ migration/options.c| 73 migration/options.h| 1 + migration/ram.c| 61 ++--- migration/trace-events | 1 + qapi/migration.json| 75 ++-- softmmu/dirtylimit.c | 91 +-- tests/qtest/migration-test.c | 155 + 10 files changed, 473 insertions(+), 25 deletions(-) -- 2.38.5
[PATCH QEMU v8 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters
From: Hyman Huang(黄勇) Introduce "vcpu-dirty-limit" migration parameter used to limit dirty page rate during live migration. "vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are two dirty-limit-related migration parameters, which can be set before and during live migration by qmp migrate-set-parameters. This two parameters are used to help implement the dirty page rate limit algo of migration. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 21 + qapi/migration.json| 18 +++--- 3 files changed, 44 insertions(+), 3 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 352e9ec716..35e8020bbf 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %" PRIu64 " ms\n", MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), params->x_vcpu_dirty_limit_period); + +monitor_printf(mon, "%s: %" PRIu64 " MB/s\n", +MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT), +params->vcpu_dirty_limit); } qapi_free_MigrationParameters(params); @@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); break; +case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT: +p->has_vcpu_dirty_limit = true; +visit_type_size(v, param, &p->vcpu_dirty_limit, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 1de63ba775..7d2d98830e 100644 --- a/migration/options.c +++ b/migration/options.c @@ -81,6 +81,7 @@ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1 /* MB/s */ Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, @@ -168,6 +169,9 @@ Property migration_properties[] = { DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, parameters.x_vcpu_dirty_limit_period, DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), +DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState, + parameters.vcpu_dirty_limit, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->has_x_vcpu_dirty_limit_period = true; params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; +params->has_vcpu_dirty_limit = true; +params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit; return params; } @@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_rounds = true; params->has_announce_step = true; params->has_x_vcpu_dirty_limit_period = true; +params->has_vcpu_dirty_limit = true; } /* @@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) return false; } +if (params->has_vcpu_dirty_limit && +(params->vcpu_dirty_limit < 1)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "vcpu_dirty_limit", + "is invalid, it must greater then 1 MB/s"); +return false; +} + return true; } @@ -1222,6 +1237,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +dest->vcpu_dirty_limit = params->vcpu_dirty_limit; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) s->parameters.x_vcpu_dirty_limit_period = params->x_vcpu_dirty_limit_period; } +if (params->has_vcpu_dirty_limit) { +s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit; +} } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/qapi/migration.json b/qapi/migration.json index 2041d336d5..e43371955a 100644 --- a/qapi/migration.jso
[PATCH QEMU v8 9/9] tests: Add migration dirty-limit capability test
From: Hyman Huang(黄勇) Add migration dirty-limit capability test if kernel support dirty ring. Migration dirty-limit capability introduce dirty limit capability, two parameters: x-vcpu-dirty-limit-period and vcpu-dirty-limit are introduced to implement the live migration with dirty limit. The test case does the following things: 1. start src, dst vm and enable dirty-limit capability 2. start migrate and set cancel it to check if dirty limit stop working. 3. restart dst vm 4. start migrate and enable dirty-limit capability 5. check if migration satisfy the convergence condition during pre-switchover phase. Signed-off-by: Hyman Huang(黄勇) --- tests/qtest/migration-test.c | 155 +++ 1 file changed, 155 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index b9cc194100..f55f95c9b0 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -2636,6 +2636,159 @@ static void test_vcpu_dirty_limit(void) dirtylimit_stop_vm(vm); } +static void migrate_dirty_limit_wait_showup(QTestState *from, +const int64_t period, +const int64_t value) +{ +/* Enable dirty limit capability */ +migrate_set_capability(from, "dirty-limit", true); + +/* Set dirty limit parameters */ +migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period); +migrate_set_parameter_int(from, "vcpu-dirty-limit", value); + +/* Make sure migrate can't converge */ +migrate_ensure_non_converge(from); + +/* To check limit rate after precopy */ +migrate_set_capability(from, "pause-before-switchover", true); + +/* Wait for the serial output from the source */ +wait_for_serial("src_serial"); +} + +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * restart target + * migrate + * + * And see that if dirty limit works correctly + */ +static void test_migrate_dirty_limit(void) +{ +g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); +QTestState *from, *to; +int64_t remaining; +uint64_t throttle_us_per_full; +/* + * We want the test to be stable and as fast as possible. + * E.g., with 1Gb/s bandwith migration may pass without dirty limit, + * so we need to decrease a bandwidth. + */ +const int64_t dirtylimit_period = 1000, dirtylimit_value = 50; +const int64_t max_bandwidth = 4; /* ~400Mb/s */ +const int64_t downtime_limit = 250; /* 250ms */ +/* + * We migrate through unix-socket (> 500Mb/s). + * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s). + * So, we can predict expected_threshold + */ +const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000; +int max_try_count = 10; +MigrateCommon args = { +.start = { +.hide_stderr = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Start src, dst vm */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Prepare for dirty limit migration and wait src vm show up */ +migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value); + +/* Start migrate */ +migrate_qmp(from, uri, "{}"); + +/* Wait for dirty limit throttle begin */ +throttle_us_per_full = 0; +while (throttle_us_per_full == 0) { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} + +/* Now cancel migrate and wait for dirty limit throttle switch off */ +migrate_cancel(from); +wait_for_migration_status(from, "cancelled", NULL); + +/* Check if dirty limit throttle switched off, set timeout 1ms */ +do { +throttle_us_per_full = +read_migrate_property_int(from, "dirty-limit-throttle-time-per-round"); +usleep(100); +g_assert_false(got_src_stop); +} while (throttle_us_per_full != 0 && --max_try_count); + +/* Assert dirty limit is not in service */ +g_assert_cmpint(throttle_us_per_full, ==, 0); + +args = (MigrateCommon) { +.start = { +.only_target = true, +.use_dirty_ring = true, +}, +.listen_uri = uri, +.connect_uri = uri, +}; + +/* Restart dst vm, src vm already show up so we needn't wait anymore */ +if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) { +return; +} + +/* Start migrate */ +migrate_qmp(from, uri, "{}&
[PATCH QEMU v8 8/9] migration: Extend query-migrate to provide dirty page limit info
From: Hyman Huang(黄勇) Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- include/sysemu/dirtylimit.h| 2 ++ migration/migration-hmp-cmds.c | 10 + migration/migration.c | 10 + qapi/migration.json| 16 +- softmmu/dirtylimit.c | 39 ++ 5 files changed, 76 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3a6b..d11edb 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +uint64_t dirtylimit_throttle_time_per_round(void); +uint64_t dirtylimit_ring_full_time(void); #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 35e8020bbf..c115ef2d23 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_time_per_round) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n", + info->dirty_limit_throttle_time_per_round); +} + +if (info->has_dirty_limit_ring_full_time) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n", + info->dirty_limit_ring_full_time); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/migration/migration.c b/migration/migration.c index a3791900fd..a4dcaa3c91 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -64,6 +64,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "options.h" +#include "sysemu/dirtylimit.h" static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->dirty_pages_rate = stat64_get(&mig_stats.dirty_pages_rate); } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_time_per_round = true; +info->dirty_limit_throttle_time_per_round = +dirtylimit_throttle_time_per_round(); + +info->has_dirty_limit_ring_full_time = true; +info->dirty_limit_ring_full_time = dirtylimit_ring_full_time(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/qapi/migration.json b/qapi/migration.json index 031832cde5..97f7d0fd3c 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -250,6 +250,18 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-time-per-round: Maximum throttle time +# (in microseconds) of virtual CPUs each dirty ring full round, +# which shows how MigrationCapability dirty-limit affects the +# guest during live migration. (since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full +# time (in microseconds) each dirty ring full round. The value +# equals dirty ring memory size divided by average dirty page +# rate of the virtual CPU, which can be used to observe the +# average memory load of the virtual CPU indirectly. Note that +# zero means guest doesn't dirty memory (since 8.1) +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -267,7 +279,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-time-per-round': 'uint64', + '*dirty-limit-ring-full-time': 'uint64'} } ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5134296667..a0686323e5 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -565,6 +565,45 @@ out: hmp_handle_error(mon, err); } +/* Return the max throttle time of each virtual CPU */ +uint64_t dirtylimit_throttle_time_per_round(void) +{ +CPUState *cpu; +
[PATCH QEMU v8 7/9] migration: Implement dirty-limit convergence algo
From: Hyman Huang(黄勇) Implement dirty-limit convergence algo for live migration, which is kind of like auto-converge algo but using dirty-limit instead of cpu throttle to make migration convergent. Enable dirty page limit if dirty_rate_high_cnt greater than 2 when dirty-limit capability enabled, Disable dirty-limit if migration be canceled. Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit" commands are not allowed during dirty-limit live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration.c | 3 +++ migration/ram.c| 36 migration/trace-events | 1 + softmmu/dirtylimit.c | 29 + 4 files changed, 69 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 096e8191d1..a3791900fd 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -166,6 +166,9 @@ void migration_cancel(const Error *error) if (error) { migrate_set_error(current_migration, error); } +if (migrate_dirty_limit()) { +qmp_cancel_vcpu_dirty_limit(false, -1, NULL); +} migrate_fd_cancel(current_migration); } diff --git a/migration/ram.c b/migration/ram.c index b6559f9312..8a86363216 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -46,6 +46,7 @@ #include "qapi/error.h" #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qapi-commands-migration.h" #include "qapi/qmp/qerror.h" #include "trace.h" #include "exec/ram_addr.h" @@ -59,6 +60,8 @@ #include "multifd.h" #include "sysemu/runstate.h" #include "options.h" +#include "sysemu/dirtylimit.h" +#include "sysemu/kvm.h" #include "hw/boards.h" /* for machine_dump_guest_core() */ @@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) } } +/* + * Enable dirty-limit to throttle down the guest + */ +static void migration_dirty_limit_guest(void) +{ +/* + * dirty page rate quota for all vCPUs fetched from + * migration parameter 'vcpu_dirty_limit' + */ +static int64_t quota_dirtyrate; +MigrationState *s = migrate_get_current(); + +/* + * If dirty limit already enabled and migration parameter + * vcpu-dirty-limit untouched. + */ +if (dirtylimit_in_service() && +quota_dirtyrate == s->parameters.vcpu_dirty_limit) { +return; +} + +quota_dirtyrate = s->parameters.vcpu_dirty_limit; + +/* + * Set all vCPU a quota dirtyrate, note that the second + * parameter will be ignored if setting all vCPU for the vm + */ +qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL); +trace_migration_dirty_limit_guest(quota_dirtyrate); +} + static void migration_trigger_throttle(RAMState *rs) { uint64_t threshold = migrate_throttle_trigger_threshold(); @@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs) trace_migration_throttle(); mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); +} else if (migrate_dirty_limit()) { +migration_dirty_limit_guest(); } } } diff --git a/migration/trace-events b/migration/trace-events index 5259c1044b..580895e86e 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) "" migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx" migration_throttle(void) "" +migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" PRIi64 " MB/s" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x" diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 953ef934bc..5134296667 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void) dirtylimit_state_finalize(); } +/* + * dirty page rate limit is not allowed to set if migration + * is running with dirty-limit capability enabled. + */ +static bool dirtylimit_is_allowed(void) +{ +MigrationState *ms = migrate_get_current(); + +if (migration_is_running(ms->state) && +(!qemu_thread_is_
[PATCH QEMU v7 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
From: Hyman Huang(黄勇) Introduce "x-vcpu-dirty-limit-period" migration experimental parameter, which is in the range of 1 to 1000ms and used to make dirtyrate calculation period configurable. Currently with the "x-vcpu-dirty-limit-period" varies, the total time of live migration changes, test results show the optimal value of "x-vcpu-dirty-limit-period" ranges from 500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made stable once it proves best value can not be determined with developer's experiments. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- migration/migration-hmp-cmds.c | 8 migration/options.c| 28 qapi/migration.json| 34 +++--- 3 files changed, 63 insertions(+), 7 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 9885d7c9f7..352e9ec716 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) } } } + +monitor_printf(mon, "%s: %" PRIu64 " ms\n", +MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD), +params->x_vcpu_dirty_limit_period); } qapi_free_MigrationParameters(params); @@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; +case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: +p->has_x_vcpu_dirty_limit_period = true; +visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); +break; default: assert(0); } diff --git a/migration/options.c b/migration/options.c index 5a9505adf7..1de63ba775 100644 --- a/migration/options.c +++ b/migration/options.c @@ -80,6 +80,8 @@ #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) +#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */ + Property migration_properties[] = { DEFINE_PROP_BOOL("store-global-state", MigrationState, store_global_state, true), @@ -163,6 +165,9 @@ Property migration_properties[] = { DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds), DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname), DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz), +DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState, + parameters.x_vcpu_dirty_limit_period, + DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) s->parameters.block_bitmap_mapping); } +params->has_x_vcpu_dirty_limit_period = true; +params->x_vcpu_dirty_limit_period = s->parameters.x_vcpu_dirty_limit_period; + return params; } @@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params) params->has_announce_max = true; params->has_announce_rounds = true; params->has_announce_step = true; +params->has_x_vcpu_dirty_limit_period = true; } /* @@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) } #endif +if (params->has_x_vcpu_dirty_limit_period && +(params->x_vcpu_dirty_limit_period < 1 || + params->x_vcpu_dirty_limit_period > 1000)) { +error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "x-vcpu-dirty-limit-period", + "a value between 1 and 1000"); +return false; +} + return true; } @@ -1199,6 +1217,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, dest->has_block_bitmap_mapping = true; dest->block_bitmap_mapping = params->block_bitmap_mapping; } + +if (params->has_x_vcpu_dirty_limit_period) { +dest->x_vcpu_dirty_limit_period = +params->x_vcpu_dirty_limit_period; +} } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) QAPI_CLONE(BitmapMigrationNodeAliasList, params->block_bitmap_mapping); } + +if (params->has_x_vcpu_dirty_limit_period) { +s->
[PATCH QEMU v7 6/9] migration: Put the detection logic before auto-converge checking
From: Hyman Huang(黄勇) This commit is prepared for the implementation of dirty-limit convergence algo. The detection logic of throttling condition can apply to both auto-converge and dirty-limit algo, putting it's position before the checking logic for auto-converge feature. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Juan Quintela --- migration/ram.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 78746849b5..b6559f9312 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs) return; } -if (migrate_auto_converge()) { -/* The following detection logic can be refined later. For now: - Check to see if the ratio between dirtied bytes and the approx. - amount of bytes that just got transferred since the last time - we were in this routine reaches the threshold. If that happens - twice, start or increase throttling. */ - -if ((bytes_dirty_period > bytes_dirty_threshold) && -(++rs->dirty_rate_high_cnt >= 2)) { +/* + * The following detection logic can be refined later. For now: + * Check to see if the ratio between dirtied bytes and the approx. + * amount of bytes that just got transferred since the last time + * we were in this routine reaches the threshold. If that happens + * twice, start or increase throttling. + */ +if ((bytes_dirty_period > bytes_dirty_threshold) && +(++rs->dirty_rate_high_cnt >= 2)) { +rs->dirty_rate_high_cnt = 0; +if (migrate_auto_converge()) { trace_migration_throttle(); -rs->dirty_rate_high_cnt = 0; mig_throttle_guest_down(bytes_dirty_period, bytes_dirty_threshold); } -- 2.38.5
[PATCH QEMU v7 8/9] migration: Extend query-migrate to provide dirty page limit info
From: Hyman Huang(黄勇) Extend query-migrate to provide throttle time and estimated ring full time with dirty-limit capability enabled, through which we can observe if dirty limit take effect during live migration. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Markus Armbruster Reviewed-by: Juan Quintela --- include/sysemu/dirtylimit.h| 2 ++ migration/migration-hmp-cmds.c | 10 + migration/migration.c | 10 + qapi/migration.json| 16 +- softmmu/dirtylimit.c | 39 ++ 5 files changed, 76 insertions(+), 1 deletion(-) diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h index 8d2c1f3a6b..d11edb 100644 --- a/include/sysemu/dirtylimit.h +++ b/include/sysemu/dirtylimit.h @@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index, void dirtylimit_set_all(uint64_t quota, bool enable); void dirtylimit_vcpu_execute(CPUState *cpu); +uint64_t dirtylimit_throttle_time_per_round(void); +uint64_t dirtylimit_ring_full_time(void); #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 35e8020bbf..c115ef2d23 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->cpu_throttle_percentage); } +if (info->has_dirty_limit_throttle_time_per_round) { +monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n", + info->dirty_limit_throttle_time_per_round); +} + +if (info->has_dirty_limit_ring_full_time) { +monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n", + info->dirty_limit_ring_full_time); +} + if (info->has_postcopy_blocktime) { monitor_printf(mon, "postcopy blocktime: %u\n", info->postcopy_blocktime); diff --git a/migration/migration.c b/migration/migration.c index a3791900fd..a4dcaa3c91 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -64,6 +64,7 @@ #include "yank_functions.h" #include "sysemu/qtest.h" #include "options.h" +#include "sysemu/dirtylimit.h" static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s) info->ram->dirty_pages_rate = stat64_get(&mig_stats.dirty_pages_rate); } + +if (migrate_dirty_limit() && dirtylimit_in_service()) { +info->has_dirty_limit_throttle_time_per_round = true; +info->dirty_limit_throttle_time_per_round = +dirtylimit_throttle_time_per_round(); + +info->has_dirty_limit_ring_full_time = true; +info->dirty_limit_ring_full_time = dirtylimit_ring_full_time(); +} } static void populate_disk_info(MigrationInfo *info) diff --git a/qapi/migration.json b/qapi/migration.json index cc51835cdd..ebc15e2782 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -250,6 +250,18 @@ # blocked. Present and non-empty when migration is blocked. # (since 6.0) # +# @dirty-limit-throttle-time-per-round: Maximum throttle time (in microseconds) of virtual +# CPUs each dirty ring full round, which shows how +# MigrationCapability dirty-limit affects the guest +# during live migration. (since 8.1) +# +# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in microseconds) +# each dirty ring full round, note that the value equals +# dirty ring memory size divided by average dirty page rate +# of virtual CPU, which can be used to observe the average +# memory load of virtual CPU indirectly. Note that zero +# means guest doesn't dirty memory (since 8.1) +# # Since: 0.14 ## { 'struct': 'MigrationInfo', @@ -267,7 +279,9 @@ '*postcopy-blocktime' : 'uint32', '*postcopy-vcpu-blocktime': ['uint32'], '*compression': 'CompressionStats', - '*socket-address': ['SocketAddress'] } } + '*socket-address': ['SocketAddress'], + '*dirty-limit-throttle-time-per-round': 'uint64', + '*dirty-limit-ring-full-time': 'uint64'} } ## # @query-migrate: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5134296667..a0686323e5 100644 --- a/
[PATCH QEMU v7 5/9] migration: Refactor auto-converge capability logic
From: Hyman Huang(黄勇) Check if block migration is running before throttling guest down in auto-converge way. Note that this modification is kind of like code clean, because block migration does not depend on auto-converge capability, so the order of checks can be adjusted. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/ram.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 5283a75f02..78746849b5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs) /* During block migration the auto-converge logic incorrectly detects * that ram migration makes no progress. Avoid this by disabling the * throttling logic during the bulk phase of block migration. */ -if (migrate_auto_converge() && !blk_mig_bulk_active()) { +if (blk_mig_bulk_active()) { +return; +} + +if (migrate_auto_converge()) { /* The following detection logic can be refined later. For now: Check to see if the ratio between dirtied bytes and the approx. amount of bytes that just got transferred since the last time -- 2.38.5
[PATCH QEMU v7 4/9] migration: Introduce dirty-limit capability
From: Hyman Huang(黄勇) Introduce migration dirty-limit capability, which can be turned on before live migration and limit dirty page rate durty live migration. Introduce migrate_dirty_limit function to help check if dirty-limit capability enabled during live migration. Meanwhile, refactor vcpu_dirty_rate_stat_collect so that period can be configured instead of hardcoded. dirty-limit capability is kind of like auto-converge but using dirty limit instead of traditional cpu-throttle to throttle guest down. To enable this feature, turn on the dirty-limit capability before live migration using migrate-set-capabilities, and set the parameters "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably to speed up convergence. Signed-off-by: Hyman Huang(黄勇) Acked-by: Peter Xu Reviewed-by: Juan Quintela --- migration/options.c | 24 migration/options.h | 1 + qapi/migration.json | 13 - softmmu/dirtylimit.c | 12 +++- 4 files changed, 48 insertions(+), 2 deletions(-) diff --git a/migration/options.c b/migration/options.c index 7d2d98830e..8b4eb8c519 100644 --- a/migration/options.c +++ b/migration/options.c @@ -27,6 +27,7 @@ #include "qemu-file.h" #include "ram.h" #include "options.h" +#include "sysemu/kvm.h" /* Maximum migrate downtime set to 2000 seconds */ #define MAX_MIGRATE_DOWNTIME_SECONDS 2000 @@ -196,6 +197,8 @@ Property migration_properties[] = { #endif DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), +DEFINE_PROP_MIG_CAP("x-dirty-limit", +MIGRATION_CAPABILITY_DIRTY_LIMIT), DEFINE_PROP_END_OF_LIST(), }; @@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void) return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS]; } +bool migrate_dirty_limit(void) +{ +MigrationState *s = migrate_get_current(); + +return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT]; +} + bool migrate_events(void) { MigrationState *s = migrate_get_current(); @@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) } } +if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) { +if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) { +error_setg(errp, "dirty-limit conflicts with auto-converge" + " either of then available currently"); +return false; +} + +if (!kvm_enabled() || !kvm_dirty_ring_enabled()) { +error_setg(errp, "dirty-limit requires KVM with accelerator" + " property 'dirty-ring-size' set"); +return false; +} +} + return true; } diff --git a/migration/options.h b/migration/options.h index 9aaf363322..b5a950d4e4 100644 --- a/migration/options.h +++ b/migration/options.h @@ -24,6 +24,7 @@ extern Property migration_properties[]; /* capabilities */ bool migrate_auto_converge(void); +bool migrate_dirty_limit(void); bool migrate_background_snapshot(void); bool migrate_block(void); bool migrate_colo(void); diff --git a/qapi/migration.json b/qapi/migration.json index aa590dbf0e..cc51835cdd 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -497,6 +497,16 @@ # are present. 'return-path' capability must be enabled to use # it. (since 8.1) # +# @dirty-limit: If enabled, migration will use the dirty-limit algo to +# throttle down guest instead of auto-converge algo. +# Throttle algo only works when vCPU's dirtyrate greater +# than 'vcpu-dirty-limit', read processes in guest os +# aren't penalized any more, so this algo can improve +# performance of vCPU during live migration. This is an +# optional performance feature and should not affect the +# correctness of the existing auto-converge algo. +# (since 8.1) +# # Features: # # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -512,7 +522,8 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] } + 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', + 'dirty-limit'] } ## # @MigrationCapabilityStatus: diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c index 5c12d26d49..953ef934bc 100644 --- a/softmmu/dirtylimit.c +++ b/softmmu/dirtylimit.c @@ -24,6 +24,9 @@ #include "hw/boards.h" #include