from:"Hyman"

Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw

2022-11-14 Thread Hyman





在 2022/11/11 3:17, Michael S. Tsirkin 写道:

On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Save the acked_features once it be configured by guest
virtio driver so it can't miss any features.

Note that this patch also change the features saving logic
in chr_closed_bh, which originally backup features no matter
whether the features are 0 or not, but now do it only if
features aren't 0.

As to reset acked_features to 0 if needed, Qemu always
keeping the backup acked_features up-to-date, and save the
acked_features after virtio_net_set_features in advance,
including reset acked_features to 0, so the behavior is
also covered.

Signed-off-by: Hyman Huang(黄勇) 
Signed-off-by: Guoyi Tu 
---
  hw/net/vhost_net.c  | 9 +
  hw/net/virtio-net.c | 5 +
  include/net/vhost_net.h | 2 ++
  net/vhost-user.c| 6 +-
  4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d28f8b9..2bffc27 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net)
  return net->dev.acked_features;
  }
  
+void vhost_net_save_acked_features(NetClientState *nc)

+{
+if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) {
+return;
+}
+
+vhost_user_save_acked_features(nc, false);
+}
+
  static int vhost_net_get_fd(NetClientState *backend)
  {
  switch (backend->info->type) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e9f696b..5f8f788 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
  continue;
  }
  vhost_net_ack_features(get_vhost_net(nc->peer), features);
+/*
+ * keep acked_features in NetVhostUserState up-to-date so it
+ * can't miss any features configured by guest virtio driver.
+ */
+vhost_net_save_acked_features(nc->peer);
  }
  
  if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {


So when do you want to ack features but *not* save them?
When openvswitch restart and reconnect and Qemu start the vhost_dev， 
acked_features in vhost_dev Qemu need to be initialized and the 
initialized value be fetched from acked_features int NetVhostUserState.

At this time, acked_features  may not be up-to-date but we want it.


Is the effect of this patch, fundamentally, that guest features
from virtio are always copied to vhost-user?
Do we even need an extra copy in vhost user then?

I'm trying to explain this from my view, please point out the mistake 
if i failed. :)


When socket used by vhost-user device disconnectted from openvswitch,
Qemu will stop the vhost-user and clean up the whole struct of 
vhost_dev(include vm's memory region and acked_features), once socket is 
reconnected from openvswitch, Qemu will collect vm's memory region 
dynamically but as to acked_features, IMHO, Qemu can not fetch it from 
guest features of virtio-net, because acked_features are kind of 
different from guest features(bit 30 is different at least)，so Qemu 
need an extra copy.




all this came in with:

commit a463215b087c41d7ca94e51aa347cde523831873
Author: Marc-André Lureau 
Date:   Mon Jun 6 18:45:05 2016 +0200

 vhost-net: save & restore vhost-user acked features

Marc-André do you remember why we have a copy of features in vhost-user
and not just reuse the features from virtio?



diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 387e913..3a5579b 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -46,6 +46,8 @@ int vhost_set_vring_enable(NetClientState * nc, int enable);
  
  uint64_t vhost_net_get_acked_features(VHostNetState *net);
  
+void vhost_net_save_acked_features(NetClientState *nc);

+
  int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
  
  #endif

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 74f349c..c512cc9 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -258,11 +258,7 @@ static void chr_closed_bh(void *opaque)
  s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
  
  for (i = queues -1; i >= 0; i--) {

-s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
-
-if (s->vhost_net) {
-s->acked_features = vhost_net_get_acked_features(s->vhost_net);
-}
+vhost_user_save_acked_features(ncs[i], false);
  }
  
  qmp_set_link(name, false, &err);

--
1.8.3.1

Re: [PATCH v2 07/11] migration: Implement dirty-limit convergence algo

2022-11-30 Thread Hyman





在 2022/11/30 7:17, Peter Xu 写道:

On Mon, Nov 21, 2022 at 11:26:39AM -0500, huang...@chinatelecom.cn wrote:

diff --git a/migration/migration.c b/migration/migration.c
index 86950a1..096b61a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -240,6 +240,7 @@ void migration_cancel(const Error *error)
  if (error) {
  migrate_set_error(current_migration, error);
  }
+qmp_cancel_vcpu_dirty_limit(false, -1, NULL);


Disable it only if migrate_dirty_limit() is true?  It seems okay if the
admin wants to use dirtylimit separately from migration.

Ok.



  migrate_fd_cancel(current_migration);
  }


[...]


@@ -1148,22 +1175,31 @@ static void migration_trigger_throttle(RAMState *rs)
  uint64_t bytes_dirty_period = rs->num_dirty_pages_period * 
TARGET_PAGE_SIZE;
  uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
  
-/* During block migration the auto-converge logic incorrectly detects

- * that ram migration makes no progress. Avoid this by disabling the
- * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+/*
+ * During block migration the auto-converge logic incorrectly detects
+ * that ram migration makes no progress. Avoid this by disabling the
+ * throttling logic during the bulk phase of block migration
+ */
+
+if (migrate_auto_converge() && !blk_mig_bulk_active()) {


Does dirtylimit cap needs to check blk_mig_bulk_active() too?  I assume
that check was used to ignore the bulk block migration phase where major
bandwidth will be consumed by block migrations so the measured bandwidth is
not accurate.  IIUC it applies to dirtylimit too.Indeed, i'll add this next 
version.


  trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
  mig_throttle_guest_down(bytes_dirty_period,
  bytes_dirty_threshold);
+} else if (migrate_dirty_limit() &&
+   kvm_dirty_ring_enabled() &&
+   migration_is_active(s)) {


Is "kvm_dirty_ring_enabled()" and "migration_is_active(s)" check helpful?
Can we only rely on migrate_dirty_limit() alone?
In qmp_set_vcpu_dirty_limit, it checks if kvm enabled and dirty ring 
size set. When "dirty-limit" capability set, we also check this in 
migrate_caps_check, so kvm_dirty_ring_enabled can be dropped indeed.


As for migration_is_active, dirty-limit can be set anytime and migration 
is active already in the path. It also can be dropped.


I'll fix this next version.

Re: [PATCH v2 08/11] migration: Export dirty-limit time info

2022-11-30 Thread Hyman





在 2022/11/30 8:09, Peter Xu 写道:

On Mon, Nov 21, 2022 at 11:26:40AM -0500, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Export dirty limit throttle time and estimated ring full
time, through which we can observe the process of dirty
limit during live migration.

Signed-off-by: Hyman Huang(黄勇) 
---
  include/sysemu/dirtylimit.h |  2 ++
  migration/migration.c   | 10 ++
  monitor/hmp-cmds.c  | 10 ++
  qapi/migration.json | 10 +-
  softmmu/dirtylimit.c| 31 +++
  5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3..98cc4a6 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
  void dirtylimit_set_all(uint64_t quota,
  bool enable);
  void dirtylimit_vcpu_execute(CPUState *cpu);
+int64_t dirtylimit_throttle_us_per_full(void);
+int64_t dirtylimit_us_ring_full(void);
  #endif
diff --git a/migration/migration.c b/migration/migration.c
index 096b61a..886c25d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -62,6 +62,7 @@
  #include "yank_functions.h"
  #include "sysemu/qtest.h"
  #include "sysemu/kvm.h"
+#include "sysemu/dirtylimit.h"
  
  #define MAX_THROTTLE  (128 << 20)  /* Migration transfer speed throttling */
  
@@ -1112,6 +1113,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)

  info->ram->remaining = ram_bytes_remaining();
  info->ram->dirty_pages_rate = ram_counters.dirty_pages_rate;
  }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_us_per_full = true;
+info->dirty_limit_throttle_us_per_full =
+dirtylimit_throttle_us_per_full();
+
+info->has_dirty_limit_us_ring_full = true;
+info->dirty_limit_us_ring_full = dirtylimit_us_ring_full();
+}
  }
  
  static void populate_disk_info(MigrationInfo *info)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 9ad6ee5..9d02baf 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -339,6 +339,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 info->cpu_throttle_percentage);
  }
  
+if (info->has_dirty_limit_throttle_us_per_full) {

+monitor_printf(mon, "dirty-limit throttle time: %" PRIi64 " us\n",
+   info->dirty_limit_throttle_us_per_full);
+}
+
+if (info->has_dirty_limit_us_ring_full) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIi64 " us\n",
+   info->dirty_limit_us_ring_full);
+}
+
  if (info->has_postcopy_blocktime) {
  monitor_printf(mon, "postcopy blocktime: %u\n",
 info->postcopy_blocktime);
diff --git a/qapi/migration.json b/qapi/migration.json
index af6b2da..62db5cb 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -242,6 +242,12 @@
  #   Present and non-empty when migration is blocked.
  #   (since 6.0)
  #
+# @dirty-limit-throttle-us-per-full: Throttle time (us) during the period of
+#dirty ring full (since 7.1)
+#
+# @dirty-limit-us-ring-full: Estimated periodic time (us) of dirty ring full.
+#(since 7.1)


s/7.1/7.3/

Could you enrich the document for the new fields?  For example, currently
you only report throttle time for vcpu0 on the 1st field, while for the
latter it's an average of all vcpus.  These need to be mentioned.
> OTOH, how do you normally use these values?  Maybe that can also be added
into the documents too.


Of course yes. I'll do that next version

+#
  # Since: 0.14
  ##
  { 'struct': 'MigrationInfo',
@@ -259,7 +265,9 @@
 '*postcopy-blocktime' : 'uint32',
 '*postcopy-vcpu-blocktime': ['uint32'],
 '*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-us-per-full': 'int64',
+   '*dirty-limit-us-ring-full': 'int64'} }
  
  ##

  # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 3f3c405..9d1df9b 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -573,6 +573,37 @@ static struct DirtyLimitInfo *dirtylimit_query_vcpu(int 
cpu_index)
  return info;
  }
  
+/* Pick up first vcpu throttle time by default */

+int64_t dirtylimit_throttle_us_per_full(void)
+{
+CPUState *cpu = first_cpu

Re: [PATCH v2 08/11] migration: Export dirty-limit time info

2022-12-03 Thread Hyman





在 2022/11/22 0:26, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

Export dirty limit throttle time and estimated ring full
time, through which we can observe the process of dirty
limit during live migration.

Signed-off-by: Hyman Huang(黄勇) 
---
  include/sysemu/dirtylimit.h |  2 ++
  migration/migration.c   | 10 ++
  monitor/hmp-cmds.c  | 10 ++
  qapi/migration.json | 10 +-
  softmmu/dirtylimit.c| 31 +++
  5 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3..98cc4a6 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
  void dirtylimit_set_all(uint64_t quota,
  bool enable);
  void dirtylimit_vcpu_execute(CPUState *cpu);
+int64_t dirtylimit_throttle_us_per_full(void);
+int64_t dirtylimit_us_ring_full(void);
  #endif
diff --git a/migration/migration.c b/migration/migration.c
index 096b61a..886c25d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -62,6 +62,7 @@
  #include "yank_functions.h"
  #include "sysemu/qtest.h"
  #include "sysemu/kvm.h"
+#include "sysemu/dirtylimit.h"
  
  #define MAX_THROTTLE  (128 << 20)  /* Migration transfer speed throttling */
  
@@ -1112,6 +1113,15 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)

  info->ram->remaining = ram_bytes_remaining();
  info->ram->dirty_pages_rate = ram_counters.dirty_pages_rate;
  }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_us_per_full = true;
+info->dirty_limit_throttle_us_per_full =
+dirtylimit_throttle_us_per_full();
+
+info->has_dirty_limit_us_ring_full = true;
+info->dirty_limit_us_ring_full = dirtylimit_us_ring_full();
+}
  }
  
  static void populate_disk_info(MigrationInfo *info)

diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 9ad6ee5..9d02baf 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -339,6 +339,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 info->cpu_throttle_percentage);
  }
  
+if (info->has_dirty_limit_throttle_us_per_full) {

+monitor_printf(mon, "dirty-limit throttle time: %" PRIi64 " us\n",
+   info->dirty_limit_throttle_us_per_full);
+}
+
+if (info->has_dirty_limit_us_ring_full) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIi64 " us\n",
+   info->dirty_limit_us_ring_full);
+}
+
  if (info->has_postcopy_blocktime) {
  monitor_printf(mon, "postcopy blocktime: %u\n",
 info->postcopy_blocktime);
diff --git a/qapi/migration.json b/qapi/migration.json
index af6b2da..62db5cb 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -242,6 +242,12 @@
  #   Present and non-empty when migration is blocked.
  #   (since 6.0)
  #
+# @dirty-limit-throttle-us-per-full: Throttle time (us) during the period of
+#dirty ring full (since 7.1)
+# > +# @dirty-limit-us-ring-full: Estimated periodic time (us) of dirty 

ring full.

+#(since 7.1)

How about the following documents:

# @dirty-limit-throttletime-each-round: Max throttle time (us) of all 
virtual CPUs each dirty ring
#   full round, used to observe if 
dirty-limit take effect

#   during live migration. (since 7.3)
#
# @dirty-limit-ring-full-time: Estimated average dirty ring full time 
(us) each round, note that
#  the value equals dirty ring memory size 
divided by average dirty
#  page rate of virtual CPU, which can be 
used to observe the average
#  memory load of virtual CPU indirectly. 
(since 7.3)


Is it more easy-understanding ?

+#
  # Since: 0.14
  ##
  { 'struct': 'MigrationInfo',
@@ -259,7 +265,9 @@
 '*postcopy-blocktime' : 'uint32',
 '*postcopy-vcpu-blocktime': ['uint32'],
 '*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-us-per-full': 'int64',
+   '*dirty-limit-us-ring-full': 'int64'} }
 >   ##
  # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 3f3c405..9d1df9b 100644
---

Re: [PULL 06/30] softmmu/dirtylimit: Implement virtual CPU throttle

2022-07-29 Thread Hyman





在 2022/7/29 22:14, Richard Henderson 写道:

On 7/29/22 06:31, Peter Maydell wrote:

On Wed, 20 Jul 2022 at 12:30, Dr. David Alan Gilbert (git)
 wrote:


From: Hyman Huang(黄勇) 

Setup a negative feedback system when vCPU thread
handling KVM_EXIT_DIRTY_RING_FULL exit by introducing
throttle_us_per_full field in struct CPUState. Sleep
throttle_us_per_full microseconds to throttle vCPU
if dirtylimit is in service.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Peter Xu 
Message-Id: 
<977e808e03a1cef5151cae75984658b6821be618.1656177590.git.huang...@chinatelecom.cn> 


Signed-off-by: Dr. David Alan Gilbert 



Hi; Coverity points out a problem with this code (CID 1490787):

Thanks for pointing out this bug.  I'm making a access request for
https://scan.coverity.com so that coverity problem can be found once new
series be posted. Hoping this bug doesn't appear anymore. :)


+static inline int64_t dirtylimit_dirty_ring_full_time(uint64_t 
dirtyrate)

+{
+    static uint64_t max_dirtyrate;
+    uint32_t dirty_ring_size = kvm_dirty_ring_size();
+    uint64_t dirty_ring_size_meory_MB =
+    dirty_ring_size * TARGET_PAGE_SIZE >> 20;


Because dirty_ring_size and TARGET_PAGE_SIZE are both 32 bits,
this multiplication will be done as a 32-bit operation,
which could overflow. You should cast one of the operands
to uint64_t to ensure that the operation is done as a 64 bit
multiplication.


To compute MB, you don't need multiplication:

   dirty_ring_size >> (20 - TARGET_PAGE_BITS)

In addition, why the mismatch in type?  dirty_ring_size_memory_MB can 
never be larger than dirty_ring_size.



r~


I'll post bugfix patch later as your suggestion, please review, thanks.


Side note: typo in the variable name: should be 'memory'.



+    if (max_dirtyrate < dirtyrate) {
+    max_dirtyrate = dirtyrate;
+    }
+
+    return dirty_ring_size_meory_MB * 100 / max_dirtyrate;
+}


thanks
-- PMM

Re: [PATCH 0/8] migration: introduce dirtylimit capability

2022-08-10 Thread Hyman


Ping.
   How about this series? hoping to get comments if anyone has played 
with it.


Thanks !

Hyman

在 2022/7/23 15:49, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

Abstract


This series added a new migration capability called "dirtylimit".  It can
be enabled when dirty ring is enabled, and it'll improve the vCPU performance
during the process of migration. It is based on the previous patchset:
https://lore.kernel.org/qemu-devel/cover.1656177590.git.huang...@chinatelecom.cn/

As mentioned in patchset "support dirty restraint on vCPU", dirtylimit way of
migration can make the read-process not be penalized. This series wires up the
vcpu dirty limit and wrappers as dirtylimit capability of migration. I introduce
two parameters vcpu-dirtylimit-period and vcpu-dirtylimit to implement the setup
of dirtylimit during live migration.

To validate the implementation, i tested a 32 vCPU vm live migration with such
model:
Only dirty vcpu0, vcpu1 with heavy memory workoad and leave the rest vcpus
untouched, running unixbench on the vpcu8-vcpu15 by setup the cpu affinity as
the following command:
taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item}

The following are results:

host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 32800  | 32786  | 25292 |
   | whetstone-double| 10326  | 10315  | 9847  |
   | pipe| 15442  | 15271  | 14506 |
   | context1| 7260   | 6235   | 4514  |
   | spawn   | 3663   | 3317   | 3249  |
   | syscall | 4669   | 4667   | 3841  |
   |-+++---|
 From the data above we can draw a conclusion that vcpus that do not dirty 
memory
in vm are almost unaffected during the dirtylimit migration, but the auto 
converge
way does.

I also tested the total time of dirtylimit migration with variable dirty memory
size in vm.

senario 1:
host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |---++---|
   | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) |
   |---++---|
   | 60| 2014   | 2131  |
   | 70| 5381   | 12590 |
   | 90| 6037   | 33545 |
   | 110   | 7660   | [*]   |
   |---++---|
   [*]: This case means migration is not convergent.

senario 2:
host cpu: Intel(R) Xeon(R) CPU E5-2650
host interface speed: 1Mb/s
   |---++---|
   | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) |
   |---++---|
   | 1600  | 15842  | 27548 |
   | 2000  | 19026  | 38447 |
   | 2400  | 19897  | 46381 |
   | 2800  | 22338  | 57149 |
   |---++---|
Above data shows that dirtylimit way of migration can also reduce the total
time of migration and it achieves convergence more easily in some case.

In addition to implement dirtylimit capability itself, this series
add 3 tests for migration, aiming at playing around for developer simply:
  1. qtest for dirty limit migration
  2. support dirty ring way of migration for guestperf tool
  3. support dirty limit migration for guestperf tool

Please review, thanks !

Hyman Huang (8):
   qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
   qapi/migration: Introduce vcpu-dirty-limit parameters
   migration: Introduce dirty-limit capability
   migration: Implement dirty-limit convergence algo
   migration: Export dirty-limit time info
   tests: Add migration dirty-limit capability test
   tests/migration: Introduce dirty-ring-size option into guestperf
   tests/migration: Introduce dirty-limit into guestperf

  include/sysemu/dirtylimit.h |  2 +
  migration/migration.c   | 50 ++
  migration/migration.h   |  1 +
  migration/ram.c | 53 ++-
  migration/trace-events  |  1 +
  monitor/hmp-cmds.c  | 26 ++
  qapi/migration.json | 57 
  softmmu/dirtylimit.c| 33 +++-
  tests/migration/guestperf/comparison.py | 14 +
  tests/migration/guestperf/engine.py

[PATCH QEMU v9 0/9] migration: introduce dirtylimit capability

2023-07-20 Thread ~hyman

Markus, thank Markus for crafting the comments
please review the latest version.

Yong

v7~v9:
Rebase on master, fix conflicts and craft the docs suggested by Markus

v6:
1. Rebase on master
2. Split the commit "Implement dirty-limit convergence algo" into two as
Juan suggested as the following:
a. Put the detection logic before auto-converge checking
b. Implement dirty-limit convergence algo
3. Put the detection logic before auto-converge checking
4. Sort the migrate_dirty_limit function in commit
"Introduce dirty-limit capability" suggested by Juan
5. Substitute the the int64_t to uint64_t in the last 2 commits
6. Fix the comments spell mistake
7. Add helper function in the commit
"Implement dirty-limit convergence algo" suggested by Juan

v5:
1. Rebase on master and enrich the comment for "dirty-limit" capability,
suggesting by Markus.
2. Drop commits that have already been merged.

v4:
1. Polish the docs and update the release version suggested by Markus
2. Rename the migrate exported info "dirty-limit-throttle-time-per-
round"
   to "dirty-limit-throttle-time-per-full".

v3(resend):
- fix the syntax error of the topic.

v3:
This version make some modifications inspired by Peter and Markus
as following:
1. Do the code clean up in [PATCH v2 02/11] suggested by Markus
2. Replace the [PATCH v2 03/11] with a much simpler patch posted by
   Peter to fix the following bug:
   https://bugzilla.redhat.com/show_bug.cgi?id=2124756
3. Fix the error path of migrate_params_check in [PATCH v2 04/11]
   pointed out by Markus. Enrich the commit message to explain why
   x-vcpu-dirty-limit-period an unstable parameter.
4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11]
   suggested by Peter:
   a. apply blk_mig_bulk_active check before enable dirty-limit
   b. drop the unhelpful check function before enable dirty-limit
   c. change the migration_cancel logic, just cancel dirty-limit
  only if dirty-limit capability turned on.
   d. abstract a code clean commit [PATCH v3 07/10] to adjust
  the check order before enable auto-converge
5. Change the name of observing indexes during dirty-limit live
   migration to make them more easy-understanding. Use the
   maximum throttle time of vpus as "dirty-limit-throttle-time-per-full"
6. Fix some grammatical and spelling errors pointed out by Markus
   and enrich the document about the dirty-limit live migration
   observing indexes "dirty-limit-ring-full-time"
   and "dirty-limit-throttle-time-per-full"
7. Change the default value of x-vcpu-dirty-limit-period to 1000ms,
   which is optimal value pointed out in cover letter in that
   testing environment.
8. Drop the 2 guestperf test commits [PATCH v2 10/11],
   [PATCH v2 11/11] and post them with a standalone series in the
   future.

v2:
This version make a little bit modifications comparing with
version 1 as following:
1. fix the overflow issue reported by Peter Maydell
2. add parameter check for hmp "set_vcpu_dirty_limit" command
3. fix the racing issue between dirty ring reaper thread and
   Qemu main thread.
4. add migrate parameter check for x-vcpu-dirty-limit-period
   and vcpu-dirty-limit.
5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit,
   cancel_vcpu_dirty_limit during dirty-limit live migration when
   implement dirty-limit convergence algo.
6. add capability check to ensure auto-converge and dirty-limit
   are mutually exclusive.
7. pre-check if kvm dirty ring size is configured before setting
   dirty-limit migrate parameter

Hyman Huang(黄勇) (9):
  softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
  qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
  qapi/migration: Introduce vcpu-dirty-limit parameters
  migration: Introduce dirty-limit capability
  migration: Refactor auto-converge capability logic
  migration: Put the detection logic before auto-converge checking
  migration: Implement dirty-limit convergence algorithm
  migration: Extend query-migrate to provide dirty-limit info
  tests: Add migration dirty-limit capability test

 include/sysemu/dirtylimit.h|   2 +
 migration/migration-hmp-cmds.c |  26 ++
 migration/migration.c  |  13 +++
 migration/options.c|  73 
 migration/options.h|   1 +
 migration/ram.c|  61 ++---
 migration/trace-events |   1 +
 qapi/migration.json|  72 +--
 softmmu/dirtylimit.c   |  91 +--
 tests/qtest/migration-test.c   | 155 +
 10 files changed, 470 insertions(+), 25 deletions(-)

-- 
2.38.5

[PATCH QEMU v9 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirty page rate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 28 +++
 qapi/migration.json| 35 +++---
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..352e9ec716 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 }
 }
 }
+
+monitor_printf(mon, "%s: %" PRIu64 " ms\n",
+MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
+params->x_vcpu_dirty_limit_period);
 }
 
 qapi_free_MigrationParameters(params);
@@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 error_setg(&err, "The block-bitmap-mapping parameter can only be set "
"through QMP");
 break;
+case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD:
+p->has_x_vcpu_dirty_limit_period = true;
+visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..1de63ba775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -80,6 +80,8 @@
 #define DEFINE_PROP_MIG_CAP(name, x) \
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
  store_global_state, true),
@@ -163,6 +165,9 @@ Property migration_properties[] = {
 DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
 DEFINE_PROP_STRING("tls-hostname", MigrationState, 
parameters.tls_hostname),
 DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
+   parameters.x_vcpu_dirty_limit_period,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
s->parameters.block_bitmap_mapping);
 }
 
+params->has_x_vcpu_dirty_limit_period = true;
+params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+
 return params;
 }
 
@@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_max = true;
 params->has_announce_rounds = true;
 params->has_announce_step = true;
+params->has_x_vcpu_dirty_limit_period = true;
 }
 
 /*
@@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 }
 #endif
 
+if (params->has_x_vcpu_dirty_limit_period &&
+(params->x_vcpu_dirty_limit_period < 1 ||
+ params->x_vcpu_dirty_limit_period > 1000)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "x-vcpu-dirty-limit-period",
+   "a value between 1 and 1000");
+return false;
+}
+
 return true;
 }
 
@@ -1199,6 +1217,11 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->has_block_bitmap_mapping = true;
 dest->block_bitmap_mapping = params->block_bitmap_mapping;
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+dest->x_vcpu_dirty_limit_period =
+params->x_vcpu_dirty_limit_period;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 QAPI_CLONE(BitmapMigrationNodeAliasList,
params->block_bitmap_mapping);
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+

[PATCH QEMU v9 6/9] migration: Put the detection logic before auto-converge checking

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f31de47a47..1d9300f4c5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs)
 return;
 }
 
-if (migrate_auto_converge()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+if (migrate_auto_converge()) {
 trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
 }
-- 
2.38.5

[PATCH QEMU v9 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid
if less than 0, so add parameter check for it.

Note that this patch also delete the unsolicited help message and
clean up the code.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 softmmu/dirtylimit.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 015a9038d1..5c12d26d49 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict 
*qdict)
 int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
 Error *err = NULL;
 
-qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
-if (err) {
-hmp_handle_error(mon, err);
-return;
+if (dirty_rate < 0) {
+error_setg(&err, "invalid dirty page limit %ld", dirty_rate);
+goto out;
 }
 
-monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query "
-   "dirty limit for virtual CPU]\n");
+qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
+
+out:
+hmp_handle_error(mon, err);
 }
 
 static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index)
-- 
2.38.5

[PATCH QEMU v9 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 21 +
 qapi/migration.json| 18 +++---
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 352e9ec716..35e8020bbf 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %" PRIu64 " ms\n",
 MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
 params->x_vcpu_dirty_limit_period);
+
+monitor_printf(mon, "%s: %" PRIu64 " MB/s\n",
+MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT),
+params->vcpu_dirty_limit);
 }
 
 qapi_free_MigrationParameters(params);
@@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 p->has_x_vcpu_dirty_limit_period = true;
 visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
 break;
+case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT:
+p->has_vcpu_dirty_limit = true;
+visit_type_size(v, param, &p->vcpu_dirty_limit, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 1de63ba775..7d2d98830e 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -81,6 +81,7 @@
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
 #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1   /* MB/s */
 
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
@@ -168,6 +169,9 @@ Property migration_properties[] = {
 DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
parameters.x_vcpu_dirty_limit_period,
DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
+DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState,
+   parameters.vcpu_dirty_limit,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 
 params->has_x_vcpu_dirty_limit_period = true;
 params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+params->has_vcpu_dirty_limit = true;
+params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit;
 
 return params;
 }
@@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_rounds = true;
 params->has_announce_step = true;
 params->has_x_vcpu_dirty_limit_period = true;
+params->has_vcpu_dirty_limit = true;
 }
 
 /*
@@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 return false;
 }
 
+if (params->has_vcpu_dirty_limit &&
+(params->vcpu_dirty_limit < 1)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "vcpu_dirty_limit",
+   "is invalid, it must greater then 1 MB/s");
+return false;
+}
+
 return true;
 }
 
@@ -1222,6 +1237,9 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 s->parameters.x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
diff --git a/qapi/migration.json b/qapi/migration.json
index 363055d252..7e92dfa045 100644
--- a/qapi/migration.json

[PATCH QEMU v9 7/9] migration: Implement dirty-limit convergence algorithm

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Implement dirty-limit convergence algorithm for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.

Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration.c  |  3 +++
 migration/ram.c| 36 
 migration/trace-events |  1 +
 softmmu/dirtylimit.c   | 29 +
 4 files changed, 69 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 91bba630a8..619af62461 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -166,6 +166,9 @@ void migration_cancel(const Error *error)
 if (error) {
 migrate_set_error(current_migration, error);
 }
+if (migrate_dirty_limit()) {
+qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
+}
 migrate_fd_cancel(current_migration);
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index 1d9300f4c5..9040d66e61 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -46,6 +46,7 @@
 #include "qapi/error.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
 #include "qapi/qmp/qerror.h"
 #include "trace.h"
 #include "exec/ram_addr.h"
@@ -59,6 +60,8 @@
 #include "multifd.h"
 #include "sysemu/runstate.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
 
 #include "hw/boards.h" /* for machine_dump_guest_core() */
 
@@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t 
end_time)
 }
 }
 
+/*
+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+/*
+ * dirty page rate quota for all vCPUs fetched from
+ * migration parameter 'vcpu_dirty_limit'
+ */
+static int64_t quota_dirtyrate;
+MigrationState *s = migrate_get_current();
+
+/*
+ * If dirty limit already enabled and migration parameter
+ * vcpu-dirty-limit untouched.
+ */
+if (dirtylimit_in_service() &&
+quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
+return;
+}
+
+quota_dirtyrate = s->parameters.vcpu_dirty_limit;
+
+/*
+ * Set all vCPU a quota dirtyrate, note that the second
+ * parameter will be ignored if setting all vCPU for the vm
+ */
+qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+
 static void migration_trigger_throttle(RAMState *rs)
 {
 uint64_t threshold = migrate_throttle_trigger_threshold();
@@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs)
 trace_migration_throttle();
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
+} else if (migrate_dirty_limit()) {
+migration_dirty_limit_guest();
 }
 }
 }
diff --git a/migration/trace-events b/migration/trace-events
index 5259c1044b..580895e86e 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, 
unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
 migration_throttle(void) ""
+migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" 
PRIi64 " MB/s"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: 
%" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: 
addr: 0x%" PRIx64 " flags: 0x%x host: %p"
 ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d 
addr=0x%" PRIx64 " flags=0x%x"
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 953ef934bc..5134296667 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void)
 dirtylimit_state_finalize();
 }
 
+/*
+ * dirty page rate limit is not allowed to set if migration
+ * is running with dirty-limit capability enabled.
+ */
+static bool dirtylimit_is_allowed(void)
+{
+MigrationState *ms = migrate_get_current();
+
+if (migration_is_running(ms->state) &&
+(!qemu_thread_is_

[PATCH QEMU v9 4/9] migration: Introduce dirty-limit capability

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/options.c  | 24 
 migration/options.h  |  1 +
 qapi/migration.json  |  9 -
 softmmu/dirtylimit.c | 12 +++-
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/migration/options.c b/migration/options.c
index 7d2d98830e..631c12cf32 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "ram.h"
 #include "options.h"
+#include "sysemu/kvm.h"
 
 /* Maximum migrate downtime set to 2000 seconds */
 #define MAX_MIGRATE_DOWNTIME_SECONDS 2000
@@ -196,6 +197,8 @@ Property migration_properties[] = {
 #endif
 DEFINE_PROP_MIG_CAP("x-switchover-ack",
 MIGRATION_CAPABILITY_SWITCHOVER_ACK),
+DEFINE_PROP_MIG_CAP("x-dirty-limit",
+MIGRATION_CAPABILITY_DIRTY_LIMIT),
 
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void)
 return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_dirty_limit(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT];
+}
+
 bool migrate_events(void)
 {
 MigrationState *s = migrate_get_current();
@@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, 
Error **errp)
 }
 }
 
+if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) {
+if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) {
+error_setg(errp, "dirty-limit conflicts with auto-converge"
+   " either of then available currently");
+return false;
+}
+
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "dirty-limit requires KVM with accelerator"
+   " property 'dirty-ring-size' set");
+return false;
+}
+}
+
 return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..b5a950d4e4 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -24,6 +24,7 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
+bool migrate_dirty_limit(void);
 bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 7e92dfa045..1c289e6658 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -497,6 +497,12 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -512,7 +518,8 @@
'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
-   'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
+   'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
+   'dirty-limit'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5c12d26d49..953ef934bc 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -24,6 +24,9 @@
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
 #include "trace.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/options.h"
 
 /*
  * Dirtylimit stop working if dirty p

[PATCH QEMU v9 8/9] migration: Extend query-migrate to provide dirty-limit info

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 include/sysemu/dirtylimit.h|  2 ++
 migration/migration-hmp-cmds.c | 10 +
 migration/migration.c  | 10 +
 qapi/migration.json| 16 +-
 softmmu/dirtylimit.c   | 39 ++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3a6b..d11edb 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
 void dirtylimit_set_all(uint64_t quota,
 bool enable);
 void dirtylimit_vcpu_execute(CPUState *cpu);
+uint64_t dirtylimit_throttle_time_per_round(void);
+uint64_t dirtylimit_ring_full_time(void);
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 35e8020bbf..c115ef2d23 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->cpu_throttle_percentage);
 }
 
+if (info->has_dirty_limit_throttle_time_per_round) {
+monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n",
+   info->dirty_limit_throttle_time_per_round);
+}
+
+if (info->has_dirty_limit_ring_full_time) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n",
+   info->dirty_limit_ring_full_time);
+}
+
 if (info->has_postcopy_blocktime) {
 monitor_printf(mon, "postcopy blocktime: %u\n",
info->postcopy_blocktime);
diff --git a/migration/migration.c b/migration/migration.c
index 619af62461..3b8587c4ae 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
 
 static NotifierList migration_state_notifiers =
 NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->dirty_pages_rate =
stat64_get(&mig_stats.dirty_pages_rate);
 }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_time_per_round = true;
+info->dirty_limit_throttle_time_per_round =
+dirtylimit_throttle_time_per_round();
+
+info->has_dirty_limit_ring_full_time = true;
+info->dirty_limit_ring_full_time = dirtylimit_ring_full_time();
+}
 }
 
 static void populate_disk_info(MigrationInfo *info)
diff --git a/qapi/migration.json b/qapi/migration.json
index 1c289e6658..8740ce268c 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -250,6 +250,18 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) each dirty ring full round. The value
+# equals dirty ring memory size divided by average dirty page
+# rate of the virtual CPU, which can be used to observe the
+# average memory load of the virtual CPU indirectly. Note that
+# zero means guest doesn't dirty memory.  (since 8.1)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationInfo',
@@ -267,7 +279,9 @@
'*postcopy-blocktime' : 'uint32',
'*postcopy-vcpu-blocktime': ['uint32'],
'*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-time-per-round': 'uint64',
+   '*dirty-limit-ring-full-time': 'uint64'} }
 
 ##
 # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5134296667..a0686323e5 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -565,6 +565,45 @@ out:
 hmp_handle_error(mon, err);
 }
 
+/* Return the max throttle time of each virtual CPU */
+uint64_t dirtylimit_throttle_time_per_round(void)
+{
+CPUState *cpu;
+

[PATCH QEMU v9 5/9] migration: Refactor auto-converge capability logic

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 0ada6477e8..f31de47a47 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs)
 /* During block migration the auto-converge logic incorrectly detects
  * that ram migration makes no progress. Avoid this by disabling the
  * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
+if (blk_mig_bulk_active()) {
+return;
+}
+
+if (migrate_auto_converge()) {
 /* The following detection logic can be refined later. For now:
Check to see if the ratio between dirtied bytes and the approx.
amount of bytes that just got transferred since the last time
-- 
2.38.5

[PATCH QEMU v9 9/9] tests: Add migration dirty-limit capability test

2023-07-20 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/qtest/migration-test.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e256da1216..e6f77d176c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2743,6 +2743,159 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source   target
+ *   migrate_incoming
+ * migrate
+ * migrate_cancel
+ *   restart target
+ * migrate
+ *
+ *  And see that if dirty limit works correctly
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+.only_target = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Restart dst vm, src vm already show up so we needn't wait anymore */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}&

[PATCH QEMU v10 5/9] migration: Refactor auto-converge capability logic

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 0ada6477e8..f31de47a47 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs)
 /* During block migration the auto-converge logic incorrectly detects
  * that ram migration makes no progress. Avoid this by disabling the
  * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
+if (blk_mig_bulk_active()) {
+return;
+}
+
+if (migrate_auto_converge()) {
 /* The following detection logic can be refined later. For now:
Check to see if the ratio between dirtied bytes and the approx.
amount of bytes that just got transferred since the last time
-- 
2.38.5

[PATCH QEMU v10 4/9] migration: Introduce dirty-limit capability

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/options.c  | 24 
 migration/options.h  |  1 +
 qapi/migration.json  |  9 -
 softmmu/dirtylimit.c | 12 +++-
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/migration/options.c b/migration/options.c
index 7d2d98830e..631c12cf32 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "ram.h"
 #include "options.h"
+#include "sysemu/kvm.h"
 
 /* Maximum migrate downtime set to 2000 seconds */
 #define MAX_MIGRATE_DOWNTIME_SECONDS 2000
@@ -196,6 +197,8 @@ Property migration_properties[] = {
 #endif
 DEFINE_PROP_MIG_CAP("x-switchover-ack",
 MIGRATION_CAPABILITY_SWITCHOVER_ACK),
+DEFINE_PROP_MIG_CAP("x-dirty-limit",
+MIGRATION_CAPABILITY_DIRTY_LIMIT),
 
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void)
 return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_dirty_limit(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT];
+}
+
 bool migrate_events(void)
 {
 MigrationState *s = migrate_get_current();
@@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, 
Error **errp)
 }
 }
 
+if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) {
+if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) {
+error_setg(errp, "dirty-limit conflicts with auto-converge"
+   " either of then available currently");
+return false;
+}
+
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "dirty-limit requires KVM with accelerator"
+   " property 'dirty-ring-size' set");
+return false;
+}
+}
+
 return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..b5a950d4e4 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -24,6 +24,7 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
+bool migrate_dirty_limit(void);
 bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 535fc27403..b4d9100ef3 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -497,6 +497,12 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.2)
+#
 # Features:
 #
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -512,7 +518,8 @@
'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
-   'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
+   'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
+   'dirty-limit'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5c12d26d49..953ef934bc 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -24,6 +24,9 @@
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
 #include "trace.h"
+#include "migration/misc.h"
+#include "migration/migration.h"
+#include "migration/options.h"
 
 /*
  * Dirtylimit stop working if dirty p

[PATCH QEMU v10 9/9] tests: Add migration dirty-limit capability test

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/qtest/migration-test.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e256da1216..e6f77d176c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2743,6 +2743,159 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source   target
+ *   migrate_incoming
+ * migrate
+ * migrate_cancel
+ *   restart target
+ * migrate
+ *
+ *  And see that if dirty limit works correctly
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+.only_target = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Restart dst vm, src vm already show up so we needn't wait anymore */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}&

[PATCH QEMU v10 8/9] migration: Extend query-migrate to provide dirty-limit info

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 include/sysemu/dirtylimit.h|  2 ++
 migration/migration-hmp-cmds.c | 10 +
 migration/migration.c  | 10 +
 qapi/migration.json| 16 +-
 softmmu/dirtylimit.c   | 39 ++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3a6b..d11edb 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
 void dirtylimit_set_all(uint64_t quota,
 bool enable);
 void dirtylimit_vcpu_execute(CPUState *cpu);
+uint64_t dirtylimit_throttle_time_per_round(void);
+uint64_t dirtylimit_ring_full_time(void);
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 35e8020bbf..c115ef2d23 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->cpu_throttle_percentage);
 }
 
+if (info->has_dirty_limit_throttle_time_per_round) {
+monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n",
+   info->dirty_limit_throttle_time_per_round);
+}
+
+if (info->has_dirty_limit_ring_full_time) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n",
+   info->dirty_limit_ring_full_time);
+}
+
 if (info->has_postcopy_blocktime) {
 monitor_printf(mon, "postcopy blocktime: %u\n",
info->postcopy_blocktime);
diff --git a/migration/migration.c b/migration/migration.c
index 619af62461..3b8587c4ae 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
 
 static NotifierList migration_state_notifiers =
 NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->dirty_pages_rate =
stat64_get(&mig_stats.dirty_pages_rate);
 }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_time_per_round = true;
+info->dirty_limit_throttle_time_per_round =
+dirtylimit_throttle_time_per_round();
+
+info->has_dirty_limit_ring_full_time = true;
+info->dirty_limit_ring_full_time = dirtylimit_ring_full_time();
+}
 }
 
 static void populate_disk_info(MigrationInfo *info)
diff --git a/qapi/migration.json b/qapi/migration.json
index b4d9100ef3..7bf4b30614 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -250,6 +250,18 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (Since 8.2)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) each dirty ring full round. The value
+# equals dirty ring memory size divided by average dirty page
+# rate of the virtual CPU, which can be used to observe the
+# average memory load of the virtual CPU indirectly. Note that
+# zero means guest doesn't dirty memory.  (Since 8.2)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationInfo',
@@ -267,7 +279,9 @@
'*postcopy-blocktime' : 'uint32',
'*postcopy-vcpu-blocktime': ['uint32'],
'*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-time-per-round': 'uint64',
+   '*dirty-limit-ring-full-time': 'uint64'} }
 
 ##
 # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5134296667..a0686323e5 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -565,6 +565,45 @@ out:
 hmp_handle_error(mon, err);
 }
 
+/* Return the max throttle time of each virtual CPU */
+uint64_t dirtylimit_throttle_time_per_round(void)
+{
+CPUState *cpu;
+

[PATCH QEMU v10 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid
if less than 0, so add parameter check for it.

Note that this patch also delete the unsolicited help message and
clean up the code.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 softmmu/dirtylimit.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 015a9038d1..5c12d26d49 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict 
*qdict)
 int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
 Error *err = NULL;
 
-qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
-if (err) {
-hmp_handle_error(mon, err);
-return;
+if (dirty_rate < 0) {
+error_setg(&err, "invalid dirty page limit %ld", dirty_rate);
+goto out;
 }
 
-monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query "
-   "dirty limit for virtual CPU]\n");
+qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
+
+out:
+hmp_handle_error(mon, err);
 }
 
 static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index)
-- 
2.38.5

[PATCH QEMU v10 7/9] migration: Implement dirty-limit convergence algorithm

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Implement dirty-limit convergence algorithm for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.

Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration.c  |  3 +++
 migration/ram.c| 36 
 migration/trace-events |  1 +
 softmmu/dirtylimit.c   | 29 +
 4 files changed, 69 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 91bba630a8..619af62461 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -166,6 +166,9 @@ void migration_cancel(const Error *error)
 if (error) {
 migrate_set_error(current_migration, error);
 }
+if (migrate_dirty_limit()) {
+qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
+}
 migrate_fd_cancel(current_migration);
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index 1d9300f4c5..9040d66e61 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -46,6 +46,7 @@
 #include "qapi/error.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
 #include "qapi/qmp/qerror.h"
 #include "trace.h"
 #include "exec/ram_addr.h"
@@ -59,6 +60,8 @@
 #include "multifd.h"
 #include "sysemu/runstate.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
 
 #include "hw/boards.h" /* for machine_dump_guest_core() */
 
@@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t 
end_time)
 }
 }
 
+/*
+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+/*
+ * dirty page rate quota for all vCPUs fetched from
+ * migration parameter 'vcpu_dirty_limit'
+ */
+static int64_t quota_dirtyrate;
+MigrationState *s = migrate_get_current();
+
+/*
+ * If dirty limit already enabled and migration parameter
+ * vcpu-dirty-limit untouched.
+ */
+if (dirtylimit_in_service() &&
+quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
+return;
+}
+
+quota_dirtyrate = s->parameters.vcpu_dirty_limit;
+
+/*
+ * Set all vCPU a quota dirtyrate, note that the second
+ * parameter will be ignored if setting all vCPU for the vm
+ */
+qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+
 static void migration_trigger_throttle(RAMState *rs)
 {
 uint64_t threshold = migrate_throttle_trigger_threshold();
@@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs)
 trace_migration_throttle();
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
+} else if (migrate_dirty_limit()) {
+migration_dirty_limit_guest();
 }
 }
 }
diff --git a/migration/trace-events b/migration/trace-events
index 5259c1044b..580895e86e 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, 
unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
 migration_throttle(void) ""
+migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" 
PRIi64 " MB/s"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: 
%" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: 
addr: 0x%" PRIx64 " flags: 0x%x host: %p"
 ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d 
addr=0x%" PRIx64 " flags=0x%x"
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 953ef934bc..5134296667 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void)
 dirtylimit_state_finalize();
 }
 
+/*
+ * dirty page rate limit is not allowed to set if migration
+ * is running with dirty-limit capability enabled.
+ */
+static bool dirtylimit_is_allowed(void)
+{
+MigrationState *ms = migrate_get_current();
+
+if (migration_is_running(ms->state) &&
+(!qemu_thread_is_

[PATCH QEMU v10 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 21 +
 qapi/migration.json| 18 +++---
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 352e9ec716..35e8020bbf 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %" PRIu64 " ms\n",
 MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
 params->x_vcpu_dirty_limit_period);
+
+monitor_printf(mon, "%s: %" PRIu64 " MB/s\n",
+MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT),
+params->vcpu_dirty_limit);
 }
 
 qapi_free_MigrationParameters(params);
@@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 p->has_x_vcpu_dirty_limit_period = true;
 visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
 break;
+case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT:
+p->has_vcpu_dirty_limit = true;
+visit_type_size(v, param, &p->vcpu_dirty_limit, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 1de63ba775..7d2d98830e 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -81,6 +81,7 @@
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
 #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1   /* MB/s */
 
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
@@ -168,6 +169,9 @@ Property migration_properties[] = {
 DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
parameters.x_vcpu_dirty_limit_period,
DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
+DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState,
+   parameters.vcpu_dirty_limit,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 
 params->has_x_vcpu_dirty_limit_period = true;
 params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+params->has_vcpu_dirty_limit = true;
+params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit;
 
 return params;
 }
@@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_rounds = true;
 params->has_announce_step = true;
 params->has_x_vcpu_dirty_limit_period = true;
+params->has_vcpu_dirty_limit = true;
 }
 
 /*
@@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 return false;
 }
 
+if (params->has_vcpu_dirty_limit &&
+(params->vcpu_dirty_limit < 1)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "vcpu_dirty_limit",
+   "is invalid, it must greater then 1 MB/s");
+return false;
+}
+
 return true;
 }
 
@@ -1222,6 +1237,9 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 s->parameters.x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
diff --git a/qapi/migration.json b/qapi/migration.json
index 16ba4e78df..535fc27403 100644
--- a/qapi/migration.json

[PATCH QEMU v10 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirty page rate calculation period configurable.

Currently, as the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes. Test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 28 +++
 qapi/migration.json| 35 +++---
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..352e9ec716 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 }
 }
 }
+
+monitor_printf(mon, "%s: %" PRIu64 " ms\n",
+MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
+params->x_vcpu_dirty_limit_period);
 }
 
 qapi_free_MigrationParameters(params);
@@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 error_setg(&err, "The block-bitmap-mapping parameter can only be set "
"through QMP");
 break;
+case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD:
+p->has_x_vcpu_dirty_limit_period = true;
+visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..1de63ba775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -80,6 +80,8 @@
 #define DEFINE_PROP_MIG_CAP(name, x) \
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
  store_global_state, true),
@@ -163,6 +165,9 @@ Property migration_properties[] = {
 DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
 DEFINE_PROP_STRING("tls-hostname", MigrationState, 
parameters.tls_hostname),
 DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
+   parameters.x_vcpu_dirty_limit_period,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
s->parameters.block_bitmap_mapping);
 }
 
+params->has_x_vcpu_dirty_limit_period = true;
+params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+
 return params;
 }
 
@@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_max = true;
 params->has_announce_rounds = true;
 params->has_announce_step = true;
+params->has_x_vcpu_dirty_limit_period = true;
 }
 
 /*
@@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 }
 #endif
 
+if (params->has_x_vcpu_dirty_limit_period &&
+(params->x_vcpu_dirty_limit_period < 1 ||
+ params->x_vcpu_dirty_limit_period > 1000)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "x-vcpu-dirty-limit-period",
+   "a value between 1 and 1000");
+return false;
+}
+
 return true;
 }
 
@@ -1199,6 +1217,11 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->has_block_bitmap_mapping = true;
 dest->block_bitmap_mapping = params->block_bitmap_mapping;
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+dest->x_vcpu_dirty_limit_period =
+params->x_vcpu_dirty_limit_period;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 QAPI_CLONE(BitmapMigrationNodeAliasList,
params->block_bitmap_mapping);
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+s->

[PATCH QEMU v10 0/9] migration: introduce dirtylimit capability

2023-07-24 Thread ~hyman

Hi, Juan,
Markus and i has crafted docs for the series,
please use the latest version to make a pull
request if it is convenient to you.

No functional changes since v6. Thanks.

Yong

v7~v10:
Rebase on master, update "Since" tags to 8.2,
fix conflicts and craft the docs suggested by Markus

v6:
1. Rebase on master
2. Split the commit "Implement dirty-limit convergence algo" into two as
Juan suggested as the following:
a. Put the detection logic before auto-converge checking
b. Implement dirty-limit convergence algo
3. Put the detection logic before auto-converge checking
4. Sort the migrate_dirty_limit function in commit
"Introduce dirty-limit capability" suggested by Juan
5. Substitute the the int64_t to uint64_t in the last 2 commits
6. Fix the comments spell mistake
7. Add helper function in the commit
"Implement dirty-limit convergence algo" suggested by Juan

v5:
1. Rebase on master and enrich the comment for "dirty-limit" capability,
suggesting by Markus.
2. Drop commits that have already been merged.

v4:
1. Polish the docs and update the release version suggested by Markus
2. Rename the migrate exported info "dirty-limit-throttle-time-per-
round"
   to "dirty-limit-throttle-time-per-full".

v3(resend):
- fix the syntax error of the topic.

v3:
This version make some modifications inspired by Peter and Markus
as following:
1. Do the code clean up in [PATCH v2 02/11] suggested by Markus
2. Replace the [PATCH v2 03/11] with a much simpler patch posted by
   Peter to fix the following bug:
   https://bugzilla.redhat.com/show_bug.cgi?id=2124756
3. Fix the error path of migrate_params_check in [PATCH v2 04/11]
   pointed out by Markus. Enrich the commit message to explain why
   x-vcpu-dirty-limit-period an unstable parameter.
4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11]
   suggested by Peter:
   a. apply blk_mig_bulk_active check before enable dirty-limit
   b. drop the unhelpful check function before enable dirty-limit
   c. change the migration_cancel logic, just cancel dirty-limit
  only if dirty-limit capability turned on.
   d. abstract a code clean commit [PATCH v3 07/10] to adjust
  the check order before enable auto-converge
5. Change the name of observing indexes during dirty-limit live
   migration to make them more easy-understanding. Use the
   maximum throttle time of vpus as "dirty-limit-throttle-time-per-full"
6. Fix some grammatical and spelling errors pointed out by Markus
   and enrich the document about the dirty-limit live migration
   observing indexes "dirty-limit-ring-full-time"
   and "dirty-limit-throttle-time-per-full"
7. Change the default value of x-vcpu-dirty-limit-period to 1000ms,
   which is optimal value pointed out in cover letter in that
   testing environment.
8. Drop the 2 guestperf test commits [PATCH v2 10/11],
   [PATCH v2 11/11] and post them with a standalone series in the
   future.

v2:
This version make a little bit modifications comparing with
version 1 as following:
1. fix the overflow issue reported by Peter Maydell
2. add parameter check for hmp "set_vcpu_dirty_limit" command
3. fix the racing issue between dirty ring reaper thread and
   Qemu main thread.
4. add migrate parameter check for x-vcpu-dirty-limit-period
   and vcpu-dirty-limit.
5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit,
   cancel_vcpu_dirty_limit during dirty-limit live migration when
   implement dirty-limit convergence algo.
6. add capability check to ensure auto-converge and dirty-limit
   are mutually exclusive.
7. pre-check if kvm dirty ring size is configured before setting
   dirty-limit migrate parameter

Hyman Huang(黄勇) (9):
  softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
  qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
  qapi/migration: Introduce vcpu-dirty-limit parameters
  migration: Introduce dirty-limit capability
  migration: Refactor auto-converge capability logic
  migration: Put the detection logic before auto-converge checking
  migration: Implement dirty-limit convergence algorithm
  migration: Extend query-migrate to provide dirty-limit info
  tests: Add migration dirty-limit capability test

 include/sysemu/dirtylimit.h|   2 +
 migration/migration-hmp-cmds.c |  26 ++
 migration/migration.c  |  13 +++
 migration/options.c|  73 
 migration/options.h|   1 +
 migration/ram.c|  61 ++---
 migration/trace-events |   1 +
 qapi/migration.json|  72 +--
 softmmu/dirtylimit.c   |  91 +--
 tests/qtest/migration-test.c   | 155 +
 10 files changed, 470 insertions(+), 25 deletions(-)

-- 
2.38.5

[PATCH QEMU v10 6/9] migration: Put the detection logic before auto-converge checking

2023-07-24 Thread ~hyman

From: Hyman Huang(黄勇) 

This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f31de47a47..1d9300f4c5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs)
 return;
 }
 
-if (migrate_auto_converge()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+if (migrate_auto_converge()) {
 trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
 }
-- 
2.38.5

[PATCH QEMU 0/2] migration: craft the doc comments

2023-07-26 Thread ~hyman

Hi, Markus,

This patchset aims to reformat migration doc comments
as commit a937b6aa739. Meanwhile, add myself
to the dirty-limit feature maintainer list.

Please review, Thanks.

Yong


Hyman Huang(黄勇) (2):
  qapi: Reformat and craft the migration doc comments
  MAINTAINERS: Add Hyman Huang to dirty-limit feature

 MAINTAINERS |  6 +
 qapi/migration.json | 66 +
 2 files changed, 37 insertions(+), 35 deletions(-)

-- 
2.38.5

[PATCH QEMU 1/2] qapi: Reformat and craft the migration doc comments

2023-07-26 Thread ~hyman

From: Hyman Huang(黄勇) 

Reformat migration doc comments to conform to current conventions
as commit a937b6aa739 (qapi: Reformat doc comments to conform to
current conventions).

Also, craft the dirty-limit capability comment.

Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 66 +
 1 file changed, 31 insertions(+), 35 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 6b49593d2f..5d5649c885 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -258,17 +258,17 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
-# @dirty-limit-throttle-time-per-round: Maximum throttle time (in 
microseconds) of virtual
-#   CPUs each dirty ring full round, which 
shows how
-#   MigrationCapability dirty-limit 
affects the guest
-#   during live migration. (since 8.1)
-#
-# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in 
microseconds)
-#  each dirty ring full round, note that the value 
equals
-#  dirty ring memory size divided by average dirty 
page rate
-#  of virtual CPU, which can be used to observe 
the average
-#  memory load of virtual CPU indirectly. Note 
that zero
-#  means guest doesn't dirty memory (since 8.1)
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (Since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) each dirty ring full round. The value
+# equals dirty ring memory size divided by average dirty page
+# rate of the virtual CPU, which can be used to observe the
+# average memory load of the virtual CPU indirectly. Note that
+# zero means guest doesn't dirty memory.  (Since 8.1)
 #
 # Since: 0.14
 ##
@@ -519,15 +519,11 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit algo to
-#   throttle down guest instead of auto-converge algo.
-#   Throttle algo only works when vCPU's dirtyrate greater
-#   than 'vcpu-dirty-limit', read processes in guest os
-#   aren't penalized any more, so this algo can improve
-#   performance of vCPU during live migration. This is an
-#   optional performance feature and should not affect the
-#   correctness of the existing auto-converge algo.
-#   (since 8.1)
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.1)
 #
 # Features:
 #
@@ -822,17 +818,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during  live migration. Should be in the range 1 to 1000ms,
+# defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#are experimental.
+# are experimental.
 #
 # Since: 2.4
 ##
@@ -988,17 +984,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms,
+# defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu

[PATCH QEMU 2/2] MAINTAINERS: Add Hyman Huang to dirty-limit feature

2023-07-26 Thread ~hyman

From: Hyman Huang(黄勇) 

Signed-off-by: Hyman Huang(黄勇) 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 12e59b6b27..d72fd63a8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3437,6 +3437,12 @@ F: hw/core/clock-vmstate.c
 F: hw/core/qdev-clock.c
 F: docs/devel/clocks.rst
 
+Dirty-limit feature
+M: Hyman Huang 
+S: Maintained
+F: softmmu/dirtylimit.c
+F: include/sysemu/dirtylimit.h
+
 Usermode Emulation
 --
 Overall usermode emulation
-- 
2.38.5

[PATCH QEMU 2/3] qapi: Craft the dirty-limit capability comment

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

Signed-off-by: Markus Armbruster 
Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index a74ade4d72..62ab151da2 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -519,14 +519,11 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit
-# algorithim to throttle down guest instead of auto-converge
-# algorithim. Throttle algorithim only works when vCPU's dirtyrate
-# greater than 'vcpu-dirty-limit', read processes in guest os
-# aren't penalized any more, so this algorithim can improve
-# performance of vCPU during live migration. This is an optional
-# performance feature and should not affect the correctness of the
-# existing auto-converge algorithim.  (Since 8.1)
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.1)
 #
 # Features:
 #
-- 
2.38.5

[PATCH QEMU 3/3] MAINTAINERS: Add Hyman Huang as maintainer

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

I've built interests in dirty-limit and dirty page rate
features and also have been working on projects related
to this subsystem.

Self-recommand myself as a maintainer for this subsystem
so that I can help to improve the dirty-limit algorithm
and review the patches about dirty page rate.

Signed-off-by: Hyman Huang(黄勇) 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 12e59b6b27..d4b1c91096 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3437,6 +3437,15 @@ F: hw/core/clock-vmstate.c
 F: hw/core/qdev-clock.c
 F: docs/devel/clocks.rst
 
+Dirty-limit and dirty page rate feature
+M: Hyman Huang 
+S: Maintained
+F: softmmu/dirtylimit.c
+F: include/sysemu/dirtylimit.h
+F: migration/dirtyrate.c
+F: migration/dirtyrate.h
+F: include/sysemu/dirtyrate.h
+
 Usermode Emulation
 --
 Overall usermode emulation
-- 
2.38.5

[PATCH QEMU 0/3] migration: craft the doc comments

2023-07-28 Thread ~hyman

Hi, Markus, Juan.

Please review the version 2, thanks.

v2:
- split the first commit in v1 into 2
- add commit message of commit:
  MAINTAINERS: Add Hyman Huang as maintainer

Yong

Hyman Huang(黄勇) (3):
  qapi: Reformat the dirty-limit migration doc comments
  qapi: Craft the dirty-limit capability comment
  MAINTAINERS: Add Hyman Huang as maintainer

 MAINTAINERS |  9 +++
 qapi/migration.json | 66 +
 2 files changed, 40 insertions(+), 35 deletions(-)

-- 
2.38.5

[PATCH QEMU 1/3] qapi: Reformat the dirty-limit migration doc comments

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

Reformat the dirty-limit migration doc comments to conform
to current conventions as commit a937b6aa739 (qapi: Reformat
doc comments to conform to current conventions).

Signed-off-by: Markus Armbruster 
Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 69 ++---
 1 file changed, 34 insertions(+), 35 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 6b49593d2f..a74ade4d72 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -258,17 +258,17 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
-# @dirty-limit-throttle-time-per-round: Maximum throttle time (in 
microseconds) of virtual
-#   CPUs each dirty ring full round, which 
shows how
-#   MigrationCapability dirty-limit 
affects the guest
-#   during live migration. (since 8.1)
-#
-# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in 
microseconds)
-#  each dirty ring full round, note that the value 
equals
-#  dirty ring memory size divided by average dirty 
page rate
-#  of virtual CPU, which can be used to observe 
the average
-#  memory load of virtual CPU indirectly. Note 
that zero
-#  means guest doesn't dirty memory (since 8.1)
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (Since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) for each dirty ring full round. The
+# value equals the dirty ring memory size divided by the average
+# dirty page rate of the virtual CPU, which can be used to
+# observe the average memory load of the virtual CPU indirectly.
+# Note that zero means guest doesn't dirty memory.  (Since 8.1)
 #
 # Since: 0.14
 ##
@@ -519,15 +519,14 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit algo to
-#   throttle down guest instead of auto-converge algo.
-#   Throttle algo only works when vCPU's dirtyrate greater
-#   than 'vcpu-dirty-limit', read processes in guest os
-#   aren't penalized any more, so this algo can improve
-#   performance of vCPU during live migration. This is an
-#   optional performance feature and should not affect the
-#   correctness of the existing auto-converge algo.
-#   (since 8.1)
+# @dirty-limit: If enabled, migration will use the dirty-limit
+# algorithim to throttle down guest instead of auto-converge
+# algorithim. Throttle algorithim only works when vCPU's dirtyrate
+# greater than 'vcpu-dirty-limit', read processes in guest os
+# aren't penalized any more, so this algorithim can improve
+# performance of vCPU during live migration. This is an optional
+# performance feature and should not affect the correctness of the
+# existing auto-converge algorithim.  (Since 8.1)
 #
 # Features:
 #
@@ -822,17 +821,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms.
+# Defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#are experimental.
+# are experimental.
 #
 # Since: 2.4
 ##
@@ -988,17 +987,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms.
+# Defaults to 1000ms.  (Since 8.1)
 #
 # @vc

[PATCH QEMU 3/3] tests/migration: Introduce dirty-limit into guestperf

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

Currently, guestperf does not cover the dirty-limit
migration, support this feature.

Note that dirty-limit requires 'dirty-ring-size' set.

To enable dirty-limit, setting x-vcpu-dirty-limit-period
as 500ms and x-vcpu-dirty-limit as 10MB/s:
$ ./tests/migration/guestperf.py \
--dirty-ring-size 4096 \
--dirty-limit --x-vcpu-dirty-limit-period 500 \
--vcpu-dirty-limit 10 --output output.json \

To run the entire standardized set of dirty-limit-enabled
comparisons, with unix migration:
$ ./tests/migration/guestperf-batch.py \
--dirty-ring-size 4096 \
--dst-host localhost --transport unix \
--filter compr-dirty-limit* --output outputdir

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/migration/guestperf/comparison.py | 23 +++
 tests/migration/guestperf/engine.py | 17 +
 tests/migration/guestperf/progress.py   | 16 ++--
 tests/migration/guestperf/scenario.py   | 11 ++-
 tests/migration/guestperf/shell.py  | 18 +-
 5 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/comparison.py 
b/tests/migration/guestperf/comparison.py
index c03b3f6d7e..42cc0372d1 100644
--- a/tests/migration/guestperf/comparison.py
+++ b/tests/migration/guestperf/comparison.py
@@ -135,4 +135,27 @@ COMPARISONS = [
 Scenario("compr-multifd-channels-64",
  multifd=True, multifd_channels=64),
 ]),
+
+# Looking at effect of dirty-limit with
+# varying x_vcpu_dirty_limit_period
+Comparison("compr-dirty-limit-period", scenarios = [
+Scenario("compr-dirty-limit-period-500",
+ dirty_limit=True, x_vcpu_dirty_limit_period=500),
+Scenario("compr-dirty-limit-period-800",
+ dirty_limit=True, x_vcpu_dirty_limit_period=800),
+Scenario("compr-dirty-limit-period-1000",
+ dirty_limit=True, x_vcpu_dirty_limit_period=1000),
+]),
+
+
+# Looking at effect of dirty-limit with
+# varying vcpu_dirty_limit
+Comparison("compr-dirty-limit", scenarios = [
+Scenario("compr-dirty-limit-10MB",
+ dirty_limit=True, vcpu_dirty_limit=10),
+Scenario("compr-dirty-limit-20MB",
+ dirty_limit=True, vcpu_dirty_limit=20),
+Scenario("compr-dirty-limit-50MB",
+ dirty_limit=True, vcpu_dirty_limit=50),
+]),
 ]
diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index 29ebb5011b..93a6f78e46 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -102,6 +102,8 @@ class Engine(object):
 info.get("expected-downtime", 0),
 info.get("setup-time", 0),
 info.get("cpu-throttle-percentage", 0),
+info.get("dirty-limit-throttle-time-per-round", 0),
+info.get("dirty-limit-ring-full-time", 0),
 )
 
 def _migrate(self, hardware, scenario, src, dst, connect_uri):
@@ -203,6 +205,21 @@ class Engine(object):
 resp = dst.command("migrate-set-parameters",
multifd_channels=scenario._multifd_channels)
 
+if scenario._dirty_limit:
+if not hardware._dirty_ring_size:
+raise Exception("dirty ring size must be configured when "
+"testing dirty limit migration")
+
+resp = src.command("migrate-set-capabilities",
+   capabilities = [
+   { "capability": "dirty-limit",
+ "state": True }
+   ])
+resp = src.command("migrate-set-parameters",
+x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period)
+resp = src.command("migrate-set-parameters",
+   vcpu_dirty_limit=scenario._vcpu_dirty_limit)
+
 resp = src.command("migrate", uri=connect_uri)
 
 post_copy = False
diff --git a/tests/migration/guestperf/progress.py 
b/tests/migration/guestperf/progress.py
index ab1ee57273..d490584217 100644
--- a/tests/migration/guestperf/progress.py
+++ b/tests/migration/guestperf/progress.py
@@ -81,7 +81,9 @@ class Progress(object):
  downtime,
  downtime_expected,
  setup_time,
- throttle_pcent):
+ throttle_pcent,
+ dirty_limit_throttle_time_per_round,
+ dirty_limit_ring_full_time):
 
 self._status = status
 self._ram = ram
@@ -91,6 +93,10 @@ class Progress(object):
 self._downtime_expected =

[PATCH QEMU 0/3] migration: enrich the dirty-limit test case

2023-07-28 Thread ~hyman

Dirty-limit feature was introduced in 8.1, and the test
case could be enriched to make sure the behavior and
the performance of dirty-limit is exactly what we want.

This series add 2 test cases, the first commit aims for
the functional test and the others aim for the
performance test.

Please review, thanks.

Yong.

Hyman Huang(黄勇) (3):
  tests: Add migration dirty-limit capability test
  tests/migration: Introduce dirty-ring-size option into guestperf
  tests/migration: Introduce dirty-limit into guestperf

 tests/migration/guestperf/comparison.py |  23 
 tests/migration/guestperf/engine.py |  23 +++-
 tests/migration/guestperf/hardware.py   |   8 +-
 tests/migration/guestperf/progress.py   |  16 ++-
 tests/migration/guestperf/scenario.py   |  11 +-
 tests/migration/guestperf/shell.py  |  24 +++-
 tests/qtest/migration-test.c| 155 
 7 files changed, 252 insertions(+), 8 deletions(-)

-- 
2.38.5

[PATCH QEMU 1/3] tests: Add migration dirty-limit capability test

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/qtest/migration-test.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 62d3f37021..52b1973afb 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2739,6 +2739,159 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source   target
+ *   migrate_incoming
+ * migrate
+ * migrate_cancel
+ *   restart target
+ * migrate
+ *
+ *  And see that if dirty limit works correctly
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+.only_target = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Restart dst vm, src vm already show up so we needn't wait anymore */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}&

[PATCH QEMU 2/3] tests/migration: Introduce dirty-ring-size option into guestperf

2023-07-28 Thread ~hyman

From: Hyman Huang(黄勇) 

Dirty ring size configuration is not supported by guestperf tool.

Introduce dirty-ring-size (ranges in [1024, 65536]) option so
developers can play with dirty-ring and dirty-limit feature easier.

To set dirty ring size with 4096 during migration test:
$ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/migration/guestperf/engine.py   | 6 +-
 tests/migration/guestperf/hardware.py | 8 ++--
 tests/migration/guestperf/shell.py| 6 +-
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index e69d16a62c..29ebb5011b 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -325,7 +325,6 @@ class Engine(object):
 cmdline = "'" + cmdline + "'"
 
 argv = [
-"-accel", "kvm",
 "-cpu", "host",
 "-kernel", self._kernel,
 "-initrd", self._initrd,
@@ -333,6 +332,11 @@ class Engine(object):
 "-m", str((hardware._mem * 1024) + 512),
 "-smp", str(hardware._cpus),
 ]
+if hardware._dirty_ring_size:
+argv.extend(["-accel", "kvm,dirty-ring-size=%s" %
+ hardware._dirty_ring_size])
+else:
+argv.extend(["-accel", "kvm"])
 
 argv.extend(self._get_qemu_serial_args())
 
diff --git a/tests/migration/guestperf/hardware.py 
b/tests/migration/guestperf/hardware.py
index 3145785ffd..f779cc050b 100644
--- a/tests/migration/guestperf/hardware.py
+++ b/tests/migration/guestperf/hardware.py
@@ -23,7 +23,8 @@ class Hardware(object):
  src_cpu_bind=None, src_mem_bind=None,
  dst_cpu_bind=None, dst_mem_bind=None,
  prealloc_pages = False,
- huge_pages=False, locked_pages=False):
+ huge_pages=False, locked_pages=False,
+ dirty_ring_size=0):
 self._cpus = cpus
 self._mem = mem # GiB
 self._src_mem_bind = src_mem_bind # List of NUMA nodes
@@ -33,6 +34,7 @@ class Hardware(object):
 self._prealloc_pages = prealloc_pages
 self._huge_pages = huge_pages
 self._locked_pages = locked_pages
+self._dirty_ring_size = dirty_ring_size
 
 
 def serialize(self):
@@ -46,6 +48,7 @@ class Hardware(object):
 "prealloc_pages": self._prealloc_pages,
 "huge_pages": self._huge_pages,
 "locked_pages": self._locked_pages,
+"dirty_ring_size": self._dirty_ring_size,
 }
 
 @classmethod
@@ -59,4 +62,5 @@ class Hardware(object):
 data["dst_mem_bind"],
 data["prealloc_pages"],
 data["huge_pages"],
-data["locked_pages"])
+data["locked_pages"],
+data["dirty_ring_size"])
diff --git a/tests/migration/guestperf/shell.py 
b/tests/migration/guestperf/shell.py
index 8a809e3dda..7d6b8cd7cf 100644
--- a/tests/migration/guestperf/shell.py
+++ b/tests/migration/guestperf/shell.py
@@ -60,6 +60,8 @@ class BaseShell(object):
 parser.add_argument("--prealloc-pages", dest="prealloc_pages", 
default=False)
 parser.add_argument("--huge-pages", dest="huge_pages", default=False)
 parser.add_argument("--locked-pages", dest="locked_pages", 
default=False)
+parser.add_argument("--dirty-ring-size", dest="dirty_ring_size",
+default=0, type=int)
 
 self._parser = parser
 
@@ -89,7 +91,9 @@ class BaseShell(object):
 
 locked_pages=args.locked_pages,
 huge_pages=args.huge_pages,
-prealloc_pages=args.prealloc_pages)
+prealloc_pages=args.prealloc_pages,
+
+dirty_ring_size=args.dirty_ring_size)
 
 
 class Shell(BaseShell):
-- 
2.38.5

[PATCH QEMU v3 2/3] qapi: Craft the dirty-limit capability comment

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

Signed-off-by: Markus Armbruster 
Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index a74ade4d72..62ab151da2 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -519,14 +519,11 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit
-# algorithim to throttle down guest instead of auto-converge
-# algorithim. Throttle algorithim only works when vCPU's dirtyrate
-# greater than 'vcpu-dirty-limit', read processes in guest os
-# aren't penalized any more, so this algorithim can improve
-# performance of vCPU during live migration. This is an optional
-# performance feature and should not affect the correctness of the
-# existing auto-converge algorithim.  (Since 8.1)
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.1)
 #
 # Features:
 #
-- 
2.38.5

[PATCH QEMU v3 0/3] migration: craft the doc comments

2023-07-30 Thread ~hyman

Hi, please review the version 3 of the series, thanks.

V3:
- craft the commit message of
  "Add section for migration dirty limit and dirty page rate",
  and put the section after section "Migration", suggested
  by Markus.

V2:
- split the first commit in v1 into 2
- add commit message of commit:
  MAINTAINERS: Add Hyman Huang as maintainer

Yong

Hyman Huang(黄勇) (3):
  qapi: Reformat the dirty-limit migration doc comments
  qapi: Craft the dirty-limit capability comment
  MAINTAINERS: Add section "Migration dirty limit and dirty page rate"

 MAINTAINERS |  9 +++
 qapi/migration.json | 66 +
 2 files changed, 40 insertions(+), 35 deletions(-)

-- 
2.38.5

[PATCH QEMU v3 3/3] MAINTAINERS: Add section "Migration dirty limit and dirty page rate"

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

I've built interests in dirty limit and dirty page rate
features and also have been working on projects related
to this subsystem.

Add a section to the MAINTAINERS file for migration
dirty limit and dirty page rate.

Add myself as a maintainer for this subsystem so that I
can help to improve the dirty limit algorithm and review
the patches about dirty page rate.

Signed-off-by: Hyman Huang(黄勇) 
Signed-off-by: Markus Armbruster 
Acked-by: Peter Xu 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 12e59b6b27..6111b6b4d9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3209,6 +3209,15 @@ F: qapi/migration.json
 F: tests/migration/
 F: util/userfaultfd.c
 
+Migration dirty limit and dirty page rate
+M: Hyman Huang 
+S: Maintained
+F: softmmu/dirtylimit.c
+F: include/sysemu/dirtylimit.h
+F: migration/dirtyrate.c
+F: migration/dirtyrate.h
+F: include/sysemu/dirtyrate.h
+
 D-Bus
 M: Marc-André Lureau 
 S: Maintained
-- 
2.38.5

[PATCH QEMU v3 1/3] qapi: Reformat the dirty-limit migration doc comments

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

Reformat the dirty-limit migration doc comments to conform
to current conventions as commit a937b6aa739 (qapi: Reformat
doc comments to conform to current conventions).

Signed-off-by: Markus Armbruster 
Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 69 ++---
 1 file changed, 34 insertions(+), 35 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 6b49593d2f..a74ade4d72 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -258,17 +258,17 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
-# @dirty-limit-throttle-time-per-round: Maximum throttle time (in 
microseconds) of virtual
-#   CPUs each dirty ring full round, which 
shows how
-#   MigrationCapability dirty-limit 
affects the guest
-#   during live migration. (since 8.1)
-#
-# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in 
microseconds)
-#  each dirty ring full round, note that the value 
equals
-#  dirty ring memory size divided by average dirty 
page rate
-#  of virtual CPU, which can be used to observe 
the average
-#  memory load of virtual CPU indirectly. Note 
that zero
-#  means guest doesn't dirty memory (since 8.1)
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (Since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) for each dirty ring full round. The
+# value equals the dirty ring memory size divided by the average
+# dirty page rate of the virtual CPU, which can be used to
+# observe the average memory load of the virtual CPU indirectly.
+# Note that zero means guest doesn't dirty memory.  (Since 8.1)
 #
 # Since: 0.14
 ##
@@ -519,15 +519,14 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit algo to
-#   throttle down guest instead of auto-converge algo.
-#   Throttle algo only works when vCPU's dirtyrate greater
-#   than 'vcpu-dirty-limit', read processes in guest os
-#   aren't penalized any more, so this algo can improve
-#   performance of vCPU during live migration. This is an
-#   optional performance feature and should not affect the
-#   correctness of the existing auto-converge algo.
-#   (since 8.1)
+# @dirty-limit: If enabled, migration will use the dirty-limit
+# algorithim to throttle down guest instead of auto-converge
+# algorithim. Throttle algorithim only works when vCPU's dirtyrate
+# greater than 'vcpu-dirty-limit', read processes in guest os
+# aren't penalized any more, so this algorithim can improve
+# performance of vCPU during live migration. This is an optional
+# performance feature and should not affect the correctness of the
+# existing auto-converge algorithim.  (Since 8.1)
 #
 # Features:
 #
@@ -822,17 +821,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms.
+# Defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#are experimental.
+# are experimental.
 #
 # Since: 2.4
 ##
@@ -988,17 +987,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms.
+# Defaults to 1000ms.  (Since 8.1)
 #
 # @vc

[PATCH QEMU v2 1/3] tests: Add migration dirty-limit capability test

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Note that this test case involves many passes, so it runs
in slow mode only.

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/qtest/migration-test.c | 164 +++
 1 file changed, 164 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 62d3f37021..0be2d17c42 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2739,6 +2739,166 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source  destination
+ *  start vm
+ *  start incoming vm
+ *  migrate
+ *  wait dirty limit to begin
+ *  cancel migrate
+ *  cancellation check
+ *  restart incoming vm
+ *  migrate
+ *  wait dirty limit to begin
+ *  wait pre-switchover event
+ *  convergence condition check
+ *
+ * And see if dirty limit migration works correctly.
+ * This test case involves many passes, so it runs in slow mode only.
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+.only_target = true,
+.use_dirty_ring = true,
+},
+

[PATCH QEMU v2 0/3] migration: enrich the dirty-limit test case

2023-07-30 Thread ~hyman

The dirty-limit migration test involves many passes
and takes about 1 minute on average, so put it in
the slow mode of migration-test. Inspired by Peter.

V2:
- put the dirty-limit migration test in slow mode and
  enrich the test case comment

Dirty-limit feature was introduced in 8.1, and the test
case could be enriched to make sure the behavior and
the performance of dirty-limit is exactly what we want.

This series adds 2 test cases, the first commit aims
for the functional test and the others aim for the
performance test.

Please review, thanks.

Yong.

Hyman Huang(黄勇) (3):
  tests: Add migration dirty-limit capability test
  tests/migration: Introduce dirty-ring-size option into guestperf
  tests/migration: Introduce dirty-limit into guestperf

 tests/migration/guestperf/comparison.py |  23 
 tests/migration/guestperf/engine.py |  23 +++-
 tests/migration/guestperf/hardware.py   |   8 +-
 tests/migration/guestperf/progress.py   |  16 ++-
 tests/migration/guestperf/scenario.py   |  11 +-
 tests/migration/guestperf/shell.py  |  24 +++-
 tests/qtest/migration-test.c| 164 
 7 files changed, 261 insertions(+), 8 deletions(-)

-- 
2.38.5

[PATCH QEMU v2 2/3] tests/migration: Introduce dirty-ring-size option into guestperf

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

Dirty ring size configuration is not supported by guestperf tool.

Introduce dirty-ring-size (ranges in [1024, 65536]) option so
developers can play with dirty-ring and dirty-limit feature easier.

To set dirty ring size with 4096 during migration test:
$ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/migration/guestperf/engine.py   | 6 +-
 tests/migration/guestperf/hardware.py | 8 ++--
 tests/migration/guestperf/shell.py| 6 +-
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index e69d16a62c..29ebb5011b 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -325,7 +325,6 @@ class Engine(object):
 cmdline = "'" + cmdline + "'"
 
 argv = [
-"-accel", "kvm",
 "-cpu", "host",
 "-kernel", self._kernel,
 "-initrd", self._initrd,
@@ -333,6 +332,11 @@ class Engine(object):
 "-m", str((hardware._mem * 1024) + 512),
 "-smp", str(hardware._cpus),
 ]
+if hardware._dirty_ring_size:
+argv.extend(["-accel", "kvm,dirty-ring-size=%s" %
+ hardware._dirty_ring_size])
+else:
+argv.extend(["-accel", "kvm"])
 
 argv.extend(self._get_qemu_serial_args())
 
diff --git a/tests/migration/guestperf/hardware.py 
b/tests/migration/guestperf/hardware.py
index 3145785ffd..f779cc050b 100644
--- a/tests/migration/guestperf/hardware.py
+++ b/tests/migration/guestperf/hardware.py
@@ -23,7 +23,8 @@ class Hardware(object):
  src_cpu_bind=None, src_mem_bind=None,
  dst_cpu_bind=None, dst_mem_bind=None,
  prealloc_pages = False,
- huge_pages=False, locked_pages=False):
+ huge_pages=False, locked_pages=False,
+ dirty_ring_size=0):
 self._cpus = cpus
 self._mem = mem # GiB
 self._src_mem_bind = src_mem_bind # List of NUMA nodes
@@ -33,6 +34,7 @@ class Hardware(object):
 self._prealloc_pages = prealloc_pages
 self._huge_pages = huge_pages
 self._locked_pages = locked_pages
+self._dirty_ring_size = dirty_ring_size
 
 
 def serialize(self):
@@ -46,6 +48,7 @@ class Hardware(object):
 "prealloc_pages": self._prealloc_pages,
 "huge_pages": self._huge_pages,
 "locked_pages": self._locked_pages,
+"dirty_ring_size": self._dirty_ring_size,
 }
 
 @classmethod
@@ -59,4 +62,5 @@ class Hardware(object):
 data["dst_mem_bind"],
 data["prealloc_pages"],
 data["huge_pages"],
-data["locked_pages"])
+data["locked_pages"],
+data["dirty_ring_size"])
diff --git a/tests/migration/guestperf/shell.py 
b/tests/migration/guestperf/shell.py
index 8a809e3dda..7d6b8cd7cf 100644
--- a/tests/migration/guestperf/shell.py
+++ b/tests/migration/guestperf/shell.py
@@ -60,6 +60,8 @@ class BaseShell(object):
 parser.add_argument("--prealloc-pages", dest="prealloc_pages", 
default=False)
 parser.add_argument("--huge-pages", dest="huge_pages", default=False)
 parser.add_argument("--locked-pages", dest="locked_pages", 
default=False)
+parser.add_argument("--dirty-ring-size", dest="dirty_ring_size",
+default=0, type=int)
 
 self._parser = parser
 
@@ -89,7 +91,9 @@ class BaseShell(object):
 
 locked_pages=args.locked_pages,
 huge_pages=args.huge_pages,
-prealloc_pages=args.prealloc_pages)
+prealloc_pages=args.prealloc_pages,
+
+dirty_ring_size=args.dirty_ring_size)
 
 
 class Shell(BaseShell):
-- 
2.38.5

[PATCH QEMU v2 3/3] tests/migration: Introduce dirty-limit into guestperf

2023-07-30 Thread ~hyman

From: Hyman Huang(黄勇) 

Currently, guestperf does not cover the dirty-limit
migration, support this feature.

Note that dirty-limit requires 'dirty-ring-size' set.

To enable dirty-limit, setting x-vcpu-dirty-limit-period
as 500ms and x-vcpu-dirty-limit as 10MB/s:
$ ./tests/migration/guestperf.py \
--dirty-ring-size 4096 \
--dirty-limit --x-vcpu-dirty-limit-period 500 \
--vcpu-dirty-limit 10 --output output.json \

To run the entire standardized set of dirty-limit-enabled
comparisons, with unix migration:
$ ./tests/migration/guestperf-batch.py \
--dirty-ring-size 4096 \
--dst-host localhost --transport unix \
--filter compr-dirty-limit* --output outputdir

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/migration/guestperf/comparison.py | 23 +++
 tests/migration/guestperf/engine.py | 17 +
 tests/migration/guestperf/progress.py   | 16 ++--
 tests/migration/guestperf/scenario.py   | 11 ++-
 tests/migration/guestperf/shell.py  | 18 +-
 5 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/comparison.py 
b/tests/migration/guestperf/comparison.py
index c03b3f6d7e..42cc0372d1 100644
--- a/tests/migration/guestperf/comparison.py
+++ b/tests/migration/guestperf/comparison.py
@@ -135,4 +135,27 @@ COMPARISONS = [
 Scenario("compr-multifd-channels-64",
  multifd=True, multifd_channels=64),
 ]),
+
+# Looking at effect of dirty-limit with
+# varying x_vcpu_dirty_limit_period
+Comparison("compr-dirty-limit-period", scenarios = [
+Scenario("compr-dirty-limit-period-500",
+ dirty_limit=True, x_vcpu_dirty_limit_period=500),
+Scenario("compr-dirty-limit-period-800",
+ dirty_limit=True, x_vcpu_dirty_limit_period=800),
+Scenario("compr-dirty-limit-period-1000",
+ dirty_limit=True, x_vcpu_dirty_limit_period=1000),
+]),
+
+
+# Looking at effect of dirty-limit with
+# varying vcpu_dirty_limit
+Comparison("compr-dirty-limit", scenarios = [
+Scenario("compr-dirty-limit-10MB",
+ dirty_limit=True, vcpu_dirty_limit=10),
+Scenario("compr-dirty-limit-20MB",
+ dirty_limit=True, vcpu_dirty_limit=20),
+Scenario("compr-dirty-limit-50MB",
+ dirty_limit=True, vcpu_dirty_limit=50),
+]),
 ]
diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index 29ebb5011b..93a6f78e46 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -102,6 +102,8 @@ class Engine(object):
 info.get("expected-downtime", 0),
 info.get("setup-time", 0),
 info.get("cpu-throttle-percentage", 0),
+info.get("dirty-limit-throttle-time-per-round", 0),
+info.get("dirty-limit-ring-full-time", 0),
 )
 
 def _migrate(self, hardware, scenario, src, dst, connect_uri):
@@ -203,6 +205,21 @@ class Engine(object):
 resp = dst.command("migrate-set-parameters",
multifd_channels=scenario._multifd_channels)
 
+if scenario._dirty_limit:
+if not hardware._dirty_ring_size:
+raise Exception("dirty ring size must be configured when "
+"testing dirty limit migration")
+
+resp = src.command("migrate-set-capabilities",
+   capabilities = [
+   { "capability": "dirty-limit",
+ "state": True }
+   ])
+resp = src.command("migrate-set-parameters",
+x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period)
+resp = src.command("migrate-set-parameters",
+   vcpu_dirty_limit=scenario._vcpu_dirty_limit)
+
 resp = src.command("migrate", uri=connect_uri)
 
 post_copy = False
diff --git a/tests/migration/guestperf/progress.py 
b/tests/migration/guestperf/progress.py
index ab1ee57273..d490584217 100644
--- a/tests/migration/guestperf/progress.py
+++ b/tests/migration/guestperf/progress.py
@@ -81,7 +81,9 @@ class Progress(object):
  downtime,
  downtime_expected,
  setup_time,
- throttle_pcent):
+ throttle_pcent,
+ dirty_limit_throttle_time_per_round,
+ dirty_limit_ring_full_time):
 
 self._status = status
 self._ram = ram
@@ -91,6 +93,10 @@ class Progress(object):
 self._downtime_expected =

[PATCH QEMU 3/3] vhost-user-blk-pci: introduce auto-num-queues property

2023-08-05 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "a4eef0711b vhost-user-blk-pci: default num_queues to -smp N"
implment sizing the number of vhost-user-blk-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for vhost-user-blk-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the vhost-user-blk-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the vhost-user-blk-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/block/vhost-user-blk.c  | 1 +
 hw/virtio/vhost-user-blk-pci.c | 9 -
 include/hw/virtio/vhost-user-blk.h | 5 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index eecf3f7a81..34e23b1727 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -566,6 +566,7 @@ static const VMStateDescription vmstate_vhost_user_blk = {
 
 static Property vhost_user_blk_properties[] = {
 DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
+DEFINE_PROP_BOOL("auto-num-queues", VHostUserBlk, auto_num_queues, true),
 DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues,
VHOST_USER_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("queue-size", VHostUserBlk, queue_size, 128),
diff --git a/hw/virtio/vhost-user-blk-pci.c b/hw/virtio/vhost-user-blk-pci.c
index eef8641a98..f7776e928a 100644
--- a/hw/virtio/vhost-user-blk-pci.c
+++ b/hw/virtio/vhost-user-blk-pci.c
@@ -56,7 +56,14 @@ static void vhost_user_blk_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 DeviceState *vdev = DEVICE(&dev->vdev);
 
 if (dev->vdev.num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) {
-dev->vdev.num_queues = virtio_pci_optimal_num_queues(0);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.auto_num_queues)
+dev->vdev.num_queues = virtio_pci_optimal_num_queues(0);
+else
+dev->vdev.num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/include/hw/virtio/vhost-user-blk.h 
b/include/hw/virtio/vhost-user-blk.h
index ea085ee1ed..e6f0515bc6 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -50,6 +50,11 @@ struct VHostUserBlk {
 bool connected;
 /* vhost_user_blk_start/vhost_user_blk_stop */
 bool started_vu;
+/*
+ * Set to true if virtqueues allow to be allocated to
+ * match the number of virtual CPUs automatically.
+ */
+bool auto_num_queues;
 };
 
 #endif
-- 
2.38.5

[PATCH QEMU 1/3] virtio-scsi-pci: introduce auto-num-queues property

2023-08-05 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "6a55882284 virtio-scsi-pci: default num_queues to -smp N"
implment sizing the number of virtio-scsi-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for virtio-scsi-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the virtio-scsi-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the virtio-scsi-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/scsi/vhost-scsi.c|  2 ++
 hw/scsi/vhost-user-scsi.c   |  2 ++
 hw/scsi/virtio-scsi.c   |  2 ++
 hw/virtio/vhost-scsi-pci.c  | 11 +--
 hw/virtio/vhost-user-scsi-pci.c | 11 +--
 hw/virtio/virtio-scsi-pci.c | 11 +--
 include/hw/virtio/virtio-scsi.h |  5 +
 7 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 443f67daa4..78a8929c49 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -284,6 +284,8 @@ static Property vhost_scsi_properties[] = {
 DEFINE_PROP_STRING("vhostfd", VirtIOSCSICommon, conf.vhostfd),
 DEFINE_PROP_STRING("wwpn", VirtIOSCSICommon, conf.wwpn),
 DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size,
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index ee99b19e7a..1b837f370a 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -161,6 +161,8 @@ static void vhost_user_scsi_unrealize(DeviceState *dev)
 static Property vhost_user_scsi_properties[] = {
 DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev),
 DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size,
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 45b95ea070..2ec13032aa 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1279,6 +1279,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev)
 }
 
 static Property virtio_scsi_properties[] = {
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSI, parent_obj.auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI,
diff --git a/hw/virtio/vhost-scsi-pci.c b/hw/virtio/vhost-scsi-pci.c
index 08980bc23b..927c155278 100644
--- a/hw/virtio/vhost-scsi-pci.c
+++ b/hw/virtio/vhost-scsi-pci.c
@@ -51,8 +51,15 @@ static void vhost_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
 VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf;
 
 if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) {
-conf->num_queues =
-virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.parent_obj.parent_obj.auto_num_queues)
+conf->num_queues =
+virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED);
+else
+conf->num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/hw/virtio/vhost-user-scsi-pci.c b/hw/virtio/vhost-user-scsi-pci.c
index 75882e3cf9..9c521a7f93 100644
--- a/hw/virtio/vhost-user-scsi-pci.c
+++ b/hw/virtio/vhost-user-scsi-pci.c
@@ -57,8 +57,15 @@ static void vhost_user_scsi_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf;
 
 if (conf-

[PATCH QEMU 0/3] provide a smooth upgrade solution for multi-queues disk

2023-08-05 Thread ~hyman

A 1:1 virtqueue:vCPU mapping implementation for virtio-*-pci disk
introduced since qemu >= 5.2.0, which improves IO performance
remarkably. To enjoy this feature for exiting running VMs without
service interruption, the common solution is to migrate VMs from the
lower version of the hypervisor to the upgraded hypervisor, then wait
for the next cold reboot of the VM to enable this feature. That's the
way "discard" and "write-zeroes" features work.

As to multi-queues disk allocation automatically, it's a little
different because the destination will allocate queues to match the
number of vCPUs automatically by default in the case of live migration,
and the VMs on the source side remain 1 queue by default, which results
in migration failure due to loading disk VMState incorrectly on the
destination side. This issue requires Qemu to provide a hint that shows
multi-queues disk allocation is automatically supported, and this allows
upper APPs, e.g., libvirt, to recognize the hypervisor's capability of
this. And upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

To fix the issue, we introduce the auto-num-queues property for
virtio-*-pci as a solution, which would be probed by APPs, e.g., libvirt
by querying the device properties of QEMU. When launching live
migration, libvirt will send the auto-num-queues property as a migration
cookie to the destination, and thus the destination knows if the source
side supports auto-num-queues. If not, the destination would switch off
by building the command line with "auto-num-queues=off" when preparing
the incoming VM process. The following patches of libvirt show how it
roughly works:
https://github.com/newfriday/libvirt/commit/ce2bae2e1a6821afeb80756dc01f3680f525e506
https://github.com/newfriday/libvirt/commit/f546972b009458c88148fe079544db7e9e1f43c3
https://github.com/newfriday/libvirt/commit/5ee19c8646fdb4d87ab8b93f287c20925268ce83

The smooth upgrade solution requires the introduction of the auto-num-
queues property on the QEMU side, which is what the patch set does. I'm
hoping for comments about the series.

Please review, thanks.
Yong

Hyman Huang(黄勇) (3):
  virtio-scsi-pci: introduce auto-num-queues property
  virtio-blk-pci: introduce auto-num-queues property
  vhost-user-blk-pci: introduce auto-num-queues property

 hw/block/vhost-user-blk.c  |  1 +
 hw/block/virtio-blk.c  |  1 +
 hw/scsi/vhost-scsi.c   |  2 ++
 hw/scsi/vhost-user-scsi.c  |  2 ++
 hw/scsi/virtio-scsi.c  |  2 ++
 hw/virtio/vhost-scsi-pci.c | 11 +--
 hw/virtio/vhost-user-blk-pci.c |  9 -
 hw/virtio/vhost-user-scsi-pci.c| 11 +--
 hw/virtio/virtio-blk-pci.c |  9 -
 hw/virtio/virtio-scsi-pci.c| 11 +--
 include/hw/virtio/vhost-user-blk.h |  5 +
 include/hw/virtio/virtio-blk.h |  5 +
 include/hw/virtio/virtio-scsi.h|  5 +
 13 files changed, 66 insertions(+), 8 deletions(-)

-- 
2.38.5

[PATCH QEMU 2/3] virtio-blk-pci: introduce auto-num-queues property

2023-08-05 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "9445e1e15 virtio-blk-pci: default num_queues to -smp N"
implment sizing the number of virtio-blk-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for virtio-blk-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the virtio-blk-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the virtio-blk-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/block/virtio-blk.c  | 1 +
 hw/virtio/virtio-blk-pci.c | 9 -
 include/hw/virtio/virtio-blk.h | 5 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 39e7f23fab..9e498ca64a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1716,6 +1716,7 @@ static Property virtio_blk_properties[] = {
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
+DEFINE_PROP_BOOL("auto-num-queues", VirtIOBlock, auto_num_queues, true),
 DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
VIRTIO_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c
index 9743bee965..4b6b4c4933 100644
--- a/hw/virtio/virtio-blk-pci.c
+++ b/hw/virtio/virtio-blk-pci.c
@@ -54,7 +54,14 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
 VirtIOBlkConf *conf = &dev->vdev.conf;
 
 if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) {
-conf->num_queues = virtio_pci_optimal_num_queues(0);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.auto_num_queues)
+conf->num_queues = virtio_pci_optimal_num_queues(0);
+else
+conf->num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index dafec432ce..dab6d7c70c 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -65,6 +65,11 @@ struct VirtIOBlock {
 uint64_t host_features;
 size_t config_size;
 BlockRAMRegistrar blk_ram_registrar;
+/*
+ * Set to true if virtqueues allow to be allocated to
+ * match the number of virtual CPUs automatically.
+ */
+bool auto_num_queues;
 };
 
 typedef struct VirtIOBlockReq {
-- 
2.38.5

Re: [PATCH v3 3/3] cpus-common: implement dirty limit on vCPU

2021-11-22 Thread Hyman





在 2021/11/22 19:26, Markus Armbruster 写道:

Hyman Huang  writes:


在 2021/11/22 17:10, Markus Armbruster 写道:

Hyman Huang  writes:


=E5=9C=A8 2021/11/22 15:35, Markus Armbruster =E5=86=99=E9=81=93:

huang...@chinatelecom.cn writes:


From: Hyman Huang(=E9=BB=84=E5=8B=87) 

implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirtyrate given by user.

introduce qmp commands set-dirty-limit/cancel-dirty-limit to
set/cancel dirty limit on vCPU.


Please start sentences with a capital letter.


Ok，i'll check the syntax problem next version.


Signed-off-by: Hyman Huang(黄勇) 



[...]


diff --git a/qapi/misc.json b/qapi/misc.json
index 358548a..98e6001 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -527,3 +527,42 @@
 'data': { '*option': 'str' },
 'returns': ['CommandLineOptionInfo'],
 'allow-preconfig': true }
+
+##
+# @set-dirty-limit:
+#
+# This command could be used to cap the vCPU memory load, which is also
+# refered as dirtyrate. One should use "calc-dirty-rate" with "dirty-ring"
+# and to calculate vCPU dirtyrate and query it with "query-dirty-rate".
+# Once getting the vCPU current dirtyrate, "set-dirty-limit" can be used
+# to set the upper limit of dirtyrate for the interested vCPU.


"dirtyrate" is not a word.  Let's spell it "dirty page rate", for
consistency with the documentation in migration.json.

Ok, sounds good.


Regarding "One should use ...": sounds like you have to run
calc-dirty-rate with argument @mode set to @dirty-ring before this
command.  Correct?  What happens when you don't?  set-dirty-limit fails?

You didn't answer this question.

set-dirty-limit doesn't do any pre-check about if calc-dirty-rate has
executed, so it doesn't fail.


Peeking at qmp_set_dirty_limit()... it fails when
!kvm_dirty_ring_enabled().  kvm_dirty_ring_enabled() returns true when
kvm_state->kvm_dirty_ring_size is non-zero.  How can it become non-zero?
If we enable dirty-ring with qemu commandline "-accel 
kvm,dirty-ring-size=xxx",qemu will parse the dirty-ring-size and set it. 
So we check if

dirty-ring is enabled by the kvm_dirty_ring_size.



Since only executing calc-dirty-rate with dirty-ring mode can we get
the vCPU dirty page rate currently(while the dirty-bitmap only get the
vm dirty page rate), "One should use ..." maybe misleading, what i
actually want to say is "One should use the dirty-ring mode to
calculate the vCPU dirty page rate".


I'm still confused on what exactly users must do for the page dirty rate
limiting to work as intended, and at least as importantly, what happens
when they get it wrong.

User can set-dirty-limit unconditionally and the dirtylimit will work.

"One should use ..." just emphasize if users want to know which vCPU is 
in high memory load and want to limit it's dirty page rate, they can use 
calc-dirty-rate but it is not prerequisite for set-dirty-limit.


Umm, I think "One should use ..." explanation make things complicated.
I'll reconsider the comment next version.


[...]

Re: [PATCH v6 3/3] cpus-common: implement dirty page limit on vCPU

2021-11-26 Thread Hyman





在 2021/11/26 15:03, Markus Armbruster 写道:

huang...@chinatelecom.cn writes:


From: Hyman Huang(黄勇) 

Implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirty page rate given by user.

Introduce qmp commands set-dirty-limit/cancel-dirty-limit to
set/cancel dirty page limit on vCPU.

Signed-off-by: Hyman Huang(黄勇) 
---
  cpus-common.c | 41 +
  include/hw/core/cpu.h |  9 +
  qapi/migration.json   | 47 +++
  softmmu/vl.c  |  1 +
  4 files changed, 98 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e..3c156b3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,6 +23,11 @@
  #include "hw/core/cpu.h"
  #include "sysemu/cpus.h"
  #include "qemu/lockable.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/cpu-throttle.h"
+#include "sysemu/kvm.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-migration.h"
  
  static QemuMutex qemu_cpu_list_lock;

  static QemuCond exclusive_cond;
@@ -352,3 +357,39 @@ void process_queued_cpu_work(CPUState *cpu)
  qemu_mutex_unlock(&cpu->work_mutex);
  qemu_cond_broadcast(&qemu_work_cond);
  }
+
+void qmp_set_dirty_limit(int64_t idx,
+ uint64_t dirtyrate,
+ Error **errp)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "setting a dirty page limit requires support from dirty 
ring");


Can we phrase the message in a way that gives the user a chance to guess
what he needs to do to avoid it?
> Perhaps: "setting a dirty page limit requires KVM with accelerator
property 'dirty-ring-size' set".

Sound good, this make things more clear.



+return;
+}
+
+dirtylimit_calc();
+dirtylimit_vcpu(idx, dirtyrate);
+}
+
+void qmp_cancel_dirty_limit(int64_t idx,
+Error **errp)
+{


Three cases:

Case 1: enable is impossible, so nothing to do.

Case 2: enable is possible and we actually enabled.

Case 3: enable is possible, but we didn't.  Nothing to do.


+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "no need to cancel a dirty page limit as dirty ring not 
enabled");
+return;


This is case 1.  We error out.


+}
+
+if (unlikely(!dirtylimit_cancel_vcpu(idx))) {


I don't think unlikely() matters here.


+dirtylimit_calc_quit();
+}


In case 2, dirtylimit_calc_quit() returns zero if this was the last
limit, else non-zero.  If the former, we request the thread to stop.I am wildly guessing you misunderstood the function 

dirtylimit_cancel_vcpu, see below.


In case 3, dirtylimit_calc_quit() returns zero, and we do nothing.
In this case, we cancel the "dirtylimit thread" in function 
dirtylimit_cancel_vcpu actually, if it was the last limit thread of the 
whole vm, dirtylimit_cancel_vcpu return zero and we request the 
dirtyrate calculation thread to stop, so we call the function 
dirtylimit_calc_quit , which stop the "dirtyrate calculation thread" 
internally.


Why is case 1 and error, but case 3 isn't?

Both could silently do nothing, like case 3 does now.

Both could error out, like case 1 does now.  A possible common error
message: "there is no dirty page limit to cancel".

I'd be okay with consistently doing nothing, and with consistently
erroring out.


+}
+
+void dirtylimit_setup(int max_cpus)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+return;
+}
+
+dirtylimit_calc_state_init(max_cpus);
+dirtylimit_state_init(max_cpus);
+}
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index e948e81..11df012 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -881,6 +881,15 @@ void end_exclusive(void);
   */
  void qemu_init_vcpu(CPUState *cpu);
  
+/**

+ * dirtylimit_setup:
+ *
+ * Initializes the global state of dirtylimit calculation and
+ * dirtylimit itself. This is prepared for vCPU dirtylimit which
+ * could be triggered during vm lifecycle.
+ */
+void dirtylimit_setup(int max_cpus);
+
  #define SSTEP_ENABLE  0x1  /* Enable simulated HW single stepping */
  #define SSTEP_NOIRQ   0x2  /* Do not use IRQ while single stepping */
  #define SSTEP_NOTIMER 0x4  /* Do not Timers while single stepping */
diff --git a/qapi/migration.json b/qapi/migration.json
index bbfd48c..2b0fe19 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1850,6 +1850,53 @@
  { 'command': 'query-dirty-rate', 'returns': 'DirtyRateInfo' }
  
  ##

+# @set-dirty-limit:
+#
+# Set the upper limit of dirty page rate for a vCPU.
+#
+# This command could be used to cap the vCPU memory load, which is also


"Could be used" suggests there

Re: [PATCH v16 0/7] support dirty restraint on vCPU

2022-03-02 Thread Hyman





在 2022/3/3 0:53, Dr. David Alan Gilbert 写道:

* Dr. David Alan Gilbert (dgilb...@redhat.com) wrote:

* huang...@chinatelecom.cn (huang...@chinatelecom.cn) wrote:

From: Hyman Huang(黄勇) 


Queued via my migration/hmp/etc tree


Hi,
   Unfortunately I've had to unqueue this - it breaks the
qmp-cmd-test:

# starting QEMU: exec ./x86_64-softmmu/qemu-system-x86_64 -qtest 
unix:/tmp/qtest-142136.sock -qtest-log /dev/fd/2 -chardev 
socket,path=/tmp/qtest-142136.qmp,id=char0 -mon chardev=char0,mode=control 
-display none -nodefaults -machine none -accel qtest
[I 1646239093.713627] OPENED
[R +0.000190] endianness
[S +0.000196] OK little
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 6}, "package": "v6.2.0-1867-g817703d65a"}, 
"capabilities": ["oob"]}}{"execute": "qmp_capabilities"}

{"return": {}}{"execute": "query-vcpu-dirty-limit"}

{"error": {"class": "GenericError", "desc": "dirty page limit not enabled"}}**
ERROR:../tests/qtest/qmp-cmd-test.c:84:test_query: assertion failed: (qdict_haskey(resp, 
"return"))
Bail out! ERROR:../tests/qtest/qmp-cmd-test.c:84:test_query: assertion failed: 
(qdict_haskey(resp, "return"))
[I +0.195433] CLOSED
Aborted (core dumped)


qmp-cmd-test tries to run every query command; so either you need to:
   a) Add it to the list of skipped command in qmp-cmd-test
query-vcpu-dirty-limit sucess only if dirty ring feature enabled. So i 
prefer to add this command to the list of kipped command. I'll fix it 
next version and run the qtests before i post the patchset.


Thinks
Yong

   b) Make it not actually error when the limit isn't enabled.

Dave



v16
- rebase on master
- drop the unused typedef syntax in [PATCH v15 6/7]
- add the Reviewed-by and Acked-by tags by the way

v15
- rebase on master
- drop the 'init_time_ms' parameter in function vcpu_calculate_dirtyrate
- drop the 'setup' field in dirtylimit_state and call dirtylimit_process
   directly, which makes code cleaner.
- code clean in dirtylimit_adjust_throttle
- fix miss dirtylimit_state_unlock() in dirtylimit_process and
   dirtylimit_query_all
- add some comment

Please review. Thanks,

Regards
Yong

v14
- v13 sent by accident, resend patchset.

v13
- rebase on master
- passing NULL to kvm_dirty_ring_reap in commit
   "refactor per-vcpu dirty ring reaping" to keep the logic unchanged.
   In other word, we still try the best to reap as much PFNs as possible
   if dirtylimit not in service.
- move the cpu list gen id changes into a separate patch.
- release the lock before sleep during dirty page rate calculation.
- move the dirty ring size fetch logic into a separate patch.
- drop the DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK MACRO .
- substitute bh with function pointer when implement dirtylimit.
- merge the dirtylimit_start/stop into dirtylimit_change.
- fix "cpu-index" parameter type with "int" to keep consistency.
- fix some syntax error in documents.

Please review. Thanks,

Yong

v12
- rebase on master
- add a new commmit to refactor per-vcpu dirty ring reaping, which can resolve
   the "vcpu miss the chances to sleep" problem
- remove the dirtylimit_thread and implemtment throttle in bottom half instead.
- let the dirty ring reaper thread keep sleeping when dirtylimit is in service
- introduce cpu_list_generation_id to identify cpu_list changing.
- keep taking the cpu_list_lock during dirty_stat_wait to prevent vcpu 
plug/unplug
   when calculating the dirty page rate
- move the dirtylimit global initializations out of dirtylimit_set_vcpu and do
   some code clean
- add DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK in case of oscillation when 
throttling
- remove the unmatched count field in dirtylimit_state
- add stub to fix build on non-x86
- refactor the documents

Thanks Peter and Markus for reviewing the previous versions, please review.

Thanks,
Yong

v11
- rebase on master
- add a commit " refactor dirty page rate calculation"  so that dirty page rate 
limit
   can reuse the calculation logic.
- handle the cpu hotplug/unplug case in the dirty page rate calculation logic.
- modify the qmp commands according to Markus's advice.
- introduce a standalone file dirtylimit.c to implement dirty page rate limit
- check if dirty limit in service by dirtylimit_state pointer instead of global 
variable
- introduce dirtylimit_mutex to protect dirtylimit_state
- do some code clean and docs

See the commit for more detail, thanks Markus and Peter very mush for the code
review and give the experienced and insightful advices, most modifications are
based on these advices.

v10:
- rebase on master
- make the following modifications on patch [1/3]:
   1. Make "dirtylimit-calc" thread joinable and join it af

Re: [PATCH V13 0/7] support dirty restraint on vCPU

2022-02-10 Thread Hyman


"Sent by accident, please ignore, I'll send v14 when ready."

在 2022/2/11 0:06, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

v13
- rebase on master
- passing NULL to kvm_dirty_ring_reap in commit
   "refactor per-vcpu dirty ring reaping" to keep the logic unchanged.
   In other word, we still try the best to reap as much PFNs as possible
   if dirtylimit not in service.
- move the cpu list gen id changes into a separate patch.
- release the lock before sleep during dirty page rate calculation.
- move the dirty ring size fetch logic into a separate patch.
- drop the DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK MACRO .
- substitute bh with function pointer when implement dirtylimit.
- merge the dirtylimit_start/stop into dirtylimit_change.
- fix "cpu-index" parameter type with "int" to keep consistency.
- fix some syntax error in documents.

Please review. Thanks,

Yong

v12
- rebase on master
- add a new commmit to refactor per-vcpu dirty ring reaping, which can resolve
   the "vcpu miss the chances to sleep" problem
- remove the dirtylimit_thread and implemtment throttle in bottom half instead.
- let the dirty ring reaper thread keep sleeping when dirtylimit is in service
- introduce cpu_list_generation_id to identify cpu_list changing.
- keep taking the cpu_list_lock during dirty_stat_wait to prevent vcpu 
plug/unplug
   when calculating the dirty page rate
- move the dirtylimit global initializations out of dirtylimit_set_vcpu and do
   some code clean
- add DIRTYLIMIT_LINEAR_ADJUSTMENT_WATERMARK in case of oscillation when 
throttling
- remove the unmatched count field in dirtylimit_state
- add stub to fix build on non-x86
- refactor the documents

Thanks Peter and Markus for reviewing the previous versions, please review.

Thanks,
Yong

v11
- rebase on master
- add a commit " refactor dirty page rate calculation"  so that dirty page rate 
limit
   can reuse the calculation logic.
- handle the cpu hotplug/unplug case in the dirty page rate calculation logic.
- modify the qmp commands according to Markus's advice.
- introduce a standalone file dirtylimit.c to implement dirty page rate limit
- check if dirty limit in service by dirtylimit_state pointer instead of global 
variable
- introduce dirtylimit_mutex to protect dirtylimit_state
- do some code clean and docs

See the commit for more detail, thanks Markus and Peter very mush for the code
review and give the experienced and insightful advices, most modifications are
based on these advices.

v10:
- rebase on master
- make the following modifications on patch [1/3]:
   1. Make "dirtylimit-calc" thread joinable and join it after quitting.

   2. Add finalize function to free dirtylimit_calc_state

   3. Do some code clean work

- make the following modifications on patch [2/3]:
   1. Remove the original implementation of throttle according to
  Peter's advice.
  
   2. Introduce a negative feedback system and implement the throttle

  on all vcpu in one thread named "dirtylimit".

   3. Simplify the algo when calculation the throttle_us_per_full:
  increase/decrease linearly when there exists a wide difference
  between quota and current dirty page rate, increase/decrease
  a fixed time slice when the difference is narrow. This makes
  throttle responds faster and reach the quota smoothly.

   4. Introduce a unfit_cnt in algo to make sure throttle really
  takes effect.

   5. Set the max sleep time 99 times more than "ring_full_time_us".
  
   6. Make "dirtylimit" thread joinable and join it after quitting.
  
- make the following modifications on patch [3/3]:

   1. Remove the unplug cpu handling logic.
  
   2. "query-vcpu-dirty-limit" only return dirtylimit information of

  vcpus that enable dirtylimit

Re: [PATCH 2/8] qapi/migration: Introduce vcpu-dirty-limit parameters

2022-08-18 Thread Hyman





在 2022/8/18 6:07, Peter Xu 写道:

On Sat, Jul 23, 2022 at 03:49:14PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) 
---
  migration/migration.c | 14 ++
  monitor/hmp-cmds.c|  8 
  qapi/migration.json   | 18 +++---
  3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 7b19f85..ed1a47b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -117,6 +117,7 @@
  #define DEFAULT_MIGRATE_ANNOUNCE_STEP100
  
  #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 500 /* ms */

+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1   /* MB/s */


This default value also looks a bit weird.. why 1MB/s?  Thanks,
Indeed, it seems kind of weired, the reason to set default dirty limit 
to 1MB/s is that we want to keep the dirty limit working until vcpu 
dirty page rate drop to 1MB/s once dirtylimit capability enabled during 
migration. In this way, migration has the largest chance to get 
converged before vcpu dirty page rate drop to 1MB/s。 If we set default
dirty limit greater than 1MB/s, the probability of success for migration 
may be reduced, and the default behavior of migration is try the best to 
become sucessful.

Re: [PATCH 1/8] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2022-08-18 Thread Hyman





在 2022/8/18 6:06, Peter Xu 写道:

On Sat, Jul 23, 2022 at 03:49:13PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is used to make dirtyrate calculation period
configurable.

Signed-off-by: Hyman Huang(黄勇) 
---
  migration/migration.c | 16 
  monitor/hmp-cmds.c|  8 
  qapi/migration.json   | 31 ---
  3 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index e03f698..7b19f85 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -116,6 +116,8 @@
  #define DEFAULT_MIGRATE_ANNOUNCE_ROUNDS5
  #define DEFAULT_MIGRATE_ANNOUNCE_STEP100
  
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 500 /* ms */


Why 500 but not DIRTYLIMIT_CALC_TIME_MS?
This is a empirical value actually, the iteration time of migration is 
less than 1000ms normally. In my test it varies from 200ms to 500ms, we 
assume iteration time is 500ms and calculation period is 1000ms， so 2 
iteration pass when 1 dirty page rate get calculated. We want 
calculation period as close to iteration time as possible so that 1 
iteration pass, 1 new dirty page rate be calculated and get compared, 
hoping the dirtylimit working more precisely.


But as the "x-" prefix implies, i'm a little unsure that if the solution 
works。


Is it intended to make this parameter experimental, but the other one not?
Since i'm not very sure vcpu-dirty-limit-period have impact on 
migration(as described above), so it is made experimental. As to 
vcpu-dirty-limit, it indeed have impact on migration in theory, so it is 
not made experimental. But from another point of view, 2 parameter are 
introduced in the first time and none of them suffer lots of tests, it 
is also reasonable to make 2 parameter experimental, i'm not insist that.


Yong


Thanks,

Re: [PATCH 4/8] migration: Implement dirty-limit convergence algo

2022-08-18 Thread Hyman





在 2022/8/18 6:09, Peter Xu 写道:

On Sat, Jul 23, 2022 at 03:49:16PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Signed-off-by: Hyman Huang(黄勇) 
---
  migration/ram.c| 53 +-
  migration/trace-events |  1 +
  2 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index b94669b..2a5cd23 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -45,6 +45,7 @@
  #include "qapi/error.h"
  #include "qapi/qapi-types-migration.h"
  #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
  #include "qapi/qmp/qerror.h"
  #include "trace.h"
  #include "exec/ram_addr.h"
@@ -57,6 +58,8 @@
  #include "qemu/iov.h"
  #include "multifd.h"
  #include "sysemu/runstate.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
  
  #include "hw/boards.h" /* for machine_dump_guest_core() */
  
@@ -1139,6 +1142,21 @@ static void migration_update_rates(RAMState *rs, int64_t end_time)

  }
  }
  
+/*

+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+if (!dirtylimit_in_service()) {
+MigrationState *s = migrate_get_current();
+int64_t quota_dirtyrate = s->parameters.vcpu_dirty_limit;
+
+/* Set quota dirtyrate if dirty limit not in service */
+qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+}


What if migration is cancelled?  Do we have logic to stop the dirty limit,
or should we?

Yes, we should have logic to stop dirty limit, i'll add that.
Thanks for your suggestion. :)

Yong

Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"

2022-11-02 Thread Hyman





在 2022/11/2 13:42, Jason Wang 写道:

On Tue, Nov 1, 2022 at 12:19 AM  wrote:


From: Hyman Huang(黄勇) 

For netdev device that can offload virtio-net dataplane to slave,
such as vhost-net, vhost-user and vhost-vdpa, exporting it's
capability information and acked features would be more friendly for
developers. These infomation can be analyzed and compare to slave
capability provided by, eg dpdk or other slaves directly, helping to
draw conclusions about if vm network interface works normally, if
it vm can be migrated to another feature-compatible destination or
whatever else.

For developers who devote to offload virtio-net dataplane to DPU
and make efforts to migrate vm lively from software-based source
host to DPU-offload destination host smoothly, virtio-net feature
compatibility is an serious issue, exporting the key capability
and acked_features of netdev could also help to debug greatly.

So we export out the key capabilities of netdev, which may affect
the final negotiated virtio-net features, meanwhile, backed-up
acked_features also exported, which is used to initialize or
restore features negotiated between qemu and vhost slave when
starting vhost_dev device.

Signed-off-by: Hyman Huang(黄勇) 
---
  net/net.c | 44 +++
  qapi/net.json | 66 +++
  2 files changed, 110 insertions(+)

diff --git a/net/net.c b/net/net.c
index 2db160e..5d11674 100644
--- a/net/net.c
+++ b/net/net.c
@@ -53,6 +53,7 @@
  #include "sysemu/runstate.h"
  #include "net/colo-compare.h"
  #include "net/filter.h"
+#include "net/vhost-user.h"
  #include "qapi/string-output-visitor.h"

  /* Net bridge is currently not supported for W32. */
@@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp)
  }
  }

+static NetDevInfo *query_netdev(NetClientState *nc)
+{
+NetDevInfo *info = NULL;
+
+if (!nc || !nc->is_netdev) {
+return NULL;
+}
+
+info = g_malloc0(sizeof(*info));
+info->name = g_strdup(nc->name);
+info->type = nc->info->type;
+info->ufo = nc->info->has_ufo;
+info->vnet_hdr = nc->info->has_vnet_hdr;
+info->vnet_hdr_len = nc->info->has_vnet_hdr_len;


So all the fields are virtio specific, I wonder if it's better to
rename the command as query-vhost or query-virtio?

Indeed, i'm also a little struggling about the naming, i prefer
Thomas's suggestion: 'x-query-virtio-netdev' and 'info virtio-netdev',
since we may add or del some capabilities about the *netdev* , so adding 
a "x-" prefix seems to reasonable, as to '-netdev' suffix, it implies 
the *backend*.


Thanks,

Yong


Thanks


+
+if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
+info->has_acked_features = true;
+info->acked_features = vhost_user_get_acked_features(nc);
+}
+
+return info;
+}
+
+NetDevInfoList *qmp_query_netdev(Error **errp)
+{
+NetClientState *nc;
+NetDevInfo *info = NULL;
+NetDevInfoList *head = NULL, **tail = &head;
+
+QTAILQ_FOREACH(nc, &net_clients, next) {
+if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
+continue;
+}
+
+info = query_netdev(nc);
+if (info) {
+QAPI_LIST_APPEND(tail, info);
+}
+}
+
+return head;
+}
+
  static void netfilter_print_info(Monitor *mon, NetFilterState *nf)
  {
  char *str;
diff --git a/qapi/net.json b/qapi/net.json
index dd088c0..76a6513 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -631,6 +631,72 @@
 'if': 'CONFIG_VMNET' } } }

  ##
+# @NetDevInfo:
+#
+# NetDev information.  This structure describes a NetDev information, including
+# capabilities and negotiated features.
+#
+# @name: The NetDev name.
+#
+# @type: Type of NetDev.
+#
+# @ufo: True if NetDev has ufo capability.
+#
+# @vnet-hdr: True if NetDev has vnet_hdr.
+#
+# @vnet-hdr-len: True if given length can be assigned to NetDev.
+#
+# @acked-features: Negotiated features with vhost slave device if device 
support
+#  dataplane offload.
+#
+# Since:  7.1
+##
+{'struct': 'NetDevInfo',
+ 'data': {
+'name': 'str',
+'type': 'NetClientDriver',
+'ufo':'bool',
+'vnet-hdr':'bool',
+'vnet-hdr-len':'bool',
+'*acked-features': 'uint64' } }
+
+##
+# @query-netdev:
+#
+# Get a list of NetDevInfo for all virtual netdev peer devices.
+#
+# Returns: a list of @NetDevInfo describing each virtual netdev peer device.
+#
+# Since: 7.1
+#
+# Example:
+#
+# -> { "execute": "query-netdev" }
+# <- {
+#   "return":[
+#  {
+# &

Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"

2022-11-02 Thread Hyman





在 2022/11/2 14:41, Michael S. Tsirkin 写道:

On Wed, Nov 02, 2022 at 01:42:39PM +0800, Jason Wang wrote:

On Tue, Nov 1, 2022 at 12:19 AM  wrote:


From: Hyman Huang(黄勇) 

For netdev device that can offload virtio-net dataplane to slave,
such as vhost-net, vhost-user and vhost-vdpa, exporting it's
capability information and acked features would be more friendly for
developers. These infomation can be analyzed and compare to slave
capability provided by, eg dpdk or other slaves directly, helping to
draw conclusions about if vm network interface works normally, if
it vm can be migrated to another feature-compatible destination or
whatever else.

For developers who devote to offload virtio-net dataplane to DPU
and make efforts to migrate vm lively from software-based source
host to DPU-offload destination host smoothly, virtio-net feature
compatibility is an serious issue, exporting the key capability
and acked_features of netdev could also help to debug greatly.

So we export out the key capabilities of netdev, which may affect
the final negotiated virtio-net features, meanwhile, backed-up
acked_features also exported, which is used to initialize or
restore features negotiated between qemu and vhost slave when
starting vhost_dev device.

Signed-off-by: Hyman Huang(黄勇) 
---
  net/net.c | 44 +++
  qapi/net.json | 66 +++
  2 files changed, 110 insertions(+)

diff --git a/net/net.c b/net/net.c
index 2db160e..5d11674 100644
--- a/net/net.c
+++ b/net/net.c
@@ -53,6 +53,7 @@
  #include "sysemu/runstate.h"
  #include "net/colo-compare.h"
  #include "net/filter.h"
+#include "net/vhost-user.h"
  #include "qapi/string-output-visitor.h"

  /* Net bridge is currently not supported for W32. */
@@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp)
  }
  }

+static NetDevInfo *query_netdev(NetClientState *nc)
+{
+NetDevInfo *info = NULL;
+
+if (!nc || !nc->is_netdev) {
+return NULL;
+}
+
+info = g_malloc0(sizeof(*info));
+info->name = g_strdup(nc->name);
+info->type = nc->info->type;
+info->ufo = nc->info->has_ufo;
+info->vnet_hdr = nc->info->has_vnet_hdr;
+info->vnet_hdr_len = nc->info->has_vnet_hdr_len;


So all the fields are virtio specific, I wonder if it's better to
rename the command as query-vhost or query-virtio?

Thanks


We have info virtio already. Seems to fit there logically.


Ok, it seems that 'x-query-virtio-netdev' is a good option.





+
+if (nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
+info->has_acked_features = true;
+info->acked_features = vhost_user_get_acked_features(nc);
+}
+
+return info;
+}
+
+NetDevInfoList *qmp_query_netdev(Error **errp)
+{
+NetClientState *nc;
+NetDevInfo *info = NULL;
+NetDevInfoList *head = NULL, **tail = &head;
+
+QTAILQ_FOREACH(nc, &net_clients, next) {
+if (nc->info->type == NET_CLIENT_DRIVER_NIC) {
+continue;
+}
+
+info = query_netdev(nc);
+if (info) {
+QAPI_LIST_APPEND(tail, info);
+}
+}
+
+return head;
+}
+
  static void netfilter_print_info(Monitor *mon, NetFilterState *nf)
  {
  char *str;
diff --git a/qapi/net.json b/qapi/net.json
index dd088c0..76a6513 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -631,6 +631,72 @@
 'if': 'CONFIG_VMNET' } } }

  ##
+# @NetDevInfo:
+#
+# NetDev information.  This structure describes a NetDev information, including
+# capabilities and negotiated features.
+#
+# @name: The NetDev name.
+#
+# @type: Type of NetDev.
+#
+# @ufo: True if NetDev has ufo capability.
+#
+# @vnet-hdr: True if NetDev has vnet_hdr.
+#
+# @vnet-hdr-len: True if given length can be assigned to NetDev.
+#
+# @acked-features: Negotiated features with vhost slave device if device 
support
+#  dataplane offload.
+#
+# Since:  7.1
+##
+{'struct': 'NetDevInfo',
+ 'data': {
+'name': 'str',
+'type': 'NetClientDriver',
+'ufo':'bool',
+'vnet-hdr':'bool',
+'vnet-hdr-len':'bool',
+'*acked-features': 'uint64' } }
+
+##
+# @query-netdev:
+#
+# Get a list of NetDevInfo for all virtual netdev peer devices.
+#
+# Returns: a list of @NetDevInfo describing each virtual netdev peer device.
+#
+# Since: 7.1
+#
+# Example:
+#
+# -> { "execute": "query-netdev" }
+# <- {
+#   "return":[
+#  {
+# "name":"hostnet0",
+# "type":"vhost-user",
+# "ufo":true,
+# "vnet-hdr"

Re: [PATCH RFC 1/4] net: Introduce qmp cmd "query-netdev"

2022-11-02 Thread Hyman





在 2022/11/2 15:10, Thomas Huth 写道:

On 02/11/2022 06.42, Jason Wang wrote:

On Tue, Nov 1, 2022 at 12:19 AM  wrote:


From: Hyman Huang(黄勇) 

For netdev device that can offload virtio-net dataplane to slave,
such as vhost-net, vhost-user and vhost-vdpa, exporting it's
capability information and acked features would be more friendly for
developers. These infomation can be analyzed and compare to slave
capability provided by, eg dpdk or other slaves directly, helping to
draw conclusions about if vm network interface works normally, if
it vm can be migrated to another feature-compatible destination or
whatever else.

For developers who devote to offload virtio-net dataplane to DPU
and make efforts to migrate vm lively from software-based source
host to DPU-offload destination host smoothly, virtio-net feature
compatibility is an serious issue, exporting the key capability
and acked_features of netdev could also help to debug greatly.

So we export out the key capabilities of netdev, which may affect
the final negotiated virtio-net features, meanwhile, backed-up
acked_features also exported, which is used to initialize or
restore features negotiated between qemu and vhost slave when
starting vhost_dev device.

Signed-off-by: Hyman Huang(黄勇) 
---
  net/net.c | 44 +++
  qapi/net.json | 66 
+++

  2 files changed, 110 insertions(+)

diff --git a/net/net.c b/net/net.c
index 2db160e..5d11674 100644
--- a/net/net.c
+++ b/net/net.c
@@ -53,6 +53,7 @@
  #include "sysemu/runstate.h"
  #include "net/colo-compare.h"
  #include "net/filter.h"
+#include "net/vhost-user.h"
  #include "qapi/string-output-visitor.h"

  /* Net bridge is currently not supported for W32. */
@@ -1224,6 +1225,49 @@ void qmp_netdev_del(const char *id, Error **errp)
  }
  }

+static NetDevInfo *query_netdev(NetClientState *nc)
+{
+    NetDevInfo *info = NULL;
+
+    if (!nc || !nc->is_netdev) {
+    return NULL;
+    }
+
+    info = g_malloc0(sizeof(*info));
+    info->name = g_strdup(nc->name);
+    info->type = nc->info->type;
+    info->ufo = nc->info->has_ufo;
+    info->vnet_hdr = nc->info->has_vnet_hdr;
+    info->vnet_hdr_len = nc->info->has_vnet_hdr_len;


So all the fields are virtio specific, I wonder if it's better to
rename the command as query-vhost or query-virtio?


And add a "x-" prefix (and a "-netdev" suffix) as long as we don't feel 
confident about this yet? "x-query-virtio-netdev" ?


Agree with that, thanks for the comment.

Yong.


  Thomas

Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw

2022-11-11 Thread Hyman





在 2022/11/11 3:00, Michael S. Tsirkin 写道:

On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Save the acked_features once it be configured by guest
virtio driver so it can't miss any features.

Note that this patch also change the features saving logic
in chr_closed_bh, which originally backup features no matter
whether the features are 0 or not, but now do it only if
features aren't 0.


I'm not sure how is this change even related to what we
are trying to do (fix a bug). Explain here?


For this series, all we want to do is to making sure acked_features
in the NetVhostUserState is credible and uptodate in the scenario that 
virtio features negotiation and openvswitch service restart happens 
simultaneously.


To make sure that happens, we save the acked_features to 
NetVhostUserState right after guest setting virtio-net features.


Assume that we do not save acked_features to NetVhostUserState just as 
it is, the acked_features in NetVhostUserState has chance to be assigned 
only when chr_closed_bh/vhost_user_stop happen. Note that openvswitch 
service stop will cause chr_closed_bh happens and acked_features in 
vhost_dev will be stored into NetVhostUserState, if the acked_features 
in vhost_dev are out-of-date(may be updated in the next few seconds), so 
does the acked_features in NetVhostUserState after doing the assignment, 
this is the bug.


Let's refine the scenario and derive the bug:
qemu threaddpdk
|   |
   vhost_net_init() |
|   |
 assign acked_features in vhost_dev |
   with 0x4000  |
|  openvswitch.service stop
   chr_closed_bh|
|   |
 assign acked_features in   |
 NetVhostUserState with 0x4000  |
|   |
   virtio_net_set_features()|
|   |
 assign acked_features in vhost_dev |
   with 0x7060a782  |
|  openvswitch.service start
|   |
   vhost_user_start |
|   |
 assign acked_features in vhost_dev |
   with 0x4000  |
|   |

As the step shows, if we do not keep the acked_features in 
NetVhostUserState up-to-date, the acked_features in vhost_dev may be 
reloaded with the wrong value(eg, 0x4000) when vhost_user_start happens.




As to reset acked_features to 0 if needed, Qemu always
keeping the backup acked_features up-to-date, and save the
acked_features after virtio_net_set_features in advance,
including reset acked_features to 0, so the behavior is
also covered.

Signed-off-by: Hyman Huang(黄勇) 
Signed-off-by: Guoyi Tu 
---
  hw/net/vhost_net.c  | 9 +
  hw/net/virtio-net.c | 5 +
  include/net/vhost_net.h | 2 ++
  net/vhost-user.c| 6 +-
  4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d28f8b9..2bffc27 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net)
  return net->dev.acked_features;
  }
  
+void vhost_net_save_acked_features(NetClientState *nc)

+{
+if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) {
+return;
+}
+
+vhost_user_save_acked_features(nc, false);
+}
+
  static int vhost_net_get_fd(NetClientState *backend)
  {
  switch (backend->info->type) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e9f696b..5f8f788 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
  continue;
  }
  vhost_net_ack_features(get_vhost_net(nc->peer), features);
+/*
+ * keep acked_features in NetVhostUserState up-to-date so it
+ * can't miss any features configured by guest virtio driver.
+ */
+vhost_net_save_acked_features(nc->peer);
  }
  
  if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {

diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 387e913..3a5579b 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -46,6 +46,8 @@ int vhost_set_vring_enable(N

Re: [PATCH v3 2/2] vhost-net: Fix the virtio features negotiation flaw

2022-11-11 Thread Hyman


The previous reply email has an text format error, please ignore and


在 2022/11/11 3:00, Michael S. Tsirkin 写道:

On Sun, Oct 30, 2022 at 09:52:39PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Save the acked_features once it be configured by guest
virtio driver so it can't miss any features.

Note that this patch also change the features saving logic
in chr_closed_bh, which originally backup features no matter
whether the features are 0 or not, but now do it only if
features aren't 0.


I'm not sure how is this change even related to what we
are trying to do (fix a bug). Explain here?


For this series, all we want to do is to making sure acked_features
in the NetVhostUserState is credible and uptodate in the scenario
that virtio features negotiation and openvswitch service restart
happens simultaneously.

To make sure that happens, we save the acked_features to
NetVhostUserState right after guest setting virtio-net features.

Assume that we do not save acked_features to NetVhostUserState just as
it is, the acked_features in NetVhostUserState has chance to be assigned
only when chr_closed_bh/vhost_user_stop happen. Note that openvswitch
service stop will cause chr_closed_bh happens and acked_features in
vhost_dev will be stored into NetVhostUserState, if the acked_features
in vhost_dev are out-of-date(may be updated in the next few seconds), so
does the acked_features in NetVhostUserState after doing the assignment,
this is the bug.

Let's refine the scenario and derive the bug:
qemu threaddpdk
|   |
   vhost_net_init() |
|   |
 assign acked_features in vhost_dev |
   with 0x4000  |
|   openvswitch.service stop
   chr_closed_bh|
|   |
 assign acked_features in   |
 NetVhostUserState with 0x4000  |
|   |
   virtio_net_set_features()|
|   |
 assign acked_features in vhost_dev |
   with 0x7060a782  |
|  openvswitch.service start
|   |
   vhost_user_start |
|   |
 assign acked_features in vhost_dev |
   with 0x4000  |
|   |

As the step shows, if we do not keep the acked_features in
NetVhostUserState up-to-date, the acked_features in vhost_dev may be
reloaded with the wrong value(eg, 0x4000) when vhost_user_start
happens.




As to reset acked_features to 0 if needed, Qemu always
keeping the backup acked_features up-to-date, and save the
acked_features after virtio_net_set_features in advance,
including reset acked_features to 0, so the behavior is
also covered.

Signed-off-by: Hyman Huang(黄勇) 
Signed-off-by: Guoyi Tu 
---
  hw/net/vhost_net.c  | 9 +
  hw/net/virtio-net.c | 5 +
  include/net/vhost_net.h | 2 ++
  net/vhost-user.c| 6 +-
  4 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d28f8b9..2bffc27 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net)
  return net->dev.acked_features;
  }
  
+void vhost_net_save_acked_features(NetClientState *nc)

+{
+if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) {
+return;
+}
+
+vhost_user_save_acked_features(nc, false);
+}
+
  static int vhost_net_get_fd(NetClientState *backend)
  {
  switch (backend->info->type) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e9f696b..5f8f788 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint64_t features)
  continue;
  }
  vhost_net_ack_features(get_vhost_net(nc->peer), features);
+/*
+ * keep acked_features in NetVhostUserState up-to-date so it
+ * can't miss any features configured by guest virtio driver.
+ */
+vhost_net_save_acked_features(nc->peer);
  }
  
  if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {

diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 387e913..3a5579b 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -46,6 +46,

Re: [PATCH RESEND v3 00/10] migration: introduce dirtylimit capability

2022-12-08 Thread Hyman


Ping ？

在 2022/12/4 1:09, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

v3(resend):
- fix the syntax error of the topic.

v3:
This version make some modifications inspired by Peter and Markus
as following:
1. Do the code clean up in [PATCH v2 02/11] suggested by Markus
2. Replace the [PATCH v2 03/11] with a much simpler patch posted by
Peter to fix the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2124756
3. Fix the error path of migrate_params_check in [PATCH v2 04/11]
pointed out by Markus. Enrich the commit message to explain why
x-vcpu-dirty-limit-period an unstable parameter.
4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11]
suggested by Peter:
a. apply blk_mig_bulk_active check before enable dirty-limit
b. drop the unhelpful check function before enable dirty-limit
c. change the migration_cancel logic, just cancel dirty-limit
   only if dirty-limit capability turned on.
d. abstract a code clean commit [PATCH v3 07/10] to adjust
   the check order before enable auto-converge
5. Change the name of observing indexes during dirty-limit live
migration to make them more easy-understanding. Use the
maximum throttle time of vpus as "dirty-limit-throttle-time-per-full"
6. Fix some grammatical and spelling errors pointed out by Markus
and enrich the document about the dirty-limit live migration
observing indexes "dirty-limit-ring-full-time"
and "dirty-limit-throttle-time-per-full"
7. Change the default value of x-vcpu-dirty-limit-period to 1000ms,
which is optimal value pointed out in cover letter in that
testing environment.
8. Drop the 2 guestperf test commits [PATCH v2 10/11],
[PATCH v2 11/11] and post them with a standalone series in the
future.

Thanks Peter and Markus sincerely for the passionate, efficient
and careful comments and suggestions.

Please review.

Yong

v2:
This version make a little bit modifications comparing with
version 1 as following:
1. fix the overflow issue reported by Peter Maydell
2. add parameter check for hmp "set_vcpu_dirty_limit" command
3. fix the racing issue between dirty ring reaper thread and
Qemu main thread.
4. add migrate parameter check for x-vcpu-dirty-limit-period
and vcpu-dirty-limit.
5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit,
cancel_vcpu_dirty_limit during dirty-limit live migration when
implement dirty-limit convergence algo.
6. add capability check to ensure auto-converge and dirty-limit
are mutually exclusive.
7. pre-check if kvm dirty ring size is configured before setting
dirty-limit migrate parameter

A more comprehensive test was done comparing with version 1.

The following are test environment:
-
a. Host hardware info:

CPU:
Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz

CPU(s):  64
On-line CPU(s) list: 0-63
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):   2
NUMA node(s):2

NUMA node0 CPU(s):   0-15,32-47
NUMA node1 CPU(s):   16-31,48-63

Memory:
Hynix  503Gi

Interface:
Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
Speed: 1000Mb/s

b. Host software info:

OS: ctyunos release 2
Kernel: 4.19.90-2102.2.0.0066.ctl2.x86_64
Libvirt baseline version:  libvirt-6.9.0
Qemu baseline version: qemu-5.0

c. vm scale
CPU: 4
Memory: 4G
-

All the supplementary test data shown as follows are basing on
above test environment.

In version 1, we post a test data from unixbench as follows:

$ taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item}

host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 32800  | 32786  | 25292 |
   | whetstone-double| 10326  | 10315  | 9847  |
   | pipe| 15442  | 15271  | 14506 |
   | context1| 7260   | 6235   | 4514  |
   | spawn   | 3663   | 3317   | 3249  |
   | syscall | 4669   | 4667   | 3841  |
   |-+++---|

In version 2, we post a supplementary test data that do not use
taskset and make the scenario more general, see as follows:

$ ./Run

per-vcpu data:
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 2991   | 2902   | 1722  |
   | whetstone-double| 1018   | 1006   | 627

Re: [RFC PATCH 2/2] tests: Add dirty page rate limit test

2022-03-10 Thread Hyman





在 2022/3/10 16:29, Peter Xu 写道:

On Wed, Mar 09, 2022 at 11:58:01PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Add dirty page rate limit test if kernel support dirty ring,
create a standalone file to implement the test case.


Thanks for writting this test case.



Signed-off-by: Hyman Huang(黄勇) 
---
  tests/qtest/dirtylimit-test.c | 288 ++
  tests/qtest/meson.build   |   2 +
  2 files changed, 290 insertions(+)
  create mode 100644 tests/qtest/dirtylimit-test.c

diff --git a/tests/qtest/dirtylimit-test.c b/tests/qtest/dirtylimit-test.c
new file mode 100644
index 000..07eac2c
--- /dev/null
+++ b/tests/qtest/dirtylimit-test.c
@@ -0,0 +1,288 @@
+/*
+ * QTest testcase for Dirty Page Rate Limit
+ *
+ * Copyright (c) 2022 CHINA TELECOM CO.,LTD.
+ *
+ * Authors:
+ *  Hyman Huang(黄勇) 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qlist.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qobject-output-visitor.h"
+
+#include "migration-helpers.h"
+#include "tests/migration/i386/a-b-bootblock.h"
+
+/*
+ * Dirtylimit stop working if dirty page rate error
+ * value less than DIRTYLIMIT_TOLERANCE_RANGE
+ */
+#define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
+
+static QDict *qmp_command(QTestState *who, const char *command, ...)
+{
+va_list ap;
+QDict *resp, *ret;
+
+va_start(ap, command);
+resp = qtest_vqmp(who, command, ap);
+va_end(ap);
+
+g_assert(!qdict_haskey(resp, "error"));
+g_assert(qdict_haskey(resp, "return"));
+
+ret = qdict_get_qdict(resp, "return");
+qobject_ref(ret);
+qobject_unref(resp);
+
+return ret;
+}
+
+static void calc_dirty_rate(QTestState *who, uint64_t calc_time)
+{
+qobject_unref(qmp_command(who,
+"{ 'execute': 'calc-dirty-rate',"
+"'arguments': { "
+"'calc-time': %ld,"
+"'mode': 'dirty-ring' }}",
+calc_time));
+}
+
+static QDict *query_dirty_rate(QTestState *who)
+{
+return qmp_command(who, "{ 'execute': 'query-dirty-rate' }");
+}
+
+static void dirtylimit_set_all(QTestState *who, uint64_t dirtyrate)
+{
+qobject_unref(qmp_command(who,
+"{ 'execute': 'set-vcpu-dirty-limit',"
+"'arguments': { "
+"'dirty-rate': %ld } }",
+dirtyrate));
+}
+
+static void cancel_vcpu_dirty_limit(QTestState *who)
+{
+qobject_unref(qmp_command(who,
+"{ 'execute': 'cancel-vcpu-dirty-limit' }"));
+}
+
+static QDict *query_vcpu_dirty_limit(QTestState *who)
+{
+QDict *rsp;
+
+rsp = qtest_qmp(who, "{ 'execute': 'query-vcpu-dirty-limit' }");
+g_assert(!qdict_haskey(rsp, "error"));
+g_assert(qdict_haskey(rsp, "return"));
+
+return rsp;
+}
+
+static int64_t get_dirty_rate(QTestState *who)
+{
+QDict *rsp_return;
+gchar *status;
+QList *rates;
+const QListEntry *entry;
+QDict *rate;
+int64_t dirtyrate;
+
+rsp_return = query_dirty_rate(who);
+g_assert(rsp_return);
+
+status = g_strdup(qdict_get_str(rsp_return, "status"));
+g_assert(status);
+g_assert_cmpstr(status, ==, "measured");
+
+rates = qdict_get_qlist(rsp_return, "vcpu-dirty-rate");
+g_assert(rates && !qlist_empty(rates));
+
+entry = qlist_first(rates);
+g_assert(entry);
+
+rate = qobject_to(QDict, qlist_entry_obj(entry));
+g_assert(rate);
+
+dirtyrate = qdict_get_try_int(rate, "dirty-rate", -1);
+
+qobject_unref(rsp_return);
+return dirtyrate;
+}
+
+static int64_t get_limit_rate(QTestState *who)
+{
+QDict *rsp_return;
+QList *rates;
+const QListEntry *entry;
+QDict *rate;
+int64_t dirtyrate;
+
+rsp_return = query_vcpu_dirty_limit(who);
+g_assert(rsp_return);
+
+rates = qdict_get_qlist(rsp_return, "return");
+g_assert(rates && !qlist_empty(rates));
+
+entry = qlist_first(rates);
+g_assert(entry);
+
+rate = qobject_to(QDict, qlist_entry_obj(entry));
+g_assert(rate);
+
+dirtyrate = qdict_get_try_int(rate, "limit-rate", -1);
+
+qobject_unref(rsp_return);
+return dirtyrate;
+}
+
+static QTestState *start_vm(void)
+{
+QTestState *vm = NULL;
+g_autofree gchar *cmd = NULL;
+const char *arch = qtest_get_ar

Re: [PATCH v21 8/9] migration-test: Export migration-test util funtions

2022-03-31 Thread Hyman





在 2022/3/30 2:54, Peter Xu 写道:

On Wed, Mar 16, 2022 at 09:07:20PM +0800, huang...@chinatelecom.cn wrote:

+void wait_for_serial(const char *tmpfs, const char *side)


Passing over tmpfs over and over (even if it's mostly a constant) doesn't
sound appealing to me..

I hope there's still a way that we could avoid doing that when spliting the
file.  Or, how about you just add a new test into migration-test?  After
all all migration tests (including auto-converge) is there, and I don't
strongly feel that we need a separate file urgently.
Ok, i separated file just for code readability. I'm not very insistent 
to do this if we think it's ok to add dirtylimit test to migration test. 
Thanks for the comment. :)


Yong

 > Thanks,

Re: [PATCH v21 9/9] tests: Add dirty page rate limit test

2022-03-31 Thread Hyman





在 2022/3/30 3:54, Peter Xu 写道:

On Wed, Mar 16, 2022 at 09:07:21PM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Add dirty page rate limit test if kernel support dirty ring,
create a standalone file to implement the test case.

The following qmp commands are covered by this test case:
"calc-dirty-rate", "query-dirty-rate", "set-vcpu-dirty-limit",
"cancel-vcpu-dirty-limit" and "query-vcpu-dirty-limit".

Signed-off-by: Hyman Huang(黄勇) 
---
  tests/qtest/dirtylimit-test.c | 327 ++
  tests/qtest/meson.build   |   2 +
  2 files changed, 329 insertions(+)
  create mode 100644 tests/qtest/dirtylimit-test.c

diff --git a/tests/qtest/dirtylimit-test.c b/tests/qtest/dirtylimit-test.c
new file mode 100644
index 000..b8d9960
--- /dev/null
+++ b/tests/qtest/dirtylimit-test.c
@@ -0,0 +1,327 @@
+/*
+ * QTest testcase for Dirty Page Rate Limit
+ *
+ * Copyright (c) 2022 CHINA TELECOM CO.,LTD.
+ *
+ * Authors:
+ *  Hyman Huang(黄勇) 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqos/libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qlist.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qobject-output-visitor.h"
+
+#include "migration-helpers.h"
+#include "tests/migration/i386/a-b-bootblock.h"
+
+/*
+ * Dirtylimit stop working if dirty page rate error
+ * value less than DIRTYLIMIT_TOLERANCE_RANGE
+ */
+#define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
+
+static const char *tmpfs;
+
+static QDict *qmp_command(QTestState *who, const char *command, ...)
+{
+va_list ap;
+QDict *resp, *ret;
+
+va_start(ap, command);
+resp = qtest_vqmp(who, command, ap);
+va_end(ap);
+
+g_assert(!qdict_haskey(resp, "error"));
+g_assert(qdict_haskey(resp, "return"));
+
+ret = qdict_get_qdict(resp, "return");
+qobject_ref(ret);
+qobject_unref(resp);
+
+return ret;
+}
+
+static void calc_dirty_rate(QTestState *who, uint64_t calc_time)
+{
+qobject_unref(qmp_command(who,
+  "{ 'execute': 'calc-dirty-rate',"
+  "'arguments': { "
+  "'calc-time': %ld,"
+  "'mode': 'dirty-ring' }}",
+  calc_time));
+}
+
+static QDict *query_dirty_rate(QTestState *who)
+{
+return qmp_command(who, "{ 'execute': 'query-dirty-rate' }");
+}
+
+static void dirtylimit_set_all(QTestState *who, uint64_t dirtyrate)
+{
+qobject_unref(qmp_command(who,
+  "{ 'execute': 'set-vcpu-dirty-limit',"
+  "'arguments': { "
+  "'dirty-rate': %ld } }",
+  dirtyrate));
+}
+
+static void cancel_vcpu_dirty_limit(QTestState *who)
+{
+qobject_unref(qmp_command(who,
+  "{ 'execute': 'cancel-vcpu-dirty-limit' }"));
+}
+
+static QDict *query_vcpu_dirty_limit(QTestState *who)
+{
+QDict *rsp;
+
+rsp = qtest_qmp(who, "{ 'execute': 'query-vcpu-dirty-limit' }");
+g_assert(!qdict_haskey(rsp, "error"));
+g_assert(qdict_haskey(rsp, "return"));
+
+return rsp;
+}
+
+static bool calc_dirtyrate_ready(QTestState *who)
+{
+QDict *rsp_return;
+gchar *status;
+
+rsp_return = query_dirty_rate(who);
+g_assert(rsp_return);
+
+status = g_strdup(qdict_get_str(rsp_return, "status"));
+g_assert(status);
+
+return g_strcmp0(status, "measuring");
+}
+
+static void wait_for_calc_dirtyrate_complete(QTestState *who,
+ int64_t calc_time)
+{
+int max_try_count = 200;
+usleep(calc_time);
+
+while (!calc_dirtyrate_ready(who) && max_try_count--) {
+usleep(1000);
+}
+
+/*
+ * Set the timeout with 200 ms(max_try_count * 1000us),
+ * if dirtyrate measurement not complete, test failed.
+ */
+g_assert_cmpint(max_try_count, !=, 0);


200ms might be still too challenging for busy systems?  How about make it
in seconds (e.g. 10 seconds)?


+}
+
+static int64_t get_dirty_rate(QTestState *who)
+{
+QDict *rsp_return;
+gchar *status;
+QList *rates;
+const QListEntry *entry;
+QDict *rate;
+int64_t dirtyrate;
+
+rsp_return = query_dirty_rate(who);
+g_assert(rsp_return);
+
+status = g_strdup(qdict_get_str(rsp_return, "status"));
+g_assert(status);
+g_assert_cmpstr(status, ==, "measured");
+
+rates = qdict_get_qlist(rsp_return, "vcpu-dirty-ra

Re: [PATCH v1 0/8] migration: introduce dirtylimit capability

2022-09-07 Thread Hyman





在 2022/9/7 4:46, Peter Xu 写道:

On Fri, Sep 02, 2022 at 01:22:28AM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

v1:
- make parameter vcpu-dirty-limit experimental
- switch dirty limit off when cancel migrate
- add cancel logic in migration test

Please review, thanks,

Yong

Abstract


This series added a new migration capability called "dirtylimit".  It can
be enabled when dirty ring is enabled, and it'll improve the vCPU performance
during the process of migration. It is based on the previous patchset:
https://lore.kernel.org/qemu-devel/cover.1656177590.git.huang...@chinatelecom.cn/

As mentioned in patchset "support dirty restraint on vCPU", dirtylimit way of
migration can make the read-process not be penalized. This series wires up the
vcpu dirty limit and wrappers as dirtylimit capability of migration. I introduce
two parameters vcpu-dirtylimit-period and vcpu-dirtylimit to implement the setup
of dirtylimit during live migration.

To validate the implementation, i tested a 32 vCPU vm live migration with such
model:
Only dirty vcpu0, vcpu1 with heavy memory workoad and leave the rest vcpus
untouched, running unixbench on the vpcu8-vcpu15 by setup the cpu affinity as
the following command:
taskset -c 8-15 ./Run -i 2 -c 8 {unixbench test item}

The following are results:

host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |-+++---|
   | UnixBench test item | Normal | Dirtylimit | Auto-converge |
   |-+++---|
   | dhry2reg| 32800  | 32786  | 25292 |
   | whetstone-double| 10326  | 10315  | 9847  |
   | pipe| 15442  | 15271  | 14506 |
   | context1| 7260   | 6235   | 4514  |
   | spawn   | 3663   | 3317   | 3249  |
   | syscall | 4669   | 4667   | 3841  |
   |-+++---|
 From the data above we can draw a conclusion that vcpus that do not dirty 
memory
in vm are almost unaffected during the dirtylimit migration, but the auto 
converge
way does.

I also tested the total time of dirtylimit migration with variable dirty memory
size in vm.

senario 1:
host cpu: Intel(R) Xeon(R) Platinum 8378A
host interface speed: 1000Mb/s
   |---++---|
   | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) |
   |---++---|
   | 60| 2014   | 2131  |
   | 70| 5381   | 12590 |
   | 90| 6037   | 33545 |
   | 110   | 7660   | [*]   |
   |---++---|
   [*]: This case means migration is not convergent.

senario 2:
host cpu: Intel(R) Xeon(R) CPU E5-2650
host interface speed: 1Mb/s
   |---++---|
   | dirty memory size(MB) | Dirtylimit(ms) | Auto-converge(ms) |
   |---++---|
   | 1600  | 15842  | 27548 |
   | 2000  | 19026  | 38447 |
   | 2400  | 19897  | 46381 |
   | 2800  | 22338  | 57149 |
   |---++---|
Above data shows that dirtylimit way of migration can also reduce the total
time of migration and it achieves convergence more easily in some case.

In addition to implement dirtylimit capability itself, this series
add 3 tests for migration, aiming at playing around for developer simply:
  1. qtest for dirty limit migration
  2. support dirty ring way of migration for guestperf tool
  3. support dirty limit migration for guestperf tool


Yong,

I should have asked even earlier - just curious whether you have started
using this in production systems?  It's definitely not required for any
patchset to be merged, but it'll be very useful (and supportive)
information to have if there's proper testing beds applied already.

Actually no when i posted the cover letter above, the qemu version in 
our production is much lower than upstream, and the patchset is 
different from here, i built test mode and did the test on my own in the 
first time. But this feature is in the process of test conducted by 
another professional test team, so once report is ready, i'll post it. :)

Thanks,

Re: [PATCH v1 4/8] migration: Implement dirty-limit convergence algo

2022-09-08 Thread Hyman





在 2022/9/7 4:37, Peter Xu 写道:

On Fri, Sep 02, 2022 at 01:22:32AM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Signed-off-by: Hyman Huang(黄勇) 
---
  migration/migration.c  |  1 +
  migration/ram.c| 53 +-
  migration/trace-events |  1 +
  3 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index d117bb4..64696de 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -239,6 +239,7 @@ void migration_cancel(const Error *error)
  if (error) {
  migrate_set_error(current_migration, error);
  }
+qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
  migrate_fd_cancel(current_migration);
  }
  
diff --git a/migration/ram.c b/migration/ram.c

index dc1de9d..cc19c5e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -45,6 +45,7 @@
  #include "qapi/error.h"
  #include "qapi/qapi-types-migration.h"
  #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
  #include "qapi/qmp/qerror.h"
  #include "trace.h"
  #include "exec/ram_addr.h"
@@ -57,6 +58,8 @@
  #include "qemu/iov.h"
  #include "multifd.h"
  #include "sysemu/runstate.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
  
  #include "hw/boards.h" /* for machine_dump_guest_core() */
  
@@ -1139,6 +1142,21 @@ static void migration_update_rates(RAMState *rs, int64_t end_time)

  }
  }
  
+/*

+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+if (!dirtylimit_in_service()) {
+MigrationState *s = migrate_get_current();
+int64_t quota_dirtyrate = s->parameters.x_vcpu_dirty_limit;
+
+/* Set quota dirtyrate if dirty limit not in service */
+qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+}
+
  static void migration_trigger_throttle(RAMState *rs)
  {
  MigrationState *s = migrate_get_current();
@@ -1148,22 +1166,31 @@ static void migration_trigger_throttle(RAMState *rs)
  uint64_t bytes_dirty_period = rs->num_dirty_pages_period * 
TARGET_PAGE_SIZE;
  uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
  
-/* During block migration the auto-converge logic incorrectly detects

- * that ram migration makes no progress. Avoid this by disabling the
- * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+/*
+ * During block migration the auto-converge logic incorrectly detects
+ * that ram migration makes no progress. Avoid this by disabling the
+ * throttling logic during the bulk phase of block migration
+ */
+
+if (migrate_auto_converge() && !blk_mig_bulk_active()) {
  trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
  mig_throttle_guest_down(bytes_dirty_period,
  bytes_dirty_threshold);
+} else if (migrate_dirty_limit() &&
+   kvm_dirty_ring_enabled() &&
+   migration_is_active(s)) {
+migration_dirty_limit_guest();


We'll call this multiple time, but only the 1st call will make sense, right?

Yes.


Can we call it once somewhere?  E.g. at the start of migration?It make sense indeed, if dirtylimit run once migration start, the 
behavior of dirtylimit migration would be kind of different from 
auto-converge, i mean, dirtylimit will make guest write vCPU slow no 
matter if dirty_rate_high_cnt ex

Re: [PATCH v22 0/8] support dirty restraint on vCPU

2022-04-24 Thread Hyman


Ping.
 Hi, David and Peter, how do you think this patchset?
 Is it suitable for queueing ? or is there still something need to be 
done ?


Yong

在 2022/4/1 1:49, huang...@chinatelecom.cn 写道:

From: Hyman Huang(黄勇) 

This is v22 of dirtylimit series.
The following is the history of the patchset, since v22 kind of different from
the original version, i made abstracts of changelog:

RFC and v1: 
https://lore.kernel.org/qemu-devel/cover.1637214721.git.huang...@chinatelecom.cn/
v2: 
https://lore.kernel.org/qemu-devel/cover.1637256224.git.huang...@chinatelecom.cn/
v1->v2 changelog:
- rename some function and variables. refactor the original algo of dirtylimit. 
Thanks for
   the comments given by Juan Quintela.
v3: 
https://lore.kernel.org/qemu-devel/cover.1637403404.git.huang...@chinatelecom.cn/
v4: 
https://lore.kernel.org/qemu-devel/cover.1637653303.git.huang...@chinatelecom.cn/
v5: 
https://lore.kernel.org/qemu-devel/cover.1637759139.git.huang...@chinatelecom.cn/
v6: 
https://lore.kernel.org/qemu-devel/cover.1637856472.git.huang...@chinatelecom.cn/
v7: 
https://lore.kernel.org/qemu-devel/cover.1638202004.git.huang...@chinatelecom.cn/
v2->v7 changelog:
- refactor the docs, annotation and fix bugs of the original algo of dirtylimit.
   Thanks for the review given by Markus Armbruster.
v8: 
https://lore.kernel.org/qemu-devel/cover.1638463260.git.huang...@chinatelecom.cn/
v9: 
https://lore.kernel.org/qemu-devel/cover.1638495274.git.huang...@chinatelecom.cn/
v10: 
https://lore.kernel.org/qemu-devel/cover.1639479557.git.huang...@chinatelecom.cn/
v7->v10 changelog:
- introduce a simpler but more efficient algo of dirtylimit inspired by Peter 
Xu.
- keep polishing the annotation suggested by Markus Armbruster.
v11: 
https://lore.kernel.org/qemu-devel/cover.1641315745.git.huang...@chinatelecom.cn/
v12: 
https://lore.kernel.org/qemu-devel/cover.1642774952.git.huang...@chinatelecom.cn/
v13: 
https://lore.kernel.org/qemu-devel/cover.1644506963.git.huang...@chinatelecom.cn/
v10->v13 changelog:
- handle the hotplug/unplug scenario.
- refactor the new algo, split the commit and make the code more clean.
v14: 
https://lore.kernel.org/qemu-devel/cover.1644509582.git.huang...@chinatelecom.cn/
v13->v14 changelog:
- sent by accident.
v15: 
https://lore.kernel.org/qemu-devel/cover.1644976045.git.huang...@chinatelecom.cn/
v16: 
https://lore.kernel.org/qemu-devel/cover.1645067452.git.huang...@chinatelecom.cn/
v17: 
https://lore.kernel.org/qemu-devel/cover.1646243252.git.huang...@chinatelecom.cn/
v14->v17 changelog:
- do some code clean and fix test bug reported by Dr. David Alan Gilbert.
v18: 
https://lore.kernel.org/qemu-devel/cover.1646247968.git.huang...@chinatelecom.cn/
v19: 
https://lore.kernel.org/qemu-devel/cover.1647390160.git.huang...@chinatelecom.cn/
v20: 
https://lore.kernel.org/qemu-devel/cover.1647396907.git.huang...@chinatelecom.cn/
v21: 
https://lore.kernel.org/qemu-devel/cover.1647435820.git.huang...@chinatelecom.cn/
v17->v21 changelog:
- add qtest, fix bug and do code clean.
v21->v22 changelog:
- move the vcpu dirty limit test into migration-test and do some modification 
suggested
   by Peter.

Please review.

Yong.

Abstract


This patchset introduce a mechanism to impose dirty restraint
on vCPU, aiming to keep the vCPU running in a certain dirtyrate
given by user. dirty restraint on vCPU maybe an alternative
method to implement convergence logic for live migration,
which could improve guest memory performance during migration
compared with traditional method in theory.

For the current live migration implementation, the convergence
logic throttles all vCPUs of the VM, which has some side effects.
-'read processes' on vCPU will be unnecessarily penalized
- throttle increase percentage step by step, which seems
   struggling to find the optimal throttle percentage when
   dirtyrate is high.
- hard to predict the remaining time of migration if the
   throttling percentage reachs 99%

to a certain extent, the dirty restraint machnism can fix these
effects by throttling at vCPU granularity during migration.

the implementation is rather straightforward, we calculate
vCPU dirtyrate via the Dirty Ring mechanism periodically
as the commit 0e21bf246 "implement dirty-ring dirtyrate calculation"
does, for vCPU that be specified to impose dirty restraint,
we throttle it periodically as the auto-converge does, once after
throttling, we compare the quota dirtyrate with current dirtyrate,
if current dirtyrate is not under the quota, increase the throttling
percentage until current dirtyrate is under the quota.

this patchset is the basis of implmenting a new auto-converge method
for live migration, we introduce two qmp commands for impose/cancel
the dirty restraint on specified vCPU, so it also can be an independent
api to supply the upper app such as libvirt, which can use it to
implement the convergence logic during live migration, supplemented
with the qmp 'ca

Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU

2021-12-06 Thread Hyman





在 2021/12/6 16:28, Peter Xu 写道:

On Sat, Dec 04, 2021 at 08:00:19PM +0800, Hyman Huang wrote:



在 2021/12/3 20:34, Markus Armbruster 写道:

huang...@chinatelecom.cn writes:


From: Hyman Huang(黄勇) 

Implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirty page rate given by user.

Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
to enable, disable, query dirty page limit for virtual CPU.

Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
"info vcpu_dirty_limit" so developers can play with them easier.

Signed-off-by: Hyman Huang(黄勇) 


[...]

I see you replaced the interface.  Back to square one...


diff --git a/qapi/migration.json b/qapi/migration.json
index 3da8fdf..dc15b3f 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1872,6 +1872,54 @@
   'current-rate': 'int64' } }
   ##
+# @vcpu-dirty-limit:
+#
+# Set or cancel the upper limit of dirty page rate for a virtual CPU.
+#
+# Requires KVM with accelerator property "dirty-ring-size" set.
+# A virtual CPU's dirty page rate is a measure of its memory load.
+# To observe dirty page rates, use @calc-dirty-rate.
+#
+# @cpu-index: index of virtual CPU.
+#
+# @enable: true to enable, false to disable.
+#
+# @dirty-rate: upper limit of dirty page rate for virtual CPU.
+#
+# Since: 7.0
+#
+# Example:
+#   {"execute": "vcpu-dirty-limit"}
+#"arguments": { "cpu-index": 0,
+#   "enable": true,
+#   "dirty-rate": 200 } }
+#
+##
+{ 'command': 'vcpu-dirty-limit',
+  'data': { 'cpu-index': 'int',
+'enable': 'bool',
+'dirty-rate': 'uint64'} }


When @enable is false, @dirty-rate makes no sense and is not used (I
checked the code), but users have to specify it anyway.  That's bad
design.

Better: drop @enable, make @dirty-rate optional, present means enable,
absent means disable.

Uh, if we drop @enable, enabling dirty limit should be like:
vcpu-dirty-limit cpu-index=0 dirty-rate=1000

And disabling dirty limit like:
vcpu-dirty-limit cpu-index=0

For disabling case, there is no hint of disabling in command
"vcpu-dirty-limit".

How about make @dirty-rate optional, when enable dirty limit, it should
present, ignored otherwise?


Sounds good, I think we can make both "enable" and "dirty-rate" optional.

To turn it on we either use "enable=true,dirty-rate=XXX" or "dirty-rate=XXX" >
To turn it off we use "enable=false".

Indeed, this make things more convenient.

 >>



+
+##
+# @query-vcpu-dirty-limit:
+#
+# Returns information about the virtual CPU dirty limit status.
+#
+# @cpu-index: index of the virtual CPU to query, if not specified, all
+# virtual CPUs will be queried.
+#
+# Since: 7.0
+#
+# Example:
+#   {"execute": "query-vcpu-dirty-limit"}
+#"arguments": { "cpu-index": 0 } }
+#
+##
+{ 'command': 'query-vcpu-dirty-limit',
+  'data': { '*cpu-index': 'int' },
+  'returns': [ 'DirtyLimitInfo' ] }


Why would anyone ever want to specify @cpu-index?  Output isn't that
large even if you have a few hundred CPUs.

Let's keep things simple and drop the parameter.

Ok, this make things simple.


I found that it'll be challenging for any human being to identify "whether
he/she has turned throttle off for all vcpus"..  I think that could be useful
when we finally decided to cancel current migration.
That's question, how about adding an optional argument "global" and 
making "cpu-index", "enable", "dirty-rate" all optional in 
"vcpu-dirty-limit", keeping the "cpu-index" and "global" options 
mutually exclusive?

{ 'command': 'vcpu-dirty-limit',
  'data': { '*cpu-index': 'int',
'*global': 'bool'
'*enable': 'bool',
'*dirty-rate': 'uint64'} }
In the case of enabling all vcpu throttle:
Either use "global=true,enable=true,dirty-rate=XXX" or 
"global=true,dirty-rate=XXX"


In the case of disabling all vcpu throttle:
use "global=true,enable=false,dirty-rate=XXX"

In other case, we pass the same option just like what we did for 
specified vcpu throttle before.


I thought about adding a "global=on/off" flag, but instead can we just return
the vcpu info for the ones that enabled the per-vcpu throttling?  For anyone
who wants to read all vcpu dirty information he/she can use calc-dirty-rate.


Ok, I'll pick up this advice next version.

Thanks,

Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU

2021-12-06 Thread Hyman





在 2021/12/6 16:36, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:47AM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) 

Implement dirtyrate calculation periodically basing on
dirty-ring and throttle vCPU until it reachs the quota
dirty page rate given by user.

Introduce qmp commands "vcpu-dirty-limit", "query-vcpu-dirty-limit"
to enable, disable, query dirty page limit for virtual CPU.

Meanwhile, introduce corresponding hmp commands "vcpu_dirty_limit",
"info vcpu_dirty_limit" so developers can play with them easier.


Thanks.  Even if I start to use qmp-shell more recently but still in some case
where only hmp is specified this could still be handy.


+void qmp_vcpu_dirty_limit(int64_t cpu_index,
+  bool enable,
+  uint64_t dirty_rate,
+  Error **errp)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "dirty page limit feature requires KVM with"
+   " accelerator property 'dirty-ring-size' set'");
+return;
+}
+
+if (!dirtylimit_is_vcpu_index_valid(cpu_index)) {
+error_setg(errp, "cpu index out of range");
+return;
+}
+
+if (enable) {
+dirtylimit_calc();
+dirtylimit_vcpu(cpu_index, dirty_rate);
+} else {
+if (!dirtylimit_enabled(cpu_index)) {
+error_setg(errp, "dirty page limit for CPU %ld not set",
+   cpu_index);
+return;
+}


We don't need to fail the user for enable=off even if vcpu is not throttled,
imho.

Ok.



+
+if (!dirtylimit_cancel_vcpu(cpu_index)) {
+dirtylimit_calc_quit();
+}
+}
+}


[...]


+struct DirtyLimitInfoList *qmp_query_vcpu_dirty_limit(bool has_cpu_index,
+  int64_t cpu_index,
+  Error **errp)
+{
+DirtyLimitInfo *info = NULL;
+DirtyLimitInfoList *head = NULL, **tail = &head;
+
+if (has_cpu_index &&
+(!dirtylimit_is_vcpu_index_valid(cpu_index))) {
+error_setg(errp, "cpu index out of range");
+return NULL;
+}
+
+if (has_cpu_index) {
+info = dirtylimit_query_vcpu(cpu_index);
+QAPI_LIST_APPEND(tail, info);
+} else {
+CPUState *cpu;
+CPU_FOREACH(cpu) {
+if (!cpu->unplug) {
+info = dirtylimit_query_vcpu(cpu->cpu_index);
+QAPI_LIST_APPEND(tail, info);
+}


There're special handling for unplug in a few places.  Could you explain why?
E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
even keep it there?
The dirty limit logic only allow plugged vcpu to be enabled throttle, so 
that the "dirtylimit-{cpu-index}" thread don't need to be forked and we 
can save the overhead. So in query logic we just filter the unplugged vcpu.


Another reason is that i thought it could make user confused when we 
return the unplugged vcpu dirtylimit info. Uh, in most time of vm 
lifecycle, hotplugging vcpu may never happen.

+}
+}
+
+return head;
+}

Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU

2021-12-06 Thread Hyman





在 2021/12/6 16:39, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:47AM +0800, huang...@chinatelecom.cn wrote:

+void dirtylimit_setup(int max_cpus)
+{
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+return;
+}
+
+dirtylimit_calc_state_init(max_cpus);
+dirtylimit_state_init(max_cpus);
+}


[...]


diff --git a/softmmu/vl.c b/softmmu/vl.c
index 620a1f1..0f83ce3 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -3777,5 +3777,6 @@ void qemu_init(int argc, char **argv, char **envp)
  qemu_init_displays();
  accel_setup_post(current_machine);
  os_setup_post();
+dirtylimit_setup(current_machine->smp.max_cpus);
  resume_mux_open();


Can we do the init only when someone enables it?  We could also do proper
free() for the structs when it's globally turned off.

Yes, i'll try this next version

Re: [PATCH v9 1/3] migration/dirtyrate: implement vCPU dirtyrate calculation periodically

2021-12-06 Thread Hyman





在 2021/12/6 18:18, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:45AM +0800, huang...@chinatelecom.cn wrote:

+static void dirtylimit_calc_func(void)
+{
+CPUState *cpu;
+DirtyPageRecord *dirty_pages;
+int64_t start_time, end_time, calc_time;
+DirtyRateVcpu rate;
+int i = 0;
+
+dirty_pages = g_malloc0(sizeof(*dirty_pages) *
+dirtylimit_calc_state->data.nvcpu);
+
+CPU_FOREACH(cpu) {
+record_dirtypages(dirty_pages, cpu, true);
+}
+
+start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+g_usleep(DIRTYLIMIT_CALC_TIME_MS * 1000);
+end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+calc_time = end_time - start_time;
+
+qemu_mutex_lock_iothread();
+memory_global_dirty_log_sync();
+qemu_mutex_unlock_iothread();
+
+CPU_FOREACH(cpu) {
+record_dirtypages(dirty_pages, cpu, false);
+}
+
+for (i = 0; i < dirtylimit_calc_state->data.nvcpu; i++) {
+uint64_t increased_dirty_pages =
+dirty_pages[i].end_pages - dirty_pages[i].start_pages;
+uint64_t memory_size_MB =
+(increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
+int64_t dirtyrate = (memory_size_MB * 1000) / calc_time;
+
+rate.id = i;
+rate.dirty_rate  = dirtyrate;
+dirtylimit_calc_state->data.rates[i] = rate;
+
+trace_dirtyrate_do_calculate_vcpu(i,
+dirtylimit_calc_state->data.rates[i].dirty_rate);
+}
+}


This looks so like the calc-dirty-rate code already.

I think adding a new resion (GLOBAL_DIRTY_LIMIT) is fine, however still, any

Ok.

chance to merge the code?

I'm not sure about merging but i'll try it. :)

Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU

2021-12-06 Thread Hyman





在 2021/12/7 10:24, Peter Xu 写道:

On Mon, Dec 06, 2021 at 10:56:00PM +0800, Hyman wrote:

I found that it'll be challenging for any human being to identify "whether
he/she has turned throttle off for all vcpus"..  I think that could be useful
when we finally decided to cancel current migration.

That's question, how about adding an optional argument "global" and making
"cpu-index", "enable", "dirty-rate" all optional in "vcpu-dirty-limit",
keeping the "cpu-index" and "global" options mutually exclusive?
{ 'command': 'vcpu-dirty-limit',
   'data': { '*cpu-index': 'int',
 '*global': 'bool'
 '*enable': 'bool',
 '*dirty-rate': 'uint64'} }
In the case of enabling all vcpu throttle:
Either use "global=true,enable=true,dirty-rate=XXX" or
"global=true,dirty-rate=XXX"

In the case of disabling all vcpu throttle:
use "global=true,enable=false,dirty-rate=XXX"

In other case, we pass the same option just like what we did for specified
vcpu throttle before.


Could we merge "cpu-index" and "global" somehow?  They're mutual exclusive. >
For example, merge them into one "vcpu" parameter, "vcpu=all" means global,
"vcpu=1" means vcpu 1.  But then we'll need to make it a string.
Ok, sound good

Re: [PATCH v9 3/3] cpus-common: implement dirty page limit on vCPU

2021-12-06 Thread Hyman





在 2021/12/7 10:57, Peter Xu 写道:

On Mon, Dec 06, 2021 at 11:19:21PM +0800, Hyman wrote:

+if (has_cpu_index) {
+info = dirtylimit_query_vcpu(cpu_index);
+QAPI_LIST_APPEND(tail, info);
+} else {
+CPUState *cpu;
+CPU_FOREACH(cpu) {
+if (!cpu->unplug) {
+info = dirtylimit_query_vcpu(cpu->cpu_index);
+QAPI_LIST_APPEND(tail, info);
+}


There're special handling for unplug in a few places.  Could you explain why?
E.g. if the vcpu is unplugged then dirty rate is zero, then it seems fine to
even keep it there?
The dirty limit logic only allow plugged vcpu to be enabled throttle, so

that the "dirtylimit-{cpu-index}" thread don't need to be forked and we can
save the overhead. So in query logic we just filter the unplugged vcpu.


I've commented similarly in the other thread - please consider not using NVCPU
threads only for vcpu throttling, irrelevant of vcpu hot plug/unplug.

Per-vcpu throttle is totally not a cpu intensive workload, 1 thread should be
enough globally, imho.

A guest with hundreds of vcpus are becoming more common, we shouldn't waste OS
thread resources just for this.


Ok, i'll try this out next version


Another reason is that i thought it could make user confused when we return
the unplugged vcpu dirtylimit info. Uh, in most time of vm lifecycle,
hotplugging vcpu may never happen.


I just think if plug/unplug does not affect the throttle logic then we should
treat them the same, it avoids unnecessary special care on those vcpus too.

Indeed, i'm struggling too :), i'll remove the plug/unplug logic the 
next version.

Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle

2021-12-08 Thread Hyman





在 2021/12/6 18:10, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote:

+static uint64_t dirtylimit_pct(unsigned int last_pct,
+   uint64_t quota,
+   uint64_t current)
+{
+uint64_t limit_pct = 0;
+RestrainPolicy policy;
+bool mitigate = (quota > current) ? true : false;
+
+if (mitigate && ((current == 0) ||
+(last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) {
+return 0;
+}
+
+policy = dirtylimit_policy(last_pct, quota, current);
+switch (policy) {
+case RESTRAIN_SLIGHT:
+/* [90, 99] */
+if (mitigate) {
+limit_pct =
+last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+} else {
+limit_pct =
+last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+
+limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX);
+}
+   break;
+case RESTRAIN_HEAVY:
+/* [75, 90) */
+if (mitigate) {
+limit_pct =
+last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+} else {
+limit_pct =
+last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+
+limit_pct = MIN(limit_pct,
+DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK);
+}
+   break;
+case RESTRAIN_RATIO:
+/* [0, 75) */
+if (mitigate) {
+if (last_pct <= (((quota - current) * 100 / quota))) {
+limit_pct = 0;
+} else {
+limit_pct = last_pct -
+((quota - current) * 100 / quota);
+limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN);
+}
+} else {
+limit_pct = last_pct +
+((current - quota) * 100 / current);
+
+limit_pct = MIN(limit_pct,
+DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK);
+}
+   break;
+case RESTRAIN_KEEP:
+default:
+   limit_pct = last_pct;
+   break;
+}
+
+return limit_pct;
+}
+
+static void *dirtylimit_thread(void *opaque)
+{
+int cpu_index = *(int *)opaque;
+uint64_t quota_dirtyrate, current_dirtyrate;
+unsigned int last_pct = 0;
+unsigned int pct = 0;
+
+rcu_register_thread();
+
+quota_dirtyrate = dirtylimit_quota(cpu_index);
+current_dirtyrate = dirtylimit_current(cpu_index);
+
+pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate);
+
+do {
+trace_dirtylimit_impose(cpu_index,
+quota_dirtyrate, current_dirtyrate, pct);
+
+last_pct = pct;
+if (pct == 0) {
+sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S);
+} else {
+dirtylimit_check(cpu_index, pct);
+}
+
+quota_dirtyrate = dirtylimit_quota(cpu_index);
+current_dirtyrate = dirtylimit_current(cpu_index);
+
+pct = dirtylimit_pct(last_pct, quota_dirtyrate, current_dirtyrate);


So what I had in mind is we can start with an extremely simple version of
negative feedback system.  Say, firstly each vcpu will have a simple number to
sleep for some interval (this is ugly code, but just show what I meant..):

===
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index eecd8031cf..c320fd190f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu)
  trace_kvm_dirty_ring_full(cpu->cpu_index);
  qemu_mutex_lock_iothread();
  kvm_dirty_ring_reap(kvm_state);
+if (dirtylimit_enabled(cpu->cpu_index) && 
cpu->throttle_us_per_full)
+usleep(cpu->throttle_us_per_full);
  qemu_mutex_unlock_iothread();
  ret = 0;
  break;
===

I think this will have finer granularity when throttle (for 4096 ring size,
that's per-16MB operation) than current way where we inject per-vcpu async task
to sleep, like auto-converge.

Then we have the "black box" to tune this value with below input/output:

   - Input: dirty rate information, same as current algo

   - Output: increase/decrease of per-vcpu throttle_us_per_full above, and
 that's all

We can do the sampling per-second, then we keep doing it: we can have 1 thread
doing per-second task collecting dirty rate information for all the vcpus, then
tune that throttle_us_per_full for each of them.

The simplest linear algorithm would be as simple as (for each vcpu):

   if (quota < current)
 throttle_us_per_full += SOMETHING;
 if (throttle_us_per_full > MAX)
   throttle_us_per_full = MAX;
   else
 throttle_us_per_full -= SOMETHING;
 if (throttle_us_per_full < 0)
   throttle_us_per_full = 0;

I think your algorithm is fine, but thoroughly review every single bit of it in
one shot will be challenging, and it's also hard to prove every bit of the
algorithm is helpful, as there're a lot of hand-made macros and state changes.

I actually tes

Re: [PATCH v9 2/3] cpu-throttle: implement vCPU throttle

2021-12-08 Thread Hyman





在 2021/12/8 23:36, Hyman 写道:



在 2021/12/6 18:10, Peter Xu 写道:

On Fri, Dec 03, 2021 at 09:39:46AM +0800, huang...@chinatelecom.cn wrote:

+static uint64_t dirtylimit_pct(unsigned int last_pct,
+   uint64_t quota,
+   uint64_t current)
+{
+    uint64_t limit_pct = 0;
+    RestrainPolicy policy;
+    bool mitigate = (quota > current) ? true : false;
+
+    if (mitigate && ((current == 0) ||
+    (last_pct <= DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE))) {
+    return 0;
+    }
+
+    policy = dirtylimit_policy(last_pct, quota, current);
+    switch (policy) {
+    case RESTRAIN_SLIGHT:
+    /* [90, 99] */
+    if (mitigate) {
+    limit_pct =
+    last_pct - DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+    } else {
+    limit_pct =
+    last_pct + DIRTYLIMIT_THROTTLE_SLIGHT_STEP_SIZE;
+
+    limit_pct = MIN(limit_pct, CPU_THROTTLE_PCT_MAX);
+    }
+   break;
+    case RESTRAIN_HEAVY:
+    /* [75, 90) */
+    if (mitigate) {
+    limit_pct =
+    last_pct - DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+    } else {
+    limit_pct =
+    last_pct + DIRTYLIMIT_THROTTLE_HEAVY_STEP_SIZE;
+
+    limit_pct = MIN(limit_pct,
+    DIRTYLIMIT_THROTTLE_SLIGHT_WATERMARK);
+    }
+   break;
+    case RESTRAIN_RATIO:
+    /* [0, 75) */
+    if (mitigate) {
+    if (last_pct <= (((quota - current) * 100 / quota))) {
+    limit_pct = 0;
+    } else {
+    limit_pct = last_pct -
+    ((quota - current) * 100 / quota);
+    limit_pct = MAX(limit_pct, CPU_THROTTLE_PCT_MIN);
+    }
+    } else {
+    limit_pct = last_pct +
+    ((current - quota) * 100 / current);
+
+    limit_pct = MIN(limit_pct,
+    DIRTYLIMIT_THROTTLE_HEAVY_WATERMARK);
+    }
+   break;
+    case RESTRAIN_KEEP:
+    default:
+   limit_pct = last_pct;
+   break;
+    }
+
+    return limit_pct;
+}
+
+static void *dirtylimit_thread(void *opaque)
+{
+    int cpu_index = *(int *)opaque;
+    uint64_t quota_dirtyrate, current_dirtyrate;
+    unsigned int last_pct = 0;
+    unsigned int pct = 0;
+
+    rcu_register_thread();
+
+    quota_dirtyrate = dirtylimit_quota(cpu_index);
+    current_dirtyrate = dirtylimit_current(cpu_index);
+
+    pct = dirtylimit_init_pct(quota_dirtyrate, current_dirtyrate);
+
+    do {
+    trace_dirtylimit_impose(cpu_index,
+    quota_dirtyrate, current_dirtyrate, pct);
+
+    last_pct = pct;
+    if (pct == 0) {
+    sleep(DIRTYLIMIT_CALC_PERIOD_TIME_S);
+    } else {
+    dirtylimit_check(cpu_index, pct);
+    }
+
+    quota_dirtyrate = dirtylimit_quota(cpu_index);
+    current_dirtyrate = dirtylimit_current(cpu_index);
+
+    pct = dirtylimit_pct(last_pct, quota_dirtyrate, 
current_dirtyrate);


So what I had in mind is we can start with an extremely simple version of
negative feedback system.  Say, firstly each vcpu will have a simple 
number to
sleep for some interval (this is ugly code, but just show what I 
meant..):


===
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index eecd8031cf..c320fd190f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2932,6 +2932,8 @@ int kvm_cpu_exec(CPUState *cpu)
  trace_kvm_dirty_ring_full(cpu->cpu_index);
  qemu_mutex_lock_iothread();
  kvm_dirty_ring_reap(kvm_state);
+    if (dirtylimit_enabled(cpu->cpu_index) && 
cpu->throttle_us_per_full)

+    usleep(cpu->throttle_us_per_full);
  qemu_mutex_unlock_iothread();
  ret = 0;
  break;
===

I think this will have finer granularity when throttle (for 4096 ring 
size,
that's per-16MB operation) than current way where we inject per-vcpu 
async task

to sleep, like auto-converge.

Then we have the "black box" to tune this value with below input/output:

   - Input: dirty rate information, same as current algo

   - Output: increase/decrease of per-vcpu throttle_us_per_full above, 
and

 that's all

We can do the sampling per-second, then we keep doing it: we can have 
1 thread
doing per-second task collecting dirty rate information for all the 
vcpus, then

tune that throttle_us_per_full for each of them.

The simplest linear algorithm would be as simple as (for each vcpu):

   if (quota < current)
 throttle_us_per_full += SOMETHING;
 if (throttle_us_per_full > MAX)
   throttle_us_per_full = MAX;
   else
 throttle_us_per_full -= SOMETHING;
 if (throttle_us_per_full < 0)
   throttle_us_per_full = 0;

I think your algorithm is fine, but thoroughly review every single bit 
of it in
one shot will be challenging, and it's also har

[PATCH QEMU v3 1/3] tests: Add migration dirty-limit capability test

2023-08-07 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Note that this test case involves many passes, so it runs
in slow mode only.

Signed-off-by: Hyman Huang(黄勇) 
Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht>
---
 tests/qtest/migration-test.c | 164 +++
 1 file changed, 164 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 62d3f37021..0be2d17c42 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2739,6 +2739,166 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source  destination
+ *  start vm
+ *  start incoming vm
+ *  migrate
+ *  wait dirty limit to begin
+ *  cancel migrate
+ *  cancellation check
+ *  restart incoming vm
+ *  migrate
+ *  wait dirty limit to begin
+ *  wait pre-switchover event
+ *  convergence condition check
+ *
+ * And see if dirty limit migration works correctly.
+ * This test case involves many passes, so it runs in slow mode only.
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+

[PATCH QEMU v3 2/3] tests/migration: Introduce dirty-ring-size option into guestperf

2023-08-07 Thread ~hyman

From: Hyman Huang(黄勇) 

Dirty ring size configuration is not supported by guestperf tool.

Introduce dirty-ring-size (ranges in [1024, 65536]) option so
developers can play with dirty-ring and dirty-limit feature easier.

To set dirty ring size with 4096 during migration test:
$ ./tests/migration/guestperf.py --dirty-ring-size 4096 xxx

Signed-off-by: Hyman Huang(黄勇) 
Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht>
---
 tests/migration/guestperf/engine.py   | 6 +-
 tests/migration/guestperf/hardware.py | 8 ++--
 tests/migration/guestperf/shell.py| 6 +-
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index e69d16a62c..29ebb5011b 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -325,7 +325,6 @@ class Engine(object):
 cmdline = "'" + cmdline + "'"
 
 argv = [
-"-accel", "kvm",
 "-cpu", "host",
 "-kernel", self._kernel,
 "-initrd", self._initrd,
@@ -333,6 +332,11 @@ class Engine(object):
 "-m", str((hardware._mem * 1024) + 512),
 "-smp", str(hardware._cpus),
 ]
+if hardware._dirty_ring_size:
+argv.extend(["-accel", "kvm,dirty-ring-size=%s" %
+ hardware._dirty_ring_size])
+else:
+argv.extend(["-accel", "kvm"])
 
 argv.extend(self._get_qemu_serial_args())
 
diff --git a/tests/migration/guestperf/hardware.py 
b/tests/migration/guestperf/hardware.py
index 3145785ffd..f779cc050b 100644
--- a/tests/migration/guestperf/hardware.py
+++ b/tests/migration/guestperf/hardware.py
@@ -23,7 +23,8 @@ class Hardware(object):
  src_cpu_bind=None, src_mem_bind=None,
  dst_cpu_bind=None, dst_mem_bind=None,
  prealloc_pages = False,
- huge_pages=False, locked_pages=False):
+ huge_pages=False, locked_pages=False,
+ dirty_ring_size=0):
 self._cpus = cpus
 self._mem = mem # GiB
 self._src_mem_bind = src_mem_bind # List of NUMA nodes
@@ -33,6 +34,7 @@ class Hardware(object):
 self._prealloc_pages = prealloc_pages
 self._huge_pages = huge_pages
 self._locked_pages = locked_pages
+self._dirty_ring_size = dirty_ring_size
 
 
 def serialize(self):
@@ -46,6 +48,7 @@ class Hardware(object):
 "prealloc_pages": self._prealloc_pages,
 "huge_pages": self._huge_pages,
 "locked_pages": self._locked_pages,
+"dirty_ring_size": self._dirty_ring_size,
 }
 
 @classmethod
@@ -59,4 +62,5 @@ class Hardware(object):
 data["dst_mem_bind"],
 data["prealloc_pages"],
 data["huge_pages"],
-data["locked_pages"])
+data["locked_pages"],
+data["dirty_ring_size"])
diff --git a/tests/migration/guestperf/shell.py 
b/tests/migration/guestperf/shell.py
index 8a809e3dda..7d6b8cd7cf 100644
--- a/tests/migration/guestperf/shell.py
+++ b/tests/migration/guestperf/shell.py
@@ -60,6 +60,8 @@ class BaseShell(object):
 parser.add_argument("--prealloc-pages", dest="prealloc_pages", 
default=False)
 parser.add_argument("--huge-pages", dest="huge_pages", default=False)
 parser.add_argument("--locked-pages", dest="locked_pages", 
default=False)
+parser.add_argument("--dirty-ring-size", dest="dirty_ring_size",
+default=0, type=int)
 
 self._parser = parser
 
@@ -89,7 +91,9 @@ class BaseShell(object):
 
 locked_pages=args.locked_pages,
 huge_pages=args.huge_pages,
-prealloc_pages=args.prealloc_pages)
+prealloc_pages=args.prealloc_pages,
+
+dirty_ring_size=args.dirty_ring_size)
 
 
 class Shell(BaseShell):
-- 
2.38.5

[PATCH QEMU v3 0/3] migration: enrich the dirty-limit test case

2023-08-07 Thread ~hyman

Ping

This version is a copy of version 2 and is rebased
on the master. No functional changes.

The dirty-limit migration test involves many passes
and takes about 1 minute on average, so put it in
the slow mode of migration-test. Inspired by Peter.

V2:
- put the dirty-limit migration test in slow mode and
  enrich the test case comment

Dirty-limit feature was introduced in 8.1, and the test
case could be enriched to make sure the behavior and
the performance of dirty-limit is exactly what we want.

This series adds 2 test cases, the first commit aims
for the functional test and the others aim for the
performance test.

Please review, thanks.

Yong.

Hyman Huang(黄勇) (3):
  tests: Add migration dirty-limit capability test
  tests/migration: Introduce dirty-ring-size option into guestperf
  tests/migration: Introduce dirty-limit into guestperf

 tests/migration/guestperf/comparison.py |  23 
 tests/migration/guestperf/engine.py |  23 +++-
 tests/migration/guestperf/hardware.py   |   8 +-
 tests/migration/guestperf/progress.py   |  16 ++-
 tests/migration/guestperf/scenario.py   |  11 +-
 tests/migration/guestperf/shell.py  |  24 +++-
 tests/qtest/migration-test.c| 164 
 7 files changed, 261 insertions(+), 8 deletions(-)

-- 
2.38.5

[PATCH QEMU v3 3/3] tests/migration: Introduce dirty-limit into guestperf

2023-08-07 Thread ~hyman

From: Hyman Huang(黄勇) 

Currently, guestperf does not cover the dirty-limit
migration, support this feature.

Note that dirty-limit requires 'dirty-ring-size' set.

To enable dirty-limit, setting x-vcpu-dirty-limit-period
as 500ms and x-vcpu-dirty-limit as 10MB/s:
$ ./tests/migration/guestperf.py \
--dirty-ring-size 4096 \
--dirty-limit --x-vcpu-dirty-limit-period 500 \
--vcpu-dirty-limit 10 --output output.json \

To run the entire standardized set of dirty-limit-enabled
comparisons, with unix migration:
$ ./tests/migration/guestperf-batch.py \
--dirty-ring-size 4096 \
--dst-host localhost --transport unix \
--filter compr-dirty-limit* --output outputdir

Signed-off-by: Hyman Huang(黄勇) 
Message-Id: <169073391195.19893.61067537833811032...@git.sr.ht>
---
 tests/migration/guestperf/comparison.py | 23 +++
 tests/migration/guestperf/engine.py | 17 +
 tests/migration/guestperf/progress.py   | 16 ++--
 tests/migration/guestperf/scenario.py   | 11 ++-
 tests/migration/guestperf/shell.py  | 18 +-
 5 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/tests/migration/guestperf/comparison.py 
b/tests/migration/guestperf/comparison.py
index c03b3f6d7e..42cc0372d1 100644
--- a/tests/migration/guestperf/comparison.py
+++ b/tests/migration/guestperf/comparison.py
@@ -135,4 +135,27 @@ COMPARISONS = [
 Scenario("compr-multifd-channels-64",
  multifd=True, multifd_channels=64),
 ]),
+
+# Looking at effect of dirty-limit with
+# varying x_vcpu_dirty_limit_period
+Comparison("compr-dirty-limit-period", scenarios = [
+Scenario("compr-dirty-limit-period-500",
+ dirty_limit=True, x_vcpu_dirty_limit_period=500),
+Scenario("compr-dirty-limit-period-800",
+ dirty_limit=True, x_vcpu_dirty_limit_period=800),
+Scenario("compr-dirty-limit-period-1000",
+ dirty_limit=True, x_vcpu_dirty_limit_period=1000),
+]),
+
+
+# Looking at effect of dirty-limit with
+# varying vcpu_dirty_limit
+Comparison("compr-dirty-limit", scenarios = [
+Scenario("compr-dirty-limit-10MB",
+ dirty_limit=True, vcpu_dirty_limit=10),
+Scenario("compr-dirty-limit-20MB",
+ dirty_limit=True, vcpu_dirty_limit=20),
+Scenario("compr-dirty-limit-50MB",
+ dirty_limit=True, vcpu_dirty_limit=50),
+]),
 ]
diff --git a/tests/migration/guestperf/engine.py 
b/tests/migration/guestperf/engine.py
index 29ebb5011b..93a6f78e46 100644
--- a/tests/migration/guestperf/engine.py
+++ b/tests/migration/guestperf/engine.py
@@ -102,6 +102,8 @@ class Engine(object):
 info.get("expected-downtime", 0),
 info.get("setup-time", 0),
 info.get("cpu-throttle-percentage", 0),
+info.get("dirty-limit-throttle-time-per-round", 0),
+info.get("dirty-limit-ring-full-time", 0),
 )
 
 def _migrate(self, hardware, scenario, src, dst, connect_uri):
@@ -203,6 +205,21 @@ class Engine(object):
 resp = dst.command("migrate-set-parameters",
multifd_channels=scenario._multifd_channels)
 
+if scenario._dirty_limit:
+if not hardware._dirty_ring_size:
+raise Exception("dirty ring size must be configured when "
+"testing dirty limit migration")
+
+resp = src.command("migrate-set-capabilities",
+   capabilities = [
+   { "capability": "dirty-limit",
+ "state": True }
+   ])
+resp = src.command("migrate-set-parameters",
+x_vcpu_dirty_limit_period=scenario._x_vcpu_dirty_limit_period)
+resp = src.command("migrate-set-parameters",
+   vcpu_dirty_limit=scenario._vcpu_dirty_limit)
+
 resp = src.command("migrate", uri=connect_uri)
 
 post_copy = False
diff --git a/tests/migration/guestperf/progress.py 
b/tests/migration/guestperf/progress.py
index ab1ee57273..d490584217 100644
--- a/tests/migration/guestperf/progress.py
+++ b/tests/migration/guestperf/progress.py
@@ -81,7 +81,9 @@ class Progress(object):
  downtime,
  downtime_expected,
  setup_time,
- throttle_pcent):
+ throttle_pcent,
+ dirty_limit_throttle_time_per_round,
+ dirty_limit_ring_full_time):
 
 self._status = status
 self._ram = ram
@@ -91,6 +93,10 @@ class Progress(object):

[PATCH QEMU] docs/migration: Add the dirty limit section

2023-08-09 Thread ~hyman

From: Hyman Huang(黄勇) 

The dirty limit feature has been introduced since the 8.1
QEMU release but has not reflected in the document, add a
section for that.

Signed-off-by: Hyman Huang(黄勇) 
---
 docs/devel/migration.rst | 70 
 1 file changed, 70 insertions(+)

diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
index c3e1400c0c..4cc83adc8e 100644
--- a/docs/devel/migration.rst
+++ b/docs/devel/migration.rst
@@ -588,6 +588,76 @@ path.
  Return path  - opened by main thread, written by main thread AND postcopy
  thread (protected by rp_mutex)
 
+Dirty limit
+=
+The dirty limit, short for dirty page rate upper limit, is a new capability
+introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM
+dirty ring to throttle down the guest during live migration.
+
+The algorithm framework is as follows:
+
+::
+
+  ---
+  main   --> throttle thread > PREPARE(1) <
+  thread  \|  |
+   \   |  |
+\  V  |
+ -\CALCULATE(2)   |
+   \   |  |
+\  |  |
+ \ V  |
+  \SET PENALTY(3) -
+   -\  |
+ \ |
+  \V
+   -> virtual CPU thread ---> ACCEPT PENALTY(4)
+  ---
+When the qmp command qmp_set_vcpu_dirty_limit is called for the first time,
+the QEMU main thread starts the throttle thread. The throttle thread, once
+launched, executes the loop, which consists of three steps:
+
+  - PREPARE (1)
+
+ The entire work of PREPARE (1) is prepared for the second stage,
+ CALCULATE(2), as the name implies. It involves preparing the dirty
+ page rate value and the corresponding upper limit of the VM:
+ The dirty page rate is calculated via the KVM dirty ring mechanism,
+ which tells QEMU how many dirty pages a virtual CPU has had since the
+ last KVM_EXIT_DIRTY_RING_RULL exception; The dirty page rate upper
+ limit is specified by caller, therefore fetch it directly.
+
+  - CALCULATE (2)
+
+ Calculate a suitable sleep period for each virtual CPU, which will be
+ used to determine the penalty for the target virtual CPU. The
+ computation must be done carefully in order to reduce the dirty page
+ rate progressively down to the upper limit without oscillation. To
+ achieve this, two strategies are provided: the first is to add or
+ subtract sleep time based on the ratio of the current dirty page rate
+ to the limit, which is used when the current dirty page rate is far
+ from the limit; the second is to add or subtract a fixed time when
+ the current dirty page rate is close to the limit.
+
+  - SET PENALTY (3)
+
+ Set the sleep time for each virtual CPU that should be penalized based
+ on the results of the calculation supplied by step CALCULATE (2).
+
+After completing the three above stages, the throttle thread loops back
+to step PREPARE (1) until the dirty limit is reached.
+
+On the other hand, each virtual CPU thread reads the sleep duration and
+sleeps in the path of the KVM_EXIT_DIRTY_RING_RULL exception handler, that
+is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will
+obviously exit to the path and get penalized, whereas virtual CPUs involved
+with read processes will not.
+
+In summary, thanks to the KVM dirty ring technology, the dirty limit
+algorithm will restrict virtual CPUs as needed to keep their dirty page
+rate inside the limit. This leads to more steady reading performance during
+live migration and can aid in improving large guest responsiveness.
+
 Postcopy
 
 
-- 
2.38.5

[PATCH QEMU v2 2/3] virtio-blk-pci: introduce auto-num-queues property

2023-08-10 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "9445e1e15 virtio-blk-pci: default num_queues to -smp N"
implment sizing the number of virtio-blk-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for virtio-blk-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the virtio-blk-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the virtio-blk-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/block/virtio-blk.c  | 1 +
 hw/virtio/virtio-blk-pci.c | 9 -
 include/hw/virtio/virtio-blk.h | 5 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 39e7f23fab..9e498ca64a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1716,6 +1716,7 @@ static Property virtio_blk_properties[] = {
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
+DEFINE_PROP_BOOL("auto-num-queues", VirtIOBlock, auto_num_queues, true),
 DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
VIRTIO_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT16("queue-size", VirtIOBlock, conf.queue_size, 256),
diff --git a/hw/virtio/virtio-blk-pci.c b/hw/virtio/virtio-blk-pci.c
index 9743bee965..4b6b4c4933 100644
--- a/hw/virtio/virtio-blk-pci.c
+++ b/hw/virtio/virtio-blk-pci.c
@@ -54,7 +54,14 @@ static void virtio_blk_pci_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
 VirtIOBlkConf *conf = &dev->vdev.conf;
 
 if (conf->num_queues == VIRTIO_BLK_AUTO_NUM_QUEUES) {
-conf->num_queues = virtio_pci_optimal_num_queues(0);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.auto_num_queues)
+conf->num_queues = virtio_pci_optimal_num_queues(0);
+else
+conf->num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index dafec432ce..dab6d7c70c 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -65,6 +65,11 @@ struct VirtIOBlock {
 uint64_t host_features;
 size_t config_size;
 BlockRAMRegistrar blk_ram_registrar;
+/*
+ * Set to true if virtqueues allow to be allocated to
+ * match the number of virtual CPUs automatically.
+ */
+bool auto_num_queues;
 };
 
 typedef struct VirtIOBlockReq {
-- 
2.38.5

[PATCH QEMU v2 0/3] provide a smooth upgrade solution for multi-queues disk

2023-08-10 Thread ~hyman

Ping,

This version is a copy of version 1 and is rebased
on the master. No functional changes.

A 1:1 virtqueue:vCPU mapping implementation for virtio-*-pci disk
introduced since qemu >= 5.2.0, which improves IO performance
remarkably. To enjoy this feature for exiting running VMs without
service interruption, the common solution is to migrate VMs from the
lower version of the hypervisor to the upgraded hypervisor, then wait
for the next cold reboot of the VM to enable this feature. That's the
way "discard" and "write-zeroes" features work.

As to multi-queues disk allocation automatically, it's a little
different because the destination will allocate queues to match the
number of vCPUs automatically by default in the case of live migration,
and the VMs on the source side remain 1 queue by default, which results
in migration failure due to loading disk VMState incorrectly on the
destination side. This issue requires Qemu to provide a hint that shows
multi-queues disk allocation is automatically supported, and this allows
upper APPs, e.g., libvirt, to recognize the hypervisor's capability of
this. And upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

To fix the issue, we introduce the auto-num-queues property for
virtio-*-pci as a solution, which would be probed by APPs, e.g., libvirt
by querying the device properties of QEMU. When launching live
migration, libvirt will send the auto-num-queues property as a migration
cookie to the destination, and thus the destination knows if the source
side supports auto-num-queues. If not, the destination would switch off
by building the command line with "auto-num-queues=off" when preparing
the incoming VM process. The following patches of libvirt show how it
roughly works:
https://github.com/newfriday/libvirt/commit/ce2bae2e1a6821afeb80756dc01f3680f525e506
https://github.com/newfriday/libvirt/commit/f546972b009458c88148fe079544db7e9e1f43c3
https://github.com/newfriday/libvirt/commit/5ee19c8646fdb4d87ab8b93f287c20925268ce83

The smooth upgrade solution requires the introduction of the auto-num-
queues property on the QEMU side, which is what the patch set does. I'm
hoping for comments about the series.

Please review, thanks.
Yong

Hyman Huang(黄勇) (3):
  virtio-scsi-pci: introduce auto-num-queues property
  virtio-blk-pci: introduce auto-num-queues property
  vhost-user-blk-pci: introduce auto-num-queues property

 hw/block/vhost-user-blk.c  |  1 +
 hw/block/virtio-blk.c  |  1 +
 hw/scsi/vhost-scsi.c   |  2 ++
 hw/scsi/vhost-user-scsi.c  |  2 ++
 hw/scsi/virtio-scsi.c  |  2 ++
 hw/virtio/vhost-scsi-pci.c | 11 +--
 hw/virtio/vhost-user-blk-pci.c |  9 -
 hw/virtio/vhost-user-scsi-pci.c| 11 +--
 hw/virtio/virtio-blk-pci.c |  9 -
 hw/virtio/virtio-scsi-pci.c| 11 +--
 include/hw/virtio/vhost-user-blk.h |  5 +
 include/hw/virtio/virtio-blk.h |  5 +
 include/hw/virtio/virtio-scsi.h|  5 +
 13 files changed, 66 insertions(+), 8 deletions(-)

-- 
2.38.5

[PATCH QEMU v2 1/3] virtio-scsi-pci: introduce auto-num-queues property

2023-08-10 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "6a55882284 virtio-scsi-pci: default num_queues to -smp N"
implment sizing the number of virtio-scsi-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for virtio-scsi-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the virtio-scsi-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the virtio-scsi-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/scsi/vhost-scsi.c|  2 ++
 hw/scsi/vhost-user-scsi.c   |  2 ++
 hw/scsi/virtio-scsi.c   |  2 ++
 hw/virtio/vhost-scsi-pci.c  | 11 +--
 hw/virtio/vhost-user-scsi-pci.c | 11 +--
 hw/virtio/virtio-scsi-pci.c | 11 +--
 include/hw/virtio/virtio-scsi.h |  5 +
 7 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 443f67daa4..78a8929c49 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -284,6 +284,8 @@ static Property vhost_scsi_properties[] = {
 DEFINE_PROP_STRING("vhostfd", VirtIOSCSICommon, conf.vhostfd),
 DEFINE_PROP_STRING("wwpn", VirtIOSCSICommon, conf.wwpn),
 DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size,
diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
index ee99b19e7a..1b837f370a 100644
--- a/hw/scsi/vhost-user-scsi.c
+++ b/hw/scsi/vhost-user-scsi.c
@@ -161,6 +161,8 @@ static void vhost_user_scsi_unrealize(DeviceState *dev)
 static Property vhost_user_scsi_properties[] = {
 DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev),
 DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSICommon, auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, conf.virtqueue_size,
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 45b95ea070..2ec13032aa 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -1279,6 +1279,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev)
 }
 
 static Property virtio_scsi_properties[] = {
+DEFINE_PROP_BOOL("auto_num_queues", VirtIOSCSI, parent_obj.auto_num_queues,
+ true),
 DEFINE_PROP_UINT32("num_queues", VirtIOSCSI, parent_obj.conf.num_queues,
VIRTIO_SCSI_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSI,
diff --git a/hw/virtio/vhost-scsi-pci.c b/hw/virtio/vhost-scsi-pci.c
index 08980bc23b..927c155278 100644
--- a/hw/virtio/vhost-scsi-pci.c
+++ b/hw/virtio/vhost-scsi-pci.c
@@ -51,8 +51,15 @@ static void vhost_scsi_pci_realize(VirtIOPCIProxy *vpci_dev, 
Error **errp)
 VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf;
 
 if (conf->num_queues == VIRTIO_SCSI_AUTO_NUM_QUEUES) {
-conf->num_queues =
-virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.parent_obj.parent_obj.auto_num_queues)
+conf->num_queues =
+virtio_pci_optimal_num_queues(VIRTIO_SCSI_VQ_NUM_FIXED);
+else
+conf->num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/hw/virtio/vhost-user-scsi-pci.c b/hw/virtio/vhost-user-scsi-pci.c
index 75882e3cf9..9c521a7f93 100644
--- a/hw/virtio/vhost-user-scsi-pci.c
+++ b/hw/virtio/vhost-user-scsi-pci.c
@@ -57,8 +57,15 @@ static void vhost_user_scsi_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 VirtIOSCSIConf *conf = &dev->vdev.parent_obj.parent_obj.conf;
 
 if (conf-

[PATCH QEMU v2 3/3] vhost-user-blk-pci: introduce auto-num-queues property

2023-08-10 Thread ~hyman

From: Hyman Huang(黄勇) 

Commit "a4eef0711b vhost-user-blk-pci: default num_queues to -smp N"
implment sizing the number of vhost-user-blk-pci request virtqueues
to match the number of vCPUs automatically. Which improves IO
preformance remarkably.

To enable this feature for the existing VMs, the cloud platform
may migrate VMs from the source hypervisor (num_queues is set to
1 by default) to the destination hypervisor (num_queues is set to
-smp N) lively. The different num-queues for vhost-user-blk-pci
devices between the source side and the destination side will
result in migration failure due to loading vmstate incorrectly
on the destination side.

To provide a smooth upgrade solution, introduce the
auto-num-queues property for the vhost-user-blk-pci device. This
allows upper APPs, e.g., libvirt, to recognize the hypervisor's
capability of allocating the virtqueues automatically by probing
the vhost-user-blk-pci.auto-num-queues property. Basing on which,
upper APPs can ensure to allocate the same num-queues on the
destination side in case of migration failure.

Signed-off-by: Hyman Huang(黄勇) 
---
 hw/block/vhost-user-blk.c  | 1 +
 hw/virtio/vhost-user-blk-pci.c | 9 -
 include/hw/virtio/vhost-user-blk.h | 5 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index eecf3f7a81..34e23b1727 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -566,6 +566,7 @@ static const VMStateDescription vmstate_vhost_user_blk = {
 
 static Property vhost_user_blk_properties[] = {
 DEFINE_PROP_CHR("chardev", VHostUserBlk, chardev),
+DEFINE_PROP_BOOL("auto-num-queues", VHostUserBlk, auto_num_queues, true),
 DEFINE_PROP_UINT16("num-queues", VHostUserBlk, num_queues,
VHOST_USER_BLK_AUTO_NUM_QUEUES),
 DEFINE_PROP_UINT32("queue-size", VHostUserBlk, queue_size, 128),
diff --git a/hw/virtio/vhost-user-blk-pci.c b/hw/virtio/vhost-user-blk-pci.c
index eef8641a98..f7776e928a 100644
--- a/hw/virtio/vhost-user-blk-pci.c
+++ b/hw/virtio/vhost-user-blk-pci.c
@@ -56,7 +56,14 @@ static void vhost_user_blk_pci_realize(VirtIOPCIProxy 
*vpci_dev, Error **errp)
 DeviceState *vdev = DEVICE(&dev->vdev);
 
 if (dev->vdev.num_queues == VHOST_USER_BLK_AUTO_NUM_QUEUES) {
-dev->vdev.num_queues = virtio_pci_optimal_num_queues(0);
+/*
+ * Allocate virtqueues automatically only if auto_num_queues
+ * property set true.
+ */
+if (dev->vdev.auto_num_queues)
+dev->vdev.num_queues = virtio_pci_optimal_num_queues(0);
+else
+dev->vdev.num_queues = 1;
 }
 
 if (vpci_dev->nvectors == DEV_NVECTORS_UNSPECIFIED) {
diff --git a/include/hw/virtio/vhost-user-blk.h 
b/include/hw/virtio/vhost-user-blk.h
index ea085ee1ed..e6f0515bc6 100644
--- a/include/hw/virtio/vhost-user-blk.h
+++ b/include/hw/virtio/vhost-user-blk.h
@@ -50,6 +50,11 @@ struct VHostUserBlk {
 bool connected;
 /* vhost_user_blk_start/vhost_user_blk_stop */
 bool started_vu;
+/*
+ * Set to true if virtqueues allow to be allocated to
+ * match the number of virtual CPUs automatically.
+ */
+bool auto_num_queues;
 };
 
 #endif
-- 
2.38.5

[PATCH QEMU v8 1/9] softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

dirty_rate paraemter of hmp command "set_vcpu_dirty_limit" is invalid
if less than 0, so add parameter check for it.

Note that this patch also delete the unsolicited help message and
clean up the code.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 softmmu/dirtylimit.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 015a9038d1..5c12d26d49 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -515,14 +515,15 @@ void hmp_set_vcpu_dirty_limit(Monitor *mon, const QDict 
*qdict)
 int64_t cpu_index = qdict_get_try_int(qdict, "cpu_index", -1);
 Error *err = NULL;
 
-qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
-if (err) {
-hmp_handle_error(mon, err);
-return;
+if (dirty_rate < 0) {
+error_setg(&err, "invalid dirty page limit %ld", dirty_rate);
+goto out;
 }
 
-monitor_printf(mon, "[Please use 'info vcpu_dirty_limit' to query "
-   "dirty limit for virtual CPU]\n");
+qmp_set_vcpu_dirty_limit(!!(cpu_index != -1), cpu_index, dirty_rate, &err);
+
+out:
+hmp_handle_error(mon, err);
 }
 
 static struct DirtyLimitInfo *dirtylimit_query_vcpu(int cpu_index)
-- 
2.38.5

[PATCH QEMU v8 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 28 +++
 qapi/migration.json| 35 +++---
 3 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..352e9ec716 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 }
 }
 }
+
+monitor_printf(mon, "%s: %" PRIu64 " ms\n",
+MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
+params->x_vcpu_dirty_limit_period);
 }
 
 qapi_free_MigrationParameters(params);
@@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 error_setg(&err, "The block-bitmap-mapping parameter can only be set "
"through QMP");
 break;
+case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD:
+p->has_x_vcpu_dirty_limit_period = true;
+visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..1de63ba775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -80,6 +80,8 @@
 #define DEFINE_PROP_MIG_CAP(name, x) \
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
  store_global_state, true),
@@ -163,6 +165,9 @@ Property migration_properties[] = {
 DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
 DEFINE_PROP_STRING("tls-hostname", MigrationState, 
parameters.tls_hostname),
 DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
+   parameters.x_vcpu_dirty_limit_period,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
s->parameters.block_bitmap_mapping);
 }
 
+params->has_x_vcpu_dirty_limit_period = true;
+params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+
 return params;
 }
 
@@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_max = true;
 params->has_announce_rounds = true;
 params->has_announce_step = true;
+params->has_x_vcpu_dirty_limit_period = true;
 }
 
 /*
@@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 }
 #endif
 
+if (params->has_x_vcpu_dirty_limit_period &&
+(params->x_vcpu_dirty_limit_period < 1 ||
+ params->x_vcpu_dirty_limit_period > 1000)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "x-vcpu-dirty-limit-period",
+   "a value between 1 and 1000");
+return false;
+}
+
 return true;
 }
 
@@ -1199,6 +1217,11 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->has_block_bitmap_mapping = true;
 dest->block_bitmap_mapping = params->block_bitmap_mapping;
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+dest->x_vcpu_dirty_limit_period =
+params->x_vcpu_dirty_limit_period;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 QAPI_CLONE(BitmapMigrationNodeAliasList,
params->block_bitmap_mapping);
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+s->

[PATCH QEMU v8 4/9] migration: Introduce dirty-limit capability

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/options.c  | 24 
 migration/options.h  |  1 +
 qapi/migration.json  | 12 +++-
 softmmu/dirtylimit.c | 12 +++-
 4 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/migration/options.c b/migration/options.c
index 7d2d98830e..631c12cf32 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "ram.h"
 #include "options.h"
+#include "sysemu/kvm.h"
 
 /* Maximum migrate downtime set to 2000 seconds */
 #define MAX_MIGRATE_DOWNTIME_SECONDS 2000
@@ -196,6 +197,8 @@ Property migration_properties[] = {
 #endif
 DEFINE_PROP_MIG_CAP("x-switchover-ack",
 MIGRATION_CAPABILITY_SWITCHOVER_ACK),
+DEFINE_PROP_MIG_CAP("x-dirty-limit",
+MIGRATION_CAPABILITY_DIRTY_LIMIT),
 
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void)
 return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_dirty_limit(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT];
+}
+
 bool migrate_events(void)
 {
 MigrationState *s = migrate_get_current();
@@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, 
Error **errp)
 }
 }
 
+if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) {
+if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) {
+error_setg(errp, "dirty-limit conflicts with auto-converge"
+   " either of then available currently");
+return false;
+}
+
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "dirty-limit requires KVM with accelerator"
+   " property 'dirty-ring-size' set");
+return false;
+}
+}
+
 return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..b5a950d4e4 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -24,6 +24,7 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
+bool migrate_dirty_limit(void);
 bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index e43371955a..031832cde5 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -497,6 +497,15 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
+# @dirty-limit: If enabled, migration will use the dirty-limit
+# algorithm to throttle down guest instead of auto-converge
+# algorithm. This algorithm only works when vCPU's dirtyrate
+# greater than 'vcpu-dirty-limit', read processes in guest os
+# aren't penalized any more, so the algorithm can improve
+# performance of vCPU during live migration. This is an optional
+# performance feature and should not affect the correctness of the
+# existing auto-converge algorithm. (since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -512,7 +521,8 @@
'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
-   'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
+   'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
+   'dirty-limit'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5c12d26d49..953ef934bc 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -24,6 +24,9 @@
 #include "hw/boards.h"
 #include "sysemu/kvm.h"
 #include "trace.h"
+#i

[PATCH QEMU v8 5/9] migration: Refactor auto-converge capability logic

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5283a75f02..78746849b5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs)
 /* During block migration the auto-converge logic incorrectly detects
  * that ram migration makes no progress. Avoid this by disabling the
  * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
+if (blk_mig_bulk_active()) {
+return;
+}
+
+if (migrate_auto_converge()) {
 /* The following detection logic can be refined later. For now:
Check to see if the ratio between dirtied bytes and the approx.
amount of bytes that just got transferred since the last time
-- 
2.38.5

[PATCH QEMU v8 6/9] migration: Put the detection logic before auto-converge checking

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 78746849b5..b6559f9312 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs)
 return;
 }
 
-if (migrate_auto_converge()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+if (migrate_auto_converge()) {
 trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
 }
-- 
2.38.5

[PATCH QEMU v8 0/9] migration: introduce dirtylimit capability

2023-07-06 Thread ~hyman

Hi, Juan and Markus, thanks for reviewing the previous
versions and please review the latest version if you have time :)

Yong

v8:
1. Rebase on master and refactor the docs suggested by Markus

v7:
1. Rebase on master and fix conflicts

v6:
1. Rebase on master
2. Split the commit "Implement dirty-limit convergence algo" into two as
Juan suggested as the following:
a. Put the detection logic before auto-converge checking
b. Implement dirty-limit convergence algo
3. Put the detection logic before auto-converge checking
4. Sort the migrate_dirty_limit function in commit
"Introduce dirty-limit capability" suggested by Juan
5. Substitute the the int64_t to uint64_t in the last 2 commits
6. Fix the comments spell mistake
7. Add helper function in the commit
"Implement dirty-limit convergence algo" suggested by Juan

v5:
1. Rebase on master and enrich the comment for "dirty-limit" capability,
suggesting by Markus.
2. Drop commits that have already been merged.

v4:
1. Polish the docs and update the release version suggested by Markus
2. Rename the migrate exported info "dirty-limit-throttle-time-per-
round"
   to "dirty-limit-throttle-time-per-full".

v3(resend):
- fix the syntax error of the topic.

v3:
This version make some modifications inspired by Peter and Markus
as following:
1. Do the code clean up in [PATCH v2 02/11] suggested by Markus
2. Replace the [PATCH v2 03/11] with a much simpler patch posted by
   Peter to fix the following bug:
   https://bugzilla.redhat.com/show_bug.cgi?id=2124756
3. Fix the error path of migrate_params_check in [PATCH v2 04/11]
   pointed out by Markus. Enrich the commit message to explain why
   x-vcpu-dirty-limit-period an unstable parameter.
4. Refactor the dirty-limit convergence algo in [PATCH v2 07/11]
   suggested by Peter:
   a. apply blk_mig_bulk_active check before enable dirty-limit
   b. drop the unhelpful check function before enable dirty-limit
   c. change the migration_cancel logic, just cancel dirty-limit
  only if dirty-limit capability turned on.
   d. abstract a code clean commit [PATCH v3 07/10] to adjust
  the check order before enable auto-converge
5. Change the name of observing indexes during dirty-limit live
   migration to make them more easy-understanding. Use the
   maximum throttle time of vpus as "dirty-limit-throttle-time-per-full"
6. Fix some grammatical and spelling errors pointed out by Markus
   and enrich the document about the dirty-limit live migration
   observing indexes "dirty-limit-ring-full-time"
   and "dirty-limit-throttle-time-per-full"
7. Change the default value of x-vcpu-dirty-limit-period to 1000ms,
   which is optimal value pointed out in cover letter in that
   testing environment.
8. Drop the 2 guestperf test commits [PATCH v2 10/11],
   [PATCH v2 11/11] and post them with a standalone series in the
   future.

v2:
This version make a little bit modifications comparing with
version 1 as following:
1. fix the overflow issue reported by Peter Maydell
2. add parameter check for hmp "set_vcpu_dirty_limit" command
3. fix the racing issue between dirty ring reaper thread and
   Qemu main thread.
4. add migrate parameter check for x-vcpu-dirty-limit-period
   and vcpu-dirty-limit.
5. add the logic to forbid hmp/qmp commands set_vcpu_dirty_limit,
   cancel_vcpu_dirty_limit during dirty-limit live migration when
   implement dirty-limit convergence algo.
6. add capability check to ensure auto-converge and dirty-limit
   are mutually exclusive.
7. pre-check if kvm dirty ring size is configured before setting
   dirty-limit migrate parameter

Hyman Huang(黄勇) (9):
  softmmu/dirtylimit: Add parameter check for hmp "set_vcpu_dirty_limit"
  qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
  qapi/migration: Introduce vcpu-dirty-limit parameters
  migration: Introduce dirty-limit capability
  migration: Refactor auto-converge capability logic
  migration: Put the detection logic before auto-converge checking
  migration: Implement dirty-limit convergence algo
  migration: Extend query-migrate to provide dirty page limit info
  tests: Add migration dirty-limit capability test

 include/sysemu/dirtylimit.h|   2 +
 migration/migration-hmp-cmds.c |  26 ++
 migration/migration.c  |  13 +++
 migration/options.c|  73 
 migration/options.h|   1 +
 migration/ram.c|  61 ++---
 migration/trace-events |   1 +
 qapi/migration.json|  75 ++--
 softmmu/dirtylimit.c   |  91 +--
 tests/qtest/migration-test.c   | 155 +
 10 files changed, 473 insertions(+), 25 deletions(-)

-- 
2.38.5

[PATCH QEMU v8 3/9] qapi/migration: Introduce vcpu-dirty-limit parameters

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 21 +
 qapi/migration.json| 18 +++---
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 352e9ec716..35e8020bbf 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -368,6 +368,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %" PRIu64 " ms\n",
 MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
 params->x_vcpu_dirty_limit_period);
+
+monitor_printf(mon, "%s: %" PRIu64 " MB/s\n",
+MigrationParameter_str(MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT),
+params->vcpu_dirty_limit);
 }
 
 qapi_free_MigrationParameters(params);
@@ -628,6 +632,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 p->has_x_vcpu_dirty_limit_period = true;
 visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
 break;
+case MIGRATION_PARAMETER_VCPU_DIRTY_LIMIT:
+p->has_vcpu_dirty_limit = true;
+visit_type_size(v, param, &p->vcpu_dirty_limit, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 1de63ba775..7d2d98830e 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -81,6 +81,7 @@
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
 #define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT1   /* MB/s */
 
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
@@ -168,6 +169,9 @@ Property migration_properties[] = {
 DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
parameters.x_vcpu_dirty_limit_period,
DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
+DEFINE_PROP_UINT64("vcpu-dirty-limit", MigrationState,
+   parameters.vcpu_dirty_limit,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -915,6 +919,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 
 params->has_x_vcpu_dirty_limit_period = true;
 params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+params->has_vcpu_dirty_limit = true;
+params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit;
 
 return params;
 }
@@ -949,6 +955,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_rounds = true;
 params->has_announce_step = true;
 params->has_x_vcpu_dirty_limit_period = true;
+params->has_vcpu_dirty_limit = true;
 }
 
 /*
@@ -1118,6 +1125,14 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 return false;
 }
 
+if (params->has_vcpu_dirty_limit &&
+(params->vcpu_dirty_limit < 1)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "vcpu_dirty_limit",
+   "is invalid, it must greater then 1 MB/s");
+return false;
+}
+
 return true;
 }
 
@@ -1222,6 +1237,9 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1345,6 +1363,9 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 s->parameters.x_vcpu_dirty_limit_period =
 params->x_vcpu_dirty_limit_period;
 }
+if (params->has_vcpu_dirty_limit) {
+s->parameters.vcpu_dirty_limit = params->vcpu_dirty_limit;
+}
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
diff --git a/qapi/migration.json b/qapi/migration.json
index 2041d336d5..e43371955a 100644
--- a/qapi/migration.jso

[PATCH QEMU v8 9/9] tests: Add migration dirty-limit capability test

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Add migration dirty-limit capability test if kernel support
dirty ring.

Migration dirty-limit capability introduce dirty limit
capability, two parameters: x-vcpu-dirty-limit-period and
vcpu-dirty-limit are introduced to implement the live
migration with dirty limit.

The test case does the following things:
1. start src, dst vm and enable dirty-limit capability
2. start migrate and set cancel it to check if dirty limit
   stop working.
3. restart dst vm
4. start migrate and enable dirty-limit capability
5. check if migration satisfy the convergence condition
   during pre-switchover phase.

Signed-off-by: Hyman Huang(黄勇) 
---
 tests/qtest/migration-test.c | 155 +++
 1 file changed, 155 insertions(+)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index b9cc194100..f55f95c9b0 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2636,6 +2636,159 @@ static void test_vcpu_dirty_limit(void)
 dirtylimit_stop_vm(vm);
 }
 
+static void migrate_dirty_limit_wait_showup(QTestState *from,
+const int64_t period,
+const int64_t value)
+{
+/* Enable dirty limit capability */
+migrate_set_capability(from, "dirty-limit", true);
+
+/* Set dirty limit parameters */
+migrate_set_parameter_int(from, "x-vcpu-dirty-limit-period", period);
+migrate_set_parameter_int(from, "vcpu-dirty-limit", value);
+
+/* Make sure migrate can't converge */
+migrate_ensure_non_converge(from);
+
+/* To check limit rate after precopy */
+migrate_set_capability(from, "pause-before-switchover", true);
+
+/* Wait for the serial output from the source */
+wait_for_serial("src_serial");
+}
+
+/*
+ * This test does:
+ *  source   target
+ *   migrate_incoming
+ * migrate
+ * migrate_cancel
+ *   restart target
+ * migrate
+ *
+ *  And see that if dirty limit works correctly
+ */
+static void test_migrate_dirty_limit(void)
+{
+g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+int64_t remaining;
+uint64_t throttle_us_per_full;
+/*
+ * We want the test to be stable and as fast as possible.
+ * E.g., with 1Gb/s bandwith migration may pass without dirty limit,
+ * so we need to decrease a bandwidth.
+ */
+const int64_t dirtylimit_period = 1000, dirtylimit_value = 50;
+const int64_t max_bandwidth = 4; /* ~400Mb/s */
+const int64_t downtime_limit = 250; /* 250ms */
+/*
+ * We migrate through unix-socket (> 500Mb/s).
+ * Thus, expected migration speed ~= bandwidth limit (< 500Mb/s).
+ * So, we can predict expected_threshold
+ */
+const int64_t expected_threshold = max_bandwidth * downtime_limit / 1000;
+int max_try_count = 10;
+MigrateCommon args = {
+.start = {
+.hide_stderr = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Start src, dst vm */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Prepare for dirty limit migration and wait src vm show up */
+migrate_dirty_limit_wait_showup(from, dirtylimit_period, dirtylimit_value);
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}");
+
+/* Wait for dirty limit throttle begin */
+throttle_us_per_full = 0;
+while (throttle_us_per_full == 0) {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+}
+
+/* Now cancel migrate and wait for dirty limit throttle switch off */
+migrate_cancel(from);
+wait_for_migration_status(from, "cancelled", NULL);
+
+/* Check if dirty limit throttle switched off, set timeout 1ms */
+do {
+throttle_us_per_full =
+read_migrate_property_int(from, "dirty-limit-throttle-time-per-round");
+usleep(100);
+g_assert_false(got_src_stop);
+} while (throttle_us_per_full != 0 && --max_try_count);
+
+/* Assert dirty limit is not in service */
+g_assert_cmpint(throttle_us_per_full, ==, 0);
+
+args = (MigrateCommon) {
+.start = {
+.only_target = true,
+.use_dirty_ring = true,
+},
+.listen_uri = uri,
+.connect_uri = uri,
+};
+
+/* Restart dst vm, src vm already show up so we needn't wait anymore */
+if (test_migrate_start(&from, &to, args.listen_uri, &args.start)) {
+return;
+}
+
+/* Start migrate */
+migrate_qmp(from, uri, "{}&

[PATCH QEMU v8 8/9] migration: Extend query-migrate to provide dirty page limit info

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 include/sysemu/dirtylimit.h|  2 ++
 migration/migration-hmp-cmds.c | 10 +
 migration/migration.c  | 10 +
 qapi/migration.json| 16 +-
 softmmu/dirtylimit.c   | 39 ++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3a6b..d11edb 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
 void dirtylimit_set_all(uint64_t quota,
 bool enable);
 void dirtylimit_vcpu_execute(CPUState *cpu);
+uint64_t dirtylimit_throttle_time_per_round(void);
+uint64_t dirtylimit_ring_full_time(void);
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 35e8020bbf..c115ef2d23 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->cpu_throttle_percentage);
 }
 
+if (info->has_dirty_limit_throttle_time_per_round) {
+monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n",
+   info->dirty_limit_throttle_time_per_round);
+}
+
+if (info->has_dirty_limit_ring_full_time) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n",
+   info->dirty_limit_ring_full_time);
+}
+
 if (info->has_postcopy_blocktime) {
 monitor_printf(mon, "postcopy blocktime: %u\n",
info->postcopy_blocktime);
diff --git a/migration/migration.c b/migration/migration.c
index a3791900fd..a4dcaa3c91 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
 
 static NotifierList migration_state_notifiers =
 NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->dirty_pages_rate =
stat64_get(&mig_stats.dirty_pages_rate);
 }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_time_per_round = true;
+info->dirty_limit_throttle_time_per_round =
+dirtylimit_throttle_time_per_round();
+
+info->has_dirty_limit_ring_full_time = true;
+info->dirty_limit_ring_full_time = dirtylimit_ring_full_time();
+}
 }
 
 static void populate_disk_info(MigrationInfo *info)
diff --git a/qapi/migration.json b/qapi/migration.json
index 031832cde5..97f7d0fd3c 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -250,6 +250,18 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration. (since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) each dirty ring full round. The value
+# equals dirty ring memory size divided by average dirty page
+# rate of the virtual CPU, which can be used to observe the
+# average memory load of the virtual CPU indirectly. Note that
+# zero means guest doesn't dirty memory (since 8.1)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationInfo',
@@ -267,7 +279,9 @@
'*postcopy-blocktime' : 'uint32',
'*postcopy-vcpu-blocktime': ['uint32'],
'*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-time-per-round': 'uint64',
+   '*dirty-limit-ring-full-time': 'uint64'} }
 
 ##
 # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5134296667..a0686323e5 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -565,6 +565,45 @@ out:
 hmp_handle_error(mon, err);
 }
 
+/* Return the max throttle time of each virtual CPU */
+uint64_t dirtylimit_throttle_time_per_round(void)
+{
+CPUState *cpu;
+

[PATCH QEMU v8 7/9] migration: Implement dirty-limit convergence algo

2023-07-06 Thread ~hyman

From: Hyman Huang(黄勇) 

Implement dirty-limit convergence algo for live migration,
which is kind of like auto-converge algo but using dirty-limit
instead of cpu throttle to make migration convergent.

Enable dirty page limit if dirty_rate_high_cnt greater than 2
when dirty-limit capability enabled, Disable dirty-limit if
migration be canceled.

Note that "set_vcpu_dirty_limit", "cancel_vcpu_dirty_limit"
commands are not allowed during dirty-limit live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration.c  |  3 +++
 migration/ram.c| 36 
 migration/trace-events |  1 +
 softmmu/dirtylimit.c   | 29 +
 4 files changed, 69 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 096e8191d1..a3791900fd 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -166,6 +166,9 @@ void migration_cancel(const Error *error)
 if (error) {
 migrate_set_error(current_migration, error);
 }
+if (migrate_dirty_limit()) {
+qmp_cancel_vcpu_dirty_limit(false, -1, NULL);
+}
 migrate_fd_cancel(current_migration);
 }
 
diff --git a/migration/ram.c b/migration/ram.c
index b6559f9312..8a86363216 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -46,6 +46,7 @@
 #include "qapi/error.h"
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
+#include "qapi/qapi-commands-migration.h"
 #include "qapi/qmp/qerror.h"
 #include "trace.h"
 #include "exec/ram_addr.h"
@@ -59,6 +60,8 @@
 #include "multifd.h"
 #include "sysemu/runstate.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
+#include "sysemu/kvm.h"
 
 #include "hw/boards.h" /* for machine_dump_guest_core() */
 
@@ -984,6 +987,37 @@ static void migration_update_rates(RAMState *rs, int64_t 
end_time)
 }
 }
 
+/*
+ * Enable dirty-limit to throttle down the guest
+ */
+static void migration_dirty_limit_guest(void)
+{
+/*
+ * dirty page rate quota for all vCPUs fetched from
+ * migration parameter 'vcpu_dirty_limit'
+ */
+static int64_t quota_dirtyrate;
+MigrationState *s = migrate_get_current();
+
+/*
+ * If dirty limit already enabled and migration parameter
+ * vcpu-dirty-limit untouched.
+ */
+if (dirtylimit_in_service() &&
+quota_dirtyrate == s->parameters.vcpu_dirty_limit) {
+return;
+}
+
+quota_dirtyrate = s->parameters.vcpu_dirty_limit;
+
+/*
+ * Set all vCPU a quota dirtyrate, note that the second
+ * parameter will be ignored if setting all vCPU for the vm
+ */
+qmp_set_vcpu_dirty_limit(false, -1, quota_dirtyrate, NULL);
+trace_migration_dirty_limit_guest(quota_dirtyrate);
+}
+
 static void migration_trigger_throttle(RAMState *rs)
 {
 uint64_t threshold = migrate_throttle_trigger_threshold();
@@ -1013,6 +1047,8 @@ static void migration_trigger_throttle(RAMState *rs)
 trace_migration_throttle();
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
+} else if (migrate_dirty_limit()) {
+migration_dirty_limit_guest();
 }
 }
 }
diff --git a/migration/trace-events b/migration/trace-events
index 5259c1044b..580895e86e 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -93,6 +93,7 @@ migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, 
unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
 migration_throttle(void) ""
+migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" 
PRIi64 " MB/s"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: 
%" PRIx64 " %zx"
 ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: 
addr: 0x%" PRIx64 " flags: 0x%x host: %p"
 ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d 
addr=0x%" PRIx64 " flags=0x%x"
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 953ef934bc..5134296667 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -436,6 +436,23 @@ static void dirtylimit_cleanup(void)
 dirtylimit_state_finalize();
 }
 
+/*
+ * dirty page rate limit is not allowed to set if migration
+ * is running with dirty-limit capability enabled.
+ */
+static bool dirtylimit_is_allowed(void)
+{
+MigrationState *ms = migrate_get_current();
+
+if (migration_is_running(ms->state) &&
+(!qemu_thread_is_

[PATCH QEMU v7 2/9] qapi/migration: Introduce x-vcpu-dirty-limit-period parameter

2023-07-04 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 migration/migration-hmp-cmds.c |  8 
 migration/options.c| 28 
 qapi/migration.json| 34 +++---
 3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..352e9ec716 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -364,6 +364,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 }
 }
 }
+
+monitor_printf(mon, "%s: %" PRIu64 " ms\n",
+MigrationParameter_str(MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD),
+params->x_vcpu_dirty_limit_period);
 }
 
 qapi_free_MigrationParameters(params);
@@ -620,6 +624,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 error_setg(&err, "The block-bitmap-mapping parameter can only be set "
"through QMP");
 break;
+case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD:
+p->has_x_vcpu_dirty_limit_period = true;
+visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err);
+break;
 default:
 assert(0);
 }
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..1de63ba775 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -80,6 +80,8 @@
 #define DEFINE_PROP_MIG_CAP(name, x) \
 DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false)
 
+#define DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD 1000/* milliseconds */
+
 Property migration_properties[] = {
 DEFINE_PROP_BOOL("store-global-state", MigrationState,
  store_global_state, true),
@@ -163,6 +165,9 @@ Property migration_properties[] = {
 DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
 DEFINE_PROP_STRING("tls-hostname", MigrationState, 
parameters.tls_hostname),
 DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
+   parameters.x_vcpu_dirty_limit_period,
+   DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
 
 /* Migration capabilities */
 DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -908,6 +913,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
s->parameters.block_bitmap_mapping);
 }
 
+params->has_x_vcpu_dirty_limit_period = true;
+params->x_vcpu_dirty_limit_period = 
s->parameters.x_vcpu_dirty_limit_period;
+
 return params;
 }
 
@@ -940,6 +948,7 @@ void migrate_params_init(MigrationParameters *params)
 params->has_announce_max = true;
 params->has_announce_rounds = true;
 params->has_announce_step = true;
+params->has_x_vcpu_dirty_limit_period = true;
 }
 
 /*
@@ -1100,6 +1109,15 @@ bool migrate_params_check(MigrationParameters *params, 
Error **errp)
 }
 #endif
 
+if (params->has_x_vcpu_dirty_limit_period &&
+(params->x_vcpu_dirty_limit_period < 1 ||
+ params->x_vcpu_dirty_limit_period > 1000)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   "x-vcpu-dirty-limit-period",
+   "a value between 1 and 1000");
+return false;
+}
+
 return true;
 }
 
@@ -1199,6 +1217,11 @@ static void 
migrate_params_test_apply(MigrateSetParameters *params,
 dest->has_block_bitmap_mapping = true;
 dest->block_bitmap_mapping = params->block_bitmap_mapping;
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+dest->x_vcpu_dirty_limit_period =
+params->x_vcpu_dirty_limit_period;
+}
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1317,6 +1340,11 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 QAPI_CLONE(BitmapMigrationNodeAliasList,
params->block_bitmap_mapping);
 }
+
+if (params->has_x_vcpu_dirty_limit_period) {
+s->

[PATCH QEMU v7 6/9] migration: Put the detection logic before auto-converge checking

2023-07-04 Thread ~hyman

From: Hyman Huang(黄勇) 

This commit is prepared for the implementation of dirty-limit
convergence algo.

The detection logic of throttling condition can apply to both
auto-converge and dirty-limit algo, putting it's position
before the checking logic for auto-converge feature.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 78746849b5..b6559f9312 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -999,17 +999,18 @@ static void migration_trigger_throttle(RAMState *rs)
 return;
 }
 
-if (migrate_auto_converge()) {
-/* The following detection logic can be refined later. For now:
-   Check to see if the ratio between dirtied bytes and the approx.
-   amount of bytes that just got transferred since the last time
-   we were in this routine reaches the threshold. If that happens
-   twice, start or increase throttling. */
-
-if ((bytes_dirty_period > bytes_dirty_threshold) &&
-(++rs->dirty_rate_high_cnt >= 2)) {
+/*
+ * The following detection logic can be refined later. For now:
+ * Check to see if the ratio between dirtied bytes and the approx.
+ * amount of bytes that just got transferred since the last time
+ * we were in this routine reaches the threshold. If that happens
+ * twice, start or increase throttling.
+ */
+if ((bytes_dirty_period > bytes_dirty_threshold) &&
+(++rs->dirty_rate_high_cnt >= 2)) {
+rs->dirty_rate_high_cnt = 0;
+if (migrate_auto_converge()) {
 trace_migration_throttle();
-rs->dirty_rate_high_cnt = 0;
 mig_throttle_guest_down(bytes_dirty_period,
 bytes_dirty_threshold);
 }
-- 
2.38.5

[PATCH QEMU v7 8/9] migration: Extend query-migrate to provide dirty page limit info

2023-07-04 Thread ~hyman

From: Hyman Huang(黄勇) 

Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) 
Reviewed-by: Markus Armbruster 
Reviewed-by: Juan Quintela 
---
 include/sysemu/dirtylimit.h|  2 ++
 migration/migration-hmp-cmds.c | 10 +
 migration/migration.c  | 10 +
 qapi/migration.json| 16 +-
 softmmu/dirtylimit.c   | 39 ++
 5 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/dirtylimit.h b/include/sysemu/dirtylimit.h
index 8d2c1f3a6b..d11edb 100644
--- a/include/sysemu/dirtylimit.h
+++ b/include/sysemu/dirtylimit.h
@@ -34,4 +34,6 @@ void dirtylimit_set_vcpu(int cpu_index,
 void dirtylimit_set_all(uint64_t quota,
 bool enable);
 void dirtylimit_vcpu_execute(CPUState *cpu);
+uint64_t dirtylimit_throttle_time_per_round(void);
+uint64_t dirtylimit_ring_full_time(void);
 #endif
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 35e8020bbf..c115ef2d23 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -190,6 +190,16 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->cpu_throttle_percentage);
 }
 
+if (info->has_dirty_limit_throttle_time_per_round) {
+monitor_printf(mon, "dirty-limit throttle time: %" PRIu64 " us\n",
+   info->dirty_limit_throttle_time_per_round);
+}
+
+if (info->has_dirty_limit_ring_full_time) {
+monitor_printf(mon, "dirty-limit ring full time: %" PRIu64 " us\n",
+   info->dirty_limit_ring_full_time);
+}
+
 if (info->has_postcopy_blocktime) {
 monitor_printf(mon, "postcopy blocktime: %u\n",
info->postcopy_blocktime);
diff --git a/migration/migration.c b/migration/migration.c
index a3791900fd..a4dcaa3c91 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -64,6 +64,7 @@
 #include "yank_functions.h"
 #include "sysemu/qtest.h"
 #include "options.h"
+#include "sysemu/dirtylimit.h"
 
 static NotifierList migration_state_notifiers =
 NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -974,6 +975,15 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->dirty_pages_rate =
stat64_get(&mig_stats.dirty_pages_rate);
 }
+
+if (migrate_dirty_limit() && dirtylimit_in_service()) {
+info->has_dirty_limit_throttle_time_per_round = true;
+info->dirty_limit_throttle_time_per_round =
+dirtylimit_throttle_time_per_round();
+
+info->has_dirty_limit_ring_full_time = true;
+info->dirty_limit_ring_full_time = dirtylimit_ring_full_time();
+}
 }
 
 static void populate_disk_info(MigrationInfo *info)
diff --git a/qapi/migration.json b/qapi/migration.json
index cc51835cdd..ebc15e2782 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -250,6 +250,18 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
+# @dirty-limit-throttle-time-per-round: Maximum throttle time (in 
microseconds) of virtual
+#   CPUs each dirty ring full round, which 
shows how
+#   MigrationCapability dirty-limit 
affects the guest
+#   during live migration. (since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in 
microseconds)
+#  each dirty ring full round, note that the value 
equals
+#  dirty ring memory size divided by average dirty 
page rate
+#  of virtual CPU, which can be used to observe 
the average
+#  memory load of virtual CPU indirectly. Note 
that zero
+#  means guest doesn't dirty memory (since 8.1)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationInfo',
@@ -267,7 +279,9 @@
'*postcopy-blocktime' : 'uint32',
'*postcopy-vcpu-blocktime': ['uint32'],
'*compression': 'CompressionStats',
-   '*socket-address': ['SocketAddress'] } }
+   '*socket-address': ['SocketAddress'],
+   '*dirty-limit-throttle-time-per-round': 'uint64',
+   '*dirty-limit-ring-full-time': 'uint64'} }
 
 ##
 # @query-migrate:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5134296667..a0686323e5 100644
--- a/

[PATCH QEMU v7 5/9] migration: Refactor auto-converge capability logic

2023-07-04 Thread ~hyman

From: Hyman Huang(黄勇) 

Check if block migration is running before throttling
guest down in auto-converge way.

Note that this modification is kind of like code clean,
because block migration does not depend on auto-converge
capability, so the order of checks can be adjusted.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/ram.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5283a75f02..78746849b5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -995,7 +995,11 @@ static void migration_trigger_throttle(RAMState *rs)
 /* During block migration the auto-converge logic incorrectly detects
  * that ram migration makes no progress. Avoid this by disabling the
  * throttling logic during the bulk phase of block migration. */
-if (migrate_auto_converge() && !blk_mig_bulk_active()) {
+if (blk_mig_bulk_active()) {
+return;
+}
+
+if (migrate_auto_converge()) {
 /* The following detection logic can be refined later. For now:
Check to see if the ratio between dirtied bytes and the approx.
amount of bytes that just got transferred since the last time
-- 
2.38.5

[PATCH QEMU v7 4/9] migration: Introduce dirty-limit capability

2023-07-04 Thread ~hyman

From: Hyman Huang(黄勇) 

Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) 
Acked-by: Peter Xu 
Reviewed-by: Juan Quintela 
---
 migration/options.c  | 24 
 migration/options.h  |  1 +
 qapi/migration.json  | 13 -
 softmmu/dirtylimit.c | 12 +++-
 4 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/migration/options.c b/migration/options.c
index 7d2d98830e..8b4eb8c519 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -27,6 +27,7 @@
 #include "qemu-file.h"
 #include "ram.h"
 #include "options.h"
+#include "sysemu/kvm.h"
 
 /* Maximum migrate downtime set to 2000 seconds */
 #define MAX_MIGRATE_DOWNTIME_SECONDS 2000
@@ -196,6 +197,8 @@ Property migration_properties[] = {
 #endif
 DEFINE_PROP_MIG_CAP("x-switchover-ack",
 MIGRATION_CAPABILITY_SWITCHOVER_ACK),
+DEFINE_PROP_MIG_CAP("x-dirty-limit",
+MIGRATION_CAPABILITY_DIRTY_LIMIT),
 
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -242,6 +245,13 @@ bool migrate_dirty_bitmaps(void)
 return s->capabilities[MIGRATION_CAPABILITY_DIRTY_BITMAPS];
 }
 
+bool migrate_dirty_limit(void)
+{
+MigrationState *s = migrate_get_current();
+
+return s->capabilities[MIGRATION_CAPABILITY_DIRTY_LIMIT];
+}
+
 bool migrate_events(void)
 {
 MigrationState *s = migrate_get_current();
@@ -573,6 +583,20 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, 
Error **errp)
 }
 }
 
+if (new_caps[MIGRATION_CAPABILITY_DIRTY_LIMIT]) {
+if (new_caps[MIGRATION_CAPABILITY_AUTO_CONVERGE]) {
+error_setg(errp, "dirty-limit conflicts with auto-converge"
+   " either of then available currently");
+return false;
+}
+
+if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
+error_setg(errp, "dirty-limit requires KVM with accelerator"
+   " property 'dirty-ring-size' set");
+return false;
+}
+}
+
 return true;
 }
 
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..b5a950d4e4 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -24,6 +24,7 @@ extern Property migration_properties[];
 /* capabilities */
 
 bool migrate_auto_converge(void);
+bool migrate_dirty_limit(void);
 bool migrate_background_snapshot(void);
 bool migrate_block(void);
 bool migrate_colo(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index aa590dbf0e..cc51835cdd 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -497,6 +497,16 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
+# @dirty-limit: If enabled, migration will use the dirty-limit algo to
+#   throttle down guest instead of auto-converge algo.
+#   Throttle algo only works when vCPU's dirtyrate greater
+#   than 'vcpu-dirty-limit', read processes in guest os
+#   aren't penalized any more, so this algo can improve
+#   performance of vCPU during live migration. This is an
+#   optional performance feature and should not affect the
+#   correctness of the existing auto-converge algo.
+#   (since 8.1)
+#
 # Features:
 #
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
@@ -512,7 +522,8 @@
'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
'validate-uuid', 'background-snapshot',
-   'zero-copy-send', 'postcopy-preempt', 'switchover-ack'] }
+   'zero-copy-send', 'postcopy-preempt', 'switchover-ack',
+   'dirty-limit'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/softmmu/dirtylimit.c b/softmmu/dirtylimit.c
index 5c12d26d49..953ef934bc 100644
--- a/softmmu/dirtylimit.c
+++ b/softmmu/dirtylimit.c
@@ -24,6 +24,9 @@
 #include "hw/boards.h"
 #include

1 2 3 4 5 >

1 - 100 of 413 matches

Mail list logo