[PATCH v2] Fixed a QEMU hang when guest poweroff in COLO mode

2021-11-10 Thread Rao, Lei
From: "Rao, Lei" 

When the PVM guest poweroff, the COLO thread may wait a semaphore
in colo_process_checkpoint().So, we should wake up the COLO thread
before migration shutdown.

Signed-off-by: Lei Rao 
---
 include/migration/colo.h |  1 +
 migration/colo.c | 20 
 migration/migration.c|  6 ++
 3 files changed, 27 insertions(+)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 768e1f0..5fbe1a6 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,5 @@ COLOMode get_colo_mode(void);
 void colo_do_failover(void);
 
 void colo_checkpoint_notify(void *opaque);
+void colo_shutdown(void);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 2415325..0d3d98f 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -820,6 +820,26 @@ static void 
colo_wait_handle_message(MigrationIncomingState *mis,
 }
 }
 
+void colo_shutdown(void)
+{
+MigrationIncomingState *mis = NULL;
+MigrationState *s = NULL;
+
+switch (get_colo_mode()) {
+case COLO_MODE_PRIMARY:
+s = migrate_get_current();
+qemu_event_set(&s->colo_checkpoint_event);
+qemu_sem_post(&s->colo_exit_sem);
+break;
+case COLO_MODE_SECONDARY:
+mis = migration_incoming_get_current();
+qemu_sem_post(&mis->colo_incoming_sem);
+break;
+default:
+break;
+}
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
 MigrationIncomingState *mis = opaque;
diff --git a/migration/migration.c b/migration/migration.c
index abaf6f9..c0ab86e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -226,6 +226,12 @@ void migration_cancel(const Error *error)
 void migration_shutdown(void)
 {
 /*
+ * When the QEMU main thread exit, the COLO thread
+ * may wait a semaphore. So, we should wakeup the
+ * COLO thread before migration shutdown.
+ */
+colo_shutdown();
+/*
  * Cancel the current migration - that will (eventually)
  * stop the migration using this structure
  */
-- 
1.8.3.1




[PATCH] docs/COLO-FT.txt: Drop deprecated 'props' from object-add in COLO docs

2021-11-17 Thread Rao, Lei
From: "Rao, Lei" 

With the removal of deprecated 'props' from object-add in the commit of
"50243407457a9fb0ed17b9a9ba9fc9aee09495b1", we also should update COLO's
documents.

Signed-off-by: Lei Rao 
---
 docs/COLO-FT.txt | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index 8d6d53a..fd5ffcc 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -289,11 +289,11 @@ Wait until disk is synced, then:
 {'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0'}}
 {'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 
'replication0' } }
 
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'props': { 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } 
} }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } }
 {'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 
'iothread1' } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 
'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 
'compare_out0', 'iothread': 'iothread1' } }
 
 {'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
 {'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.2:9998' } }
@@ -318,11 +318,11 @@ Wait until disk is synced, then:
 {'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=,file.export=parent0,node-name=replication0'}}
 {'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 
'replication0' } }
 
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 
'queue': 'tx', 'outdev': 'mirror0' } } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 
'hn0', 'queue': 'rx', 'indev': 'compare_out' } } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 
'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 
'tx', 'outdev': 'mirror0' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 
'rx', 'indev': 'compare_out' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 'queue': 
'rx', 'outdev': 'compare0' } }
 {'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 
'iothread1' } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 
'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 
'compare_out0', 'iothread': 'iothread1' } }
 
 {'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
 {'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.1:9998' } }
-- 
1.8.3.1




RE: [PATCH] docs/COLO-FT.txt: Drop deprecated 'props' from object-add in COLO docs

2021-11-18 Thread Rao, Lei
OK,all files you mentioned will be modified in the next patch.

Thanks,
Lei.

-Original Message-
From: Markus Armbruster  
Sent: Thursday, November 18, 2021 2:50 PM
To: Rao, Lei 
Cc: Zhang, Chen ; zhang.zhanghaili...@huawei.com; 
quint...@redhat.com; dgilb...@redhat.com; qemu-triv...@nongnu.org; 
qemu-devel@nongnu.org
Subject: Re: [PATCH] docs/COLO-FT.txt: Drop deprecated 'props' from object-add 
in COLO docs

"Rao, Lei"  writes:

> From: "Rao, Lei" 
>
> With the removal of deprecated 'props' from object-add in the commit 
> of "50243407457a9fb0ed17b9a9ba9fc9aee09495b1", we also should update 
> COLO's documents.

We should've done this right when we deprecated it.  Not your fault, and better 
late than never.

Recommend to quote commits like

in commit 5024340745 "qapi/qom: Drop deprecated 'props' from
object-add" (v6.0.0)

I have

[alias]
whatis = "!f() { git --no-pager show -s --pretty=\"tformat:%h 
\"%s\" (`git dc $1 | sed 's/~.*//;s/%/%%/g'`, %cd)\" --date=short $1; }; f"

in my ~/.gitconfig, so I can run

$ git whatis 50243407457a9fb0ed17b9a9ba9fc9aee09495b1
5024340745 qapi/qom: Drop deprecated 'props' from object-add (v6.0.0-rc0, 
2021-03-19)

> Signed-off-by: Lei Rao 

Reviewed-by: Markus Armbruster 

Same issue in docs/system/authz.rst, docs/throttle.txt, and 
docs/tools/qemu-nbd.rst.  Care to fix it there as well?

Also...

> ---
>  docs/COLO-FT.txt | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index 
> 8d6d53a..fd5ffcc 100644
> --- a/docs/COLO-FT.txt
> +++ b/docs/COLO-FT.txt
> @@ -289,11 +289,11 @@ Wait until disk is synced, then:
>  {'execute': 'human-monitor-command', 'arguments':{ 'command-line': 
> 'drive_add -n buddy 
> driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,fi
> le.port=,file.export=parent0,node-name=replication0'}}
>  {'execute': 'x-blockdev-change', 'arguments':{ 'parent': 
> 'colo-disk0', 'node': 'replication0' } }
>  

... I'd like us to use double quotes in examples, not single quotes.
QMP supports single quotes as an extension over JSON.  Best to avoid it.
Separate patch, of course.

Same in docs/block-replication.txt.

[...]




[PATCH 1/2] docs: Drop deprecated 'props' from object-add

2021-11-21 Thread Rao, Lei
From: "Rao, Lei" 

In commit 5024340745 "qapi/qom: Drop deprecated 'props' from
object-add" (v6.0.0), we also should update documents.

Signed-off-by: Lei Rao 
---
 docs/system/authz.rst   | 26 ++
 docs/throttle.txt   |  8 +++-
 docs/tools/qemu-nbd.rst |  2 +-
 3 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/docs/system/authz.rst b/docs/system/authz.rst
index 942af39..55b7315 100644
--- a/docs/system/authz.rst
+++ b/docs/system/authz.rst
@@ -77,9 +77,7 @@ To create an instance of this driver via QMP:
  "arguments": {
"qom-type": "authz-simple",
"id": "authz0",
-   "props": {
- "identity": "fred"
-   }
+   "identity": "fred"
  }
}
 
@@ -110,15 +108,13 @@ To create an instance of this class via QMP:
  "arguments": {
"qom-type": "authz-list",
"id": "authz0",
-   "props": {
- "rules": [
-{ "match": "fred", "policy": "allow", "format": "exact" },
-{ "match": "bob", "policy": "allow", "format": "exact" },
-{ "match": "danb", "policy": "deny", "format": "exact" },
-{ "match": "dan*", "policy": "allow", "format": "glob" }
- ],
- "policy": "deny"
-   }
+   "rules": [
+  { "match": "fred", "policy": "allow", "format": "exact" },
+  { "match": "bob", "policy": "allow", "format": "exact" },
+  { "match": "danb", "policy": "deny", "format": "exact" },
+  { "match": "dan*", "policy": "allow", "format": "glob" }
+   ],
+   "policy": "deny"
  }
}
 
@@ -143,10 +139,8 @@ To create an instance of this class via QMP:
  "arguments": {
"qom-type": "authz-list-file",
"id": "authz0",
-   "props": {
- "filename": "/etc/qemu/myvm-vnc.acl",
- "refresh": true
-   }
+   "filename": "/etc/qemu/myvm-vnc.acl",
+   "refresh": true
  }
}
 
diff --git a/docs/throttle.txt b/docs/throttle.txt
index b5b78b7..0a0453a 100644
--- a/docs/throttle.txt
+++ b/docs/throttle.txt
@@ -273,11 +273,9 @@ A group can be created using the object-add QMP function:
  "arguments": {
"qom-type": "throttle-group",
"id": "group0",
-   "props": {
- "limits" : {
-   "iops-total": 1000
-   "bps-write": 2097152
- }
+   "limits" : {
+ "iops-total": 1000,
+ "bps-write": 2097152
}
  }
}
diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst
index 56e54cd..726cd189 100644
--- a/docs/tools/qemu-nbd.rst
+++ b/docs/tools/qemu-nbd.rst
@@ -31,7 +31,7 @@ driver options if ``--image-opts`` is specified.
 
 *dev* is an NBD device.
 
-.. option:: --object type,id=ID,...props...
+.. option:: --object type,id=ID,...
 
   Define a new instance of the *type* object class identified by *ID*.
   See the :manpage:`qemu(1)` manual page for full details of the properties
-- 
1.8.3.1




[PATCH 2/2] docs: Use double quotes instead of single quotes for COLO

2021-11-21 Thread Rao, Lei
From: "Rao, Lei" 

Signed-off-by: Lei Rao 
---
 docs/COLO-FT.txt   | 106 ++---
 docs/block-replication.txt |  52 +++---
 2 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index fd5ffcc..8ec653f 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -209,9 +209,9 @@ children.0=childs0 \
 
 
 3. On Secondary VM's QEMU monitor, issue command
-{'execute':'qmp_capabilities'}
-{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': '0.0.0.0', 'port': ''} } } }
-{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': 
true } }
+{"execute":"qmp_capabilities"}
+{"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": 
{"host": "0.0.0.0", "port": ""} } } }
+{"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": 
true } }
 
 Note:
   a. The qmp command nbd-server-start and nbd-server-add must be run
@@ -222,11 +222,11 @@ Note:
  will be merged into the parent disk on failover.
 
 4. On Primary VM's QEMU monitor, issue command:
-{'execute':'qmp_capabilities'}
-{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0'}}
-{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 
'replication0' } }
-{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
-{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
+{"execute":"qmp_capabilities"}
+{"execute": "human-monitor-command", "arguments": {"command-line": "drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0"}}
+{"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": 
"replication0" } }
+{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ 
{"capability": "x-colo", "state": true } ] } }
+{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
 
   Note:
   a. There should be only one NBD Client for each primary disk.
@@ -249,59 +249,59 @@ if you want to resume the replication, follow "Secondary 
resume replication"
 == Primary Failover ==
 The Secondary died, resume on the Primary
 
-{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 
'child': 'children.1'} }
-{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_del 
replication0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'comp0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'iothread1' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'm0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'redire0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'redire1' } }
-{'execute': 'x-colo-lost-heartbeat' }
+{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", 
"child": "children.1"} }
+{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del 
replication0" } }
+{"execute": "object-del", "arguments":{ "id": "comp0" } }
+{"execute": "object-del", "arguments":{ "id": "iothread1" } }
+{"execute": "object-del", "arguments":{ "id": "m0" } }
+{"execu

[PATCH v2 1/2] docs: Drop deprecated 'props' from object-add

2021-11-21 Thread Rao, Lei
From: "Rao, Lei" 

In commit 5024340745 "qapi/qom: Drop deprecated 'props' from
object-add" (v6.0.0), we also should update documents.

Signed-off-by: Lei Rao 
---
 docs/COLO-FT.txt| 16 
 docs/system/authz.rst   | 26 ++
 docs/throttle.txt   |  8 +++-
 docs/tools/qemu-nbd.rst |  2 +-
 4 files changed, 22 insertions(+), 30 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index 8d6d53a..fd5ffcc 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -289,11 +289,11 @@ Wait until disk is synced, then:
 {'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0'}}
 {'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 
'replication0' } }
 
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'props': { 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } 
} }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'props': { 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', 'netdev': 'hn0', 'queue': 'rx', 'indev': 'compare_out' } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire1', 'netdev': 'hn0', 'queue': 'rx', 'outdev': 'compare0' } }
 {'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id': 
'iothread1' } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 
'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
+{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare', 'id': 
'comp0', 'primary_in': 'compare0-0', 'secondary_in': 'compare1', 'outdev': 
'compare_out0', 'iothread': 'iothread1' } }
 
 {'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
 {'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.2:9998' } }
@@ -318,11 +318,11 @@ Wait until disk is synced, then:
 {'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=,file.export=parent0,node-name=replication0'}}
 {'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 'node': 
'replication0' } }
 
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror', 'id': 
'm0', 'props': { 'insert': 'before', 'position': 'id=rew0', 'netdev': 'hn0', 
'queue': 'tx', 'outdev': 'mirror0' } } }
-{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-redirector', 'id': 
'redire0', '

[PATCH v2 2/2] docs: Use double quotes instead of single quotes for COLO

2021-11-21 Thread Rao, Lei
From: "Rao, Lei" 

Signed-off-by: Lei Rao 
---
 docs/COLO-FT.txt   | 106 ++---
 docs/block-replication.txt |  52 +++---
 2 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index fd5ffcc..8ec653f 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -209,9 +209,9 @@ children.0=childs0 \
 
 
 3. On Secondary VM's QEMU monitor, issue command
-{'execute':'qmp_capabilities'}
-{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': '0.0.0.0', 'port': ''} } } }
-{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0', 'writable': 
true } }
+{"execute":"qmp_capabilities"}
+{"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": 
{"host": "0.0.0.0", "port": ""} } } }
+{"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": 
true } }
 
 Note:
   a. The qmp command nbd-server-start and nbd-server-add must be run
@@ -222,11 +222,11 @@ Note:
  will be merged into the parent disk on failover.
 
 4. On Primary VM's QEMU monitor, issue command:
-{'execute':'qmp_capabilities'}
-{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0'}}
-{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0', 'node': 
'replication0' } }
-{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ 
{'capability': 'x-colo', 'state': true } ] } }
-{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
+{"execute":"qmp_capabilities"}
+{"execute": "human-monitor-command", "arguments": {"command-line": "drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=,file.export=parent0,node-name=replication0"}}
+{"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": 
"replication0" } }
+{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ 
{"capability": "x-colo", "state": true } ] } }
+{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
 
   Note:
   a. There should be only one NBD Client for each primary disk.
@@ -249,59 +249,59 @@ if you want to resume the replication, follow "Secondary 
resume replication"
 == Primary Failover ==
 The Secondary died, resume on the Primary
 
-{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0', 
'child': 'children.1'} }
-{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 'drive_del 
replication0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'comp0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'iothread1' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'm0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'redire0' } }
-{'execute': 'object-del', 'arguments':{ 'id': 'redire1' } }
-{'execute': 'x-colo-lost-heartbeat' }
+{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", 
"child": "children.1"} }
+{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del 
replication0" } }
+{"execute": "object-del", "arguments":{ "id": "comp0" } }
+{"execute": "object-del", "arguments":{ "id": "iothread1" } }
+{"execute": "object-del", "arguments":{ "id": "m0" } }
+{"execu

[PATCH] block/nbd.c: Fixed IO request coroutine not being wakeup when kill NBD server.

2022-03-02 Thread Rao Lei
rite_quorum_entry (opaque=0x7f2fddaf8c50) at 
../block/quorum.c:699
16 0x55575c2ee4db in coroutine_trampoline (i0=1578543808, i1=21847) at 
../util/coroutine-ucontext.c:173
17 0x7f2ff9855660 in __start_context () at 
../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91

When we do failover in COLO mode, QEMU will hang while it is waiting for the in 
flight IO.
>From the call trace, we can see the IO request coroutine which is waiting for 
>send_mutex
has yield in nbd_co_send_request(). When we kill nbd server, it will never be 
wake up.
So, it is necessary to wake up the coroutine in nbd_channel_error().

Signed-off-by: Rao Lei 
---
 block/nbd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/nbd.c b/block/nbd.c
index 5853d85d60..cf9dda537c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -167,6 +167,7 @@ static void nbd_channel_error(BDRVNBDState *s, int ret)
 s->state = NBD_CLIENT_QUIT;
 }
 
+qemu_co_queue_restart_all(&s->free_sema);
 nbd_recv_coroutines_wake(s, true);
 }
 
-- 
2.32.0




Re: [PATCH] block/nbd.c: Fixed IO request coroutine not being wakeup when kill NBD server.

2022-03-03 Thread Rao, Lei




On 3/3/2022 5:25 PM, Vladimir Sementsov-Ogievskiy wrote:

03.03.2022 05:21, Rao Lei wrote:

During the stress test, the IO request coroutine has a probability that it
can't be awakened when the NBD server is killed.

The GDB statck is as follows:
(gdb) bt
0  0x7f2ff990cbf6 in __ppoll (fds=0x55575de85000, nfds=1, timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
1  0x55575c302e7c in qemu_poll_ns (fds=0x55575de85000, nfds=1, 
timeout=59603140) at ../util/qemu-timer.c:348
2  0x55575c2d3c34 in fdmon_poll_wait (ctx=0x55575dc480f0, 
ready_list=0x7ffd9dd1dae0, timeout=59603140) at ../util/fdmon-poll.c:80
3  0x55575c2d350d in aio_poll (ctx=0x55575dc480f0, blocking=true) at 
../util/aio-posix.c:655
4  0x55575c16eabd in bdrv_do_drained_begin (bs=0x55575dee7fe0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
../block/io.c:474
5  0x55575c16eba6 in bdrv_drained_begin (bs=0x55575dee7fe0) at 
../block/io.c:480
6  0x55575c1aff33 in quorum_del_child (bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block/quorum.c:1130
7  0x55575c14239b in bdrv_del_child (parent_bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block.c:7705
8  0x55575c12da28 in qmp_x_blockdev_change
 (parent=0x55575df404c0 "colo-disk0", has_child=true, child=0x55575de867f0 
"children.1", has_node=false, node=0x0, errp=0x7ffd9dd1dd08)
 at ../blockdev.c:3676
9  0x55575c258435 in qmp_marshal_x_blockdev_change (args=0x7f2fec008190, 
ret=0x7f2ff7b0bd98, errp=0x7f2ff7b0bd90) at qapi/qapi-commands-block-core.c:1675
10 0x55575c2c6201 in do_qmp_dispatch_bh (opaque=0x7f2ff7b0be30) at 
../qapi/qmp-dispatch.c:129
11 0x55575c2ebb1c in aio_bh_call (bh=0x55575dc429c0) at ../util/async.c:141
12 0x55575c2ebc2a in aio_bh_poll (ctx=0x55575dc480f0) at ../util/async.c:169
13 0x55575c2d2d96 in aio_dispatch (ctx=0x55575dc480f0) at 
../util/aio-posix.c:415
14 0x55575c2ec07f in aio_ctx_dispatch (source=0x55575dc480f0, callback=0x0, 
user_data=0x0) at ../util/async.c:311
15 0x7f2ff9e7cfbd in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
16 0x55575c2fd581 in glib_pollfds_poll () at ../util/main-loop.c:232
17 0x55575c2fd5ff in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:255
18 0x55575c2fd710 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
19 0x55575bfa7588 in qemu_main_loop () at ../softmmu/runstate.c:726
20 0x55575bbee57a in main (argc=60, argv=0x7ffd9dd1e0e8, 
envp=0x7ffd9dd1e2d0) at ../softmmu/main.c:50

(gdb) qemu coroutine 0x55575e16aac0
0  0x55575c2ee7dc in qemu_coroutine_switch (from_=0x55575e16aac0, 
to_=0x7f2ff830fba0, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:302
1  0x55575c2fe2a9 in qemu_coroutine_yield () at ../util/qemu-coroutine.c:195
2  0x55575c2fe93c in qemu_co_queue_wait_impl (queue=0x55575dc46170, 
lock=0x7f2b32ad9850) at ../util/qemu-coroutine-lock.c:56
3  0x55575c17ddfb in nbd_co_send_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, qiov=0x55575dfc15d8) at ../block/nbd.c:478
4  0x55575c17f931 in nbd_co_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, write_qiov=0x55575dfc15d8) at ../block/nbd.c:1182
5  0x55575c17fe14 in nbd_client_co_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/nbd.c:1284
6  0x55575c170d25 in bdrv_driver_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:1264
7  0x55575c1733b4 in bdrv_aligned_pwritev
 (child=0x55575dff6890, req=0x7f2b32ad9ad0, offset=403487858688, 
bytes=4538368, align=1, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at 
../block/io.c:2126
8  0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2314
9  0x55575c17391b in bdrv_co_pwritev (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/io.c:2233
10 0x55575c1ee506 in replication_co_writev (bs=0x55575e9824f0, 
sector_num=788062224, remaining_sectors=8864, qiov=0x55575dfc15d8, flags=0)
 at ../block/replication.c:270
11 0x55575c170eed in bdrv_driver_pwritev (bs=0x55575e9824f0, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:1297
12 0x55575c1733b4 in bdrv_aligned_pwritev
 (child=0x55575dcea690, req=0x7f2b32ad9e00, offset=403487858688, 
bytes=4538368, align=512, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2126
13 0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dcea690, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2314
14 0x55575c17391b in bdrv_co_pwritev (child=0x55575dcea690, 
offset=40348785

[PATCH v2 2/7] Fixed qemu crash when guest power off in COLO mode

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

This patch fixes the following:
qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
Aborted (core dumped)
The gdb bt as following:
0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
1  0x7faa3d613859 in __GI_abort () at abort.c:79
2  0x55c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723
3  0x55c5a1f8cae4 in vm_prepare_start () at 
/home/workspace/colo-qemu/cpus.c:2206
4  0x55c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213
5  0x55c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at 
migration/migration.c:3376
6  0x55c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
migration/migration.c:3527
7  0x55c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
util/qemu-thread-posix.c:519
8  0x7faa3d7e9609 in start_thread (arg=) at 
pthread_create.c:477
9  0x7faa3d710293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Signed-off-by: Lei Rao 
---
 migration/migration.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 9172686b89..e5b38a492f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3616,7 +3616,9 @@ static void migration_iteration_finish(MigrationState *s)
 case MIGRATION_STATUS_CANCELLED:
 case MIGRATION_STATUS_CANCELLING:
 if (s->vm_was_running) {
-vm_start();
+if (!runstate_check(RUN_STATE_SHUTDOWN)) {
+vm_start();
+}
 } else {
 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
 runstate_set(RUN_STATE_POSTMIGRATE);
-- 
2.30.2




[PATCH v2 1/7] Some minor optimizations for COLO

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

Signed-off-by: Lei Rao 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/colo.c   | 2 +-
 net/colo-compare.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 79fa1f6619..616dc00af7 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -152,7 +152,7 @@ static void primary_vm_do_failover(void)
  * kick COLO thread which might wait at
  * qemu_sem_wait(&s->colo_checkpoint_sem).
  */
-colo_checkpoint_notify(migrate_get_current());
+colo_checkpoint_notify(s);
 
 /*
  * Wake up COLO thread which may blocked in recv() or send(),
diff --git a/net/colo-compare.c b/net/colo-compare.c
index b100e7b51f..4a64a5d386 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -170,7 +170,7 @@ static bool packet_matches_str(const char *str,
 return false;
 }
 
-return !memcmp(str, buf, strlen(str));
+return !memcmp(str, buf, packet_len);
 }
 
 static void notify_remote_frame(CompareState *s)
-- 
2.30.2




[PATCH v2 0/7] Fixed some bugs and optimized some codes for COLO

2021-11-01 Thread Rao, Lei
Changes since v1:
--Move the s->rp_state.from_dst_file = NULL behind qemu_close().

The series of patches include:
Fixed some bugs of qemu crash and segment fault.
Optimized the function of fill_connection_key.
Remove some unnecessary code to improve COLO.

Rao, Lei (7):
  Some minor optimizations for COLO
  Fixed qemu crash when guest power off in COLO mode
  Fixed SVM hang when do failover before PVM crash
  colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff
  Removed the qemu_fclose() in colo_process_incoming_thread
  Changed the last-mode to none of first start COLO
  Optimized the function of fill_connection_key.

 migration/colo.c  | 16 +---
 migration/migration.c |  6 +-
 net/colo-compare.c|  4 ++--
 net/colo.c| 31 ---
 net/colo.h|  6 +++---
 net/filter-rewriter.c | 10 +-
 6 files changed, 28 insertions(+), 45 deletions(-)

-- 
2.30.2




[PATCH v2 4/7] colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

The GDB statck is as follows:
Program terminated with signal SIGSEGV, Segmentation fault.
0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
 if (type->class->interfaces &&
[Current thread is 1 (Thread 0x7f756e97eb00 (LWP 1811577))]
(gdb) bt
0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
1  0x55c8f2c3dd14 in object_dynamic_cast (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel") at qom/object.c:763
2  0x55c8f2c3ddce in object_dynamic_cast_assert (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel",
file=0x55c8f2f73780 "migration/qemu-file-channel.c", line=117, 
func=0x55c8f2f73800 <__func__.18724> "channel_shutdown") at qom/object.c:786
3  0x55c8f2bbc6ac in channel_shutdown (opaque=0x55c8f543ac00, rd=true, 
wr=true, errp=0x0) at migration/qemu-file-channel.c:117
4  0x55c8f2bba56e in qemu_file_shutdown (f=0x7f7558070f50) at 
migration/qemu-file.c:67
5  0x55c8f2ba5373 in migrate_fd_cancel (s=0x55c8f4ccf3f0) at 
migration/migration.c:1699
6  0x55c8f2ba1992 in migration_shutdown () at migration/migration.c:187
7  0x55c8f29a5b77 in main (argc=69, argv=0x7fff3e9e8c08, 
envp=0x7fff3e9e8e38) at vl.c:4512

The root cause is that we still want to shutdown the from_dst_file in
migrate_fd_cancel() after qemu_close in colo_process_checkpoint().
So, we should set the s->rp_state.from_dst_file = NULL after
qemu_close().

Signed-off-by: Lei Rao 
---
 migration/colo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/colo.c b/migration/colo.c
index 616dc00af7..907241ab5c 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -640,6 +640,7 @@ out:
  */
 if (s->rp_state.from_dst_file) {
 qemu_fclose(s->rp_state.from_dst_file);
+s->rp_state.from_dst_file = NULL;
 }
 }
 
-- 
2.30.2




[PATCH v2 5/7] Removed the qemu_fclose() in colo_process_incoming_thread

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

After the live migration, the related fd will be cleanup in
migration_incoming_state_destroy(). So, the qemu_close()
in colo_process_incoming_thread is not necessary.

Signed-off-by: Lei Rao 
---
 migration/colo.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 907241ab5c..71fc82a040 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -919,11 +919,6 @@ out:
 /* Hope this not to be too long to loop here */
 qemu_sem_wait(&mis->colo_incoming_sem);
 qemu_sem_destroy(&mis->colo_incoming_sem);
-/* Must be called after failover BH is completed */
-if (mis->to_src_file) {
-qemu_fclose(mis->to_src_file);
-mis->to_src_file = NULL;
-}
 
 rcu_unregister_thread();
 return NULL;
-- 
2.30.2




[PATCH v2 3/7] Fixed SVM hang when do failover before PVM crash

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

This patch fixed as follows:
Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)):
#0 __pthread_clockjoin_ex (threadid=139847152957184, 
thread_return=0x7f30b1febf30, clockid=, abstime=, 
block=) at pthread_join_common.c:145
#1 0x563401998e36 in qemu_thread_join (thread=0x563402d66610) at 
util/qemu-thread-posix.c:587
#2 0x5634017a79fa in process_incoming_migration_co (opaque=0x0) at 
migration/migration.c:502
#3 0x5634019b59c9 in coroutine_trampoline (i0=63395504, i1=22068) at 
util/coroutine-ucontext.c:115
#4 0x7f34ef860660 in ?? () at 
../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from 
/lib/x86_64-linux-gnu/libc.so.6
#5 0x7f30b21ee730 in ?? ()
#6 0x in ?? ()

Thread 13 (Thread 0x7f30b3dff700 (LWP 11747)):
#0  __lll_lock_wait (futex=futex@entry=0x56340218ffa0 , 
private=0) at lowlevellock.c:52
#1  0x7f34efa000a3 in _GI__pthread_mutex_lock (mutex=0x56340218ffa0 
) at ../nptl/pthread_mutex_lock.c:80
#2  0x563401997f99 in qemu_mutex_lock_impl (mutex=0x56340218ffa0 
, file=0x563401b7a80e "migration/colo.c", line=806) at 
util/qemu-thread-posix.c:78
#3  0x563401407144 in qemu_mutex_lock_iothread_impl 
(file=0x563401b7a80e "migration/colo.c", line=806) at 
/home/workspace/colo-qemu/cpus.c:1899
#4  0x5634017ba8e8 in colo_process_incoming_thread 
(opaque=0x563402d664c0) at migration/colo.c:806
#5  0x563401998b72 in qemu_thread_start (args=0x5634039f8370) at 
util/qemu-thread-posix.c:519
#6  0x7f34ef9fd609 in start_thread (arg=) at 
pthread_create.c:477
#7  0x7f34ef924293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The QEMU main thread is holding the lock:
(gdb) p qemu_global_mutex
$1 = {lock = {_data = {lock = 2, __count = 0, __owner = 11212, __nusers = 
9, __kind = 0, __spins = 0, __elision = 0, __list = {_prev = 0x0, __next = 
0x0}},
 __size = "\002\000\000\000\000\000\000\000\314+\000\000\t", '\000' 
, __align = 2}, file = 0x563401c07e4b "util/main-loop.c", 
line = 240,
initialized = true}

>From the call trace, we can see it is a deadlock bug. and the QEMU main thread 
>holds the global mutex to wait until the COLO thread ends. and the colo thread
wants to acquire the global mutex, which will cause a deadlock. So, we should 
release the qemu_global_mutex before waiting colo thread ends.

Signed-off-by: Lei Rao 
Reviewed-by: Li Zhijian 
---
 migration/migration.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index e5b38a492f..811097c423 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -587,8 +587,10 @@ static void process_incoming_migration_co(void *opaque)
 mis->have_colo_incoming_thread = true;
 qemu_coroutine_yield();
 
+qemu_mutex_unlock_iothread();
 /* Wait checkpoint incoming thread exit before free resource */
 qemu_thread_join(&mis->colo_incoming_thread);
+qemu_mutex_lock_iothread();
 /* We hold the global iothread lock, so it is safe here */
 colo_release_ram_cache();
 }
-- 
2.30.2




[PATCH v2 6/7] Changed the last-mode to none of first start COLO

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

When we first stated the COLO, the last-mode is as follows:
{ "execute": "query-colo-status" }
{"return": {"last-mode": "primary", "mode": "primary", "reason": "none"}}

The last-mode is unreasonable. After the patch, will be changed to the
following:
{ "execute": "query-colo-status" }
{"return": {"last-mode": "none", "mode": "primary", "reason": "none"}}

Signed-off-by: Lei Rao 
---
 migration/colo.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 71fc82a040..e3b1f136f4 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -205,7 +205,7 @@ void colo_do_failover(void)
 vm_stop_force_state(RUN_STATE_COLO);
 }
 
-switch (get_colo_mode()) {
+switch (last_colo_mode = get_colo_mode()) {
 case COLO_MODE_PRIMARY:
 primary_vm_do_failover();
 break;
@@ -530,8 +530,7 @@ static void colo_process_checkpoint(MigrationState *s)
 Error *local_err = NULL;
 int ret;
 
-last_colo_mode = get_colo_mode();
-if (last_colo_mode != COLO_MODE_PRIMARY) {
+if (get_colo_mode() != COLO_MODE_PRIMARY) {
 error_report("COLO mode must be COLO_MODE_PRIMARY");
 return;
 }
@@ -830,8 +829,7 @@ void *colo_process_incoming_thread(void *opaque)
 migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
   MIGRATION_STATUS_COLO);
 
-last_colo_mode = get_colo_mode();
-if (last_colo_mode != COLO_MODE_SECONDARY) {
+if (get_colo_mode() != COLO_MODE_SECONDARY) {
 error_report("COLO mode must be COLO_MODE_SECONDARY");
 return NULL;
 }
-- 
2.30.2




[PATCH v2 7/7] Optimized the function of fill_connection_key.

2021-11-01 Thread Rao, Lei
From: "Rao, Lei" 

Remove some unnecessary code to improve the performance of
the filter-rewriter module.

Signed-off-by: Lei Rao 
Reviewed-by: Zhang Chen 
---
 net/colo-compare.c|  2 +-
 net/colo.c| 31 ---
 net/colo.h|  6 +++---
 net/filter-rewriter.c | 10 +-
 4 files changed, 17 insertions(+), 32 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 4a64a5d386..6a1354dcb1 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -264,7 +264,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 pkt = NULL;
 return -1;
 }
-fill_connection_key(pkt, &key);
+fill_connection_key(pkt, &key, 0);
 
 conn = connection_get(s->connection_track_table,
   &key,
diff --git a/net/colo.c b/net/colo.c
index 3a3e6e89a0..5e7232ce47 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -83,19 +83,26 @@ int parse_packet_early(Packet *pkt)
 return 0;
 }
 
-void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key, Packet *pkt)
+void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key,
+ Packet *pkt, int reverse)
 {
+if (reverse) {
+key->src = pkt->ip->ip_dst;
+key->dst = pkt->ip->ip_src;
+key->src_port = ntohs(tmp_ports & 0x);
+key->dst_port = ntohs(tmp_ports >> 16);
+} else {
 key->src = pkt->ip->ip_src;
 key->dst = pkt->ip->ip_dst;
 key->src_port = ntohs(tmp_ports >> 16);
 key->dst_port = ntohs(tmp_ports & 0x);
+}
 }
 
-void fill_connection_key(Packet *pkt, ConnectionKey *key)
+void fill_connection_key(Packet *pkt, ConnectionKey *key, int reverse)
 {
-uint32_t tmp_ports;
+uint32_t tmp_ports = 0;
 
-memset(key, 0, sizeof(*key));
 key->ip_proto = pkt->ip->ip_p;
 
 switch (key->ip_proto) {
@@ -106,29 +113,15 @@ void fill_connection_key(Packet *pkt, ConnectionKey *key)
 case IPPROTO_SCTP:
 case IPPROTO_UDPLITE:
 tmp_ports = *(uint32_t *)(pkt->transport_header);
-extract_ip_and_port(tmp_ports, key, pkt);
 break;
 case IPPROTO_AH:
 tmp_ports = *(uint32_t *)(pkt->transport_header + 4);
-extract_ip_and_port(tmp_ports, key, pkt);
 break;
 default:
 break;
 }
-}
-
-void reverse_connection_key(ConnectionKey *key)
-{
-struct in_addr tmp_ip;
-uint16_t tmp_port;
-
-tmp_ip = key->src;
-key->src = key->dst;
-key->dst = tmp_ip;
 
-tmp_port = key->src_port;
-key->src_port = key->dst_port;
-key->dst_port = tmp_port;
+extract_ip_and_port(tmp_ports, key, pkt, reverse);
 }
 
 Connection *connection_new(ConnectionKey *key)
diff --git a/net/colo.h b/net/colo.h
index d91cd245c4..5f4d502df3 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -89,9 +89,9 @@ typedef struct Connection {
 uint32_t connection_key_hash(const void *opaque);
 int connection_key_equal(const void *opaque1, const void *opaque2);
 int parse_packet_early(Packet *pkt);
-void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key, Packet *pkt);
-void fill_connection_key(Packet *pkt, ConnectionKey *key);
-void reverse_connection_key(ConnectionKey *key);
+void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key,
+ Packet *pkt, int reverse);
+void fill_connection_key(Packet *pkt, ConnectionKey *key, int reverse);
 Connection *connection_new(ConnectionKey *key);
 void connection_destroy(void *opaque);
 Connection *connection_get(GHashTable *connection_track_table,
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index cb3a96cde1..bf05023dc3 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -279,15 +279,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState 
*nf,
  */
 if (pkt && is_tcp_packet(pkt)) {
 
-fill_connection_key(pkt, &key);
-
-if (sender == nf->netdev) {
-/*
- * We need make tcp TX and RX packet
- * into one connection.
- */
-reverse_connection_key(&key);
-}
+fill_connection_key(pkt, &key, sender == nf->netdev);
 
 /* After failover we needn't change new TCP packet */
 if (s->failover_mode &&
-- 
2.30.2




RE: [PATCH v2 7/7] Optimized the function of fill_connection_key.

2021-11-02 Thread Rao, Lei
Will be changed and sent separately.

Thanks,
Lei

-Original Message-
From: Juan Quintela  
Sent: Wednesday, November 3, 2021 12:23 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; lukasstra...@web.de; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH v2 7/7] Optimized the function of fill_connection_key.

"Rao, Lei"  wrote:
> From: "Rao, Lei" 
>
> Remove some unnecessary code to improve the performance of the 
> filter-rewriter module.
>
> Signed-off-by: Lei Rao 
> Reviewed-by: Zhang Chen 

As Chen has already reviewed it:

Reviewed-by: Juan Quintela 

But I think that you should change in a following patch:

s/int reverse/bool reverse/

Later, Juan.




[PATCH v3] Optimized the function of fill_connection_key.

2021-11-02 Thread Rao, Lei
From: "Rao, Lei" 

Remove some unnecessary code to improve the performance of
the filter-rewriter module.

Signed-off-by: Lei Rao 
Reviewed-by: Zhang Chen 
Reviewed-by: Juan Quintela 
---
 net/colo-compare.c|  2 +-
 net/colo.c| 31 ---
 net/colo.h|  6 +++---
 net/filter-rewriter.c | 10 +-
 4 files changed, 17 insertions(+), 32 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 4a64a5d..b8876d7 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -264,7 +264,7 @@ static int packet_enqueue(CompareState *s, int mode, 
Connection **con)
 pkt = NULL;
 return -1;
 }
-fill_connection_key(pkt, &key);
+fill_connection_key(pkt, &key, false);
 
 conn = connection_get(s->connection_track_table,
   &key,
diff --git a/net/colo.c b/net/colo.c
index 3a3e6e8..1f8162f 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -83,19 +83,26 @@ int parse_packet_early(Packet *pkt)
 return 0;
 }
 
-void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key, Packet *pkt)
+void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key,
+ Packet *pkt, bool reverse)
 {
+if (reverse) {
+key->src = pkt->ip->ip_dst;
+key->dst = pkt->ip->ip_src;
+key->src_port = ntohs(tmp_ports & 0x);
+key->dst_port = ntohs(tmp_ports >> 16);
+} else {
 key->src = pkt->ip->ip_src;
 key->dst = pkt->ip->ip_dst;
 key->src_port = ntohs(tmp_ports >> 16);
 key->dst_port = ntohs(tmp_ports & 0x);
+}
 }
 
-void fill_connection_key(Packet *pkt, ConnectionKey *key)
+void fill_connection_key(Packet *pkt, ConnectionKey *key, bool reverse)
 {
-uint32_t tmp_ports;
+uint32_t tmp_ports = 0;
 
-memset(key, 0, sizeof(*key));
 key->ip_proto = pkt->ip->ip_p;
 
 switch (key->ip_proto) {
@@ -106,29 +113,15 @@ void fill_connection_key(Packet *pkt, ConnectionKey *key)
 case IPPROTO_SCTP:
 case IPPROTO_UDPLITE:
 tmp_ports = *(uint32_t *)(pkt->transport_header);
-extract_ip_and_port(tmp_ports, key, pkt);
 break;
 case IPPROTO_AH:
 tmp_ports = *(uint32_t *)(pkt->transport_header + 4);
-extract_ip_and_port(tmp_ports, key, pkt);
 break;
 default:
 break;
 }
-}
-
-void reverse_connection_key(ConnectionKey *key)
-{
-struct in_addr tmp_ip;
-uint16_t tmp_port;
-
-tmp_ip = key->src;
-key->src = key->dst;
-key->dst = tmp_ip;
 
-tmp_port = key->src_port;
-key->src_port = key->dst_port;
-key->dst_port = tmp_port;
+extract_ip_and_port(tmp_ports, key, pkt, reverse);
 }
 
 Connection *connection_new(ConnectionKey *key)
diff --git a/net/colo.h b/net/colo.h
index d91cd24..8b3e8d5 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -89,9 +89,9 @@ typedef struct Connection {
 uint32_t connection_key_hash(const void *opaque);
 int connection_key_equal(const void *opaque1, const void *opaque2);
 int parse_packet_early(Packet *pkt);
-void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key, Packet *pkt);
-void fill_connection_key(Packet *pkt, ConnectionKey *key);
-void reverse_connection_key(ConnectionKey *key);
+void extract_ip_and_port(uint32_t tmp_ports, ConnectionKey *key,
+ Packet *pkt, bool reverse);
+void fill_connection_key(Packet *pkt, ConnectionKey *key, bool reverse);
 Connection *connection_new(ConnectionKey *key);
 void connection_destroy(void *opaque);
 Connection *connection_get(GHashTable *connection_track_table,
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index cb3a96c..bf05023 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -279,15 +279,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState 
*nf,
  */
 if (pkt && is_tcp_packet(pkt)) {
 
-fill_connection_key(pkt, &key);
-
-if (sender == nf->netdev) {
-/*
- * We need make tcp TX and RX packet
- * into one connection.
- */
-reverse_connection_key(&key);
-}
+fill_connection_key(pkt, &key, sender == nf->netdev);
 
 /* After failover we needn't change new TCP packet */
 if (s->failover_mode &&
-- 
1.8.3.1




[PATCH v7 0/2] Optimized some codes and fixed PVM hang when enabling auto-converge

2021-11-08 Thread Rao, Lei
From: "Rao, Lei" 

Changes since v1-v6:
--Reset the state of the auto-converge counters at every checkpoint instead 
of directly disabling.
--Remove cpu_throttle_stop from mig_throttle_counter_reset.

The series of patches include:
Reduced the PVM stop time during checkpoint.
Fixed the PVM hang when enabling auto-converge feature for COLO.

Rao, Lei (2):
  Reset the auto-converge counter at every checkpoint.
  Reduce the PVM stop time during Checkpoint

 migration/colo.c |  4 
 migration/ram.c  | 57 +---
 migration/ram.h  |  1 +
 3 files changed, 59 insertions(+), 3 deletions(-)

-- 
1.8.3.1




[PATCH v7 1/2] Reset the auto-converge counter at every checkpoint.

2021-11-08 Thread Rao, Lei
From: "Rao, Lei" 

if we don't reset the auto-converge counter,
it will continue to run with COLO running,
and eventually the system will hang due to the
CPU throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE.

Signed-off-by: Lei Rao 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/colo.c | 4 
 migration/ram.c  | 9 +
 migration/ram.h  | 1 +
 3 files changed, 14 insertions(+)

diff --git a/migration/colo.c b/migration/colo.c
index e3b1f13..2415325 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -459,6 +459,10 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 if (ret < 0) {
 goto out;
 }
+
+if (migrate_auto_converge()) {
+mig_throttle_counter_reset();
+}
 /*
  * Only save VM's live state, which not including device state.
  * TODO: We may need a timeout mechanism to prevent COLO process
diff --git a/migration/ram.c b/migration/ram.c
index 847af46..d5f98e6 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -641,6 +641,15 @@ static void mig_throttle_guest_down(uint64_t 
bytes_dirty_period,
 }
 }
 
+void mig_throttle_counter_reset(void)
+{
+RAMState *rs = ram_state;
+
+rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+rs->num_dirty_pages_period = 0;
+rs->bytes_xfer_prev = ram_counters.transferred;
+}
+
 /**
  * xbzrle_cache_zero_page: insert a zero page in the XBZRLE cache
  *
diff --git a/migration/ram.h b/migration/ram.h
index dda1988..c515396 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -50,6 +50,7 @@ bool ramblock_is_ignored(RAMBlock *block);
 int xbzrle_cache_resize(uint64_t new_size, Error **errp);
 uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_total(void);
+void mig_throttle_counter_reset(void);
 
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
-- 
1.8.3.1




[PATCH v7 2/2] Reduce the PVM stop time during Checkpoint

2021-11-08 Thread Rao, Lei
From: "Rao, Lei" 

When flushing memory from ram cache to ram during every checkpoint
on secondary VM, we can copy continuous chunks of memory instead of
4096 bytes per time to reduce the time of VM stop during checkpoint.

Signed-off-by: Lei Rao 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 48 +---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d5f98e6..863035d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -845,6 +845,41 @@ migration_clear_memory_region_dirty_bitmap_range(RAMBlock 
*rb,
 }
 }
 
+/*
+ * colo_bitmap_find_diry:find contiguous dirty pages from start
+ *
+ * Returns the page offset within memory region of the start of the contiguout
+ * dirty page
+ *
+ * @rs: current RAM state
+ * @rb: RAMBlock where to search for dirty pages
+ * @start: page where we start the search
+ * @num: the number of contiguous dirty pages
+ */
+static inline
+unsigned long colo_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
+ unsigned long start, unsigned long *num)
+{
+unsigned long size = rb->used_length >> TARGET_PAGE_BITS;
+unsigned long *bitmap = rb->bmap;
+unsigned long first, next;
+
+*num = 0;
+
+if (ramblock_is_ignored(rb)) {
+return size;
+}
+
+first = find_next_bit(bitmap, size, start);
+if (first >= size) {
+return first;
+}
+next = find_next_zero_bit(bitmap, size, first + 1);
+assert(next >= first);
+*num = next - first;
+return first;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page)
@@ -3895,19 +3930,26 @@ void colo_flush_ram_cache(void)
 block = QLIST_FIRST_RCU(&ram_list.blocks);
 
 while (block) {
-offset = migration_bitmap_find_dirty(ram_state, block, offset);
+unsigned long num = 0;
 
+offset = colo_bitmap_find_dirty(ram_state, block, offset, &num);
 if (!offset_in_ramblock(block,
 ((ram_addr_t)offset) << TARGET_PAGE_BITS)) 
{
 offset = 0;
+num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-migration_bitmap_clear_dirty(ram_state, block, offset);
+unsigned long i = 0;
+
+for (i = 0; i < num; i++) {
+migration_bitmap_clear_dirty(ram_state, block, offset + i);
+}
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
-memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
+memcpy(dst_host, src_host, TARGET_PAGE_SIZE * num);
+offset += num;
 }
 }
 }
-- 
1.8.3.1




[PATCH 2/2] migration/ram.c: Remove the qemu_mutex_lock in colo_flush_ram_cache.

2021-11-10 Thread Rao, Lei
From: "Rao, Lei" 

The code to acquire bitmap_mutex is added in the commit of
"63268c4970a5f126cc9af75f3ccb8057abef5ec0". There is no
need to acquire bitmap_mutex in colo_flush_ram_cache(). This
is because the colo_flush_ram_cache only be called on the COLO
secondary VM, which is the destination side.
On the COLO secondary VM, only the COLO thread will touch
the bitmap of ram cache.

Signed-off-by: Lei Rao 
---
 migration/ram.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 863035d..2c688f5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3918,7 +3918,6 @@ void colo_flush_ram_cache(void)
 unsigned long offset = 0;
 
 memory_global_dirty_log_sync();
-qemu_mutex_lock(&ram_state->bitmap_mutex);
 WITH_RCU_READ_LOCK_GUARD() {
 RAMBLOCK_FOREACH_NOT_IGNORED(block) {
 ramblock_sync_dirty_bitmap(ram_state, block);
@@ -3954,7 +3953,6 @@ void colo_flush_ram_cache(void)
 }
 }
 trace_colo_flush_ram_cache_end();
-qemu_mutex_unlock(&ram_state->bitmap_mutex);
 }
 
 /**
-- 
1.8.3.1




[PATCH 1/2] Fixed a QEMU hang when guest poweroff in COLO mode

2021-11-10 Thread Rao, Lei
From: "Rao, Lei" 

When the PVM guest poweroff, the COLO thread may wait a semaphore
in colo_process_checkpoint().So, we should wake up the COLO thread
before migration shutdown.

Signed-off-by: Lei Rao 
---
 include/migration/colo.h |  1 +
 migration/colo.c | 14 ++
 migration/migration.c| 10 ++
 3 files changed, 25 insertions(+)

diff --git a/include/migration/colo.h b/include/migration/colo.h
index 768e1f0..525b45a 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -37,4 +37,5 @@ COLOMode get_colo_mode(void);
 void colo_do_failover(void);
 
 void colo_checkpoint_notify(void *opaque);
+void colo_shutdown(COLOMode mode);
 #endif
diff --git a/migration/colo.c b/migration/colo.c
index 2415325..385c1d7 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -820,6 +820,20 @@ static void 
colo_wait_handle_message(MigrationIncomingState *mis,
 }
 }
 
+void colo_shutdown(COLOMode mode)
+{
+if (mode == COLO_MODE_PRIMARY) {
+MigrationState *s = migrate_get_current();
+
+qemu_event_set(&s->colo_checkpoint_event);
+qemu_sem_post(&s->colo_exit_sem);
+} else {
+MigrationIncomingState *mis = migration_incoming_get_current();
+
+qemu_sem_post(&mis->colo_incoming_sem);
+}
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
 MigrationIncomingState *mis = opaque;
diff --git a/migration/migration.c b/migration/migration.c
index abaf6f9..9df6328 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -225,6 +225,16 @@ void migration_cancel(const Error *error)
 
 void migration_shutdown(void)
 {
+COLOMode mode = get_colo_mode();
+
+/*
+ * When the QEMU main thread exit, the COLO thread
+ * may wait a semaphore. So, we should wakeup the
+ * COLO thread before migration shutdown.
+ */
+if (mode != COLO_MODE_NONE) {
+colo_shutdown(mode);
+ }
 /*
  * Cancel the current migration - that will (eventually)
  * stop the migration using this structure
-- 
1.8.3.1




RE: [PATCH 1/2] Fixed a QEMU hang when guest poweroff in COLO mode

2021-11-10 Thread Rao, Lei
OK, will be changed in V2.

Thanks,
Lei

-Original Message-
From: Juan Quintela  
Sent: Wednesday, November 10, 2021 5:55 PM
To: Rao, Lei 
Cc: Zhang, Chen ; zhang.zhanghaili...@huawei.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 1/2] Fixed a QEMU hang when guest poweroff in COLO mode

"Rao, Lei"  wrote:
> From: "Rao, Lei" 
>
> When the PVM guest poweroff, the COLO thread may wait a semaphore in 
> colo_process_checkpoint().So, we should wake up the COLO thread before 
> migration shutdown.
>
> Signed-off-by: Lei Rao 

A couple of notes.

> ---
>  include/migration/colo.h |  1 +
>  migration/colo.c | 14 ++
>  migration/migration.c| 10 ++
>  3 files changed, 25 insertions(+)
>
> diff --git a/include/migration/colo.h b/include/migration/colo.h index 
> 768e1f0..525b45a 100644
> --- a/include/migration/colo.h
> +++ b/include/migration/colo.h
> @@ -37,4 +37,5 @@ COLOMode get_colo_mode(void);  void 
> colo_do_failover(void);
>  
>  void colo_checkpoint_notify(void *opaque);
> +void colo_shutdown(COLOMode mode);
>  #endif
> diff --git a/migration/colo.c b/migration/colo.c index 
> 2415325..385c1d7 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -820,6 +820,20 @@ static void 
> colo_wait_handle_message(MigrationIncomingState *mis,
>  }
>  }
>  
> +void colo_shutdown(COLOMode mode)
> +{
> +if (mode == COLO_MODE_PRIMARY) {
> +MigrationState *s = migrate_get_current();
> +
> +qemu_event_set(&s->colo_checkpoint_event);
> +qemu_sem_post(&s->colo_exit_sem);
> +} else {
> +MigrationIncomingState *mis = 
> + migration_incoming_get_current();
> +
> +qemu_sem_post(&mis->colo_incoming_sem);
> +}
> +}
> +
>  void *colo_process_incoming_thread(void *opaque)  {
>  MigrationIncomingState *mis = opaque; diff --git 
> a/migration/migration.c b/migration/migration.c index abaf6f9..9df6328 
> 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -225,6 +225,16 @@ void migration_cancel(const Error *error)
>  
>  void migration_shutdown(void)
>  {
> +COLOMode mode = get_colo_mode();
> +
> +/*
> + * When the QEMU main thread exit, the COLO thread
> + * may wait a semaphore. So, we should wakeup the
> + * COLO thread before migration shutdown.
> + */
> +if (mode != COLO_MODE_NONE) {
> +colo_shutdown(mode);
> + }

don't put the code on the main path.

Could you just put all of this inside a colo_shutdown() call?

Thanks, Juan.

>  /*
>   * Cancel the current migration - that will (eventually)
>   * stop the migration using this structure




[PATCH v2] ui/vnc.c: Fixed a deadlock bug.

2022-01-04 Thread Rao Lei
The GDB statck is as follows:
(gdb) bt
0  __lll_lock_wait (futex=futex@entry=0x56211df20360, private=0) at 
lowlevellock.c:52
1  0x7f263caf20a3 in __GI___pthread_mutex_lock (mutex=0x56211df20360) at 
../nptl/pthread_mutex_lock.c:80
2  0x56211a757364 in qemu_mutex_lock_impl (mutex=0x56211df20360, 
file=0x56211a804857 "../ui/vnc-jobs.h", line=60)
at ../util/qemu-thread-posix.c:80
3  0x56211a0ef8c7 in vnc_lock_output (vs=0x56211df14200) at 
../ui/vnc-jobs.h:60
4  0x56211a0efcb7 in vnc_clipboard_send (vs=0x56211df14200, count=1, 
dwords=0x7ffdf1701338) at ../ui/vnc-clipboard.c:138
5  0x56211a0f0129 in vnc_clipboard_notify (notifier=0x56211df244c8, 
data=0x56211dd1bbf0) at ../ui/vnc-clipboard.c:209
6  0x56211a75dde8 in notifier_list_notify (list=0x56211afa17d0 
, data=0x56211dd1bbf0) at ../util/notify.c:39
7  0x56211a0bf0e6 in qemu_clipboard_update (info=0x56211dd1bbf0) at 
../ui/clipboard.c:50
8  0x56211a0bf05d in qemu_clipboard_peer_release (peer=0x56211df244c0, 
selection=QEMU_CLIPBOARD_SELECTION_CLIPBOARD)
at ../ui/clipboard.c:41
9  0x56211a0bef9b in qemu_clipboard_peer_unregister (peer=0x56211df244c0) 
at ../ui/clipboard.c:19
10 0x56211a0d45f3 in vnc_disconnect_finish (vs=0x56211df14200) at 
../ui/vnc.c:1358
11 0x56211a0d4c9d in vnc_client_read (vs=0x56211df14200) at ../ui/vnc.c:1611
12 0x56211a0d4df8 in vnc_client_io (ioc=0x56211ce70690, condition=G_IO_IN, 
opaque=0x56211df14200) at ../ui/vnc.c:1649
13 0x56211a5b976c in qio_channel_fd_source_dispatch
(source=0x56211ce50a00, callback=0x56211a0d4d71 , 
user_data=0x56211df14200) at ../io/channel-watch.c:84
14 0x7f263ccede8e in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
15 0x56211a77d4a1 in glib_pollfds_poll () at ../util/main-loop.c:232
16 0x56211a77d51f in os_host_main_loop_wait (timeout=958545) at 
../util/main-loop.c:255
17 0x56211a77d630 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
18 0x56211a45bc8e in qemu_main_loop () at ../softmmu/runstate.c:726
19 0x56211a0b45fa in main (argc=69, argv=0x7ffdf1701778, 
envp=0x7ffdf17019a8) at ../softmmu/main.c:50

>From the call trace, we can see it is a deadlock bug.
vnc_disconnect_finish will acquire the output_mutex.
But, the output_mutex will be acquired again in vnc_clipboard_send.
Repeated locking will cause deadlock. So, I move
qemu_clipboard_peer_unregister() behind vnc_unlock_output();

Fixes: 0bf41cab93e ("ui/vnc: clipboard support")
Signed-off-by: Lei Rao 
Reviewed-by: Marc-André Lureau 
---
 ui/vnc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 1ed1c7efc6..3ccd33dedc 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1354,12 +1354,12 @@ void vnc_disconnect_finish(VncState *vs)
 /* last client gone */
 vnc_update_server_surface(vs->vd);
 }
+vnc_unlock_output(vs);
+
 if (vs->cbpeer.notifier.notify) {
 qemu_clipboard_peer_unregister(&vs->cbpeer);
 }
 
-vnc_unlock_output(vs);
-
 qemu_mutex_destroy(&vs->output_mutex);
 if (vs->bh != NULL) {
 qemu_bh_delete(vs->bh);
-- 
2.32.0




Re: [PATCH] block/nbd.c: Fixed IO request coroutine not being wakeup when kill NBD server.

2022-03-08 Thread Rao, Lei




On 3/3/2022 10:05 PM, Rao, Lei wrote:



On 3/3/2022 5:25 PM, Vladimir Sementsov-Ogievskiy wrote:

03.03.2022 05:21, Rao Lei wrote:

During the stress test, the IO request coroutine has a probability that it
can't be awakened when the NBD server is killed.

The GDB statck is as follows:
(gdb) bt
0  0x7f2ff990cbf6 in __ppoll (fds=0x55575de85000, nfds=1, timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
1  0x55575c302e7c in qemu_poll_ns (fds=0x55575de85000, nfds=1, 
timeout=59603140) at ../util/qemu-timer.c:348
2  0x55575c2d3c34 in fdmon_poll_wait (ctx=0x55575dc480f0, 
ready_list=0x7ffd9dd1dae0, timeout=59603140) at ../util/fdmon-poll.c:80
3  0x55575c2d350d in aio_poll (ctx=0x55575dc480f0, blocking=true) at 
../util/aio-posix.c:655
4  0x55575c16eabd in bdrv_do_drained_begin (bs=0x55575dee7fe0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
../block/io.c:474
5  0x55575c16eba6 in bdrv_drained_begin (bs=0x55575dee7fe0) at 
../block/io.c:480
6  0x55575c1aff33 in quorum_del_child (bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block/quorum.c:1130
7  0x55575c14239b in bdrv_del_child (parent_bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block.c:7705
8  0x55575c12da28 in qmp_x_blockdev_change
 (parent=0x55575df404c0 "colo-disk0", has_child=true, child=0x55575de867f0 
"children.1", has_node=false, node=0x0, errp=0x7ffd9dd1dd08)
 at ../blockdev.c:3676
9  0x55575c258435 in qmp_marshal_x_blockdev_change (args=0x7f2fec008190, 
ret=0x7f2ff7b0bd98, errp=0x7f2ff7b0bd90) at qapi/qapi-commands-block-core.c:1675
10 0x55575c2c6201 in do_qmp_dispatch_bh (opaque=0x7f2ff7b0be30) at 
../qapi/qmp-dispatch.c:129
11 0x55575c2ebb1c in aio_bh_call (bh=0x55575dc429c0) at ../util/async.c:141
12 0x55575c2ebc2a in aio_bh_poll (ctx=0x55575dc480f0) at ../util/async.c:169
13 0x55575c2d2d96 in aio_dispatch (ctx=0x55575dc480f0) at 
../util/aio-posix.c:415
14 0x55575c2ec07f in aio_ctx_dispatch (source=0x55575dc480f0, callback=0x0, 
user_data=0x0) at ../util/async.c:311
15 0x7f2ff9e7cfbd in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
16 0x55575c2fd581 in glib_pollfds_poll () at ../util/main-loop.c:232
17 0x55575c2fd5ff in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:255
18 0x55575c2fd710 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
19 0x55575bfa7588 in qemu_main_loop () at ../softmmu/runstate.c:726
20 0x55575bbee57a in main (argc=60, argv=0x7ffd9dd1e0e8, 
envp=0x7ffd9dd1e2d0) at ../softmmu/main.c:50

(gdb) qemu coroutine 0x55575e16aac0
0  0x55575c2ee7dc in qemu_coroutine_switch (from_=0x55575e16aac0, 
to_=0x7f2ff830fba0, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:302
1  0x55575c2fe2a9 in qemu_coroutine_yield () at ../util/qemu-coroutine.c:195
2  0x55575c2fe93c in qemu_co_queue_wait_impl (queue=0x55575dc46170, 
lock=0x7f2b32ad9850) at ../util/qemu-coroutine-lock.c:56
3  0x55575c17ddfb in nbd_co_send_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, qiov=0x55575dfc15d8) at ../block/nbd.c:478
4  0x55575c17f931 in nbd_co_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, write_qiov=0x55575dfc15d8) at ../block/nbd.c:1182
5  0x55575c17fe14 in nbd_client_co_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/nbd.c:1284
6  0x55575c170d25 in bdrv_driver_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:1264
7  0x55575c1733b4 in bdrv_aligned_pwritev
 (child=0x55575dff6890, req=0x7f2b32ad9ad0, offset=403487858688, 
bytes=4538368, align=1, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at 
../block/io.c:2126
8  0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2314
9  0x55575c17391b in bdrv_co_pwritev (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/io.c:2233
10 0x55575c1ee506 in replication_co_writev (bs=0x55575e9824f0, 
sector_num=788062224, remaining_sectors=8864, qiov=0x55575dfc15d8, flags=0)
 at ../block/replication.c:270
11 0x55575c170eed in bdrv_driver_pwritev (bs=0x55575e9824f0, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:1297
12 0x55575c1733b4 in bdrv_aligned_pwritev
 (child=0x55575dcea690, req=0x7f2b32ad9e00, offset=403487858688, 
bytes=4538368, align=512, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2126
13 0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dcea690, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
 at ../block/io.c:2314
14 0x55575c17391b in bdrv_co_pwri

[PATCH v2] block/nbd.c: Fixed IO request coroutine not being wakeup when kill NBD server

2022-03-08 Thread Rao Lei
During the IO stress test, the IO request coroutine has a probability that is
can't be awakened when the NBD server is killed.

The GDB stack is as follows:
(gdb) bt
0  0x7f2ff990cbf6 in __ppoll (fds=0x55575de85000, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
1  0x55575c302e7c in qemu_poll_ns (fds=0x55575de85000, nfds=1, 
timeout=59603140) at ../util/qemu-timer.c:348
2  0x55575c2d3c34 in fdmon_poll_wait (ctx=0x55575dc480f0, 
ready_list=0x7ffd9dd1dae0, timeout=59603140) at ../util/fdmon-poll.c:80
3  0x55575c2d350d in aio_poll (ctx=0x55575dc480f0, blocking=true) at 
../util/aio-posix.c:655
4  0x55575c16eabd in bdrv_do_drained_begin(bs=0x55575dee7fe0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true)at 
../block/io.c:474
5  0x55575c16eba6 in bdrv_drained_begin (bs=0x55575dee7fe0) at 
../block/io.c:480
6  0x55575c1aff33 in quorum_del_child (bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block/quorum.c:1130
7  0x55575c14239b in bdrv_del_child (parent_bs=0x55575dee7fe0, 
child=0x55575dcea690, errp=0x7ffd9dd1dd08) at ../block.c:7705
8  0x55575c12da28 in qmp_x_blockdev_change(parent=0x55575df404c0 
"colo-disk0", has_child=true, child=0x55575de867f0 "children.1", 
has_node=false, no   de=0x0, errp=0x7ffd9dd1dd08) at ../blockdev.c:3676
9  0x55575c258435 in qmp_marshal_x_blockdev_change (args=0x7f2fec008190, 
ret=0x7f2ff7b0bd98, errp=0x7f2ff7b0bd90) at qapi/qapi-commands-block-core.c   
:1675
10 0x55575c2c6201 in do_qmp_dispatch_bh (opaque=0x7f2ff7b0be30) at 
../qapi/qmp-dispatch.c:129
11 0x55575c2ebb1c in aio_bh_call (bh=0x55575dc429c0) at ../util/async.c:141
12 0x55575c2ebc2a in aio_bh_poll (ctx=0x55575dc480f0) at ../util/async.c:169
13 0x55575c2d2d96 in aio_dispatch (ctx=0x55575dc480f0) at 
../util/aio-posix.c:415
14 0x55575c2ec07f in aio_ctx_dispatch (source=0x55575dc480f0, callback=0x0, 
user_data=0x0) at ../util/async.c:311
15 0x7f2ff9e7cfbd in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
16 0x55575c2fd581 in glib_pollfds_poll () at ../util/main-loop.c:232
17 0x55575c2fd5ff in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:255
18 0x55575c2fd710 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
19 0x55575bfa7588 in qemu_main_loop () at ../softmmu/runstate.c:726
20 0x55575bbee57a in main (argc=60, argv=0x7ffd9dd1e0e8, 
envp=0x7ffd9dd1e2d0) at ../softmmu/main.c:50

(gdb) qemu coroutine 0x55575e16aac0
0  0x55575c2ee7dc in qemu_coroutine_switch (from_=0x55575e16aac0, 
to_=0x7f2ff830fba0, action=COROUTINE_YIELD) at ../util/coroutine-ucontext.c:302
1  0x55575c2fe2a9 in qemu_coroutine_yield () at ../util/qemu-coroutine.c:195
2  0x55575c2fe93c in qemu_co_queue_wait_impl (queue=0x55575dc46170, 
lock=0x7f2b32ad9850) at ../util/qemu-coroutine-lock.c:56
3  0x55575c17ddfb in nbd_co_send_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, qiov=0x55575dfc15d8) at ../block/nbd.c:478
4  0x55575c17f931 in nbd_co_request (bs=0x55575ebfaf20, 
request=0x7f2b32ad9920, write_qiov=0x55575dfc15d8) at ../block/nbd.c:1182
5  0x55575c17fe14 in nbd_client_co_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/nbd.c:1284
6  0x55575c170d25 in bdrv_driver_pwritev (bs=0x55575ebfaf20, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
at ../block/io.c:1264
7  0x55575c1733b4 in bdrv_aligned_pwritev
(child=0x55575dff6890, req=0x7f2b32ad9ad0, offset=403487858688, 
bytes=4538368, align=1, qiov=0x55575dfc15d8, qiov_offset=0, flags=0) at 
../block/io.c:2126
8  0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
at ../block/io.c:2314
9  0x55575c17391b in bdrv_co_pwritev (child=0x55575dff6890, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/io.c:2233
10 0x55575c1ee506 in replication_co_writev (bs=0x55575e9824f0, 
sector_num=788062224, remaining_sectors=8864, qiov=0x55575dfc15d8, flags=0)
at ../block/replication.c:270
11 0x55575c170eed in bdrv_driver_pwritev (bs=0x55575e9824f0, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
at ../block/io.c:1297
12 0x55575c1733b4 in bdrv_aligned_pwritev
(child=0x55575dcea690, req=0x7f2b32ad9e00, offset=403487858688, 
bytes=4538368, align=512, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
at ../block/io.c:2126
13 0x55575c173c67 in bdrv_co_pwritev_part (child=0x55575dcea690, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, qiov_offset=0, flags=0)
at ../block/io.c:2314
14 0x55575c17391b in bdrv_co_pwritev (child=0x55575dcea690, 
offset=403487858688, bytes=4538368, qiov=0x55575dfc15d8, flags=0) at 
../block/io.c:2233
15 0x55575c1aeffa in write_quorum_entry (opaque=0x7f2fddaf8

[PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-11-30 Thread Rao, Lei
We found that the QIO channel coroutine could not be awakened in some 
corner cases during our stress test for COLO.
The patch fixes as follow:
#0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
#1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
timeout=59697630) at util/qemu-timer.c:348
#2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, blocking=true) 
at util/aio-posix.c:669
#3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
block/io.c:432
#4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
block/io.c:438
#5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
#6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
#7  0x5563d658499e in qmp_x_blockdev_change (parent=0x5563d80ec4c0 
"colo-disk0", has_child=true, child=0x5563d7577d30 "children.1", 
has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
#8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
(args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
qapi/qapi-commands-block-core.c:1538
#9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
at qapi/qmp-dispatch.c:132
#10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true) at 
qapi/qmp-dispatch.c:175
#11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
req=0x7fad64009d80) at monitor/qmp.c:145
#12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
monitor/qmp.c:234
#13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
util/async.c:89
#14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
util/async.c:117
#15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
util/aio-posix.c:459
#16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
callback=0x0, user_data=0x0) at util/async.c:260
#17 0x7fad730e1fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
#19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
util/main-loop.c:242
#20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:518
#21 0x5563d658f5ed in main_loop () at vl.c:1814
#22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8, 
envp=0x7fff3cf5c2b8) at vl.c:450

From the call trace, we can see that the QEMU main thread is waiting for 
the in_flight return to zero. But the in_filght never reaches 0.
After analysis, the root cause is that the coroutine of NBD was not 
awakened. Although this bug was found in the COLO test, I think this is a
universal problem in the QIO module. This issue also affects other modules 
depending on QIO such as NBD. We dump the following data:
$2 = {
  in_flight = 2,
  state = NBD_CLIENT_QUIT,
  connect_status = 0,
  connect_err = 0x0,
  wait_in_flight = false,
  requests = {{
  coroutine = 0x0,
  offset = 273077071872,
  receiving = false,
}, {
  coroutine = 0x7f1164571df0,
  offset = 498792529920,
  receiving = false,
}, {
  coroutine = 0x0,
  offset = 500663590912,
  receiving = false,
}, {
  ...
} },
  reply = {
simple = {
  magic = 1732535960,
  error = 0,
  handle = 0
},
structured = {
  magic = 1732535960,
  flags = 0,
  type = 0,
  handle = 0,
  length = 0
},
{
  magic = 1732535960,
  _skip = 0,
  handle = 0
}
  },
  bs = 0x7f11640e2140,
  reconnect_delay = 0,
  saddr = 0x7f11640e1f80,
  export = 0x7f11640dc560 "parent0",
}
From the data, we can see the coroutine of NBD does not exit normally when 
killing the NBD server(we kill the Secondary VM in the COLO stress test).
Then it will not execute in_flight--. So, the QEMU main thread will hang 
here. Further analysis, I found the coroutine of NBD will yield
in nbd_send_request() or qio_channel_write_all() in nbd_co_send_request. By 
the way, the yield is due to the kernel return EAGAIN(under the stress test).
However, NBD didn't know it would yield here. So, the 
nbd_recv_coroutines_wake won't wake it up because req->receiving is false. if 
the NBD server
is terminated abnormally at the same time. The G_IO_OUT will be invalid, 

RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-01 Thread Rao, Lei


-Original Message-
From: Daniel P. Berrangé  
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei 
Cc: Zhang, Chen ; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; 
qemu-bl...@nongnu.org; qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:
> We found that the QIO channel coroutine could not be awakened in some 
> corner cases during our stress test for COLO.
> The patch fixes as follow:
> #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
> timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
> #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
> timeout=59697630) at util/qemu-timer.c:348
> #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
> blocking=true) at util/aio-posix.c:669
> #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
> recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
> block/io.c:432
> #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
> block/io.c:438
> #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
> child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
> #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
> child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
> #7  0x5563d658499e in qmp_x_blockdev_change 
> (parent=0x5563d80ec4c0 "colo-disk0", has_child=true, child=0x5563d7577d30 
> "children.1", has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
> #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
> (args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
> qapi/qapi-commands-block-core.c:1538
> #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
> , request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
> at qapi/qmp-dispatch.c:132
> #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
> , request=0x7fad64009d80, allow_oob=true) at 
> qapi/qmp-dispatch.c:175
> #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
> req=0x7fad64009d80) at monitor/qmp.c:145
> #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
> monitor/qmp.c:234
> #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
> util/async.c:89
> #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
> util/async.c:117
> #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
> util/aio-posix.c:459
> #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
> callback=0x0, user_data=0x0) at util/async.c:260
> #17 0x7fad730e1fbd in g_main_context_dispatch () from 
> /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
> #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
> util/main-loop.c:242
> #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
> util/main-loop.c:518
> #21 0x5563d658f5ed in main_loop () at vl.c:1814
> #22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8, 
> envp=0x7fff3cf5c2b8) at vl.c:450
> 
> From the call trace, we can see that the QEMU main thread is waiting for 
> the in_flight return to zero. But the in_filght never reaches 0.
> After analysis, the root cause is that the coroutine of NBD was not 
> awakened. Although this bug was found in the COLO test, I think this is a
> universal problem in the QIO module. This issue also affects other 
> modules depending on QIO such as NBD. We dump the following data:
> $2 = {
>   in_flight = 2,
>   state = NBD_CLIENT_QUIT,
>   connect_status = 0,
>   connect_err = 0x0,
>   wait_in_flight = false,
>   requests = {{
>   coroutine = 0x0,
>   offset = 273077071872,
>   receiving = false,
> }, {
>   coroutine = 0x7f1164571df0,
>   offset = 498792529920,
>   receiving = false,
> }, {
>   coroutine = 0x0,
>   offset = 500663590912,
>   receiving = false,
> }, {
>   ...
> } },
>   reply = {
> simple = {
>   magic = 1732535960,
>   error = 0,
>   handle = 0
> },
> structured = {
>   magic = 1732535960,
>   flags = 0,
>   type = 0,
>   handle = 0,
>   length = 0
> },
>

RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-01 Thread Rao, Lei


-Original Message-
From: Vladimir Sementsov-Ogievskiy  
Sent: Wednesday, December 1, 2021 10:27 PM
To: Rao, Lei ; Daniel P. Berrangé 
Cc: Zhang, Chen ; ebl...@redhat.com; kw...@redhat.com; 
hre...@redhat.com; qemu-bl...@nongnu.org; qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

01.12.2021 12:48, Rao, Lei wrote:
> 
> 
> -Original Message-
> From: Daniel P. Berrangé 
> Sent: Wednesday, December 1, 2021 5:11 PM
> To: Rao, Lei 
> Cc: Zhang, Chen ; ebl...@redhat.com; 
> vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; 
> qemu-bl...@nongnu.org; qemu-devel@nongnu.org
> Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure 
> QIO exits cleanly in some corner case
> 
> On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:
>>  We found that the QIO channel coroutine could not be awakened in some 
>> corner cases during our stress test for COLO.
>>  The patch fixes as follow:
>>  #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
>> timeout=, sigmask=0x0) at 
>> ../sysdeps/unix/sysv/linux/ppoll.c:44
>>  #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
>> timeout=59697630) at util/qemu-timer.c:348
>>  #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
>> blocking=true) at util/aio-posix.c:669
>>  #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
>> recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
>> block/io.c:432
>>  #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
>> block/io.c:438
>>  #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
>> child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
>>  #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
>> child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
>>  #7  0x5563d658499e in qmp_x_blockdev_change 
>> (parent=0x5563d80ec4c0 "colo-disk0", has_child=true, child=0x5563d7577d30 
>> "children.1", has_node=false, node=0x0,errp=0x7fff3cf5b960) at 
>> blockdev.c:4494
>>  #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
>> (args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
>> qapi/qapi-commands-block-core.c:1538
>>  #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
>> , request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
>>  at qapi/qmp-dispatch.c:132
>>  #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
>> , request=0x7fad64009d80, allow_oob=true) at 
>> qapi/qmp-dispatch.c:175
>>  #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
>> req=0x7fad64009d80) at monitor/qmp.c:145
>>  #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
>> monitor/qmp.c:234
>>  #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
>> util/async.c:89
>>  #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
>> util/async.c:117
>>  #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
>> util/aio-posix.c:459
>>  #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
>> callback=0x0, user_data=0x0) at util/async.c:260
>>  #17 0x7fad730e1fbd in g_main_context_dispatch () from 
>> /lib/x86_64-linux-gnu/libglib-2.0.so.0
>>  #18 0x5563d6978cda in glib_pollfds_poll () at 
>> util/main-loop.c:219
>>  #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) 
>> at util/main-loop.c:242
>>  #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
>> util/main-loop.c:518
>>  #21 0x5563d658f5ed in main_loop () at vl.c:1814
>>  #22 0x5563d6596bb7 in main (argc=61, 
>> argv=0x7fff3cf5c0c8,
>> envp=0x7fff3cf5c2b8) at vl.c:450
>>
>>  From the call trace, we can see that the QEMU main thread is waiting 
>> for the in_flight return to zero. But the in_filght never reaches 0.
>>  After analysis, the root cause is that the coroutine of NBD was not 
>> awakened. Although this bug was found in the COLO test, I think this is a
>>  universal problem in the QIO module. This issue also affects other 
>> modules depending on QIO such as NBD. We dump the following data:
>>  $2 = {
>>in_flight = 2,
>>state = NBD_CLIENT_QUIT,
>>connect_status = 0,
>>connect_err = 0x0,
>&g

Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-01 Thread Rao, Lei

Sorry, resending with correct indentation and quoting.

On 12/1/2021 10:27 PM, Vladimir Sementsov-Ogievskiy wrote:

01.12.2021 12:48, Rao, Lei wrote:



-Original Message-
From: Daniel P. Berrangé 
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei 
Cc: Zhang, Chen ; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; qemu-bl...@nongnu.org; 
qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:

 We found that the QIO channel coroutine could not be awakened in some 
corner cases during our stress test for COLO.
 The patch fixes as follow:
 #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
 #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
timeout=59697630) at util/qemu-timer.c:348
 #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, blocking=true) 
at util/aio-posix.c:669
 #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
block/io.c:432
 #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
block/io.c:438
 #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
 #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
 #7  0x5563d658499e in qmp_x_blockdev_change (parent=0x5563d80ec4c0 
"colo-disk0", has_child=true, child=0x5563d7577d30 "children.1", 
has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
 #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
(args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
qapi/qapi-commands-block-core.c:1538
 #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
 at qapi/qmp-dispatch.c:132
 #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true) at 
qapi/qmp-dispatch.c:175
 #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
req=0x7fad64009d80) at monitor/qmp.c:145
 #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
monitor/qmp.c:234
 #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
util/async.c:89
 #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
util/async.c:117
 #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
util/aio-posix.c:459
 #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
callback=0x0, user_data=0x0) at util/async.c:260
 #17 0x7fad730e1fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
 #18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
 #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
util/main-loop.c:242
 #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:518
 #21 0x5563d658f5ed in main_loop () at vl.c:1814
 #22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8,
envp=0x7fff3cf5c2b8) at vl.c:450

 From the call trace, we can see that the QEMU main thread is waiting for 
the in_flight return to zero. But the in_filght never reaches 0.
 After analysis, the root cause is that the coroutine of NBD was not 
awakened. Although this bug was found in the COLO test, I think this is a
 universal problem in the QIO module. This issue also affects other modules 
depending on QIO such as NBD. We dump the following data:
 $2 = {
   in_flight = 2,
   state = NBD_CLIENT_QUIT,
   connect_status = 0,
   connect_err = 0x0,
   wait_in_flight = false,
   requests = {{
   coroutine = 0x0,
   offset = 273077071872,
   receiving = false,
 }, {
   coroutine = 0x7f1164571df0,
   offset = 498792529920,
   receiving = false,
 }, {
   coroutine = 0x0,
   offset = 500663590912,
   receiving = false,
 }, {
   ...
 } },
   reply = {
 simple = {
   magic = 1732535960,
   error = 0,
   handle = 0
 },
 structured = {
   magic = 1732535960,
   flags = 0,
   type = 0,
   handle = 0,
   length = 0
 },
 {
   magic = 1732535960,
   _skip = 0,
   handle = 0
 }
   },
   bs = 0x7f11640e2140,
   reconnect_delay = 0,
   saddr = 0x7f11640e1f80,
   export = 0x7f11640dc560 "parent0",
 }

Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-02 Thread Rao, Lei




On 12/2/2021 5:54 PM, Vladimir Sementsov-Ogievskiy wrote:

02.12.2021 11:53, Daniel P. Berrangé wrote:

On Thu, Dec 02, 2021 at 01:14:47PM +0800, Rao, Lei wrote:

Sorry, resending with correct indentation and quoting.

On 12/1/2021 10:27 PM, Vladimir Sementsov-Ogievskiy wrote:

01.12.2021 12:48, Rao, Lei wrote:



-Original Message-
From: Daniel P. Berrangé 
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei 
Cc: Zhang, Chen ; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; qemu-bl...@nongnu.org; 
qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:

  We found that the QIO channel coroutine could not be awakened in some 
corner cases during our stress test for COLO.
  The patch fixes as follow:
  #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
  #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
timeout=59697630) at util/qemu-timer.c:348
  #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
blocking=true) at util/aio-posix.c:669
  #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
block/io.c:432
  #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
block/io.c:438
  #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
  #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
  #7  0x5563d658499e in qmp_x_blockdev_change (parent=0x5563d80ec4c0 
"colo-disk0", has_child=true, child=0x5563d7577d30 "children.1", 
has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
  #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
(args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
qapi/qapi-commands-block-core.c:1538
  #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
  at qapi/qmp-dispatch.c:132
  #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true) at 
qapi/qmp-dispatch.c:175
  #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
req=0x7fad64009d80) at monitor/qmp.c:145
  #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
monitor/qmp.c:234
  #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
util/async.c:89
  #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
util/async.c:117
  #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
util/aio-posix.c:459
  #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
callback=0x0, user_data=0x0) at util/async.c:260
  #17 0x7fad730e1fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
  #18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
  #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
util/main-loop.c:242
  #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:518
  #21 0x5563d658f5ed in main_loop () at vl.c:1814
  #22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8,
envp=0x7fff3cf5c2b8) at vl.c:450

  From the call trace, we can see that the QEMU main thread is waiting for 
the in_flight return to zero. But the in_filght never reaches 0.
  After analysis, the root cause is that the coroutine of NBD was not 
awakened. Although this bug was found in the COLO test, I think this is a
  universal problem in the QIO module. This issue also affects other 
modules depending on QIO such as NBD. We dump the following data:
  $2 = {
    in_flight = 2,
    state = NBD_CLIENT_QUIT,
    connect_status = 0,
    connect_err = 0x0,
    wait_in_flight = false,
    requests = {{
    coroutine = 0x0,
    offset = 273077071872,
    receiving = false,
  }, {
    coroutine = 0x7f1164571df0,
    offset = 498792529920,
    receiving = false,
  }, {
    coroutine = 0x0,
    offset = 500663590912,
    receiving = false,
  }, {
    ...
  } },
    reply = {
  simple = {
    magic = 1732535960,
    error = 0,
    handle = 0
  },
  structured = {
    magic = 1732535960,
    flags = 0,
    type = 0,
    handle = 0,
    length = 0
  },
  {
  

[PATCH] COLO: Move some trace code behind qemu_mutex_unlock_iothread()

2021-12-09 Thread Rao, Lei
Signed-off-by: Lei Rao 
---
 migration/colo.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 2415325262..3ccacb29c8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -683,8 +683,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
 
 qemu_mutex_lock_iothread();
 vm_stop_force_state(RUN_STATE_COLO);
-trace_colo_vm_state_change("run", "stop");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("run", "stop");
 
 /* FIXME: This is unnecessary for periodic checkpoint mode */
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
@@ -786,8 +786,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
 
 vmstate_loading = false;
 vm_start();
-trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
 
 if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
 return;
@@ -870,8 +870,8 @@ void *colo_process_incoming_thread(void *opaque)
 abort();
 #endif
 vm_start();
-trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
 
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
   &local_err);
-- 
2.32.0




Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-13 Thread Rao, Lei




On 12/3/2021 9:26 AM, Rao, Lei wrote:



On 12/2/2021 5:54 PM, Vladimir Sementsov-Ogievskiy wrote:

02.12.2021 11:53, Daniel P. Berrangé wrote:

On Thu, Dec 02, 2021 at 01:14:47PM +0800, Rao, Lei wrote:

Sorry, resending with correct indentation and quoting.

On 12/1/2021 10:27 PM, Vladimir Sementsov-Ogievskiy wrote:

01.12.2021 12:48, Rao, Lei wrote:



-Original Message-
From: Daniel P. Berrangé 
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei 
Cc: Zhang, Chen ; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; qemu-bl...@nongnu.org; 
qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:

  We found that the QIO channel coroutine could not be awakened in some 
corner cases during our stress test for COLO.
  The patch fixes as follow:
  #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
  #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
timeout=59697630) at util/qemu-timer.c:348
  #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
blocking=true) at util/aio-posix.c:669
  #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
block/io.c:432
  #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
block/io.c:438
  #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
  #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
  #7  0x5563d658499e in qmp_x_blockdev_change (parent=0x5563d80ec4c0 
"colo-disk0", has_child=true, child=0x5563d7577d30 "children.1", 
has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
  #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
(args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
qapi/qapi-commands-block-core.c:1538
  #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
  at qapi/qmp-dispatch.c:132
  #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true) at 
qapi/qmp-dispatch.c:175
  #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
req=0x7fad64009d80) at monitor/qmp.c:145
  #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
monitor/qmp.c:234
  #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
util/async.c:89
  #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
util/async.c:117
  #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
util/aio-posix.c:459
  #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
callback=0x0, user_data=0x0) at util/async.c:260
  #17 0x7fad730e1fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
  #18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
  #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
util/main-loop.c:242
  #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:518
  #21 0x5563d658f5ed in main_loop () at vl.c:1814
  #22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8,
envp=0x7fff3cf5c2b8) at vl.c:450


Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
---
   block/nbd.c  |  5 +
   include/io/channel.h | 19 +++
   io/channel.c | 22 ++
   3 files changed, 46 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c index 5ef462db1b..5ee4eaaf57
100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -208,6 +208,8 @@ static void nbd_teardown_connection(BlockDriverState *bs)
   assert(!s->in_flight);
   if (s->ioc) {
+    qio_channel_set_force_quit(s->ioc, true);
+    qio_channel_coroutines_wake(s->ioc);
   qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);


Calling shutdown here should already be causing the qio_chanel_readv_all to 
wakeup and break out of its
poll() loop. We shouldn't need to add a second way to break out of the poll().

Calling shutdown can wake up the coroutines which is polling. But I think it's 
not enough. I tried to forcibly wake up the NBD coroutine,
It may cause segment fault. The root cause is that it will continue to access 
ioc->xxx in qio_channel_yield(), but the ioc has been released and set it NULL 
such as in
nbd_co_do_establish_connection(); I think call shutdown will have the same 
result. So,

Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

2021-12-13 Thread Rao, Lei




On 12/13/2021 4:38 PM, Vladimir Sementsov-Ogievskiy wrote:

13.12.2021 11:02, Rao, Lei wrote:



On 12/3/2021 9:26 AM, Rao, Lei wrote:



On 12/2/2021 5:54 PM, Vladimir Sementsov-Ogievskiy wrote:

02.12.2021 11:53, Daniel P. Berrangé wrote:

On Thu, Dec 02, 2021 at 01:14:47PM +0800, Rao, Lei wrote:

Sorry, resending with correct indentation and quoting.

On 12/1/2021 10:27 PM, Vladimir Sementsov-Ogievskiy wrote:

01.12.2021 12:48, Rao, Lei wrote:



-Original Message-
From: Daniel P. Berrangé 
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei 
Cc: Zhang, Chen ; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; qemu-bl...@nongnu.org; 
qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case

On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:

  We found that the QIO channel coroutine could not be awakened in some 
corner cases during our stress test for COLO.
  The patch fixes as follow:
  #0  0x7fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
timeout=, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
  #1  0x5563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
timeout=59697630) at util/qemu-timer.c:348
  #2  0x5563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
blocking=true) at util/aio-posix.c:669
  #3  0x5563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
block/io.c:432
  #4  0x5563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
block/io.c:438
  #5  0x5563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
  #6  0x5563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
  #7  0x5563d658499e in qmp_x_blockdev_change (parent=0x5563d80ec4c0 
"colo-disk0", has_child=true, child=0x5563d7577d30 "children.1", 
has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
  #8  0x5563d67e8b4e in qmp_marshal_x_blockdev_change 
(args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
qapi/qapi-commands-block-core.c:1538
  #9  0x5563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
  at qapi/qmp-dispatch.c:132
  #10 0x5563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
, request=0x7fad64009d80, allow_oob=true) at 
qapi/qmp-dispatch.c:175
  #11 0x5563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
req=0x7fad64009d80) at monitor/qmp.c:145
  #12 0x5563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
monitor/qmp.c:234
  #13 0x5563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
util/async.c:89
  #14 0x5563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
util/async.c:117
  #15 0x5563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
util/aio-posix.c:459
  #16 0x5563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
callback=0x0, user_data=0x0) at util/async.c:260
  #17 0x7fad730e1fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
  #18 0x5563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
  #19 0x5563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
util/main-loop.c:242
  #20 0x5563d6978e69 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:518
  #21 0x5563d658f5ed in main_loop () at vl.c:1814
  #22 0x5563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8,
envp=0x7fff3cf5c2b8) at vl.c:450


Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
---
   block/nbd.c  |  5 +
   include/io/channel.h | 19 +++
   io/channel.c | 22 ++
   3 files changed, 46 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c index 5ef462db1b..5ee4eaaf57
100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -208,6 +208,8 @@ static void nbd_teardown_connection(BlockDriverState *bs)
   assert(!s->in_flight);
   if (s->ioc) {
+    qio_channel_set_force_quit(s->ioc, true);
+    qio_channel_coroutines_wake(s->ioc);
   qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);


Calling shutdown here should already be causing the qio_chanel_readv_all to 
wakeup and break out of its
poll() loop. We shouldn't need to add a second way to break out of the poll().

Calling shutdown can wake up the coroutines which is polling. But I think it's 
not enough. I tried to forcibly wake up the NBD coroutine,
It may cause segment fault. The root cause is that it will continue to access 
ioc->xxx in qio_channel_yield(), but the ioc has been released and set it NULL 
such as i

Re: [PATCH] COLO: Move some trace code behind qemu_mutex_unlock_iothread()

2021-12-13 Thread Rao, Lei




On 12/13/2021 7:45 PM, Dr. David Alan Gilbert wrote:

* Rao, Lei (lei@intel.com) wrote:

Signed-off-by: Lei Rao 


You don't say why you want to move it - it's just a trace, what's the
advantage?


I think it's not necessary to put trace code in the critical section.
So, moving it behind qemu_mutex_unlock_iothread() may reduce the lock time.
I will send the V2 and add some commit messages to say why.

Thanks,
Lei



Dave


---
  migration/colo.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 2415325262..3ccacb29c8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -683,8 +683,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
  
  qemu_mutex_lock_iothread();

  vm_stop_force_state(RUN_STATE_COLO);
-trace_colo_vm_state_change("run", "stop");
  qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("run", "stop");
  
  /* FIXME: This is unnecessary for periodic checkpoint mode */

  colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
@@ -786,8 +786,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
  
  vmstate_loading = false;

  vm_start();
-trace_colo_vm_state_change("stop", "run");
  qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
  
  if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {

  return;
@@ -870,8 +870,8 @@ void *colo_process_incoming_thread(void *opaque)
  abort();
  #endif
  vm_start();
-trace_colo_vm_state_change("stop", "run");
  qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
  
  colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,

&local_err);
--
2.32.0





[PATCH v2] COLO: Move some trace code behind qemu_mutex_unlock_iothread()

2021-12-13 Thread Rao, Lei
There is no need to put some trace code in the critical section.
So, moving it behind qemu_mutex_unlock_iothread() can reduce the
lock time.

Signed-off-by: Lei Rao 
---
 migration/colo.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 2415325262..3ccacb29c8 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -683,8 +683,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
 
 qemu_mutex_lock_iothread();
 vm_stop_force_state(RUN_STATE_COLO);
-trace_colo_vm_state_change("run", "stop");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("run", "stop");
 
 /* FIXME: This is unnecessary for periodic checkpoint mode */
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
@@ -786,8 +786,8 @@ static void 
colo_incoming_process_checkpoint(MigrationIncomingState *mis,
 
 vmstate_loading = false;
 vm_start();
-trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
 
 if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) {
 return;
@@ -870,8 +870,8 @@ void *colo_process_incoming_thread(void *opaque)
 abort();
 #endif
 vm_start();
-trace_colo_vm_state_change("stop", "run");
 qemu_mutex_unlock_iothread();
+trace_colo_vm_state_change("stop", "run");
 
 colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
   &local_err);
-- 
2.32.0




[PATCH] net/filter: Optimize filter_send to coroutine

2021-12-23 Thread Rao, Lei
This patch is to improve the logic of QEMU main thread sleep code in
qemu_chr_write_buffer() where it can be blocked and can't run other
coroutines during COLO IO stress test.

Our approach is to put filter_send() in a coroutine. In this way,
filter_send() will call qemu_coroutine_yield() in qemu_co_sleep_ns(),
so that it can be scheduled out and QEMU main thread has opportunity to
run other tasks.

Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
---
 net/filter-mirror.c | 67 -
 1 file changed, 54 insertions(+), 13 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index f20240cc9f..1e9f8b6216 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -20,6 +20,7 @@
 #include "chardev/char-fe.h"
 #include "qemu/iov.h"
 #include "qemu/sockets.h"
+#include "block/aio-wait.h"
 
 #define TYPE_FILTER_MIRROR "filter-mirror"
 typedef struct MirrorState MirrorState;
@@ -42,20 +43,21 @@ struct MirrorState {
 bool vnet_hdr;
 };
 
-static int filter_send(MirrorState *s,
-   const struct iovec *iov,
-   int iovcnt)
+typedef struct FilterSendCo {
+MirrorState *s;
+char *buf;
+ssize_t size;
+bool done;
+int ret;
+} FilterSendCo;
+
+static int _filter_send(MirrorState *s,
+   char *buf,
+   ssize_t size)
 {
 NetFilterState *nf = NETFILTER(s);
 int ret = 0;
-ssize_t size = 0;
 uint32_t len = 0;
-char *buf;
-
-size = iov_size(iov, iovcnt);
-if (!size) {
-return 0;
-}
 
 len = htonl(size);
 ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)&len, sizeof(len));
@@ -80,10 +82,7 @@ static int filter_send(MirrorState *s,
 }
 }
 
-buf = g_malloc(size);
-iov_to_buf(iov, iovcnt, 0, buf, size);
 ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)buf, size);
-g_free(buf);
 if (ret != size) {
 goto err;
 }
@@ -94,6 +93,48 @@ err:
 return ret < 0 ? ret : -EIO;
 }
 
+static void coroutine_fn filter_send_co(void *opaque)
+{
+FilterSendCo *data = opaque;
+
+data->ret = _filter_send(data->s, data->buf, data->size);
+data->done = true;
+g_free(data->buf);
+aio_wait_kick();
+}
+
+static int filter_send(MirrorState *s,
+   const struct iovec *iov,
+   int iovcnt)
+{
+ssize_t size = iov_size(iov, iovcnt);
+char *buf = NULL;
+
+if (!size) {
+return 0;
+}
+
+buf = g_malloc(size);
+iov_to_buf(iov, iovcnt, 0, buf, size);
+
+FilterSendCo data = {
+.s = s,
+.size = size,
+.buf = buf,
+.ret = 0,
+};
+
+Coroutine *co = qemu_coroutine_create(filter_send_co, &data);
+qemu_coroutine_enter(co);
+
+while (!data.done) {
+aio_poll(qemu_get_aio_context(), true);
+}
+
+return data.ret;
+
+}
+
 static void redirector_to_filter(NetFilterState *nf,
  const uint8_t *buf,
  int len)
-- 
2.32.0




Re: [PATCH] net/filter: Optimize filter_send to coroutine

2021-12-24 Thread Rao, Lei




On 12/24/2021 6:07 PM, lizhij...@fujitsu.com wrote:



On 24/12/2021 10:37, Rao, Lei wrote:

This patch is to improve the logic of QEMU main thread sleep code in
qemu_chr_write_buffer() where it can be blocked and can't run other
coroutines during COLO IO stress test.

Our approach is to put filter_send() in a coroutine. In this way,
filter_send() will call qemu_coroutine_yield() in qemu_co_sleep_ns(),
so that it can be scheduled out and QEMU main thread has opportunity to
run other tasks.

Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
---
   net/filter-mirror.c | 67 -
   1 file changed, 54 insertions(+), 13 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index f20240cc9f..1e9f8b6216 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -20,6 +20,7 @@
   #include "chardev/char-fe.h"
   #include "qemu/iov.h"
   #include "qemu/sockets.h"
+#include "block/aio-wait.h"
   
   #define TYPE_FILTER_MIRROR "filter-mirror"

   typedef struct MirrorState MirrorState;
@@ -42,20 +43,21 @@ struct MirrorState {
   bool vnet_hdr;
   };
   
-static int filter_send(MirrorState *s,

-   const struct iovec *iov,
-   int iovcnt)
+typedef struct FilterSendCo {
+MirrorState *s;
+char *buf;
+ssize_t size;
+bool done;
+int ret;
+} FilterSendCo;
+
+static int _filter_send(MirrorState *s,
+   char *buf,
+   ssize_t size)
   {
   NetFilterState *nf = NETFILTER(s);
   int ret = 0;
-ssize_t size = 0;
   uint32_t len = 0;
-char *buf;
-
-size = iov_size(iov, iovcnt);
-if (!size) {
-return 0;
-}
   
   len = htonl(size);

   ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)&len, sizeof(len));
@@ -80,10 +82,7 @@ static int filter_send(MirrorState *s,
   }
   }
   
-buf = g_malloc(size);

-iov_to_buf(iov, iovcnt, 0, buf, size);
   ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)buf, size);
-g_free(buf);
   if (ret != size) {
   goto err;
   }
@@ -94,6 +93,48 @@ err:
   return ret < 0 ? ret : -EIO;
   }
   
+static void coroutine_fn filter_send_co(void *opaque)

+{
+FilterSendCo *data = opaque;
+
+data->ret = _filter_send(data->s, data->buf, data->size);
+data->done = true;
+g_free(data->buf);
+aio_wait_kick();
+}
+
+static int filter_send(MirrorState *s,
+   const struct iovec *iov,
+   int iovcnt)
+{
+ssize_t size = iov_size(iov, iovcnt);
+char *buf = NULL;
+
+if (!size) {
+return 0;
+}
+
+buf = g_malloc(size);
+iov_to_buf(iov, iovcnt, 0, buf, size);
+
+FilterSendCo data = {
+.s = s,
+.size = size,
+.buf = buf,
+.ret = 0,
+};
+
+Coroutine *co = qemu_coroutine_create(filter_send_co, &data);


BTW, does qemu/old gcc complaint such coding style ?

int a;
a = foo()
int b = a;


There are a lot of codes of this style in QEMU.
It is written that we need at least GCC v7.4 to compile QEMU in the configure 
file.
So, I think it is no problem.






+qemu_coroutine_enter(co);
+
+while (!data.done) {
+aio_poll(qemu_get_aio_context(), true);
+}
+
+return data.ret;
+

redundant  newline


will be changed in V2.

Thanks,
Lei



Otherwise,
Reviewed-by: Li Zhijian 




+}
+
   static void redirector_to_filter(NetFilterState *nf,
const uint8_t *buf,
int len)




[PATCH v2] net/filter: Optimize filter_send to coroutine

2021-12-26 Thread Rao Lei
This patch is to improve the logic of QEMU main thread sleep code in
qemu_chr_write_buffer() where it can be blocked and can't run other
coroutines during COLO IO stress test.

Our approach is to put filter_send() in a coroutine. In this way,
filter_send() will call qemu_coroutine_yield() in qemu_co_sleep_ns(),
so that it can be scheduled out and QEMU main thread has opportunity to
run other tasks.

Signed-off-by: Lei Rao 
Signed-off-by: Zhang Chen 
Reviewed-by: Li Zhijian 
---
 net/filter-mirror.c | 66 -
 1 file changed, 53 insertions(+), 13 deletions(-)

diff --git a/net/filter-mirror.c b/net/filter-mirror.c
index f20240cc9f..34a63b5dbb 100644
--- a/net/filter-mirror.c
+++ b/net/filter-mirror.c
@@ -20,6 +20,7 @@
 #include "chardev/char-fe.h"
 #include "qemu/iov.h"
 #include "qemu/sockets.h"
+#include "block/aio-wait.h"
 
 #define TYPE_FILTER_MIRROR "filter-mirror"
 typedef struct MirrorState MirrorState;
@@ -42,20 +43,21 @@ struct MirrorState {
 bool vnet_hdr;
 };
 
-static int filter_send(MirrorState *s,
-   const struct iovec *iov,
-   int iovcnt)
+typedef struct FilterSendCo {
+MirrorState *s;
+char *buf;
+ssize_t size;
+bool done;
+int ret;
+} FilterSendCo;
+
+static int _filter_send(MirrorState *s,
+   char *buf,
+   ssize_t size)
 {
 NetFilterState *nf = NETFILTER(s);
 int ret = 0;
-ssize_t size = 0;
 uint32_t len = 0;
-char *buf;
-
-size = iov_size(iov, iovcnt);
-if (!size) {
-return 0;
-}
 
 len = htonl(size);
 ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)&len, sizeof(len));
@@ -80,10 +82,7 @@ static int filter_send(MirrorState *s,
 }
 }
 
-buf = g_malloc(size);
-iov_to_buf(iov, iovcnt, 0, buf, size);
 ret = qemu_chr_fe_write_all(&s->chr_out, (uint8_t *)buf, size);
-g_free(buf);
 if (ret != size) {
 goto err;
 }
@@ -94,6 +93,47 @@ err:
 return ret < 0 ? ret : -EIO;
 }
 
+static void coroutine_fn filter_send_co(void *opaque)
+{
+FilterSendCo *data = opaque;
+
+data->ret = _filter_send(data->s, data->buf, data->size);
+data->done = true;
+g_free(data->buf);
+aio_wait_kick();
+}
+
+static int filter_send(MirrorState *s,
+   const struct iovec *iov,
+   int iovcnt)
+{
+ssize_t size = iov_size(iov, iovcnt);
+char *buf = NULL;
+
+if (!size) {
+return 0;
+}
+
+buf = g_malloc(size);
+iov_to_buf(iov, iovcnt, 0, buf, size);
+
+FilterSendCo data = {
+.s = s,
+.size = size,
+.buf = buf,
+.ret = 0,
+};
+
+Coroutine *co = qemu_coroutine_create(filter_send_co, &data);
+qemu_coroutine_enter(co);
+
+while (!data.done) {
+aio_poll(qemu_get_aio_context(), true);
+}
+
+return data.ret;
+}
+
 static void redirector_to_filter(NetFilterState *nf,
  const uint8_t *buf,
  int len)
-- 
2.32.0




[PATCH] migration/colo.c: Add missed return in error handling

2021-12-27 Thread Rao Lei
When doing failover and checkpoint, some returns are missed in error
handling. Let's add it.

Signed-off-by: Lei Rao 
---
 migration/colo.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 5f7071b3cd..014d3cba01 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -94,12 +94,15 @@ static void secondary_vm_do_failover(void)
 if (local_err) {
 error_report_err(local_err);
 local_err = NULL;
+return;
 }
 
 /* Notify all filters of all NIC to do checkpoint */
 colo_notify_filters_event(COLO_EVENT_FAILOVER, &local_err);
 if (local_err) {
 error_report_err(local_err);
+local_err = NULL;
+return;
 }
 
 if (!autostart) {
@@ -178,6 +181,7 @@ static void primary_vm_do_failover(void)
 if (local_err) {
 error_report_err(local_err);
 local_err = NULL;
+return;
 }
 
 /* Notify COLO thread that failover work is finished */
@@ -507,12 +511,11 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
 goto out;
 }
 
-ret = 0;
-
 qemu_mutex_lock_iothread();
 vm_start();
 qemu_mutex_unlock_iothread();
 trace_colo_vm_state_change("stop", "run");
+return 0;
 
 out:
 if (local_err) {
-- 
2.32.0




[RFC PATCH] ui/vnc.c: Fixed a deadlock bug.

2021-12-28 Thread Rao Lei
The GDB statck is as follows:
(gdb) bt
0  __lll_lock_wait (futex=futex@entry=0x56211df20360, private=0) at 
lowlevellock.c:52
1  0x7f263caf20a3 in __GI___pthread_mutex_lock (mutex=0x56211df20360) at 
../nptl/pthread_mutex_lock.c:80
2  0x56211a757364 in qemu_mutex_lock_impl (mutex=0x56211df20360, 
file=0x56211a804857 "../ui/vnc-jobs.h", line=60)
at ../util/qemu-thread-posix.c:80
3  0x56211a0ef8c7 in vnc_lock_output (vs=0x56211df14200) at 
../ui/vnc-jobs.h:60
4  0x56211a0efcb7 in vnc_clipboard_send (vs=0x56211df14200, count=1, 
dwords=0x7ffdf1701338) at ../ui/vnc-clipboard.c:138
5  0x56211a0f0129 in vnc_clipboard_notify (notifier=0x56211df244c8, 
data=0x56211dd1bbf0) at ../ui/vnc-clipboard.c:209
6  0x56211a75dde8 in notifier_list_notify (list=0x56211afa17d0 
, data=0x56211dd1bbf0) at ../util/notify.c:39
7  0x56211a0bf0e6 in qemu_clipboard_update (info=0x56211dd1bbf0) at 
../ui/clipboard.c:50
8  0x56211a0bf05d in qemu_clipboard_peer_release (peer=0x56211df244c0, 
selection=QEMU_CLIPBOARD_SELECTION_CLIPBOARD)
at ../ui/clipboard.c:41
9  0x56211a0bef9b in qemu_clipboard_peer_unregister (peer=0x56211df244c0) 
at ../ui/clipboard.c:19
10 0x56211a0d45f3 in vnc_disconnect_finish (vs=0x56211df14200) at 
../ui/vnc.c:1358
11 0x56211a0d4c9d in vnc_client_read (vs=0x56211df14200) at ../ui/vnc.c:1611
12 0x56211a0d4df8 in vnc_client_io (ioc=0x56211ce70690, condition=G_IO_IN, 
opaque=0x56211df14200) at ../ui/vnc.c:1649
13 0x56211a5b976c in qio_channel_fd_source_dispatch
(source=0x56211ce50a00, callback=0x56211a0d4d71 , 
user_data=0x56211df14200) at ../io/channel-watch.c:84
14 0x7f263ccede8e in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
15 0x56211a77d4a1 in glib_pollfds_poll () at ../util/main-loop.c:232
16 0x56211a77d51f in os_host_main_loop_wait (timeout=958545) at 
../util/main-loop.c:255
17 0x56211a77d630 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
18 0x56211a45bc8e in qemu_main_loop () at ../softmmu/runstate.c:726
19 0x56211a0b45fa in main (argc=69, argv=0x7ffdf1701778, 
envp=0x7ffdf17019a8) at ../softmmu/main.c:50

>From the call trace, we can see it is a deadlock bug.
vnc_disconnect_finish will acquire the output_mutex.
But, the output_mutex will be acquired again in vnc_clipboard_send.
Repeated locking will cause deadlock. So, I move
qemu_clipboard_peer_unregister() behind vnc_unlock_output();

Signed-off-by: Lei Rao 
---
 ui/vnc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 1ed1c7efc6..3ccd33dedc 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1354,12 +1354,12 @@ void vnc_disconnect_finish(VncState *vs)
 /* last client gone */
 vnc_update_server_surface(vs->vd);
 }
+vnc_unlock_output(vs);
+
 if (vs->cbpeer.notifier.notify) {
 qemu_clipboard_peer_unregister(&vs->cbpeer);
 }
 
-vnc_unlock_output(vs);
-
 qemu_mutex_destroy(&vs->output_mutex);
 if (vs->bh != NULL) {
 qemu_bh_delete(vs->bh);
-- 
2.32.0




RE: [PATCH v3 07/10] Reset the auto-converge counter at every checkpoint.

2021-03-24 Thread Rao, Lei
After testing, I think you are right.
Will remove the cpu_throttle_stop() in V4.

Thanks,
Lei.

-Original Message-
From: Dr. David Alan Gilbert  
Sent: Thursday, March 25, 2021 12:40 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
lukasstra...@web.de; qemu-devel@nongnu.org
Subject: Re: [PATCH v3 07/10] Reset the auto-converge counter at every 
checkpoint.

* leirao (lei@intel.com) wrote:
> From: "Rao, Lei" 
> 
> if we don't reset the auto-converge counter, it will continue to run 
> with COLO running, and eventually the system will hang due to the CPU 
> throttle reaching DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao 
> ---
>  migration/colo.c |  4 
>  migration/ram.c  | 10 ++
>  migration/ram.h  |  1 +
>  3 files changed, 15 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c index 
> 1aaf316..723ffb8 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -459,6 +459,10 @@ static int colo_do_checkpoint_transaction(MigrationState 
> *s,
>  if (ret < 0) {
>  goto out;
>  }
> +
> +if (migrate_auto_converge()) {
> +mig_throttle_counter_reset();
> +}
>  /*
>   * Only save VM's live state, which not including device state.
>   * TODO: We may need a timeout mechanism to prevent COLO process 
> diff --git a/migration/ram.c b/migration/ram.c index 72143da..e795a8d 
> 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -652,6 +652,16 @@ static void mig_throttle_guest_down(uint64_t 
> bytes_dirty_period,
>  }
>  }
>  
> +void mig_throttle_counter_reset(void) {
> +RAMState *rs = ram_state;
> +
> +rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +rs->num_dirty_pages_period = 0;
> +rs->bytes_xfer_prev = ram_counters.transferred;
> +cpu_throttle_stop();

I think this is right, so:

Reviewed-by: Dr. David Alan Gilbert 

however, do you really need the cpu_throttle_stop?  Shouldn't the previous 
iteration have called that in migration_iteration_finish() ?

Dave

> +}
> +
>  /**
>   * xbzrle_cache_zero_page: insert a zero page in the XBZRLE cache
>   *
> diff --git a/migration/ram.h b/migration/ram.h index 6378bb3..3f78175 
> 100644
> --- a/migration/ram.h
> +++ b/migration/ram.h
> @@ -50,6 +50,7 @@ bool ramblock_is_ignored(RAMBlock *block);  int 
> xbzrle_cache_resize(uint64_t new_size, Error **errp);  uint64_t 
> ram_bytes_remaining(void);  uint64_t ram_bytes_total(void);
> +void mig_throttle_counter_reset(void);
>  
>  uint64_t ram_pagesize_summary(void);
>  int ram_save_queue_pages(const char *rbname, ram_addr_t start, 
> ram_addr_t len);
> --
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




RE: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

2021-03-25 Thread Rao, Lei


-Original Message-
From: Dr. David Alan Gilbert  
Sent: Friday, March 26, 2021 2:08 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
lukasstra...@web.de; qemu-devel@nongnu.org
Subject: Re: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

* leirao (lei@intel.com) wrote:
> From: "Rao, Lei" 
> 
> When we use continuous dirty memory copy for flushing ram cache on 
> secondary VM, we can also clean up the bitmap of contiguous dirty page 
> memory. This also can reduce the VM stop time during checkpoint.
> 
> Signed-off-by: Lei Rao 
> ---
>  migration/ram.c | 29 +
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c index a258466..ae1e659 
> 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -855,6 +855,30 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
> RAMBlock *rb,
>  return first;
>  }
>  
> +/**
> + * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will 
> +use
> + * continuous memory copy, so we can also clean up the bitmap of 
> +contiguous
> + * dirty memory.
> + */
> +static inline bool colo_bitmap_clear_dirty(RAMState *rs,
> +   RAMBlock *rb,
> +   unsigned long start,
> +   unsigned long num) {
> +bool ret;
> +unsigned long i = 0;
> +
> +qemu_mutex_lock(&rs->bitmap_mutex);

Please use QEMU_LOCK_GUARD(&rs->bitmap_mutex);

Will be changed in V5. Thanks.

> +for (i = 0; i < num; i++) {
> +ret = test_and_clear_bit(start + i, rb->bmap);
> +if (ret) {
> +rs->migration_dirty_pages--;
> +}
> +}
> +qemu_mutex_unlock(&rs->bitmap_mutex);
> +return ret;

This implementation is missing the clear_bmap code that 
migration_bitmap_clear_dirty has.
I think that's necessary now.

Are we sure there's any benefit in this?

Dave

There is such a note about clear_bmap in struct RAMBlock:
"On destination side, this should always be NULL, and the variable 
`clear_bmap_shift' is meaningless."
This means that clear_bmap is always NULL on secondary VM. And for the behavior 
of flush ram cache to ram, we will always only happen on secondary VM.
So, I think the clear_bmap code is unnecessary for COLO.
As for the benefits, When the number of dirty pages from flush ram cache to ram 
is too much. it will reduce the number of locks acquired.

Lei

> +}
> +
>  static inline bool migration_bitmap_clear_dirty(RAMState *rs,
>  RAMBlock *rb,
>  unsigned long page) 
> @@ -3700,7 +3724,6 @@ void colo_flush_ram_cache(void)
>  void *src_host;
>  unsigned long offset = 0;
>  unsigned long num = 0;
> -unsigned long i = 0;
>  
>  memory_global_dirty_log_sync();
>  WITH_RCU_READ_LOCK_GUARD() {
> @@ -3722,9 +3745,7 @@ void colo_flush_ram_cache(void)
>  num = 0;
>  block = QLIST_NEXT_RCU(block, next);
>  } else {
> -for (i = 0; i < num; i++) {
> -migration_bitmap_clear_dirty(ram_state, block, offset + 
> i);
> -}
> +colo_bitmap_clear_dirty(ram_state, block, offset, 
> + num);
>  dst_host = block->host
>   + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
>  src_host = block->colo_cache
> --
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




RE: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

2021-04-08 Thread Rao, Lei
The performance data has been added to the commit message in V6.

Thanks,
Lei.

-Original Message-
From: Dr. David Alan Gilbert  
Sent: Monday, March 29, 2021 7:32 PM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
lukasstra...@web.de; qemu-devel@nongnu.org
Subject: Re: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.

* Rao, Lei (lei@intel.com) wrote:
> 
> -Original Message-
> From: Dr. David Alan Gilbert 
> Sent: Friday, March 26, 2021 2:08 AM
> To: Rao, Lei 
> Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
> jasow...@redhat.com; quint...@redhat.com; pbonz...@redhat.com; 
> lukasstra...@web.de; qemu-devel@nongnu.org
> Subject: Re: [PATCH v4 09/10] Add the function of colo_bitmap_clear_diry.
> 
> * leirao (lei....@intel.com) wrote:
> > From: "Rao, Lei" 
> > 
> > When we use continuous dirty memory copy for flushing ram cache on 
> > secondary VM, we can also clean up the bitmap of contiguous dirty 
> > page memory. This also can reduce the VM stop time during checkpoint.
> > 
> > Signed-off-by: Lei Rao 
> > ---
> >  migration/ram.c | 29 +
> >  1 file changed, 25 insertions(+), 4 deletions(-)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c index 
> > a258466..ae1e659
> > 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -855,6 +855,30 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
> > RAMBlock *rb,
> >  return first;
> >  }
> >  
> > +/**
> > + * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will 
> > +use
> > + * continuous memory copy, so we can also clean up the bitmap of 
> > +contiguous
> > + * dirty memory.
> > + */
> > +static inline bool colo_bitmap_clear_dirty(RAMState *rs,
> > +   RAMBlock *rb,
> > +   unsigned long start,
> > +   unsigned long num) {
> > +bool ret;
> > +unsigned long i = 0;
> > +
> > +qemu_mutex_lock(&rs->bitmap_mutex);
> 
> Please use QEMU_LOCK_GUARD(&rs->bitmap_mutex);
> 
> Will be changed in V5. Thanks.
> 
> > +for (i = 0; i < num; i++) {
> > +ret = test_and_clear_bit(start + i, rb->bmap);
> > +if (ret) {
> > +rs->migration_dirty_pages--;
> > +}
> > +}
> > +qemu_mutex_unlock(&rs->bitmap_mutex);
> > +return ret;
> 
> This implementation is missing the clear_bmap code that 
> migration_bitmap_clear_dirty has.
> I think that's necessary now.
> 
> Are we sure there's any benefit in this?
> 
> Dave
> 
> There is such a note about clear_bmap in struct RAMBlock:
> "On destination side, this should always be NULL, and the variable 
> `clear_bmap_shift' is meaningless."
> This means that clear_bmap is always NULL on secondary VM. And for the 
> behavior of flush ram cache to ram, we will always only happen on secondary 
> VM.
> So, I think the clear_bmap code is unnecessary for COLO.

Ah yes; can you add a comment there to note this is on the secondary to make 
that clear.

> As for the benefits, When the number of dirty pages from flush ram cache to 
> ram is too much. it will reduce the number of locks acquired.

It might be good to measure the benefit.

Dave

> Lei
> 
> > +}
> > +
> >  static inline bool migration_bitmap_clear_dirty(RAMState *rs,
> >  RAMBlock *rb,
> >  unsigned long page) 
> > @@ -3700,7 +3724,6 @@ void colo_flush_ram_cache(void)
> >  void *src_host;
> >  unsigned long offset = 0;
> >  unsigned long num = 0;
> > -unsigned long i = 0;
> >  
> >  memory_global_dirty_log_sync();
> >  WITH_RCU_READ_LOCK_GUARD() {
> > @@ -3722,9 +3745,7 @@ void colo_flush_ram_cache(void)
> >  num = 0;
> >  block = QLIST_NEXT_RCU(block, next);
> >  } else {
> > -for (i = 0; i < num; i++) {
> > -migration_bitmap_clear_dirty(ram_state, block, offset 
> > + i);
> > -}
> > +colo_bitmap_clear_dirty(ram_state, block, offset, 
> > + num);
> >  dst_host = block->host
> >   + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
> >  src_host = block->colo_cache
> > --
> > 1.8.3.1
> > 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




RE: [PATCH v2 05/10] Optimize the function of packet_new

2021-03-12 Thread Rao, Lei
How about redefine a function named packet_new_nocopy? 
In comments, we can tell the caller don't release the buffer and the 
packet_destroy will release it.

Thanks,
Lei.
-Original Message-
From: lizhij...@fujitsu.com  
Sent: Friday, March 12, 2021 2:53 PM
To: Rao, Lei ; Zhang, Chen ; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH v2 05/10] Optimize the function of packet_new



On 3/12/21 1:02 PM, leirao wrote:
> From: "Rao, Lei" 
>
> if we put the data copy outside the packet_new(), then for the 
> filter-rewrite module, there will be one less memory copy in the 
> processing of each network packet.
>
> Signed-off-by: Lei Rao 
> ---
>   net/colo-compare.c| 7 +--
>   net/colo.c| 4 ++--
>   net/colo.h| 2 +-
>   net/filter-rewriter.c | 1 -
>   4 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c index 
> 9e18baa..8bdf5a8 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -247,14 +247,17 @@ static int packet_enqueue(CompareState *s, int mode, 
> Connection **con)
>   ConnectionKey key;
>   Packet *pkt = NULL;
>   Connection *conn;
> +char *data = NULL;
>   int ret;
>   
>   if (mode == PRIMARY_IN) {
> -pkt = packet_new(s->pri_rs.buf,
> +data = g_memdup(s->pri_rs.buf, s->pri_rs.packet_len);
> +pkt = packet_new(data,
>s->pri_rs.packet_len,
>s->pri_rs.vnet_hdr_len);
>   } else {
> -pkt = packet_new(s->sec_rs.buf,
> +data = g_memdup(s->sec_rs.buf, s->sec_rs.packet_len);
> +pkt = packet_new(data,
>s->sec_rs.packet_len,
>s->sec_rs.vnet_hdr_len);
>   }
> diff --git a/net/colo.c b/net/colo.c
> index ef00609..08fb37e 100644
> --- a/net/colo.c
> +++ b/net/colo.c
> @@ -155,11 +155,11 @@ void connection_destroy(void *opaque)
>   g_slice_free(Connection, conn);
>   }
>   
> -Packet *packet_new(const void *data, int size, int vnet_hdr_len)
> +Packet *packet_new(void *data, int size, int vnet_hdr_len)
>   {
>   Packet *pkt = g_slice_new(Packet);
>   
> -pkt->data = g_memdup(data, size);
> +pkt->data = data;

if so,  should packet_destroy()  free() data which may be not alloc by itself

Thanks
Zhijian


RE: [PATCH v2 08/10] Reduce the PVM stop time during Checkpoint

2021-03-12 Thread Rao, Lei
I think it's enough to know the start and the number of dirty pages in the 
bitmap.
Because the size of consecutive dirty pages can be calculated. By the way, no 
problems were found in our automated testing.

Thanks,
Lei.

-Original Message-
From: lizhij...@fujitsu.com  
Sent: Friday, March 12, 2021 3:54 PM
To: Rao, Lei ; Zhang, Chen ; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH v2 08/10] Reduce the PVM stop time during Checkpoint



On 3/12/21 1:03 PM, leirao wrote:
> From: "Rao, Lei" 
>
> When flushing memory from ram cache to ram during every checkpoint on 
> secondary VM, we can copy continuous chunks of memory instead of
> 4096 bytes per time to reduce the time of VM stop during checkpoint.
>
> Signed-off-by: Lei Rao 
> ---
>   migration/ram.c | 44 +---
>   1 file changed, 41 insertions(+), 3 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c index e795a8d..b269637 
> 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -823,6 +823,39 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, 
> RAMBlock *rb,
>   return next;
>   }
>   
> +/*
> + * colo_bitmap_find_diry:find contiguous dirty pages from start
> + *
> + * Returns the page offset within memory region of the start of the 
> +contiguout
> + * dirty page
> + *
> + * @rs: current RAM state
> + * @rb: RAMBlock where to search for dirty pages
> + * @start: page where we start the search
> + * @num: the number of contiguous dirty pages  */ static inline 
> +unsigned long colo_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
> + unsigned long start, unsigned 
> +long *num) {
> +unsigned long size = rb->used_length >> TARGET_PAGE_BITS;
> +unsigned long *bitmap = rb->bmap;
> +unsigned long first, next;
> +
> +if (ramblock_is_ignored(rb)) {
> +return size;
> +}
> +
> +first = find_next_bit(bitmap, size, start);
> +if (first >= size) {
> +return first;
> +}
> +next = find_next_zero_bit(bitmap, size, first + 1);
> +assert(next >= first);
> +*num = next - first;
> +return first;

The idea is outstanding

i wonder it should return (next - 1) ?

Thanks
Zhijian


> +}
> +
>   static inline bool migration_bitmap_clear_dirty(RAMState *rs,
>   RAMBlock *rb,
>   unsigned long page) 
> @@ -3669,6 +3702,8 @@ void colo_flush_ram_cache(void)
>   void *dst_host;
>   void *src_host;
>   unsigned long offset = 0;
> +unsigned long num = 0;
> +unsigned long i = 0;
>   
>   memory_global_dirty_log_sync();
>   WITH_RCU_READ_LOCK_GUARD() {
> @@ -3682,19 +3717,22 @@ void colo_flush_ram_cache(void)
>   block = QLIST_FIRST_RCU(&ram_list.blocks);
>   
>   while (block) {
> -offset = migration_bitmap_find_dirty(ram_state, block, offset);
> +offset = colo_bitmap_find_dirty(ram_state, block, offset, 
> + &num);
>   
>   if (((ram_addr_t)offset) << TARGET_PAGE_BITS
>   >= block->used_length) {
>   offset = 0;
> +num = 0;
>   block = QLIST_NEXT_RCU(block, next);
>   } else {
> -migration_bitmap_clear_dirty(ram_state, block, offset);
> +for (i = 0; i < num; i++) {
> +migration_bitmap_clear_dirty(ram_state, block, offset + 
> i);
> +}
>   dst_host = block->host
>+ (((ram_addr_t)offset) << TARGET_PAGE_BITS);
>   src_host = block->colo_cache
>+ (((ram_addr_t)offset) << TARGET_PAGE_BITS);
> -memcpy(dst_host, src_host, TARGET_PAGE_SIZE);
> +memcpy(dst_host, src_host, TARGET_PAGE_SIZE * num);
>   }
>   }
>   }



RE: [PATCH v2 05/10] Optimize the function of packet_new

2021-03-12 Thread Rao, Lei
Oh, I understand what you mean, and will change it in V3.

Thanks,
Lei.

-Original Message-
From: lizhij...@fujitsu.com  
Sent: Friday, March 12, 2021 6:23 PM
To: Rao, Lei ; Zhang, Chen ; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH v2 05/10] Optimize the function of packet_new



> +offset = colo_bitmap_find_dirty(ram_state, block, offset, 
> + &num);
IIUC, this return value would pass to the next round as start index,  so you 
should skip the already checked one.


Thanks


On 3/12/21 5:56 PM, Rao, Lei wrote:
> How about redefine a function named packet_new_nocopy?
> In comments, we can tell the caller don't release the buffer and the 
> packet_destroy will release it.
>
> Thanks,
> Lei.
> -Original Message-
> From:lizhij...@fujitsu.com  
> Sent: Friday, March 12, 2021 2:53 PM
> To: Rao, Lei; Zhang, 
> Chen;jasow...@redhat.com;quint...@redhat.com;dgi
> lb...@redhat.com;pbonz...@redhat.com;lukasstra...@web.de
> Cc:qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] Optimize the function of packet_new
>
>


RE: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

2021-04-16 Thread Rao, Lei
Hi, Dave

The performance data has added to the commit messages. 
Do you have any other suggestions?

Thanks
Lei.

-Original Message-
From: Rao, Lei  
Sent: Friday, April 9, 2021 11:21 AM
To: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org; Rao, Lei 
Subject: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

From: "Rao, Lei" 

When we use continuous dirty memory copy for flushing ram cache on secondary 
VM, we can also clean up the bitmap of contiguous dirty page memory. This also 
can reduce the VM stop time during checkpoint.

The performance test for COLO as follow:

Server configuraton:
CPU :Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz MEM :251G(type:DDR4 Speed:2666 
MT/s) SSD :Intel 730 and DC S35x0/3610/3700 Series SSDs

dirty pages:3189376  migration_bitmap_clear_dirty time consuming(ns):105194000 
dirty pages:3189784  migration_bitmap_clear_dirty time consuming(ns):105297000 
dirty pages:3190501  migration_bitmap_clear_dirty time consuming(ns):10541 
dirty pages:3188734  migration_bitmap_clear_dirty time consuming(ns):105138000 
dirty pages:3189464  migration_bitmap_clear_dirty time consuming(ns):111736000 
dirty pages:3188558  migration_bitmap_clear_dirty time consuming(ns):105079000 
dirty pages:3239489  migration_bitmap_clear_dirty time consuming(ns):106761000

dirty pages:3190240  colo_bitmap_clear_dirty time consuming(ns):8369000 dirty 
pages:3189293  colo_bitmap_clear_dirty time consuming(ns):8388000 dirty 
pages:3189171  colo_bitmap_clear_dirty time consuming(ns):8641000 dirty 
pages:3189099  colo_bitmap_clear_dirty time consuming(ns):828 dirty 
pages:3189974  colo_bitmap_clear_dirty time consuming(ns):8352000 dirty 
pages:3189471  colo_bitmap_clear_dirty time consuming(ns):8348000 dirty 
pages:3189681  colo_bitmap_clear_dirty time consuming(ns):8426000

it can be seen from the data that colo_bitmap_clear_dirty is more efficient.

Signed-off-by: Lei Rao 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c index 8661d82..11275cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -857,6 +857,36 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
RAMBlock *rb,
 return first;
 }
 
+/**
+ * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will use
+ * continuous memory copy, so we can also clean up the bitmap of 
+contiguous
+ * dirty memory.
+ */
+static inline bool colo_bitmap_clear_dirty(RAMState *rs,
+   RAMBlock *rb,
+   unsigned long start,
+   unsigned long num) {
+bool ret;
+unsigned long i = 0;
+
+/*
+ * Since flush ram cache to ram can only happen on Secondary VM.
+ * and the clear bitmap always is NULL on destination side.
+ * Therefore, there is unnecessary to judge whether the
+ * clear_bitmap needs clear.
+ */
+QEMU_LOCK_GUARD(&rs->bitmap_mutex);
+for (i = 0; i < num; i++) {
+ret = test_and_clear_bit(start + i, rb->bmap);
+if (ret) {
+rs->migration_dirty_pages--;
+}
+}
+
+return ret;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page) @@ 
-3774,11 +3804,7 @@ void colo_flush_ram_cache(void)
 num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-unsigned long i = 0;
-
-for (i = 0; i < num; i++) {
-migration_bitmap_clear_dirty(ram_state, block, offset + i);
-}
+colo_bitmap_clear_dirty(ram_state, block, offset, num);
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
--
1.8.3.1




RE: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

2021-04-26 Thread Rao, Lei
Hi, Dave

I think this set of patches is beneficial to upstream. Please check these 
performance data. If you have any other ideas, please let me know.

Thanks
Lei.

-Original Message-
From: Rao, Lei 
Sent: Friday, April 16, 2021 3:57 PM
To: dgilb...@redhat.com
Cc: qemu-devel@nongnu.org; Zhang, Chen ; 
lizhij...@cn.fujitsu.com; jasow...@redhat.com; quint...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Subject: RE: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

Hi, Dave

The performance data has added to the commit messages. 
Do you have any other suggestions?

Thanks
Lei.

-Original Message-
From: Rao, Lei 
Sent: Friday, April 9, 2021 11:21 AM
To: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com; lukasstra...@web.de
Cc: qemu-devel@nongnu.org; Rao, Lei 
Subject: [PATCH v6 09/10] Add the function of colo_bitmap_clear_dirty

From: "Rao, Lei" 

When we use continuous dirty memory copy for flushing ram cache on secondary 
VM, we can also clean up the bitmap of contiguous dirty page memory. This also 
can reduce the VM stop time during checkpoint.

The performance test for COLO as follow:

Server configuraton:
CPU :Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz MEM :251G(type:DDR4 Speed:2666 
MT/s) SSD :Intel 730 and DC S35x0/3610/3700 Series SSDs

dirty pages:3189376  migration_bitmap_clear_dirty time consuming(ns):105194000 
dirty pages:3189784  migration_bitmap_clear_dirty time consuming(ns):105297000 
dirty pages:3190501  migration_bitmap_clear_dirty time consuming(ns):10541 
dirty pages:3188734  migration_bitmap_clear_dirty time consuming(ns):105138000 
dirty pages:3189464  migration_bitmap_clear_dirty time consuming(ns):111736000 
dirty pages:3188558  migration_bitmap_clear_dirty time consuming(ns):105079000 
dirty pages:3239489  migration_bitmap_clear_dirty time consuming(ns):106761000

dirty pages:3190240  colo_bitmap_clear_dirty time consuming(ns):8369000 dirty 
pages:3189293  colo_bitmap_clear_dirty time consuming(ns):8388000 dirty 
pages:3189171  colo_bitmap_clear_dirty time consuming(ns):8641000 dirty 
pages:3189099  colo_bitmap_clear_dirty time consuming(ns):828 dirty 
pages:3189974  colo_bitmap_clear_dirty time consuming(ns):8352000 dirty 
pages:3189471  colo_bitmap_clear_dirty time consuming(ns):8348000 dirty 
pages:3189681  colo_bitmap_clear_dirty time consuming(ns):8426000

it can be seen from the data that colo_bitmap_clear_dirty is more efficient.

Signed-off-by: Lei Rao 
Reviewed-by: Lukas Straub 
Tested-by: Lukas Straub 
---
 migration/ram.c | 36 +++-
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c index 8661d82..11275cd 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -857,6 +857,36 @@ unsigned long colo_bitmap_find_dirty(RAMState *rs, 
RAMBlock *rb,
 return first;
 }
 
+/**
+ * colo_bitmap_clear_dirty:when we flush ram cache to ram, we will use
+ * continuous memory copy, so we can also clean up the bitmap of 
+contiguous
+ * dirty memory.
+ */
+static inline bool colo_bitmap_clear_dirty(RAMState *rs,
+   RAMBlock *rb,
+   unsigned long start,
+   unsigned long num) {
+bool ret;
+unsigned long i = 0;
+
+/*
+ * Since flush ram cache to ram can only happen on Secondary VM.
+ * and the clear bitmap always is NULL on destination side.
+ * Therefore, there is unnecessary to judge whether the
+ * clear_bitmap needs clear.
+ */
+QEMU_LOCK_GUARD(&rs->bitmap_mutex);
+for (i = 0; i < num; i++) {
+ret = test_and_clear_bit(start + i, rb->bmap);
+if (ret) {
+rs->migration_dirty_pages--;
+}
+}
+
+return ret;
+}
+
 static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 RAMBlock *rb,
 unsigned long page) @@ 
-3774,11 +3804,7 @@ void colo_flush_ram_cache(void)
 num = 0;
 block = QLIST_NEXT_RCU(block, next);
 } else {
-unsigned long i = 0;
-
-for (i = 0; i < num; i++) {
-migration_bitmap_clear_dirty(ram_state, block, offset + i);
-}
+colo_bitmap_clear_dirty(ram_state, block, offset, num);
 dst_host = block->host
  + (((ram_addr_t)offset) << TARGET_PAGE_BITS);
 src_host = block->colo_cache
--
1.8.3.1




RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

2021-02-25 Thread Rao, Lei
Sorry for the late reply due to CNY.
Auto-converge ensure that live migration can be completed smoothly when there 
are too many dirty pages.
COLO may encounter the same situation when rebuild a new secondary VM. 
So, I think it is necessary to enable COLO and auto-converge at the same time.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Sunday, February 14, 2021 6:52 PM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

On Wed, 13 Jan 2021 10:46:32 +0800
leirao  wrote:

> From: "Rao, Lei" 
> 
> If we don't disable the feature of auto-converge for live migration 
> before entering COLO mode, it will continue to run with COLO running, 
> and eventually the system will hang due to the CPU throttle reaching 
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao 
> ---
>  migration/migration.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c index 
> 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error 
> **errp)
>  qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp) {
> +MigrationCapabilityStatusList *cap = NULL;
> +
> +if (migrate_colo_enabled() && migrate_auto_converge()) {
> +QAPI_LIST_PREPEND(cap,
> +  migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +  value));
> +qmp_migrate_set_capabilities(cap, errp);
> +qapi_free_MigrationCapabilityStatusList(cap);
> +}
> +cpu_throttle_stop();
> +}
> +

I think it's better to error out in migration_prepare or migrate_caps_check if 
both colo and auto-converge is enabled.

>  static void migrate_set_block_incremental(MigrationState *s, bool 
> value)  {
>  s->parameters.block_incremental = value; @@ -3401,7 +3415,7 @@ 
> static MigIterateState migration_iteration_run(MigrationState *s)  
> static void migration_iteration_finish(MigrationState *s)  {
>  /* If we enabled cpu throttling for auto-converge, turn it off. */
> -cpu_throttle_stop();
> +colo_auto_converge_enabled(false, &error_abort);
>  
>  qemu_mutex_lock_iothread();
>  switch (s->state) {



-- 




RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-02-25 Thread Rao, Lei
If user executes the shutdown normally and QEMU crashes, I think this is 
unacceptable.
Since we can avoid this situation, why not do it?

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Sunday, February 14, 2021 7:46 PM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during 
checkpoint

On Fri, 29 Jan 2021 02:57:57 +
"Rao, Lei"  wrote:

> The state will be set RUN_STATE_COLO in colo_do_checkpoint_transaction(). If 
> the guest executes power off or shutdown at this time and the QEMU main 
> thread will call vm_shutdown(), it will set the state to RUN_STATE_SHUTDOWN.
> The state switch from RUN_STATE_COLO to RUN_STATE_SHUTDOWN is not defined in 
> runstate_transitions_def. this will cause QEMU crash. Although this is small 
> probability, it may still happen.

This patch fixes the 'colo' -> 'shutdown' transition. AFAIK then 
colo_do_checkpoint_transaction will call vm_start() again, which does 
'shutdown' -> 'running' and (rightfully) crashes. So I think it is better to 
crash here too.

>  By the way. Do you have any comments about other patches?
> Thanks,
> Lei.
> 
> -Original Message-
> From: Lukas Straub 
> Sent: Thursday, January 28, 2021 2:24 AM
> To: Rao, Lei 
> Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
> jasow...@redhat.com; zhang.zhanghaili...@huawei.com; 
> quint...@redhat.com; dgilb...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> during checkpoint
> 
> On Thu, 21 Jan 2021 01:48:31 +
> "Rao, Lei"  wrote:
> 
> > The Primary VM can be shut down when it is in COLO state, which may trigger 
> > this bug.  
> 
> Do you have a backtrace for this bug?
> 
> > About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have 
> > the problems you said. For 'shutdown'->'colo', The fixed 
> > patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> > Recently, I found another bug as follows in the test.
> > qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
> > Aborted (core dumped)
> > The gdb bt as following:
> > #0  __GI_raise (sig=sig@entry=6) at 
> > ../sysdeps/unix/sysv/linux/raise.c:50
> > #1  0x7faa3d613859 in __GI_abort () at abort.c:79
> > #2  0x55c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at 
> > vl.c:723
> > #3  0x55c5a1f8cae4 in vm_prepare_start () at 
> > /home/workspace/colo-qemu/cpus.c:2206
> > #4  0x55c5a1f8cb1b in vm_start () at 
> > /home/workspace/colo-qemu/cpus.c:2213
> > #5  0x55c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) 
> > at migration/migration.c:3376
> > #6  0x55c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
> > migration/migration.c:3527
> > #7  0x55c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
> > util/qemu-thread-posix.c:519
> > #8  0x7faa3d7e9609 in start_thread (arg=) at 
> > pthread_create.c:477
> > #9  0x7faa3d710293 in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > 
> > For the bug, I made the following changes:
> > @@ -3379,7 +3379,9 @@ static void 
> > migration_iteration_finish(MigrationState *s)
> >  case MIGRATION_STATUS_CANCELLED:
> >  case MIGRATION_STATUS_CANCELLING:
> >  if (s->vm_was_running) {
> > -vm_start();
> > +if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> > +vm_start();
> > +}
> >  } else {
> >  if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
> >  runstate_set(RUN_STATE_POSTMIGRATE);
> >  
> > I will send the patch to community after more test.
> > 
> > Thanks,
> > Lei.
> > 
> > -Original Message-
> > From: Lukas Straub 
> > Sent: Thursday, January 21, 2021 3:13 AM
> > To: Rao, Lei 
> > Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
> > jasow...@redhat.com; zhang.zhanghaili...@huawei.com; 
> > quint...@redhat.com; dgilb...@redhat.com; qemu-devel@nongnu.org
> > Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> > during checkpoint
> > 
> > On Wed, 13 Jan 2021 10:46:27 +0800
> > leirao  wrote:
> >   
> > > From: &qu

RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-01-28 Thread Rao, Lei
The state will be set RUN_STATE_COLO in colo_do_checkpoint_transaction(). If 
the guest executes power off or shutdown at this time and the QEMU main thread 
will call vm_shutdown(), it will set the state to RUN_STATE_SHUTDOWN.
The state switch from RUN_STATE_COLO to RUN_STATE_SHUTDOWN is not defined in 
runstate_transitions_def. this will cause QEMU crash. Although this is small 
probability, it may still happen. By the way. Do you have any comments about 
other patches?

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Thursday, January 28, 2021 2:24 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during 
checkpoint

On Thu, 21 Jan 2021 01:48:31 +
"Rao, Lei"  wrote:

> The Primary VM can be shut down when it is in COLO state, which may trigger 
> this bug.

Do you have a backtrace for this bug?

> About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have 
> the problems you said. For 'shutdown'->'colo', The fixed 
> patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
> Recently, I found another bug as follows in the test.
>   qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
>   Aborted (core dumped)
> The gdb bt as following:
> #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x7faa3d613859 in __GI_abort () at abort.c:79
> #2  0x55c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at 
> vl.c:723
> #3  0x55c5a1f8cae4 in vm_prepare_start () at 
> /home/workspace/colo-qemu/cpus.c:2206
> #4  0x55c5a1f8cb1b in vm_start () at 
> /home/workspace/colo-qemu/cpus.c:2213
> #5  0x55c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) 
> at migration/migration.c:3376
> #6  0x55c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
> migration/migration.c:3527
> #7  0x55c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
> util/qemu-thread-posix.c:519
> #8  0x7faa3d7e9609 in start_thread (arg=) at 
> pthread_create.c:477
> #9  0x7faa3d710293 in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> For the bug, I made the following changes:
>   @@ -3379,7 +3379,9 @@ static void 
> migration_iteration_finish(MigrationState *s)
>  case MIGRATION_STATUS_CANCELLED:
>  case MIGRATION_STATUS_CANCELLING:
>  if (s->vm_was_running) {
> -vm_start();
> +if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +vm_start();
> +}
>  } else {
>  if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
>  runstate_set(RUN_STATE_POSTMIGRATE);
>
> I will send the patch to community after more test.
> 
> Thanks,
> Lei.
> 
> -Original Message-
> From: Lukas Straub 
> Sent: Thursday, January 21, 2021 3:13 AM
> To: Rao, Lei 
> Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
> jasow...@redhat.com; zhang.zhanghaili...@huawei.com; 
> quint...@redhat.com; dgilb...@redhat.com; qemu-devel@nongnu.org
> Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown 
> during checkpoint
> 
> On Wed, 13 Jan 2021 10:46:27 +0800
> leirao  wrote:
> 
> > From: "Rao, Lei" 
> > 
> > This patch fixes the following:
> > qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> > Aborted (core dumped)
> > 
> > Signed-off-by: Lei Rao 
> 
> I wonder how that is possible, since the VM is stopped during 'colo' state.
> 
> Unrelated to this patch, I think this area needs some work since the 
> following unintended runstate transition is possible:
> 'shutdown' -> 'colo' -> 'running'.
> 
> > ---
> >  softmmu/runstate.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> > 636aab0..455ad0d 100644
> > --- a/softmmu/runstate.c
> > +++ b/softmmu/runstate.c
> > @@ -125,6 +125,7 @@ static const RunStateTransition 
> > runstate_transitions_def[] = {
> >  { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> >  
> >  { RUN_STATE_COLO, RUN_STATE_RUNNING },
> > +{ RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
> >  
> >  { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
> >  { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
> 
> 
> 



-- 




RE: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

2021-01-13 Thread Rao, Lei
I think there is a difference between doing checkpoints in COLO and live 
migration.
The feature of auto-converge is to ensure the success of live migration even 
though the dirty page generation speed is faster than data transfer.
but for COLO, we will force the VM to stop when something is doing a 
checkpoint. This will ensure the success of doing a checkpoint and this has 
nothing to do with auto-converge.

Thanks,
Lei.

-Original Message-
From: Dr. David Alan Gilbert  
Sent: Wednesday, January 13, 2021 7:32 PM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
qemu-devel@nongnu.org
Subject: Re: [PATCH 07/10] Disable auto-coverge before entering COLO mode.

* leirao (lei@intel.com) wrote:
> From: "Rao, Lei" 
> 
> If we don't disable the feature of auto-converge for live migration 
> before entering COLO mode, it will continue to run with COLO running, 
> and eventually the system will hang due to the CPU throttle reaching 
> DEFAULT_MIGRATE_MAX_CPU_THROTTLE.
> 
> Signed-off-by: Lei Rao 

I don't think that's the right answer, because it would seem reasonable to use 
auto-converge to ensure that a COLO snapshot succeeded by limiting guest CPU 
time.  Is the right fix here to reset the state of the auto-converge counters 
at the start of each colo snapshot?

Dave

> ---
>  migration/migration.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c index 
> 31417ce..6ab37e5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1673,6 +1673,20 @@ void migrate_set_block_enabled(bool value, Error 
> **errp)
>  qapi_free_MigrationCapabilityStatusList(cap);
>  }
>  
> +static void colo_auto_converge_enabled(bool value, Error **errp) {
> +MigrationCapabilityStatusList *cap = NULL;
> +
> +if (migrate_colo_enabled() && migrate_auto_converge()) {
> +QAPI_LIST_PREPEND(cap,
> +  migrate_cap_add(MIGRATION_CAPABILITY_AUTO_CONVERGE,
> +  value));
> +qmp_migrate_set_capabilities(cap, errp);
> +qapi_free_MigrationCapabilityStatusList(cap);
> +}
> +cpu_throttle_stop();
> +}
> +
>  static void migrate_set_block_incremental(MigrationState *s, bool 
> value)  {
>  s->parameters.block_incremental = value; @@ -3401,7 +3415,7 @@ 
> static MigIterateState migration_iteration_run(MigrationState *s)  
> static void migration_iteration_finish(MigrationState *s)  {
>  /* If we enabled cpu throttling for auto-converge, turn it off. */
> -cpu_throttle_stop();
> +colo_auto_converge_enabled(false, &error_abort);
>  
>  qemu_mutex_lock_iothread();
>  switch (s->state) {
> --
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




RE: [PATCH 03/10] Optimize the function of filter_send

2021-01-20 Thread Rao, Lei
OK, you are right,  I will change it in V2.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Thursday, January 21, 2021 3:21 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 03/10] Optimize the function of filter_send

On Wed, 13 Jan 2021 10:46:28 +0800
leirao  wrote:

> From: "Rao, Lei" 
> 
> The iov_size has been calculated in filter_send(). we can directly 
> return the size.In this way, this is no need to repeat calculations in 
> filter_redirector_receive_iov();
> 
> Signed-off-by: Lei Rao 
> ---
>  net/filter-mirror.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/filter-mirror.c b/net/filter-mirror.c index 
> f8e6500..7fa2eb3 100644
> --- a/net/filter-mirror.c
> +++ b/net/filter-mirror.c
> @@ -88,7 +88,7 @@ static int filter_send(MirrorState *s,
>  goto err;
>  }
>  
> -return 0;
> +return size;
>  
>  err:
>  return ret < 0 ? ret : -EIO;
> @@ -159,7 +159,7 @@ static ssize_t filter_mirror_receive_iov(NetFilterState 
> *nf,
>  int ret;
>  
>  ret = filter_send(s, iov, iovcnt);
> -if (ret) {
> +if (ret <= 0) {
>  error_report("filter mirror send failed(%s)", strerror(-ret));
>  }

0 is a valid return value if the data to send has size = 0.

> @@ -182,10 +182,10 @@ static ssize_t 
> filter_redirector_receive_iov(NetFilterState *nf,
>  
>  if (qemu_chr_fe_backend_connected(&s->chr_out)) {
>  ret = filter_send(s, iov, iovcnt);
> -if (ret) {
> +if (ret <= 0) {
>  error_report("filter redirector send failed(%s)", 
> strerror(-ret));
>  }

dito

> -return iov_size(iov, iovcnt);
> +return ret;
>  } else {
>  return 0;
>  }



-- 




RE: [PATCH 02/10] Fix the qemu crash when guest shutdown during checkpoint

2021-01-20 Thread Rao, Lei
The Primary VM can be shut down when it is in COLO state, which may trigger 
this bug.
About 'shutdown' -> 'colo' -> 'running', I think you are right, I did have the 
problems you said. For 'shutdown'->'colo', The fixed 
patch(5647051f432b7c9b57525470b0a79a31339062d2) have been merged.
Recently, I found another bug as follows in the test.
qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running'
Aborted (core dumped)
The gdb bt as following:
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7faa3d613859 in __GI_abort () at abort.c:79
#2  0x55c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at 
vl.c:723
#3  0x55c5a1f8cae4 in vm_prepare_start () at 
/home/workspace/colo-qemu/cpus.c:2206
#4  0x55c5a1f8cb1b in vm_start () at 
/home/workspace/colo-qemu/cpus.c:2213
#5  0x55c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at 
migration/migration.c:3376
#6  0x55c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at 
migration/migration.c:3527
#7  0x55c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at 
util/qemu-thread-posix.c:519
#8  0x7faa3d7e9609 in start_thread (arg=) at 
pthread_create.c:477
#9  0x7faa3d710293 in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

For the bug, I made the following changes:
@@ -3379,7 +3379,9 @@ static void 
migration_iteration_finish(MigrationState *s)
 case MIGRATION_STATUS_CANCELLED:
 case MIGRATION_STATUS_CANCELLING:
 if (s->vm_was_running) {
-vm_start();
+if (!runstate_check(RUN_STATE_SHUTDOWN)) {
+vm_start();
+}
 } else {
 if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
 runstate_set(RUN_STATE_POSTMIGRATE);
 
I will send the patch to community after more test.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Thursday, January 21, 2021 3:13 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; qemu-devel@nongnu.org
Subject: Re: [PATCH 02/10] Fix the qemu crash when guest shutdown during 
checkpoint

On Wed, 13 Jan 2021 10:46:27 +0800
leirao  wrote:

> From: "Rao, Lei" 
> 
> This patch fixes the following:
> qemu-system-x86_64: invalid runstate transition: 'colo' ->'shutdown'
> Aborted (core dumped)
> 
> Signed-off-by: Lei Rao 

I wonder how that is possible, since the VM is stopped during 'colo' state.

Unrelated to this patch, I think this area needs some work since the following 
unintended runstate transition is possible:
'shutdown' -> 'colo' -> 'running'.

> ---
>  softmmu/runstate.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 
> 636aab0..455ad0d 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -125,6 +125,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>  { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>  
>  { RUN_STATE_COLO, RUN_STATE_RUNNING },
> +{ RUN_STATE_COLO, RUN_STATE_SHUTDOWN},
>  
>  { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
>  { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },



-- 




RE: [PATCH 4/7] colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff

2021-07-04 Thread Rao, Lei
It's been a long time since this bug, I will reproduce it to get the GDB stack, 
but it may take some time.

Thanks,
Lei.

-Original Message-
From: Lukas Straub  
Sent: Sunday, July 4, 2021 4:36 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; like.xu.li...@gmail.com; qemu-devel@nongnu.org; Like Xu 

Subject: Re: [PATCH 4/7] colo: fixed 'Segmentation fault' when the simplex mode 
PVM poweroff

On Thu, 17 Jun 2021 10:47:12 +0800
Lei Rao  wrote:

> From: "Rao, Lei" 
> 
> When a PVM completed its SVM failover steps and begins to run in the 
> simplex mode, QEMU would encounter a 'Segmentation fault' if the guest 
> poweroff with the following calltrace:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 
> This is because primary_vm_do_failover() would call 
> "qemu_file_shutdown (s->rp_state.from_dst_file);" and later the 
> migration_shutdown() would do it again. So, we should set the 
> s->rp_state.from_dst_file to NULL.

Hello,
Please provide a backtrace to such bugfixes. It helps the reviewers to better 
understand the bug and the fix and it helps yourself to check if it is actually 
correct. I suggest you to enable coredumps in your test (or even production) 
system, so for every crash you definitely have a backtrace.

> Signed-off-by: Like Xu 
> Signed-off-by: Lei Rao 
> ---
>  migration/colo.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index 
> 616dc00..c25e488 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -156,14 +156,15 @@ static void primary_vm_do_failover(void)
>  
>  /*
>   * Wake up COLO thread which may blocked in recv() or send(),
> - * The s->rp_state.from_dst_file and s->to_dst_file may use the
> - * same fd, but we still shutdown the fd for twice, it is harmless.
> + * The s->to_dst_file may use the same fd, but we still shutdown
> + * the fd for twice, it is harmless.
>   */

This change to the comment is incorrect. Shutting down a file multiple times is 
safe and not a bug in it self. If it leads to a crash anyway, that points to a 
bug in the qemu_file_shutdown() function or similar.

>  if (s->to_dst_file) {
>  qemu_file_shutdown(s->to_dst_file);
>  }
>  if (s->rp_state.from_dst_file) {
>  qemu_file_shutdown(s->rp_state.from_dst_file);
> +s->rp_state.from_dst_file = NULL;
>  }

This is incorrect, as the file won't be closed after this and is leaked. And 
indeed, when applying the patch to master, qemu crashes when triggering 
failover on the primary, due to the leak:

qemu-system-x86_64: ../util/yank.c:107: yank_unregister_instance: Assertion 
`QLIST_EMPTY(&entry->yankfns)' failed.

>  old_state = failover_set_state(FAILOVER_STATUS_ACTIVE,



-- 




RE: [PATCH 4/7] colo: fixed 'Segmentation fault' when the simplex mode PVM poweroff

2021-07-05 Thread Rao, Lei
I can reproduce this bug successfully and the GDB stack is as follows:
Program terminated with signal SIGSEGV, Segmentation fault.
#0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
832 if (type->class->interfaces &&
[Current thread is 1 (Thread 0x7f756e97eb00 (LWP 1811577))]
(gdb) bt
#0  object_class_dynamic_cast (class=0x55c8f5d2bf50, typename=0x55c8f2f7379e 
"qio-channel") at qom/object.c:832
#1  0x55c8f2c3dd14 in object_dynamic_cast (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel") at qom/object.c:763
#2  0x55c8f2c3ddce in object_dynamic_cast_assert (obj=0x55c8f543ac00, 
typename=0x55c8f2f7379e "qio-channel",
file=0x55c8f2f73780 "migration/qemu-file-channel.c", line=117, 
func=0x55c8f2f73800 <__func__.18724> "channel_shutdown") at qom/object.c:786
#3  0x55c8f2bbc6ac in channel_shutdown (opaque=0x55c8f543ac00, rd=true, 
wr=true, errp=0x0) at migration/qemu-file-channel.c:117
#4  0x55c8f2bba56e in qemu_file_shutdown (f=0x7f7558070f50) at 
migration/qemu-file.c:67
#5  0x55c8f2ba5373 in migrate_fd_cancel (s=0x55c8f4ccf3f0) at 
migration/migration.c:1699
#6  0x55c8f2ba1992 in migration_shutdown () at migration/migration.c:187
#7  0x55c8f29a5b77 in main (argc=69, argv=0x7fff3e9e8c08, 
envp=0x7fff3e9e8e38) at vl.c:4512

>From the call trace we can see that the reason for core dump is due to 
>QIOChannel *ioc = QIO_CHANNEL(opaque) in the channel_shutdown();
After analysis, I think what you said is right, Shutting down a file multiple 
times is safe and not a bug in it self. The reason for the segmentation fault 
Is that after doing failover, the COLO thread will 
qemu_fclose(s->rp_state.from_dst_file) in colo_process_checkpoint();
if we power off the guest at the same time. This will cause qemu_file_shutdown 
after qemu_close(from_dst_file). As a result, the qemu will crash.
So, I think the better way is as follows:
  /*
 * Must be called after failover BH is completed,
 * Or the failover BH may shutdown the wrong fd that
 * re-used by other threads after we release here.
 */
 if (s->rp_state.from_dst_file) {
qemu_fclose(s->rp_state.from_dst_file);
+  s->rp_state.from_dst_file = NULL;
  }
After qemu_close(s->rp_state.from_dst_file), we set the from_dst_file = NULL;

Ways to reproduce bugs:
You can kill SVM, after executing the failover QMP command, immediately execute 
the power off command in the guest, which will have a high probability to 
reproduce this bug.

Thanks,
Lei.


-Original Message-
From: Rao, Lei 
Sent: Monday, July 5, 2021 10:54 AM
To: Lukas Straub 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; like.xu.li...@gmail.com; qemu-devel@nongnu.org; Like Xu 

Subject: RE: [PATCH 4/7] colo: fixed 'Segmentation fault' when the simplex mode 
PVM poweroff

It's been a long time since this bug, I will reproduce it to get the GDB stack, 
but it may take some time.

Thanks,
Lei.

-Original Message-
From: Lukas Straub 
Sent: Sunday, July 4, 2021 4:36 AM
To: Rao, Lei 
Cc: Zhang, Chen ; lizhij...@cn.fujitsu.com; 
jasow...@redhat.com; zhang.zhanghaili...@huawei.com; quint...@redhat.com; 
dgilb...@redhat.com; like.xu.li...@gmail.com; qemu-devel@nongnu.org; Like Xu 

Subject: Re: [PATCH 4/7] colo: fixed 'Segmentation fault' when the simplex mode 
PVM poweroff

On Thu, 17 Jun 2021 10:47:12 +0800
Lei Rao  wrote:

> From: "Rao, Lei" 
> 
> When a PVM completed its SVM failover steps and begins to run in the 
> simplex mode, QEMU would encounter a 'Segmentation fault' if the guest 
> poweroff with the following calltrace:
> 
> Program received signal SIGSEGV, Segmentation fault.
> 
> This is because primary_vm_do_failover() would call 
> "qemu_file_shutdown (s->rp_state.from_dst_file);" and later the
> migration_shutdown() would do it again. So, we should set the 
> s->rp_state.from_dst_file to NULL.

Hello,
Please provide a backtrace to such bugfixes. It helps the reviewers to better 
understand the bug and the fix and it helps yourself to check if it is actually 
correct. I suggest you to enable coredumps in your test (or even production) 
system, so for every crash you definitely have a backtrace.

> Signed-off-by: Like Xu 
> Signed-off-by: Lei Rao 
> ---
>  migration/colo.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index
> 616dc00..c25e488 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -156,14 +156,15 @@ static void primary_vm_do_failover(void)
>  
>  /*
>   * Wake up COLO thread which may blocked in recv() or send(),
>

RE: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

2020-09-21 Thread Rao, Lei
Hi, Derek and Chen

ram_bulk_stage is false by default before Hailiang's patch.
For COLO, it does not seem to be used, so I think there is no need to reset it 
to true.

Thanks,
Lei.

From: Derek Su  
Sent: Tuesday, September 22, 2020 11:48 AM
To: Zhang, Chen 
Cc: qemu-devel ; Rao, Lei ; 
zhang.zhanghaili...@huawei.com; quint...@redhat.com; dgilb...@redhat.com
Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache

Hi, Chen

Sure.

BTW, I just went through Lei's patch.
ram_bulk_stage() might need to reset to true after stopping COLO service as my 
patch.
How about your opinion?

Thanks.


Best regards,
Derek


Zhang, Chen <mailto:chen.zh...@intel.com> 於 2020年9月22日 週二 上午11:41寫道:
Hi Derek and Lei,
 
It looks same with Lei’ patch:
[PATCH 2/3] Reduce the time of checkpoint for COLO
Can you discuss to merge it into one patch?
 
Thanks
Zhang Chen
 
From: Derek Su <mailto:dere...@qnap.com> 
Sent: Tuesday, September 22, 2020 11:31 AM
To: qemu-devel <mailto:qemu-devel@nongnu.org>
Cc: mailto:zhang.zhanghaili...@huawei.com; mailto:quint...@redhat.com; 
mailto:dgilb...@redhat.com; Zhang, Chen <mailto:chen.zh...@intel.com>
Subject: Re: [PATCH v1 1/1] COLO: only flush dirty ram pages from colo cache
 
Hello, all
 
Ping...
 
Regards,
Derek Su
 
Derek Su <mailto:dere...@qnap.com> 於 2020年9月10日 週四 下午6:47寫道:
In secondary side, the colo_flush_ram_cache() calls
migration_bitmap_find_dirty() to finding the dirty pages and
flush them to host. But ram_state's ram_bulk_stage flag is always
enabled in secondary side, it leads to the whole ram pages copy
instead of only dirty pages.

Here, the ram_bulk_stage in secondary side is disabled in the
preparation of COLO incoming process to avoid the whole dirty
ram pages flush.

Signed-off-by: Derek Su <mailto:dere...@qnap.com>
---
 migration/colo.c |  6 +-
 migration/ram.c  | 10 ++
 migration/ram.h  |  3 +++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..6e644db306 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -844,6 +844,8 @@ void *colo_process_incoming_thread(void *opaque)
         return NULL;
     }

+    colo_disable_ram_bulk_stage();
+
     failover_init_state();

     mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
@@ -873,7 +875,7 @@ void *colo_process_incoming_thread(void *opaque)
         goto out;
     }
 #else
-        abort();
+    abort();
 #endif
     vm_start();
     trace_colo_vm_state_change("stop", "run");
@@ -924,6 +926,8 @@ out:
         qemu_fclose(fb);
     }

+    colo_enable_ram_bulk_stage();
+
     /* Hope this not to be too long to loop here */
     qemu_sem_wait(&mis->colo_incoming_sem);
     qemu_sem_destroy(&mis->colo_incoming_sem);
diff --git a/migration/ram.c b/migration/ram.c
index 76d4fee5d5..65e9b12058 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3357,6 +3357,16 @@ static bool postcopy_is_running(void)
     return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END;
 }

+void colo_enable_ram_bulk_stage(void)
+{
+    ram_state->ram_bulk_stage = true;
+}
+
+void colo_disable_ram_bulk_stage(void)
+{
+    ram_state->ram_bulk_stage = false;
+}
+
 /*
  * Flush content of RAM cache into SVM's memory.
  * Only flush the pages that be dirtied by PVM or SVM or both.
diff --git a/migration/ram.h b/migration/ram.h
index 2eeaacfa13..c1c0ebbe0f 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -69,4 +69,7 @@ void colo_flush_ram_cache(void);
 void colo_release_ram_cache(void);
 void colo_incoming_start_dirty_log(void);

+void colo_enable_ram_bulk_stage(void);
+void colo_disable_ram_bulk_stage(void);
+
 #endif
-- 
2.25.1


RE: [PATCH 2/3] Reduce the time of checkpoint for COLO

2020-09-22 Thread Rao, Lei
Got it.
It looks more reasonable. Will be changed in V2.

Thanks,
Lei.

-Original Message-
From: Li Zhijian  
Sent: Tuesday, September 22, 2020 2:58 PM
To: Rao, Lei ; Zhang, Chen ; 
jasow...@redhat.com; quint...@redhat.com; dgilb...@redhat.com; 
pbonz...@redhat.com
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH 2/3] Reduce the time of checkpoint for COLO



On 9/19/20 11:10 AM, leirao wrote:
> we should set ram_bulk_stage to false after ram_state_init, otherwise 
> the bitmap will be unused in migration_bitmap_find_dirty.
> all pages in ram cache will be flushed to the ram of secondary guest 
> for each checkpoint.
>
> Signed-off-by: leirao 
> ---
>   migration/ram.c | 12 
>   1 file changed, 12 insertions(+)
>
> diff --git a/migration/ram.c b/migration/ram.c index 76d4fee..6a2b6c1 
> 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3019,6 +3019,17 @@ static void 
> decompress_data_with_multi_threads(QEMUFile *f,
>   }
>   
>   /*
> + * we must set ram_bulk_stage to fasle, otherwise in
> + * migation_bitmap_find_dirty the bitmap will be unused and
> + * all the pages in ram cache wil be flushed to the ram of
> + * secondary VM.
> + */
> +static void colo_set_ram_state(RAMState *rsp)
this function name is too general, how about

colo_init_ram_state(ram_state)
{
 ram_state_init(&ram_state);
 ram_state->ram_bulk_stage = false;
}

Thanks
Zhijian

> +{
> +rsp->ram_bulk_stage = false;
> +}
> +
> +/*
>* colo cache: this is for secondary VM, we cache the whole
>* memory of the secondary VM, it is need to hold the global lock
>* to call this helper.
> @@ -3062,6 +3073,7 @@ int colo_init_ram_cache(void)
>   }
>   
>   ram_state_init(&ram_state);
> +colo_set_ram_state(ram_state);
>   return 0;
>   }
>