On 2015/5/19 0:48, Dr. David Alan Gilbert wrote:
* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
Besides normal checkpoint which according to the result of net packets
comparing, We do additional checkpoint periodically, it will reduce the number
of dirty pages when do one checkpoint, if we don't do checkpoint for a long
time (This is a special case when the net packets is always consistent).
Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com>
Signed-off-by: Yang Hongyang <yan...@cn.fujitsu.com>
---
migration/colo.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index 9ef4554..da5bc5e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,6 +10,7 @@
* later. See the COPYING file in the top-level directory.
*/
+#include "qemu/timer.h"
#include "sysemu/sysemu.h"
#include "migration/migration-colo.h"
#include "qemu/error-report.h"
@@ -32,6 +33,13 @@
*/
#define CHECKPOINT_MIN_PERIOD 100 /* unit: ms */
+/*
+ * force checkpoint timer: unit ms
+ * this is large because COLO checkpoint will mostly depend on
+ * COLO compare module.
+ */
+#define CHECKPOINT_MAX_PEROID 10000
+
enum {
COLO_READY = 0x46,
@@ -340,14 +348,7 @@ static void *colo_thread(void *opaque)
proxy_checkpoint_req = colo_proxy_compare();
if (proxy_checkpoint_req < 0) {
goto out;
- } else if (!proxy_checkpoint_req) {
- /*
- * No checkpoint is needed, wait for 1ms and then
- * check if we need checkpoint again
- */
- g_usleep(1000);
- continue;
- } else {
+ } else if (proxy_checkpoint_req) {
int64_t interval;
current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
@@ -357,8 +358,20 @@ static void *colo_thread(void *opaque)
g_usleep((1000*(CHECKPOINT_MIN_PERIOD - interval)));
}
DPRINTF("Net packets is not consistent!!!\n");
+ goto do_checkpoint;
+ }
+
+ /*
+ * No proxy checkpoint is request, wait for 100ms
+ * and then check if we need checkpoint again.
+ */
+ current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+ if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
+ g_usleep(100000);
+ continue;
This 100ms sleep is interesting - can you explain it's purpose; is it
just to save CPU time in the colo thread? It used to be 1ms (above
and in the previous version).
You are right, it is just to save CPU time. I'm not sure but if '1ms' is a
little too short ?
In our latest patch, we actually will send some dirty pages to slave side
during this sleep time
if there are dirty pages.
The MIN_PERIOD already stops the checkpoints being too close together,
so this is a separate sleep from that.
}
+do_checkpoint:
/* start a colo checkpoint */
if (colo_do_checkpoint_transaction(s, colo_control)) {
goto out;
--
1.7.12.4
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
.