Re: [Xen-devel] [RFC PATCH] tools: remove blktap2 related code and documentation

2016-08-16 Thread Yang Hongyang
On Mon, Aug 15, 2016 at 6:50 PM, Wei Liu  wrote:

> Blktap2 is effectively dead code for a few years.
>
> Notable changes in this patch:
>
> 0. Unhook blktap2 from build system
> 1. Now libxl no longer supports TAP ask backend, appropriate assertions
>

s/ask/disk/

   are added and some code paths now return ERROR_FAIL
> 2. Tap is no longer a supported backend in doc
> 3. Remove relevant entries in MAINTAINERS
>
> A patch to actually remove blktap2 directory will come later.
>
> Signed-off-by: Wei Liu 
> ---
> Compile-test only at this stage.
>
> Ross, do you have any objection for this? I haven't seen update from the
> joint blktap2 maintenance for a few months.
>
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Tim Deegan 
> Cc: Shriram Rajagopalan 
> Cc: Yang Hongyang 
> Cc: Ross Philipson 
> Cc: Lars Kurth 
> ---
>  .gitignore  | 14 --
>  INSTALL |  4 --
>  MAINTAINERS |  2 -
>  config/Tools.mk.in  |  1 -
>  docs/misc/xl-disk-configuration.txt |  2 +-
>  tools/Makefile  |  1 -
>  tools/Rules.mk  | 17 +--
>  tools/config.h.in   |  6 ---
>  tools/configure | 83 
>  tools/configure.ac  | 22 -
>  tools/libxl/Makefile|  8 +---
>  tools/libxl/check-xl-disk-parse |  2 +-
>  tools/libxl/libxl.c | 25 ++
>  tools/libxl/libxl_blktap2.c | 94 --
> ---
>  tools/libxl/libxl_device.c  | 32 ++---
>  tools/libxl/libxl_dm.c  | 17 ++-
>  tools/libxl/libxl_internal.h| 19 
>  tools/libxl/libxl_noblktap2.c   | 42 -
>  tools/xenstore/hashtable.c  |  5 --
>  tools/xenstore/hashtable.h  |  5 --
>  tools/xenstore/hashtable_private.h  |  5 --
>  21 files changed, 13 insertions(+), 393 deletions(-)
>  delete mode 100644 tools/libxl/libxl_blktap2.c
>  delete mode 100644 tools/libxl/libxl_noblktap2.c
>
> diff --git a/.gitignore b/.gitignore
> index d193820..ea2 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -97,19 +97,6 @@ tools/libs/evtchn/headers.chk
>  tools/libs/gnttab/headers.chk
>  tools/libs/call/headers.chk
>  tools/libs/foreignmemory/headers.chk
> -tools/blktap2/daemon/blktapctrl
> -tools/blktap2/drivers/img2qcow
> -tools/blktap2/drivers/lock-util
> -tools/blktap2/drivers/qcow-create
> -tools/blktap2/drivers/qcow2raw
> -tools/blktap2/drivers/tapdisk
> -tools/blktap2/drivers/tapdisk-client
> -tools/blktap2/drivers/tapdisk-diff
> -tools/blktap2/drivers/tapdisk-stream
> -tools/blktap2/drivers/tapdisk2
> -tools/blktap2/drivers/td-util
> -tools/blktap2/vhd/vhd-update
> -tools/blktap2/vhd/vhd-util
>  tools/console/xenconsole
>  tools/console/xenconsoled
>  tools/console/client/_paths.h
> @@ -327,7 +314,6 @@ tools/libxl/*.pyc
>  tools/libxl/libxl-save-helper
>  tools/libxl/test_timedereg
>  tools/libxl/test_fdderegrace
> -tools/blktap2/control/tap-ctl
>  tools/firmware/etherboot/eb-roms.h
>  tools/firmware/etherboot/gpxe-git-snapshot.tar.gz
>  tools/misc/xenwatchdogd
> diff --git a/INSTALL b/INSTALL
> index 9759354..3b255c7 100644
> --- a/INSTALL
> +++ b/INSTALL
> @@ -144,10 +144,6 @@ this detection and the sysv runlevel scripts have to
> be used.
>--with-systemd=DIR
>--with-systemd-modules-load=DIR
>
> -The old backend drivers are disabled because qdisk is now the default.
> -This option can be used to build them anyway.
> -  --enable-blktap2
> -
>  Build various stubom components, some are only example code. Its usually
>  enough to specify just --enable-stubdom and leave these options alone.
>--enable-ioemu-stubdom
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 97720a8..d54795b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -322,8 +322,6 @@ M:  Shriram Rajagopalan 
>  M: Yang Hongyang 
>  S: Maintained
>  F: docs/README.remus
> -F: tools/blktap2/drivers/block-remus.c
> -F: tools/blktap2/drivers/hashtable*
>  F: tools/libxl/libxl_remus_*
>  F: tools/libxl/libxl_netbuffer.c
>  F: tools/libxl/libxl_nonetbuffer.c
> diff --git a/config/Tools.mk.in b/config/Tools.mk.in
> index 0f79f4e..511406c 100644
> --- a/config/Tools.mk.in
> +++ b/config/Tools.mk.in
> @@ -56,7 +56,6 @@ CONFIG_ROMBIOS  := @rombios@
>  CONFIG_SEABIOS  := @seabios@
>  CONFIG_QEMU_TRAD:= @qemu_traditional@
>  CONFIG_QEMU_XEN := @qe

Re: [Xen-devel] Current LibXL Status

2015-11-19 Thread Yang Hongyang

On 2015年11月19日 20:16, Ian Campbell wrote:

On Thu, 2015-11-19 at 11:55 +, Ian Campbell wrote:

On Thu, 2015-11-19 at 11:48 +, Ian Campbell wrote:

On Thu, 2015-11-19 at 11:33 +, Andrew Cooper wrote:


The majority of those are cases are not appropriate uses of exit().
AFAIIR, the *only* valid use of exit() in a library is to clean up in
a
child process from a library-initiated fork().


... or (in this case) in the libxl-save-helper (separate process).

The only one I can find which isn't one of this is
in libxl__event_disaster, and that is only if the applications (or
language
bindings) haven't provided a suitable disaster callback.


Was looking at 4.4, in staging I also see a very odd one in
drbd_preresume_async, which isn't obviously in a child process AFAICT.

Hongyang, what prevents that exit from killing the whole toolstack
process?


I had missed an _async suffix on that function versus the one which was the
actual callback, it is invoked via drbd_async_call which involves a fork().


Yeah, it is in a child process.



Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



--
Thanks,
Yang

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 COLO Pre 03/12] tools/libxc: export xc_bitops.h

2015-06-03 Thread Yang Hongyang



On 06/02/2015 06:11 PM, Andrew Cooper wrote:

On 02/06/15 10:26, Yang Hongyang wrote:

When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint. So we need to get/test
the dirty page bitmap. We just expose xc_bitops.h for libxl use.

NOTE:
   Need to make clean and rerun configure to get it compiled.

Signed-off-by: Yang Hongyang 


I like this change, but lets take the opportunity to fix some of the
issues in it.


Thanks, will fix in next version.




---
  tools/libxc/include/xc_bitops.h | 76 +
  tools/libxc/xc_bitops.h | 76 -
  2 files changed, 76 insertions(+), 76 deletions(-)
  create mode 100644 tools/libxc/include/xc_bitops.h
  delete mode 100644 tools/libxc/xc_bitops.h

diff --git a/tools/libxc/include/xc_bitops.h b/tools/libxc/include/xc_bitops.h
new file mode 100644
index 000..cd749f4
--- /dev/null
+++ b/tools/libxc/include/xc_bitops.h
@@ -0,0 +1,76 @@
+#ifndef XC_BITOPS_H
+#define XC_BITOPS_H 1


No need for a 1 here


+
+/* bitmap operations for single threaded access */
+
+#include 
+#include 
+
+#define BITS_PER_LONG (sizeof(unsigned long) * 8)


All defines like this need XC_ prefixes, and CHAR_BIT should be used in
preference to 8.


+#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)


This name is misleading, as it is in terms of bits not bytes.
XC_BITMAP_SHIFT perhaps?


+
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
+#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)


I would recommend dropping these and open coding the few cases below.
It would be far more clear.


+
+/* calculate required space for number of longs needed to hold nr_bits */
+static inline int bitmap_size(int nr_bits)


"int" has been inappropriate everywhere in this file.  unsigned long
please (or settle on unsigned int everywhere)


+{
+int nr_long, nr_bytes;
+nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;


This calculation can overflow.

(nr_bits >> ORDER_LONG) + !!(nr_bits % BITS_PER_LONG)


+nr_bytes = nr_long * sizeof(unsigned long);
+return nr_bytes;
+}
+
+static inline unsigned long *bitmap_alloc(int nr_bits)
+{
+return calloc(1, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_set(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0xff, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0, bitmap_size(nr_bits));
+}
+
+static inline int test_bit(int nr, unsigned long *addr)


const *addr, as this is a read-only operation.


+{
+return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
+}
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
+}


It would be nice to be consistent on whether the bitmap pointer or the
bit is the first parameter.  Perhaps a second cleanup patch which makes
this consistent and adjusts all current callers.

~Andrew


+
+static inline int test_and_clear_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+clear_bit(nr, addr);
+return oldbit;
+}
+
+static inline int test_and_set_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+set_bit(nr, addr);
+return oldbit;
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+ int nr_bits)
+{
+int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
+for ( i = 0; i < nr_longs; ++i )
+dst[i] |= other[i];
+}
+
+#endif  /* XC_BITOPS_H */
diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
deleted file mode 100644
index cd749f4..000
--- a/tools/libxc/xc_bitops.h
+++ /dev/null
@@ -1,76 +0,0 @@
-#ifndef XC_BITOPS_H
-#define XC_BITOPS_H 1
-
-/* bitmap operations for single threaded access */
-
-#include 
-#include 
-
-#define BITS_PER_LONG (sizeof(unsigned long) * 8)
-#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
-
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
-
-/* calculate required space for number of longs needed to hold nr_bits */
-static inline int bitmap_size(int nr_bits)
-{
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
-}
-
-static inline unsigned long *bitmap_alloc(int nr_bits)
-{
-return calloc(1, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0xff, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0, bitmap_size(nr_bits));
-}
-
-static inline

Re: [Xen-devel] [PATCH v1 COLO Pre 03/12] tools/libxc: export xc_bitops.h

2015-06-04 Thread Yang Hongyang



On 06/04/2015 04:36 PM, Ian Campbell wrote:

On Thu, 2015-06-04 at 09:01 +0800, Yang Hongyang wrote:


On 06/02/2015 06:11 PM, Andrew Cooper wrote:

On 02/06/15 10:26, Yang Hongyang wrote:

When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint. So we need to get/test
the dirty page bitmap. We just expose xc_bitops.h for libxl use.

NOTE:
Need to make clean and rerun configure to get it compiled.

Signed-off-by: Yang Hongyang 


I like this change, but lets take the opportunity to fix some of the
issues in it.


Thanks, will fix in next version.


Please do fix and move in two separate patches, probably fix first
although I don't mind overall.


Sure, will do.



Ian.

.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/6] Misc cleanups for libxl

2015-06-04 Thread Yang Hongyang



On 06/03/2015 06:51 PM, Andrew Cooper wrote:

On 03/06/15 09:01, Yang Hongyang wrote:

This patchset mainly focus on libxl save, most of the patches are
simply move codes out of libxl_dom.c, except a refactor patch.

Please see individual patch for detail.

Can get the whole patchset from:
 https://github.com/macrosheep/xen/tree/misc-libxl-v2


Overall, these look like good changes, although I have not reviewed them
in detail.


Thank you! hope it won't affect migration v2 too much.



~Andrew



v1->v2:
   - use dsps for suspend_state and dss for save_state.
   - move resume code to libxl_dom_suspend.c
   - move toolstatck save/restore code to libxl_dom_save.c
   - move refactor pacth to the end so that rebase of the patchset easier.

Yang Hongyang (6):
   tools/libxl: rename libxl__domain_suspend to libxl__domain_save
   tools/libxl: move domain suspend code into libxl_dom_suspend.c
   tools/libxl: move domain resume code into libxl_dom_suspend.c
   tools/libxl: move remus code into libxl_remus.c
   tools/libxl: move save/restore code into libxl_dom_save.c
   libxl/save: Refactor libxl__domain_suspend_state

  tools/libxl/Makefile |5 +-
  tools/libxl/libxl.c  |  126 +---
  tools/libxl/libxl_dom.c  | 1202 --
  tools/libxl/libxl_dom_save.c |  672 +
  tools/libxl/libxl_dom_suspend.c  |  465 +++
  tools/libxl/libxl_internal.h |   65 ++-
  tools/libxl/libxl_netbuffer.c|2 +-
  tools/libxl/libxl_remus.c|  307 ++
  tools/libxl/libxl_save_callout.c |2 +-
  9 files changed, 1503 insertions(+), 1343 deletions(-)
  create mode 100644 tools/libxl/libxl_dom_save.c
  create mode 100644 tools/libxl/libxl_dom_suspend.c
  create mode 100644 tools/libxl/libxl_remus.c



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 06/13] tools/libxl: Introduce a new internal API libxl__domain_unpause()

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

The guest is paused after libxl_domain_create_restore().
Secondary vm is running in colo mode. So we need to unpause
the guest. The current API libxl_domain_unpause() is
not an internal API. Introduce a new API to support it.
No functional change.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl.c  | 20 ++--
 tools/libxl/libxl_internal.h |  1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index ba2da92..d5691dc 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,9 +933,8 @@ out:
 return AO_INPROGRESS;
 }
 
-int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
+int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
 {
-GC_INIT(ctx);
 char *path;
 char *state;
 int ret, rc = 0;
@@ -947,7 +946,7 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
 }
 
 if (type == LIBXL_DOMAIN_TYPE_HVM) {
-uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
 
 path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
 state = libxl__xs_read(gc, XBT_NULL, path);
@@ -957,12 +956,21 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
  NULL, NULL, NULL);
 }
 }
-ret = xc_domain_unpause(ctx->xch, domid);
-if (ret<0) {
-LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unpausing domain %d", domid);
+
+ret = xc_domain_unpause(CTX->xch, domid);
+if (ret < 0) {
+LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "unpausing domain %d", domid);
 rc = ERROR_FAIL;
 }
  out:
+return rc;
+}
+
+int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
+{
+GC_INIT(ctx);
+int rc = libxl__domain_unpause(gc, domid);
+
 GC_FREE;
 return rc;
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 20364c6..366470f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1044,6 +1044,7 @@ _hidden int libxl__domain_restore(libxl__gc *gc, uint32_t 
domid);
 _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
  int suspend_cancel);
 _hidden int libxl__domain_s3_resume(libxl__gc *gc, int domid);
+_hidden int libxl__domain_unpause(libxl__gc *gc, uint32_t domid);
 
 /* returns 0 or 1, or a libxl error code */
 _hidden int libxl__domain_pvcontrol_available(libxl__gc *gc, uint32_t domid);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 00/13] Prerequisite patches for COLO

2015-06-07 Thread Yang Hongyang
This patchset is Prerequisite for COLO feature. For what COLO is, refer
to http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

This patchset is based on:
[PATCH v2 0/6] Misc cleanups for libxl

You can also get the patchset from:
https://github.com/macrosheep/xen/tree/colo-v6

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record

Wen Congyang (4):
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: Introduce a new internal API libxl__domain_unpause()
  tools/libxl: Update libxl_save_msgs_gen.pl to support return data from
xl to xc
  tools/libxl: Add back channel to allow migration target send data back

Yang Hongyang (9):
  libxc/restore: fix error handle of process_record
  libxc/restore: zero ioreq page only one time
  tools/libxc: export xc_bitops.h
  tools/libxl: introduce a new API libxl__domain_restore() to load qemu
state
  tools/libxl: Update libxl__domain_unpause() to support qemu-xen
  tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  tools/libxl: rename remus device to checkpoint device
  tools/libxl: adjust the indentation
  tools/libxl: don't touch remus in checkpoint_device

 tools/libxc/include/xc_bitops.h   |  76 
 tools/libxc/xc_bitops.h   |  76 
 tools/libxc/xc_resume.c   |  22 ++-
 tools/libxc/xc_sr_restore.c   |  28 +--
 tools/libxc/xc_sr_restore_x86_hvm.c   |   3 +-
 tools/libxl/Makefile  |   2 +-
 tools/libxl/libxl.c   |  62 +--
 tools/libxl/libxl_checkpoint_device.c | 282 +
 tools/libxl/libxl_create.c|  14 +-
 tools/libxl/libxl_dom_save.c  | 128 +
 tools/libxl/libxl_internal.h  | 171 ++
 tools/libxl/libxl_netbuffer.c | 117 ++--
 tools/libxl/libxl_nonetbuffer.c   |  10 +-
 tools/libxl/libxl_qmp.c   |  10 ++
 tools/libxl/libxl_remus.c | 140 ++-
 tools/libxl/libxl_remus_device.c  | 327 --
 tools/libxl/libxl_remus_disk_drbd.c   |  56 +++---
 tools/libxl/libxl_save_callout.c  |  31 
 tools/libxl/libxl_save_helper.c   |  17 ++
 tools/libxl/libxl_save_msgs_gen.pl|  65 ++-
 tools/libxl/libxl_types.idl   |  11 +-
 tools/libxl/xl_cmdimpl.c  |   7 +
 22 files changed, 970 insertions(+), 685 deletions(-)
 create mode 100644 tools/libxc/include/xc_bitops.h
 delete mode 100644 tools/libxc/xc_bitops.h
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen

2015-06-07 Thread Yang Hongyang
Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl.c | 42 +-
 1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5691dc..5c843c2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,10 +933,37 @@ out:
 return AO_INPROGRESS;
 }
 
-int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
+static int libxl__domain_unpause_device_model(libxl__gc *gc, uint32_t domid)
 {
 char *path;
 char *state;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
+state = libxl__xs_read(gc, XBT_NULL, path);
+if (state != NULL && !strcmp(state, "paused")) {
+libxl__qemu_traditional_cmd(gc, domid, "continue");
+libxl__wait_for_device_model_deprecated(gc, domid, "running",
+NULL, NULL, NULL);
+}
+break;
+}
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+if (libxl__qmp_resume(gc, domid))
+return ERROR_FAIL;
+break;
+default:
+return ERROR_INVAL;
+}
+
+return 0;
+}
+
+int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
+{
 int ret, rc = 0;
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -946,14 +973,11 @@ int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
 }
 
 if (type == LIBXL_DOMAIN_TYPE_HVM) {
-uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-
-path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
-state = libxl__xs_read(gc, XBT_NULL, path);
-if (state != NULL && !strcmp(state, "paused")) {
-libxl__qemu_traditional_cmd(gc, domid, "continue");
-libxl__wait_for_device_model_deprecated(gc, domid, "running",
- NULL, NULL, NULL);
+rc = libxl__domain_unpause_device_model(gc, domid);
+if (rc < 0) {
+LOG(ERROR, "failed to unpause device model for domain %u:%d",
+domid, rc);
+goto out;
 }
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-07 Thread Yang Hongyang
ioreq page contains evtchn which will be set when we resume the
secondary vm the first time. The hypervisor will check if the
evtchn is corrupted, so we cannot zero the ioreq page more
than one time.

The ioreq->state is always STATE_IOREQ_NONE after the vm is
suspended, so it is OK if we only zero it one time.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen congyang 
CC: Andrew Cooper 
---
 tools/libxc/xc_sr_restore_x86_hvm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c 
b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6f5af0e..06177e0 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -78,7 +78,8 @@ static int handle_hvm_params(struct xc_sr_context *ctx,
 break;
 case HVM_PARAM_IOREQ_PFN:
 case HVM_PARAM_BUFIOREQ_PFN:
-xc_clear_domain_page(xch, ctx->domid, entry->value);
+if ( !ctx->restore.buffer_all_records )
+xc_clear_domain_page(xch, ctx->domid, entry->value);
 break;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 05/13] tools/libxl: introduce a new API libxl__domain_restore() to load qemu state

2015-06-07 Thread Yang Hongyang
Secondary vm is running in colo mode. So we will do
the following things again and again:
1. suspend both primay vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
We will send qemu's state each time in step2, and
slave's qemu should read it each time before resuming
secondary vm. Introduce a new API libxl__domain_restore()
to do it. This API should be called before resuming
secondary vm.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
 tools/libxl/libxl_dom_save.c | 47 
 tools/libxl/libxl_internal.h |  4 
 tools/libxl/libxl_qmp.c  | 10 ++
 3 files changed, 61 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 74a6bae..f9627f8 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -663,6 +663,53 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t 
*buf,
 return 0;
 }
 
+int libxl__domain_restore(libxl__gc *gc, uint32_t domid)
+{
+int rc = 0;
+
+libxl_domain_type type = libxl__domain_type(gc, domid);
+if (type != LIBXL_DOMAIN_TYPE_HVM) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+rc = libxl__domain_restore_device_model(gc, domid);
+if (rc)
+LOG(ERROR, "failed to restore device mode for domain %u:%d",
+domid, rc);
+out:
+return rc;
+}
+
+int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
+{
+char *state_file;
+int rc;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+/* not supported now */
+rc = ERROR_INVAL;
+break;
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+/*
+ * This function may be called too many times for the same gc,
+ * so we use NOGC, and free the memory before return to avoid
+ * OOM.
+ */
+state_file = libxl__sprintf(NOGC,
+XC_DEVICE_MODEL_RESTORE_FILE".%d",
+domid);
+rc = libxl__qmp_restore(gc, domid, state_file);
+free(state_file);
+break;
+default:
+rc = ERROR_INVAL;
+}
+
+return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1905195..20364c6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1022,6 +1022,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t 
domid,
 
 _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
  uint32_t size, void *data);
+_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
 
 _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1039,6 +1040,7 @@ _hidden int libxl__userdata_store(libxl__gc *gc, uint32_t 
domid,
   const char *userdata_userid,
   const uint8_t *data, int datalen);
 
+_hidden int libxl__domain_restore(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
  int suspend_cancel);
 _hidden int libxl__domain_s3_resume(libxl__gc *gc, int domid);
@@ -1651,6 +1653,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
 _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
 /* Save current QEMU state into fd. */
 _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
 /* Set dirty bitmap logging status */
 _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool 
enable);
 _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const 
libxl_device_disk *disk);
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 9aa7e2e..a6f1a21 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -892,6 +892,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char 
*filename)
NULL, NULL);
 }
 
+int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
+{
+libxl__json_object *args = NULL;
+
+qmp_parameters_add_string(gc, &args, "filename", state_file);
+
+return qmp_run_command(gc, domid, "xen-load-devices-state", args,
+   NULL, NULL);
+}
+
 static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
   char *device, char *target, char *arg)
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 10/13] tools/libxl: Add back channel to allow migration target send data back

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

In colo mode, slave needs to send data to master, but the io_fd
only can be written in master, and only can be read in slave.
Save recv_fd in domain_suspend_state, and send_fd in
domain_create_state.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl.c  |  2 +-
 tools/libxl/libxl_create.c   | 14 ++
 tools/libxl/libxl_internal.h |  2 ++
 tools/libxl/libxl_types.idl  |  7 +++
 tools/libxl/xl_cmdimpl.c |  7 +++
 5 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5c843c2..36b97fe 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -832,7 +832,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 dss->callback = remus_failover_cb;
 dss->domid = domid;
 dss->fd = send_fd;
-/* TODO do something with recv_fd */
+dss->recv_fd = recv_fd;
 dss->type = type;
 dss->live = 1;
 dss->debug = 0;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..bd8149c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
  int rc, uint32_t domid);
 
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-uint32_t *domid,
-int restore_fd, int checkpointed_stream,
+uint32_t *domid, int restore_fd,
+int send_fd, int checkpointed_stream,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
@@ -1591,6 +1591,7 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
 libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
 cdcs->dcs.restore_fd = restore_fd;
+cdcs->dcs.send_fd = send_fd;
 cdcs->dcs.callback = domain_create_cb;
 cdcs->dcs.checkpointed_stream = checkpointed_stream;
 libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
@@ -1619,7 +1620,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
-return do_domain_create(ctx, d_config, domid, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, -1, 0,
 ao_how, aop_console_how);
 }
 
@@ -1629,7 +1630,12 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
-return do_domain_create(ctx, d_config, domid, restore_fd,
+int send_fd = -1;
+
+if (params->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO)
+send_fd = params->send_fd;
+
+return do_domain_create(ctx, d_config, domid, restore_fd, send_fd,
 params->checkpointed_stream, ao_how, 
aop_console_how);
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fbbae93..6d214b5 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2874,6 +2874,7 @@ struct libxl__domain_save_state {
 
 uint32_t domid;
 int fd;
+int recv_fd;
 libxl_domain_type type;
 int live;
 int debug;
@@ -3143,6 +3144,7 @@ struct libxl__domain_create_state {
 libxl_domain_config *guest_config;
 libxl_domain_config guest_config_saved; /* vanilla config */
 int restore_fd;
+int send_fd;
 libxl__domain_create_cb *callback;
 libxl_asyncprogress_how aop_console_how;
 /* private to domain_create */
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..8a3d7ba 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -198,6 +198,12 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (3, "reference_tsc"),
 ])
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+(0, "NONE"),
+(1, "REMUS"),
+(2, "COLO"),
+], init_val = 0)
+
 #
 # Complex libxl types
 #
@@ -346,6 +352,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
 
 libxl_domain_restore_params = Struct("domain_restore_params", [
 ("checkpointed_stream", integer),
+("send_fd", integer),
 ])
 
 libxl_domain_sched_params = Struct("domain_sched_params",[
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..adfadd1 100644
--- a/tools/libxl/

[Xen-devel] [PATCH v2 COLOPre 02/13] tools/libxc: support to resume uncooperative HVM guests

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

For PVHVM, the hypercall return code is 0, and it can be resumed
in a new domain context.
we suspend PVHVM and resume it is like this:
1. suspend it via evtchn
2. modifty the return code to 1
3. the guest know that the suspend is cancelled, we will use fast path
   to resume it.

Under COLO, we will update the guest's state(modify memory, cpu's registers,
device status...). In this case, we cannot use the fast path to resume it.
Keep the return code 0, and use a slow path to resume the guest. We have
updated the guest state, so we call it a new domain context.

For HVM, the hypercall is a NOP.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxc/xc_resume.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index e67bebd..bd82334 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -109,6 +109,23 @@ static int xc_domain_resume_cooperative(xc_interface *xch, 
uint32_t domid)
 return do_domctl(xch, &domctl);
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+DECLARE_DOMCTL;
+
+/*
+ * If it is PVHVM, the hypercall return code is 0, because this
+ * is not a fast path resume, we do not modify_returncode as in
+ * xc_domain_resume_cooperative.
+ * (resuming it in a new domain context)
+ *
+ * If it is a HVM, the hypercall is a NOP.
+ */
+domctl.cmd = XEN_DOMCTL_resumedomain;
+domctl.domain = domid;
+return do_domctl(xch, &domctl);
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
 DECLARE_DOMCTL;
@@ -138,10 +155,7 @@ static int xc_domain_resume_any(xc_interface *xch, 
uint32_t domid)
  */
 #if defined(__i386__) || defined(__x86_64__)
 if ( info.hvm )
-{
-ERROR("Cannot resume uncooperative HVM guests");
-return rc;
-}
+return xc_domain_resume_hvm(xch, domid);
 
 if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 01/13] libxc/restore: fix error handle of process_record

2015-06-07 Thread Yang Hongyang
If the err is RECORD_NOT_PROCESSED, and it is an optional record,
restore will still fail. The patch fix this.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
 tools/libxc/xc_sr_restore.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..2d2edd3 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -560,19 +560,6 @@ static int process_record(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 free(rec->data);
 rec->data = NULL;
 
-if ( rc == RECORD_NOT_PROCESSED )
-{
-if ( rec->type & REC_TYPE_OPTIONAL )
-DPRINTF("Ignoring optional record %#x (%s)",
-rec->type, rec_type_to_str(rec->type));
-else
-{
-ERROR("Mandatory record %#x (%s) not handled",
-  rec->type, rec_type_to_str(rec->type));
-rc = -1;
-}
-}
-
 return rc;
 }
 
@@ -678,7 +665,20 @@ static int restore(struct xc_sr_context *ctx)
 else
 {
 rc = process_record(ctx, &rec);
-if ( rc )
+if ( rc == RECORD_NOT_PROCESSED )
+{
+if ( rec.type & REC_TYPE_OPTIONAL )
+DPRINTF("Ignoring optional record %#x (%s)",
+rec.type, rec_type_to_str(rec.type));
+else
+{
+ERROR("Mandatory record %#x (%s) not handled",
+  rec.type, rec_type_to_str(rec.type));
+rc = -1;
+goto err;
+}
+}
+else if ( rc )
 goto err;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 COLOPre 08/13] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()

2015-06-07 Thread Yang Hongyang
Secondary vm is running in colo mode, we need to send
secondary vm's dirty page information to master at checkpoint,
so we have to enable qemu logdirty on secondary.

libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
qemu logdirty. But it uses domain_save_state, and calls
libxl__xc_domain_saverestore_async_callback_done()
before exits. This can not be used for secondary vm.

Update libxl__domain_suspend_common_switch_qemu_logdirty() to
introduce a new API libxl__domain_common_switch_qemu_logdirty().
This API only uses libxl__logdirty_switch, and calls
lds->callback before exits.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
 tools/libxl/libxl_dom_save.c | 78 ++--
 tools/libxl/libxl_internal.h |  8 +
 2 files changed, 54 insertions(+), 32 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index f9627f8..c15e9f1 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -44,7 +44,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, 
libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
- libxl__domain_save_state *dss, int ok);
+ libxl__logdirty_switch *lds, int ok);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -54,13 +54,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 }
 
 static void domain_suspend_switch_qemu_xen_traditional_logdirty
-   (int domid, unsigned enable,
-libxl__save_helper_state *shs)
+   (libxl__egc *egc, int domid, unsigned enable,
+libxl__logdirty_switch *lds)
 {
-libxl__egc *egc = shs->egc;
-libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-libxl__logdirty_switch *lds = &dss->logdirty;
-STATE_AO_GC(dss->ao);
+STATE_AO_GC(lds->ao);
 int rc;
 xs_transaction_t t = 0;
 const char *got;
@@ -122,64 +119,81 @@ static void 
domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
 LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
 libxl__xs_transaction_abort(gc, &t);
-switch_logdirty_done(egc,dss,-1);
+switch_logdirty_done(egc,lds,-1);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
-   (int domid, unsigned enable,
-libxl__save_helper_state *shs)
+   (libxl__egc *egc, int domid, unsigned enable,
+libxl__logdirty_switch *lds)
 {
-libxl__egc *egc = shs->egc;
-libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-STATE_AO_GC(dss->ao);
+STATE_AO_GC(lds->ao);
 int rc;
 
 rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
 if (!rc) {
-libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+lds->callback(egc, lds, 0);
 } else {
 LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
-libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+lds->callback(egc, lds, -1);
 }
 }
 
+static void libxl__domain_suspend_switch_qemu_logdirty_done
+(libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
+{
+libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
+
+libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, rc);
+}
+
 void libxl__domain_suspend_common_switch_qemu_logdirty
(int domid, unsigned enable, void *user)
 {
 libxl__save_helper_state *shs = user;
 libxl__egc *egc = shs->egc;
 libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-STATE_AO_GC(dss->ao);
+
+/* convenience aliases */
+libxl__logdirty_switch *const lds = &dss->logdirty;
+
+lds->callback = libxl__domain_suspend_switch_qemu_logdirty_done;
+libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
+}
+
+void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+   int domid, unsigned enable,
+   libxl__logdirty_switch *lds)
+{
+STATE_AO_GC(lds->ao);
 
 switch (libxl__device_model_version_running(gc, domid)) {
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, 
shs);
+domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
+lds);
 break;
 case LIBXL_DEVICE_MO

[Xen-devel] [PATCH v2 COLOPre 11/13] tools/libxl: rename remus device to checkpoint device

2015-06-07 Thread Yang Hongyang
This patch is auto generated by the following commands:
 1. git mv tools/libxl/libxl_remus_device.c 
tools/libxl/libxl_checkpoint_device.c
 2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' 
tools/libxl/Makefile
 3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' 
tools/libxl/*.[ch]
 4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' 
tools/libxl/*.[ch]
 5. perl -pi -e 
's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g'
 tools/libxl/*.[ch]
 6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' 
tools/libxl/*.[ch]
 7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' 
tools/libxl/*.[ch]
 8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' 
tools/libxl/*.[ch]
 9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' 
tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] 
tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] 
tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' 
tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' 
tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' 
tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' 
tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' 
tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' 
tools/libxl/libxl_internal.h

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxl/Makefile  |   2 +-
 tools/libxl/libxl_checkpoint_device.c | 327 ++
 tools/libxl/libxl_internal.h  | 112 ++--
 tools/libxl/libxl_netbuffer.c | 108 +--
 tools/libxl/libxl_nonetbuffer.c   |  10 +-
 tools/libxl/libxl_remus.c |  76 
 tools/libxl/libxl_remus_device.c  | 327 --
 tools/libxl/libxl_remus_disk_drbd.c   |  52 +++---
 tools/libxl/libxl_types.idl   |   4 +-
 9 files changed, 509 insertions(+), 509 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index df51b22..cd63dac 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -56,7 +56,7 @@ else
 LIBXL_OBJS-y += libxl_nonetbuffer.o
 endif
 
-LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
new file mode 100644
index 000..109cd23
--- /dev/null
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -0,0 +1,327 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+&remus_device_nic,
+&remus_device_drbd_disk,
+NULL,
+};
+
+/*- helper functions -*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+/* init device subkind-specific state in the libxl ctx */
+int rc;
+STATE_AO_GC(cds->ao);
+
+if (libxl__netbuffer_enabled(gc)) {
+rc = init_subkind_nic(cds);
+if (rc) goto out;
+}
+
+rc = init_subkind_drbd_disk(cds);
+if (rc) goto out;
+
+rc = 0;
+out:
+return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+/* cleanup device subkind-specific state in the libxl ctx */
+STATE_AO_GC(cds->ao);
+
+if (libxl__netbuffe

[Xen-devel] [PATCH v2 COLOPre 12/13] tools/libxl: adjust the indentation

2015-06-07 Thread Yang Hongyang
This is just tidying up after the previous automatic renaming.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
 tools/libxl/libxl_checkpoint_device.c | 21 +++--
 tools/libxl/libxl_internal.h  | 19 +++
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
index 109cd23..226f159 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
 /* checkpoint device setup and teardown */
 
 static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
-  libxl__checkpoint_devices_state 
*cds,
-  libxl__device_kind kind,
-  void *libxl_dev)
+libxl__checkpoint_devices_state *cds,
+libxl__device_kind kind,
+void *libxl_dev)
 {
 libxl__checkpoint_device *dev = NULL;
 
@@ -89,9 +89,10 @@ static libxl__checkpoint_device* 
checkpoint_device_init(libxl__egc *egc,
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-libxl__checkpoint_devices_state *cds);
+ libxl__checkpoint_devices_state *cds);
 
-void libxl__checkpoint_devices_setup(libxl__egc *egc, 
libxl__checkpoint_devices_state *cds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds)
 {
 int i, rc;
 
@@ -137,7 +138,7 @@ out:
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-libxl__checkpoint_devices_state *cds)
+ libxl__checkpoint_devices_state *cds)
 {
 int i, rc;
 
@@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_checkpoint_api(api)\
-void libxl__checkpoint_devices_##api(libxl__egc *egc,\
-libxl__checkpoint_devices_state *cds)\
+#define define_checkpoint_api(api)  \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,   \
+libxl__checkpoint_devices_state *cds)   \
 {   \
 int i;  \
-libxl__checkpoint_device *dev;   \
+libxl__checkpoint_device *dev;  \
 \
 STATE_AO_GC(cds->ao);   \
 \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5399601..3a1360b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2673,7 +2673,8 @@ typedef struct libxl__save_helper_state {
  * Each device type needs to implement the interfaces specified in
  * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the checkpoint device layer is shown 
below:
+ * The high-level control flow through the checkpoint device layer is shown
+ * below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
@@ -2734,7 +2735,8 @@ int 
init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
 typedef void libxl__checkpoint_callback(libxl__egc *,
-   libxl__checkpoint_devices_state *, int rc);
+libxl__checkpoint_devices_state *,
+int rc);
 
 /*
  * State associated with a checkpoint invocation, including parameters
@@ -2742,7 +2744,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
  * save/restore machinery.
  */
 struct libxl__checkpoint_devices_state {
-/* must be set by caller of libxl__checkpoint_device_(setup|teardown) 
*/
+/*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) 
--*/
 
 libxl__ao *ao;
 uint32_t domid;
@@ -2755,7 +2757,8 @@ struct libxl__checkpoint_devices_state {
 /*
  * this array is allocated before setup the checkpoint devices by the
  * checkpoint abstract layer.
- * devs may be NULL, means there's no checkpoint devices that has been set 
up.
+ * devs may be NULL, means there's no checkpoint devices that has been
+ * set up.
  * the size of this array is 'num_devices', which is the tot

[Xen-devel] [PATCH v2 COLOPre 09/13] tools/libxl: Update libxl_save_msgs_gen.pl to support return data from xl to xc

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

 Currently, all callbacks return an integer value or void. We cannot
 return some data to xc via callback. Update libxl_save_msgs_gen.pl
 to support this case.

Signed-off-by: Wen Congyang 
---
 tools/libxl/libxl_internal.h   |  3 ++
 tools/libxl/libxl_save_callout.c   | 31 ++
 tools/libxl/libxl_save_helper.c| 17 ++
 tools/libxl/libxl_save_msgs_gen.pl | 65 ++
 4 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0b62107..fbbae93 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3180,6 +3180,9 @@ _hidden void libxl__xc_domain_save_done(libxl__egc*, void 
*dss_void,
  * When they are ready to indicate completion, they call this. */
 void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
libxl__save_helper_state *shs, int return_value);
+void libxl__xc_domain_saverestore_async_callback_done_with_data(libxl__egc 
*egc,
+   libxl__save_helper_state *shs,
+   const void *data, uint64_t size);
 
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index cd342b9..5c691eb 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -145,6 +145,15 @@ void 
libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 shs->egc = 0;
 }
 
+void libxl__xc_domain_saverestore_async_callback_done_with_data(libxl__egc 
*egc,
+   libxl__save_helper_state *shs,
+   const void *data, uint64_t size)
+{
+shs->egc = egc;
+libxl__srm_callout_sendreply_data(data, size, shs);
+shs->egc = 0;
+}
+
 /*- helper execution -*/
 
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
@@ -370,6 +379,28 @@ void libxl__srm_callout_sendreply(int r, void *user)
 helper_failed(egc, shs, ERROR_FAIL);
 }
 
+void libxl__srm_callout_sendreply_data(const void *data, uint64_t size, void 
*user)
+{
+libxl__save_helper_state *shs = user;
+libxl__egc *egc = shs->egc;
+STATE_AO_GC(shs->ao);
+int errnoval;
+
+errnoval = libxl_write_exactly(CTX, libxl__carefd_fd(shs->pipes[0]),
+   &size, sizeof(size), shs->stdin_what,
+   "callback return data length");
+if (errnoval)
+goto out;
+
+errnoval = libxl_write_exactly(CTX, libxl__carefd_fd(shs->pipes[0]),
+   data, size, shs->stdin_what,
+   "callback return data");
+
+out:
+if (errnoval)
+helper_failed(egc, shs, ERROR_FAIL);
+}
+
 void libxl__srm_callout_callback_log(uint32_t level, uint32_t errnoval,
   const char *context, const char *formatted, void *user)
 {
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 74826a1..44c5807 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -155,6 +155,23 @@ int helper_getreply(void *user)
 return v;
 }
 
+uint8_t *helper_getreply_data(void *user)
+{
+uint64_t size;
+int r = read_exactly(0, &size, sizeof(size));
+uint8_t *data;
+
+if (r <= 0)
+exit(-2);
+
+data = helper_allocbuf(size, user);
+r = read_exactly(0, data, size);
+if (r <= 0)
+exit(-2);
+
+return data;
+}
+
 /*- other callbacks -*/
 
 static int toolstack_save_fd;
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..41ee000 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -15,6 +15,7 @@ our @msgs = (
 # and its null-ness needs to be passed through to the helper's xc
 #   W  - needs a return value; callback is synchronous
 #   A  - needs a return value; callback is asynchronous
+#   B  - return value is an pointer
 [  1, 'sr', "log",   [qw(uint32_t level
  uint32_t errnoval
  STRING context
@@ -99,23 +100,28 @@ our $libxl = "libxl__srm";
 our $callback = "${libxl}_callout_callback";
 our $receiveds = "${libxl}_callout_received";
 our $sendreply = "${libxl}_callout_sendreply";
+our $sendreply_data = "${libxl}_callout_sendreply_data";
 our $getcallbacks = "${libxl}_callout_get_callbacks";
 our $enumcallbacks = "${libxl}_callout_enumcallbacks";
 sub cbtype ($) { "${libxl}_".$_[0]."_autogen_callbacks"; };
 
 f_decl($sendreply, 'callout', 'void', "(int r, void *user)");
+f_decl($sendreply_data, 'callout', 'void',
+   "(const void *data, uint64_t size, void *user)");
 
 our $helper = "helper";
 our $encode = "${helper}_stub";
 our $allocbuf = "${helper}_allocbuf";
 

[Xen-devel] [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h

2015-06-07 Thread Yang Hongyang
When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint. So we need to get/test
the dirty page bitmap. We just expose xc_bitops.h for libxl use.

NOTE:
  Need to make clean and rerun configure to get it compiled.

Signed-off-by: Yang Hongyang 
---
 tools/libxc/include/xc_bitops.h | 76 +
 tools/libxc/xc_bitops.h | 76 -
 2 files changed, 76 insertions(+), 76 deletions(-)
 create mode 100644 tools/libxc/include/xc_bitops.h
 delete mode 100644 tools/libxc/xc_bitops.h

diff --git a/tools/libxc/include/xc_bitops.h b/tools/libxc/include/xc_bitops.h
new file mode 100644
index 000..cd749f4
--- /dev/null
+++ b/tools/libxc/include/xc_bitops.h
@@ -0,0 +1,76 @@
+#ifndef XC_BITOPS_H
+#define XC_BITOPS_H 1
+
+/* bitmap operations for single threaded access */
+
+#include 
+#include 
+
+#define BITS_PER_LONG (sizeof(unsigned long) * 8)
+#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
+
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
+#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+
+/* calculate required space for number of longs needed to hold nr_bits */
+static inline int bitmap_size(int nr_bits)
+{
+int nr_long, nr_bytes;
+nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
+nr_bytes = nr_long * sizeof(unsigned long);
+return nr_bytes;
+}
+
+static inline unsigned long *bitmap_alloc(int nr_bits)
+{
+return calloc(1, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_set(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0xff, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0, bitmap_size(nr_bits));
+}
+
+static inline int test_bit(int nr, unsigned long *addr)
+{
+return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
+}
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
+}
+
+static inline int test_and_clear_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+clear_bit(nr, addr);
+return oldbit;
+}
+
+static inline int test_and_set_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+set_bit(nr, addr);
+return oldbit;
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+ int nr_bits)
+{
+int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
+for ( i = 0; i < nr_longs; ++i )
+dst[i] |= other[i];
+}
+
+#endif  /* XC_BITOPS_H */
diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
deleted file mode 100644
index cd749f4..000
--- a/tools/libxc/xc_bitops.h
+++ /dev/null
@@ -1,76 +0,0 @@
-#ifndef XC_BITOPS_H
-#define XC_BITOPS_H 1
-
-/* bitmap operations for single threaded access */
-
-#include 
-#include 
-
-#define BITS_PER_LONG (sizeof(unsigned long) * 8)
-#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
-
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
-
-/* calculate required space for number of longs needed to hold nr_bits */
-static inline int bitmap_size(int nr_bits)
-{
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
-}
-
-static inline unsigned long *bitmap_alloc(int nr_bits)
-{
-return calloc(1, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0xff, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0, bitmap_size(nr_bits));
-}
-
-static inline int test_bit(int nr, unsigned long *addr)
-{
-return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
-}
-
-static inline void clear_bit(int nr, unsigned long *addr)
-{
-BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
-}
-
-static inline void set_bit(int nr, unsigned long *addr)
-{
-BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
-}
-
-static inline int test_and_clear_bit(int nr, unsigned long *addr)
-{
-int oldbit = test_bit(nr, addr);
-clear_bit(nr, addr);
-return oldbit;
-}
-
-static inline int test_and_set_bit(int nr, unsigned long *addr)
-{
-int oldbit = test_bit(nr, addr);
-set_bit(nr, addr);
-return oldbit;
-}
-
-static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
- int nr_bits)
-{
-int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
-for ( i = 0; i < nr_longs; ++i )
-dst[i] |= other[i];
-}
-
-#endif  /* XC_BITOPS_H */
-- 
1.9.1



[Xen-devel] [PATCH v2 COLOPre 13/13] tools/libxl: don't touch remus in checkpoint_device

2015-06-07 Thread Yang Hongyang
Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus:
1. remus_ops: we use remus ops directly in checkpoint
   device. Store it in checkpoint device state.
2. concrete layer's private member: add a new structure
   remus state, and move them to remus state.
3. init/cleanup device subkind: we call (init|cleanup)_subkind_nic
   and (init|cleanup)_subkind_drbd_disk directly in checkpoint
   device. Call them before calling libxl__checkpoint_devices_setup()
   or after calling libxl__checkpoint_devices_teardown().

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl.c   |  2 +-
 tools/libxl/libxl_checkpoint_device.c | 52 ++---
 tools/libxl/libxl_dom_save.c  |  3 +-
 tools/libxl/libxl_internal.h  | 40 ++--
 tools/libxl/libxl_netbuffer.c | 51 +++-
 tools/libxl/libxl_remus.c | 88 ---
 tools/libxl/libxl_remus_disk_drbd.c   |  8 ++--
 7 files changed, 135 insertions(+), 109 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 36b97fe..10d3d82 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -841,7 +841,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 assert(info);
 
 /* Point of no return */
-libxl__remus_setup(egc, dss);
+libxl__remus_setup(egc, &dss->rs);
 return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
index 226f159..0a16dbb 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,46 +17,6 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__checkpoint_device_instance_ops remus_device_nic;
-extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
-static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
-&remus_device_nic,
-&remus_device_drbd_disk,
-NULL,
-};
-
-/*- helper functions -*/
-
-static int init_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-/* init device subkind-specific state in the libxl ctx */
-int rc;
-STATE_AO_GC(cds->ao);
-
-if (libxl__netbuffer_enabled(gc)) {
-rc = init_subkind_nic(cds);
-if (rc) goto out;
-}
-
-rc = init_subkind_drbd_disk(cds);
-if (rc) goto out;
-
-rc = 0;
-out:
-return rc;
-}
-
-static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-/* cleanup device subkind-specific state in the libxl ctx */
-STATE_AO_GC(cds->ao);
-
-if (libxl__netbuffer_enabled(gc))
-cleanup_subkind_nic(cds);
-
-cleanup_subkind_drbd_disk(cds);
-}
-
 /*- setup() and teardown() -*/
 
 /* callbacks */
@@ -94,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
 void libxl__checkpoint_devices_setup(libxl__egc *egc,
  libxl__checkpoint_devices_state *cds)
 {
-int i, rc;
+int i;
 
 STATE_AO_GC(cds->ao);
 
-rc = init_device_subkind(cds);
-if (rc)
-goto out;
-
 cds->num_devices = 0;
 cds->num_nics = 0;
 cds->num_disks = 0;
@@ -134,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
 return;
 
 out:
-cds->callback(egc, cds, rc);
+cds->callback(egc, cds, 0);
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
@@ -172,7 +128,7 @@ static void device_setup_iterate(libxl__egc *egc, 
libxl__ao_device *aodev)
 goto out;
 
 do {
-dev->ops = remus_ops[++dev->ops_index];
+dev->ops = dev->cds->ops[++dev->ops_index];
 if (!dev->ops) {
 libxl_device_nic * nic = NULL;
 libxl_device_disk * disk = NULL;
@@ -271,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
 cds->disks = NULL;
 cds->num_disks = 0;
 
-cleanup_device_subkind(cds);
-
 cds->callback(egc, cds, rc);
 }
 
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index c15e9f1..cb3d8db 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -402,7 +402,6 @@ void libxl__domain_save(libxl__egc *egc, 
libxl__domain_save_state *dss)
 dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
 
 if (r_info != NULL) {
-dss->interval = r_info->interval;
 dss->xcflags |= XCFLAGS_CHECKPOINTED;
 if (libxl_defbool_val(r_info->compression))
 dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
@@ -601,7 +600,7 @@ static void domain_save_done(libxl__egc *egc,
  * from sending checkpoints. Teardown the network buffers and
  * release netlink resources.  This is an async op.
  */
-libxl__remus_teardown(egc, dss, rc);
+libxl__remus_teardown(egc

[Xen-devel] [PATCH v6 COLO 11/15] COLO proxy: preresume, postresume and checkpoint

2015-06-07 Thread Yang Hongyang
preresume, postresume and checkpoint

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl_colo.h   |  3 +++
 tools/libxl/libxl_colo_proxy.c | 57 ++
 2 files changed, 60 insertions(+)

diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 5983aa0..872c652 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -47,4 +47,7 @@ extern void libxl__colo_save_teardown(libxl__egc *egc,
 
 extern int colo_proxy_setup(libxl__colo_proxy_state *cps);
 extern void colo_proxy_teardown(libxl__colo_proxy_state *cps);
+extern void colo_proxy_preresume(libxl__colo_proxy_state *cps);
+extern void colo_proxy_postresume(libxl__colo_proxy_state *cps);
+extern int colo_proxy_checkpoint(libxl__colo_proxy_state *cps);
 #endif
diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
index 9f1243e..c8ff722 100644
--- a/tools/libxl/libxl_colo_proxy.c
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -208,3 +208,60 @@ void colo_proxy_teardown(libxl__colo_proxy_state *cps)
 cps->sock_fd = -1;
 }
 }
+
+/* = colo-proxy: preresume, postresume and checkpoint == */
+
+void colo_proxy_preresume(libxl__colo_proxy_state *cps)
+{
+colo_proxy_send(cps, NULL, 0, COLO_CHECKPOINT);
+/* TODO: need to handle if the call fails... */
+}
+
+void colo_proxy_postresume(libxl__colo_proxy_state *cps)
+{
+/* nothing to do... */
+}
+
+
+typedef struct colo_msg {
+bool is_checkpoint;
+} colo_msg;
+
+/*
+do checkpoint: return 1
+error: return -1
+do not checkpoint: return 0
+*/
+int colo_proxy_checkpoint(libxl__colo_proxy_state *cps)
+{
+uint8_t *buff;
+int64_t size;
+struct nlmsghdr *h;
+struct colo_msg *m;
+int ret = -1;
+
+size = colo_proxy_recv(cps, &buff, MSG_DONTWAIT);
+
+/* timeout, return no checkpoint message. */
+if (size <= 0) {
+return 0;
+}
+
+h = (struct nlmsghdr *) buff;
+
+if (h->nlmsg_type == NLMSG_ERROR) {
+goto out;
+}
+
+if (h->nlmsg_len < NLMSG_LENGTH(sizeof(*m))) {
+goto out;
+}
+
+m = NLMSG_DATA(h);
+
+ret = m->is_checkpoint ? 1 : 0;
+
+out:
+free(buff);
+return ret;
+}
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 13/15] setup and control colo proxy on primary side

2015-06-07 Thread Yang Hongyang
setup and control colo proxy on primary side

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl_colo_save.c | 125 +++---
 tools/libxl/libxl_internal.h  |   1 +
 2 files changed, 118 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
index 80fd605..9a4f501 100644
--- a/tools/libxl/libxl_colo_save.c
+++ b/tools/libxl/libxl_colo_save.c
@@ -19,9 +19,11 @@
 #include "libxl_internal.h"
 #include "libxl_colo.h"
 
+extern const libxl__checkpoint_device_instance_ops colo_save_device_nic;
 extern const libxl__checkpoint_device_instance_ops colo_save_device_qdisk;
 
 static const libxl__checkpoint_device_instance_ops *colo_ops[] = {
+&colo_save_device_nic,
 &colo_save_device_qdisk,
 NULL,
 };
@@ -33,9 +35,15 @@ static int 
init_device_subkind(libxl__checkpoint_devices_state *cds)
 int rc;
 STATE_AO_GC(cds->ao);
 
-rc = init_subkind_qdisk(cds);
+rc = init_subkind_colo_nic(cds);
 if (rc) goto out;
 
+rc = init_subkind_qdisk(cds);
+if (rc) {
+cleanup_subkind_colo_nic(cds);
+goto out;
+}
+
 rc = 0;
 out:
 return rc;
@@ -46,6 +54,7 @@ static void 
cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
 /* cleanup device subkind-specific state in the libxl ctx */
 STATE_AO_GC(cds->ao);
 
+cleanup_subkind_colo_nic(cds);
 cleanup_subkind_qdisk(cds);
 }
 
@@ -76,14 +85,28 @@ void libxl__colo_save_setup(libxl__egc *egc, 
libxl__colo_save_state *css)
 css->svm_running = false;
 css->paused = true;
 css->qdisk_setuped = false;
+libxl__ev_child_init(&css->child);
 
-/* TODO: nic support */
-cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+if (dss->remus->netbufscript)
+css->colo_proxy_script = libxl__strdup(gc, dss->remus->netbufscript);
+else
+css->colo_proxy_script = GCSPRINTF("%s/colo-proxy-setup",
+   libxl__xen_script_dir_path());
+
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
 cds->ops = colo_ops;
 cds->callback = colo_save_setup_done;
 cds->ao = ao;
 cds->domid = dss->domid;
 
+css->cps.ao = ao;
+if (colo_proxy_setup(&css->cps)) {
+LOG(ERROR, "COLO: failed to setup colo proxy for guest with domid %u",
+cds->domid);
+goto out;
+}
+
 if (init_device_subkind(cds))
 goto out;
 
@@ -157,6 +180,7 @@ static void colo_teardown_done(libxl__egc *egc,
 libxl__domain_save_state *dss = CONTAINER_OF(css, *dss, css);
 
 cleanup_device_subkind(cds);
+colo_proxy_teardown(&css->cps);
 dss->callback(egc, dss, rc);
 }
 
@@ -437,6 +461,8 @@ static void colo_read_svm_ready_done(libxl__egc *egc,
 goto out;
 }
 
+colo_proxy_preresume(&css->cps);
+
 css->svm_running = true;
 css->cds.callback = colo_preresume_cb;
 libxl__checkpoint_devices_preresume(egc, &css->cds);
@@ -530,6 +556,8 @@ static void colo_read_svm_resumed_done(libxl__egc *egc,
 goto out;
 }
 
+colo_proxy_postresume(&css->cps);
+
 ok = 1;
 
 out:
@@ -538,6 +566,91 @@ out:
 
 
 /* = colo: wait new checkpoint = */
+
+static void colo_start_new_checkpoint(libxl__egc *egc,
+  libxl__checkpoint_devices_state *cds,
+  int rc);
+static void colo_proxy_async_wait_for_checkpoint(libxl__colo_save_state *css);
+static void colo_proxy_async_call_done(libxl__egc *egc,
+   libxl__ev_child *child,
+   int pid,
+   int status);
+
+static void colo_proxy_async_call(libxl__egc *egc,
+  libxl__colo_save_state *css,
+  void func(libxl__colo_save_state *),
+  libxl__ev_child_callback callback)
+{
+int pid = -1, rc;
+
+STATE_AO_GC(css->cds.ao);
+
+/* Fork and call */
+pid = libxl__ev_child_fork(gc, &css->child, callback);
+if (pid == -1) {
+LOG(ERROR, "unable to fork");
+rc = ERROR_FAIL;
+goto out;
+}
+
+if (!pid) {
+/* child */
+func(css);
+/* notreached */
+abort();
+}
+
+return;
+
+out:
+callback(egc, &css->child, -1, 1);
+}
+
+static void colo_proxy_wait_for_checkpoint(libxl__egc *egc,
+   libxl__colo_save_state *css)
+{
+colo_proxy_async_call(egc, css,
+  colo_proxy_async_wait_for_checkpoint,
+  colo_proxy_async_call_d

[Xen-devel] [PATCH v6 COLO 12/15] COLO nic: implement COLO nic subkind

2015-06-07 Thread Yang Hongyang
implement COLO nic subkind.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
 tools/hotplug/Linux/Makefile |   1 +
 tools/hotplug/Linux/colo-proxy-setup | 131 +++
 tools/libxl/Makefile |   1 +
 tools/libxl/libxl_colo_nic.c | 317 +++
 tools/libxl/libxl_internal.h |   5 +
 tools/libxl/libxl_types.idl  |   1 +
 6 files changed, 456 insertions(+)
 create mode 100755 tools/hotplug/Linux/colo-proxy-setup
 create mode 100644 tools/libxl/libxl_colo_nic.c

diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
index d94a9cb..1c28bea 100644
--- a/tools/hotplug/Linux/Makefile
+++ b/tools/hotplug/Linux/Makefile
@@ -25,6 +25,7 @@ XEN_SCRIPTS += vscsi
 XEN_SCRIPTS += block-iscsi
 XEN_SCRIPTS += block-drbd-probe
 XEN_SCRIPTS += $(XEN_SCRIPTS-y)
+XEN_SCRIPTS += colo-proxy-setup
 
 SUBDIRS-$(CONFIG_SYSTEMD) += systemd
 
diff --git a/tools/hotplug/Linux/colo-proxy-setup 
b/tools/hotplug/Linux/colo-proxy-setup
new file mode 100755
index 000..08a93de
--- /dev/null
+++ b/tools/hotplug/Linux/colo-proxy-setup
@@ -0,0 +1,131 @@
+#! /bin/bash
+
+dir=$(dirname "$0")
+. "$dir/xen-hotplug-common.sh"
+. "$dir/hotplugpath.sh"
+. "$dir/xen-network-ft.sh"
+
+findCommand "$@"
+
+if [ "$command" != "setup" -a  "$command" != "teardown" ]
+then
+echo "Invalid command: $command"
+log err "Invalid command: $command"
+exit 1
+fi
+
+evalVariables "$@"
+
+: ${vifname:?}
+: ${forwarddev:?}
+: ${mode:?}
+: ${index:?}
+: ${bridge:?}
+
+forwardbr="colobr0"
+
+if [ "$mode" != "primary" -a "$mode" != "secondary" ]
+then
+echo "Invalid mode: $mode"
+log err "Invalid mode: $mode"
+exit 1
+fi
+
+if [ $index -lt 0 ] || [ $index -gt 100 ]; then
+echo "index overflow"
+exit 1
+fi
+
+function setup_primary()
+{
+do_without_error tc qdisc add dev $vifname root handle 1: prio
+do_without_error tc filter add dev $vifname parent 1: protocol ip prio 10 \
+u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev 
$forwarddev
+do_without_error tc filter add dev $vifname parent 1: protocol arp prio 11 
\
+u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev 
$forwarddev
+do_without_error tc filter add dev $vifname parent 1: protocol ipv6 prio \
+12 u32 match u32 0 0 flowid 1:2 action mirred egress mirror \
+dev $forwarddev
+
+do_without_error modprobe nf_conntrack_ipv4
+do_without_error modprobe xt_PMYCOLO sec_dev=$forwarddev
+
+do_without_error /usr/local/sbin/iptables -t mangle -I PREROUTING -m 
physdev --physdev-in \
+$vifname -j PMYCOLO --index $index
+do_without_error /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m 
physdev --physdev-in \
+$vifname -j PMYCOLO --index $index
+do_without_error /usr/local/sbin/arptables -I INPUT -i $forwarddev -j MARK 
--set-mark $index
+}
+
+function teardown_primary()
+{
+do_without_error tc filter del dev $vifname parent 1: protocol ip prio 10 
u32 match u32 \
+0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+do_without_error tc filter del dev $vifname parent 1: protocol arp prio 11 
u32 match u32 \
+0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+do_without_error tc filter del dev $vifname parent 1: protocol ipv6 prio 
12 u32 match u32 \
+0 0 flowid 1:2 action mirred egress mirror dev $forwarddev
+do_without_error tc qdisc del dev $vifname root handle 1: prio
+
+do_without_error /usr/local/sbin/iptables -t mangle -F
+do_without_error /usr/local/sbin/ip6tables -t mangle -F
+do_without_error /usr/local/sbin/arptables -F
+do_without_error rmmod xt_PMYCOLO
+}
+
+function setup_secondary()
+{
+do_without_error brctl delif $bridge $vifname
+do_without_error brctl addbr $forwardbr
+do_without_error brctl addif $forwardbr $vifname
+do_without_error brctl addif $forwardbr $forwarddev
+do_without_error modprobe xt_SECCOLO
+
+do_without_error /usr/local/sbin/iptables -t mangle -I PREROUTING -m 
physdev --physdev-in \
+$vifname -j SECCOLO --index $index
+do_without_error /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m 
physdev --physdev-in \
+$vifname -j SECCOLO --index $index
+}
+
+function teardown_secondary()
+{
+do_without_error brctl delif $forwardbr $forwarddev
+do_without_error brctl delif $forwardbr $vifname
+do_without_error brctl delbr $forwardbr
+do_without_error brctl addif $bridge $vifname
+
+do_without_error /usr/local/sbin/iptables -t mangle -F
+do_without_error /usr/local/sbin/ip6tables -t mangle -F
+do_without_error rmmod xt_SECCOLO
+}
+
+case "$command" in
+setup)
+if [ &qu

[Xen-devel] [PATCH v6 COLO 10/15] COLO proxy: implement setup/teardown of COLO proxy module

2015-06-07 Thread Yang Hongyang
setup/teardown of COLO proxy module.
we use netlink to communicate with proxy module.

Signed-off-by: Yang Hongyang 
---
 tools/libxl/Makefile   |   1 +
 tools/libxl/libxl_colo.h   |   2 +
 tools/libxl/libxl_colo_proxy.c | 210 +
 tools/libxl/libxl_internal.h   |  12 +++
 4 files changed, 225 insertions(+)
 create mode 100644 tools/libxl/libxl_colo_proxy.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index d93b271..b45fe62 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -59,6 +59,7 @@ endif
 LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
 LIBXL_OBJS-y += libxl_colo_qdisk.o
+LIBXL_OBJS-y += libxl_colo_proxy.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 26a2563..5983aa0 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -45,4 +45,6 @@ extern void libxl__colo_save_teardown(libxl__egc *egc,
   libxl__colo_save_state *css,
   int rc);
 
+extern int colo_proxy_setup(libxl__colo_proxy_state *cps);
+extern void colo_proxy_teardown(libxl__colo_proxy_state *cps);
 #endif
diff --git a/tools/libxl/libxl_colo_proxy.c b/tools/libxl/libxl_colo_proxy.c
new file mode 100644
index 000..9f1243e
--- /dev/null
+++ b/tools/libxl/libxl_colo_proxy.c
@@ -0,0 +1,210 @@
+/*
+ * Copyright (C) 2015 FUJITSU LIMITED
+ * Author: Yang Hongyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+#include "libxl_colo.h"
+#include 
+
+#define NETLINK_COLO 28
+
+enum colo_netlink_op {
+COLO_QUERY_CHECKPOINT = (NLMSG_MIN_TYPE + 1),
+COLO_CHECKPOINT,
+COLO_FAILOVER,
+COLO_PROXY_INIT,
+COLO_PROXY_RESET, /* UNUSED, will be used for continuous FT */
+};
+
+/* = colo-proxy: helper functions == */
+
+static int colo_proxy_send(libxl__colo_proxy_state *cps, uint8_t *buff, 
uint64_t size, int type)
+{
+struct sockaddr_nl sa;
+struct nlmsghdr msg;
+struct iovec iov;
+struct msghdr mh;
+int ret;
+
+STATE_AO_GC(cps->ao);
+
+memset(&sa, 0, sizeof(sa));
+sa.nl_family = AF_NETLINK;
+sa.nl_pid = 0;
+sa.nl_groups = 0;
+
+msg.nlmsg_len = NLMSG_SPACE(0);
+msg.nlmsg_flags = NLM_F_REQUEST;
+if (type == COLO_PROXY_INIT) {
+msg.nlmsg_flags |= NLM_F_ACK;
+}
+msg.nlmsg_seq = 0;
+/* This is untrusty */
+msg.nlmsg_pid = cps->index;
+msg.nlmsg_type = type;
+
+iov.iov_base = &msg;
+iov.iov_len = msg.nlmsg_len;
+
+mh.msg_name = &sa;
+mh.msg_namelen = sizeof(sa);
+mh.msg_iov = &iov;
+mh.msg_iovlen = 1;
+mh.msg_control = NULL;
+mh.msg_controllen = 0;
+mh.msg_flags = 0;
+
+ret = sendmsg(cps->sock_fd, &mh, 0);
+if (ret <= 0) {
+LOG(ERROR, "can't send msg to kernel by netlink: %s",
+strerror(errno));
+}
+
+return ret;
+}
+
+/* error: return -1, otherwise return 0 */
+static int64_t colo_proxy_recv(libxl__colo_proxy_state *cps, uint8_t **buff, 
int flags)
+{
+struct sockaddr_nl sa;
+struct iovec iov;
+struct msghdr mh = {
+.msg_name = &sa,
+.msg_namelen = sizeof(sa),
+.msg_iov = &iov,
+.msg_iovlen = 1,
+};
+uint32_t size = 16384;
+int64_t len = 0;
+int ret;
+
+STATE_AO_GC(cps->ao);
+uint8_t *tmp = libxl__malloc(NOGC, size);
+
+iov.iov_base = tmp;
+iov.iov_len = size;
+next:
+   ret = recvmsg(cps->sock_fd, &mh, flags);
+if (ret <= 0) {
+goto out;
+}
+
+len += ret;
+if (mh.msg_flags & MSG_TRUNC) {
+size += 16384;
+tmp = libxl__realloc(NOGC, tmp, size);
+iov.iov_base = tmp + len;
+iov.iov_len = size - len;
+goto next;
+}
+
+*buff = tmp;
+return len;
+
+out:
+free(tmp);
+*buff = NULL;
+return ret;
+}
+
+/* = colo-proxy: setup and teardown == */
+
+int colo_proxy_setup(libxl__colo_proxy_state *cps)
+{
+int skfd = 0;
+struct sockaddr_nl sa;
+struct nlmsghdr *h;
+struct timeval tv = {0, 

[Xen-devel] [PATCH v6 COLO 03/15] primary vm suspend/get_dirty_pfn/resume/checkpoint code

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

We will do the following things again and again:
1. Suspend primary vm
   a. Suspend primary vm
   b. do postsuspend
   c. Read LIBXL_COLO_SVM_SUSPENDED sent by secondary
   d. Read secondary vm's dirty page information to master(count + pfn list)
2. Get dirty pfn list callback, used by libxc
   a. Return secondary vm's dirty pfn list
3. Resume primary vm
   a. Read LIBXL_COLO_SVM_READY from slave
   b. Do presume
   c. Resume primary vm
   d. Read LIBXL_COLO_SVM_RESUMED from slave
4. Wait a new checkpoint
   a. Wait a new checkpoint(not implemented)
   b. Send LIBXL_COLO_NEW_CHECKPOINT to slave

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxc/include/xenguest.h |  12 +
 tools/libxl/Makefile   |   2 +-
 tools/libxl/libxl.c|   6 +-
 tools/libxl/libxl_colo.h   |  10 +
 tools/libxl/libxl_colo_save.c  | 643 +
 tools/libxl/libxl_dom_save.c   |  15 +-
 tools/libxl/libxl_internal.h   |  31 +-
 tools/libxl/libxl_save_msgs_gen.pl |   1 +
 tools/libxl/libxl_types.idl|   1 +
 9 files changed, 712 insertions(+), 9 deletions(-)
 create mode 100644 tools/libxl/libxl_colo_save.c

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 86bcf9c..d5902a6 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -75,6 +75,18 @@ struct save_callbacks {
  */
 int (*toolstack_save)(uint32_t domid, uint8_t **buf, uint32_t *len, void 
*data);
 
+/* Called after the guest is suspended.
+ *
+ * returns the list of dirty pfn:
+ *  struct {
+ *  uint64_t count;
+ *  uint64_t pfn[];
+ *  };
+ *
+ *  Note: the caller must free the return value.
+ */
+uint8_t *(*get_dirty_pfn)(void *data);
+
 /* to be provided as the last argument to each callback function */
 void* data;
 };
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 82cc4c2..88c5426 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -57,7 +57,7 @@ LIBXL_OBJS-y += libxl_nonetbuffer.o
 endif
 
 LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
-LIBXL_OBJS-y += libxl_colo_restore.o
+LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 10d3d82..1145ae4 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -17,6 +17,7 @@
 #include "libxl_osdeps.h"
 
 #include "libxl_internal.h"
+#include "libxl_colo.h"
 
 #define PAGE_TO_MEMKB(pages) ((pages) * 4)
 #define BACKEND_STRING_SIZE 5
@@ -841,7 +842,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 assert(info);
 
 /* Point of no return */
-libxl__remus_setup(egc, &dss->rs);
+if (libxl_defbool_val(info->colo))
+libxl__colo_save_setup(egc, &dss->css);
+else
+libxl__remus_setup(egc, &dss->rs);
 return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
index 91df275..26a2563 100644
--- a/tools/libxl/libxl_colo.h
+++ b/tools/libxl/libxl_colo.h
@@ -35,4 +35,14 @@ extern void libxl__colo_restore_teardown(libxl__egc *egc,
  libxl__colo_restore_state *crs,
  int rc);
 
+extern void libxl__colo_save_domain_suspend_callback(void *data);
+extern void libxl__colo_save_domain_resume_callback(void *data);
+extern void libxl__colo_save_domain_checkpoint_callback(void *data);
+extern void libxl__colo_save_get_dirty_pfn_callback(void *data);
+extern void libxl__colo_save_setup(libxl__egc *egc,
+   libxl__colo_save_state *css);
+extern void libxl__colo_save_teardown(libxl__egc *egc,
+  libxl__colo_save_state *css,
+  int rc);
+
 #endif
diff --git a/tools/libxl/libxl_colo_save.c b/tools/libxl/libxl_colo_save.c
new file mode 100644
index 000..153ec57
--- /dev/null
+++ b/tools/libxl/libxl_colo_save.c
@@ -0,0 +1,643 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang 
+ * Yang Hongyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "

[Xen-devel] [PATCH v6 COLO 02/15] secondary vm suspend/resume/checkpoint code

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

Secondary vm is running in colo mode. So we will do
the following things again and again:
1. Resume secondary vm
   a. Send LIBXL_COLO_SVM_READY to master.
   b. If it is not the first resume, call libxl__checkpoint_devices_preresume().
   c. If it is the first resume(resume right after live migration),
  - call libxl__xc_domain_restore_done() to build the secondary vm.
  - enable secondary vm's logdirty.
  - call libxl__domain_resume() to resume secondary vm.
  - call libxl__checkpoint_devices_setup() to setup checkpoint devices.
   d. Send LIBXL_COLO_SVM_RESUMED to master.
2. Wait a new checkpoint
   a. Call libxl__checkpoint_devices_commit().
   b. Read LIBXL_COLO_NEW_CHECKPOINT from master.
3. Suspend secondary vm
   a. Suspend secondary vm.
   b. Call libxl__checkpoint_devices_postsuspend().
   c. Get secondary vm's dirty page information.
   d. Send LIBXL_COLO_SVM_SUSPENDED to master.
   e. Send secondary vm's dirty page information to master(count + pfn list).

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxc/include/xenguest.h |   20 +
 tools/libxl/Makefile   |1 +
 tools/libxl/libxl_colo.h   |   38 ++
 tools/libxl/libxl_colo_restore.c   | 1158 
 tools/libxl/libxl_create.c |  116 +++-
 tools/libxl/libxl_dom_save.c   |2 +-
 tools/libxl/libxl_internal.h   |   24 +
 tools/libxl/libxl_save_callout.c   |6 +-
 tools/libxl/libxl_save_msgs_gen.pl |6 +-
 9 files changed, 1364 insertions(+), 7 deletions(-)
 create mode 100644 tools/libxl/libxl_colo.h
 create mode 100644 tools/libxl/libxl_colo_restore.c

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..86bcf9c 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -98,6 +98,26 @@ int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t 
dom, uint32_t max_ite
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
+/* Called after a new checkpoint to suspend the guest.
+ */
+int (*suspend)(void* data);
+
+/* Called after the secondary vm is ready to resume.
+ * Callback function resumes the guest & the device model,
+ *  returns to xc_domain_restore.
+ */
+int (*postcopy)(void* data);
+
+/* callback to wait a new checkpoint
+ *
+ * returns:
+ * 0: terminate checkpointing gracefully
+ * 1: take another checkpoint */
+int (*checkpoint)(void* data);
+
+/* Enable qemu-dm logging dirty pages to xen */
+int (*switch_qemu_logdirty)(int domid, unsigned enable, void *data); /* 
HVM only */
+
 /* callback to restore toolstack specific data */
 int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
 uint32_t size, void* data);
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cd63dac..82cc4c2 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -57,6 +57,7 @@ LIBXL_OBJS-y += libxl_nonetbuffer.o
 endif
 
 LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_colo_restore.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo.h b/tools/libxl/libxl_colo.h
new file mode 100644
index 000..91df275
--- /dev/null
+++ b/tools/libxl/libxl_colo.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#ifndef LIBXL_COLO_H
+#define LIBXL_COLO_H
+
+/*
+ * values to control suspend/resume primary vm and secondary vm
+ * at the same time
+ */
+enum {
+LIBXL_COLO_NEW_CHECKPOINT = 1,
+LIBXL_COLO_SVM_SUSPENDED,
+LIBXL_COLO_SVM_READY,
+LIBXL_COLO_SVM_RESUMED,
+};
+
+extern void libxl__colo_restore_done(libxl__egc *egc, void *dcs_void,
+ int ret, int retval, int errnoval);
+extern void libxl__colo_restore_setup(libxl__egc *egc,
+  libxl__colo_restore_state *crs);
+extern void libxl__colo_restore_teardown(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc);
+
+#endif
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
new file mode 100644
index 000..6c39758

[Xen-devel] [PATCH v6 COLO 04/15] libxc/restore: support COLO restore

2015-06-07 Thread Yang Hongyang
call the callbacks resume/checkpoint/suspend while secondary vm
status is consistent with primary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
 tools/libxc/xc_sr_common.h  | 11 +--
 tools/libxc/xc_sr_restore.c | 63 -
 tools/libxc/xc_sr_restore_x86_hvm.c |  1 +
 3 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 565c5da..382bf76 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -132,8 +132,11 @@ struct xc_sr_restore_ops
  *
  * @return 0 for success, -1 for failure, or the sentinel value
  * RECORD_NOT_PROCESSED.
+ * BROKEN_CHANNEL: if we are under Remus/COLO, this means master may dead,
+ * we will failover.
  */
 #define RECORD_NOT_PROCESSED 1
+#define BROKEN_CHANNEL 2
 int (*process_record)(struct xc_sr_context *ctx, struct xc_sr_record *rec);
 
 /**
@@ -205,8 +208,12 @@ struct xc_sr_context
 uint32_t guest_type;
 uint32_t guest_page_size;
 
-/* Plain VM, or checkpoints over time. */
-bool checkpointed;
+/*
+ * 0: Plain VM
+ * 1: Remus
+ * 2: COLO
+ */
+int checkpointed;
 
 /* Currently buffering records between a checkpoint */
 bool buffer_all_records;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 2d2edd3..982a70e 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 
 #include "xc_sr_common.h"
 
@@ -472,7 +473,7 @@ static int process_record(struct xc_sr_context *ctx, struct 
xc_sr_record *rec);
 static int handle_checkpoint(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx->xch;
-int rc = 0;
+int rc = 0, ret;
 unsigned i;
 
 if ( !ctx->restore.checkpointed )
@@ -498,6 +499,46 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 else
 ctx->restore.buffer_all_records = true;
 
+if ( ctx->restore.checkpointed == 2 )
+{
+#define HANDLE_CALLBACK_RETURN_VALUE(ret)   \
+do {\
+if ( ret == 0 ) \
+{   \
+/* Some internal error happens */   \
+rc = -1;\
+goto err;   \
+}   \
+else if ( ret == 2 )\
+{   \
+/* Reading/writing error, do failover */\
+rc = BROKEN_CHANNEL;\
+goto err;   \
+}   \
+} while (0)
+
+/* COLO */
+
+/* We need to resume guest */
+rc = ctx->restore.ops.stream_complete(ctx);
+if ( rc )
+goto err;
+
+/* TODO: call restore_results */
+
+/* Resume secondary vm */
+ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+/* wait for new checkpoint */
+ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+/* suspend secondary vm */
+ret = ctx->restore.callbacks->suspend(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);
+}
+
  err:
 return rc;
 }
@@ -678,6 +719,8 @@ static int restore(struct xc_sr_context *ctx)
 goto err;
 }
 }
+else if ( rc == BROKEN_CHANNEL )
+goto remus_failover;
 else if ( rc )
 goto err;
 }
@@ -685,6 +728,15 @@ static int restore(struct xc_sr_context *ctx)
 } while ( rec.type != REC_TYPE_END );
 
  remus_failover:
+
+if ( ctx->restore.checkpointed == 2 )
+{
+/* With COLO, we have already called stream_complete */
+rc = 0;
+IPRINTF("COLO Failover");
+goto done;
+}
+
 /*
  * With Remus, if we reach here, there must be some error on primary,
  * failover from the last checkpoint state.
@@ -735,6 +787,15 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
 ctx.restore.checkpointed = checkpointed_stream;
 ctx.restore.callbacks = callbacks;
 
+/* Sanity checks for callbacks. */
+if ( ctx.restore.checkpointed == 2 )
+{
+/* this is COLO restore */
+assert(callbacks->suspend &&
+   cal

[Xen-devel] [PATCH v6 COLO 00/15] COarse-grain LOck-stepping Virtual Machines for Non-stop Service

2015-06-07 Thread Yang Hongyang
This patchset implemented the COLO feature for Xen.
For detail/install/use of COLO feature, refer to:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

This patchset is based on:
[PATCH v2 COLOPre 00/13] Prerequisite patches for COLO

We only support hvm guest now. The codes are also hosted on github:
https://github.com/macrosheep/xen/tree/colo-v6

Patch 1: Add readme
Patch 2-7: COLO framework related codes
Patch 8-9: Implement disk replication
Patch 10-15: Implement nic replication

Changelog from v5 to v6:
1. based on migration v2(libxc)
2. split the patchset into prerequisite patchset and this main patchset.

Changelog from v4 to v5:
1. rebase to the latest xen upstream
2. disk replication: blktap2->qdisk
3. nic replication: colo-agent->colo-proxy

Changelog from v3 to v4:
1. rebase to newest xen
2. bug fix

Changlog from v2 to v3:
1. rebase to newest remus
2. add nic replication support

Changlog from v1 to v2:
1. rebase to newest remus
2. add disk replication support

Wen Congyang (6):
  secondary vm suspend/resume/checkpoint code
  primary vm suspend/get_dirty_pfn/resume/checkpoint code
  send store mfn and console mfn to xl before resuming secondary vm
  implement the cmdline for COLO
  Support colo mode for qemu disk
  COLO: use qemu block replication

Yang Hongyang (9):
  docs: add colo readme
  libxc/restore: support COLO restore
  libxc/save: support COLO save
  COLO proxy: implement setup/teardown of COLO proxy module
  COLO proxy: preresume, postresume and checkpoint
  COLO nic: implement COLO nic subkind
  setup and control colo proxy on primary side
  setup and control colo proxy on secondary side
  cmdline switches and config vars to control colo-proxy

 docs/README.colo |9 +
 docs/man/xl.conf.pod.5   |6 +
 docs/man/xl.pod.1|   11 +-
 tools/hotplug/Linux/Makefile |1 +
 tools/hotplug/Linux/colo-proxy-setup |  131 
 tools/libxc/include/xenguest.h   |   40 ++
 tools/libxc/xc_sr_common.h   |   11 +-
 tools/libxc/xc_sr_restore.c  |   67 +-
 tools/libxc/xc_sr_restore_x86_hvm.c  |1 +
 tools/libxc/xc_sr_save.c |   49 +-
 tools/libxl/Makefile |4 +
 tools/libxl/libxl.c  |   70 +-
 tools/libxl/libxl_colo.h |   53 ++
 tools/libxl/libxl_colo_nic.c |  317 +
 tools/libxl/libxl_colo_proxy.c   |  267 
 tools/libxl/libxl_colo_qdisk.c   |  209 ++
 tools/libxl/libxl_colo_restore.c | 1192 ++
 tools/libxl/libxl_colo_save.c|  784 ++
 tools/libxl/libxl_create.c   |  156 -
 tools/libxl/libxl_device.c   |   38 ++
 tools/libxl/libxl_dm.c   |  262 +++-
 tools/libxl/libxl_dom_save.c |   17 +-
 tools/libxl/libxl_internal.h |   94 ++-
 tools/libxl/libxl_qmp.c  |   31 +
 tools/libxl/libxl_save_callout.c |6 +-
 tools/libxl/libxl_save_msgs_gen.pl   |9 +-
 tools/libxl/libxl_types.idl  |8 +
 tools/libxl/libxlu_disk_l.l  |5 +
 tools/libxl/xl.c |3 +
 tools/libxl/xl.h |1 +
 tools/libxl/xl_cmdimpl.c |   92 ++-
 tools/libxl/xl_cmdtable.c|4 +-
 32 files changed, 3896 insertions(+), 52 deletions(-)
 create mode 100644 docs/README.colo
 create mode 100755 tools/hotplug/Linux/colo-proxy-setup
 create mode 100644 tools/libxl/libxl_colo.h
 create mode 100644 tools/libxl/libxl_colo_nic.c
 create mode 100644 tools/libxl/libxl_colo_proxy.c
 create mode 100644 tools/libxl/libxl_colo_qdisk.c
 create mode 100644 tools/libxl/libxl_colo_restore.c
 create mode 100644 tools/libxl/libxl_colo_save.c

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 07/15] implement the cmdline for COLO

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

Add a new option -c to the command 'xl remus'. If you want
to use COLO HA instead of Remus HA, please use -c option.

Update man pages to reflect the addition of a new option to
'xl remus' command.

Also add a new option -c to the internal command 'xl migrate-receive'.

Signed-off-by: Wen Congyang 
---
 docs/man/xl.pod.1 | 12 +--
 tools/libxl/libxl.c   | 16 ++
 tools/libxl/xl_cmdimpl.c  | 53 +++
 tools/libxl/xl_cmdtable.c |  4 +++-
 4 files changed, 73 insertions(+), 12 deletions(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 4eb929d..f5e97d7 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -447,12 +447,15 @@ Print huge (!) amount of debug during the migration 
process.
 
 =item B [I] I I
 
-Enable Remus HA for domain. By default B relies on ssh as a transport
-mechanism between the two hosts.
+Enable Remus HA or COLO HA for domain. By default B relies on ssh as a
+transport mechanism between the two hosts.
 
 N.B: Remus support in xl is still in experimental (proof-of-concept) phase.
  Disk replication support is limited to DRBD disks.
 
+ COLO support in xl is still in experimental (proof-of-concept) phase.
+ There is no support for network or disk at the moment.
+
 B
 
 =over 4
@@ -498,6 +501,11 @@ Disable network output buffering. Requires enabling unsafe 
mode.
 
 Disable disk replication. Requires enabling unsafe mode.
 
+=item B<-c>
+
+Enable COLO HA. It is conflict with B<-i> and B<-b>, and memory
+checkpoint compression must be disabled.
+
 =back
 
 =item B I
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 1145ae4..7df2466 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -811,6 +811,22 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 goto out;
 }
 
+/* The caller must set this defbool */
+if (libxl_defbool_is_default(info->colo)) {
+LOG(ERROR, "colo mode must be enabled/disabled");
+rc = ERROR_FAIL;
+goto out;
+}
+
+if (libxl_defbool_val(info->colo)) {
+libxl_defbool_setdefault(&info->compression, false);
+if (libxl_defbool_val(info->compression)) {
+LOG(ERROR, "cannot use memory checkpoint compression in COLO 
mode");
+rc = ERROR_FAIL;
+goto out;
+}
+}
+
 libxl_defbool_setdefault(&info->allow_unsafe, false);
 libxl_defbool_setdefault(&info->blackhole, false);
 libxl_defbool_setdefault(&info->compression, true);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index adfadd1..4bbadd3 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4273,6 +4273,9 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 dom_info.send_fd = send_fd;
 dom_info.migration_domname_r = &migration_domname;
 dom_info.checkpointed_stream = remus;
+if (remus == LIBXL_CHECKPOINTED_STREAM_COLO)
+/* COLO uses stdout to send control message to master */
+dom_info.quiet = 1;
 
 rc = create_domain(&dom_info);
 if (rc < 0) {
@@ -4287,7 +4290,8 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 /* If we are here, it means that the sender (primary) has crashed.
  * TODO: Split-Brain Check.
  */
-fprintf(stderr, "migration target: Remus Failover for domain %u\n",
+fprintf(stderr, "migration target: %s Failover for domain %u\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : "Remus",
 domid);
 
 /*
@@ -4304,15 +4308,21 @@ static void migrate_receive(int debug, int daemonize, 
int monitor,
 rc = libxl_domain_rename(ctx, domid, migration_domname,
  common_domname);
 if (rc)
-fprintf(stderr, "migration target (Remus): "
+fprintf(stderr, "migration target (%s): "
 "Failed to rename domain from %s to %s:%d\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : 
"Remus",
 migration_domname, common_domname, rc);
 }
 
+if (remus == LIBXL_CHECKPOINTED_STREAM_COLO)
+/* The guest is running after failover in COLO mode */
+exit(rc ? -ERROR_FAIL: 0);
+
 rc = libxl_domain_unpause(ctx, domid);
 if (rc)
-fprintf(stderr, "migration target (Remus): "
+fprintf(stderr, "migration target (%s): "
 "Failed to unpause domain %s (id: %u):%d\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : "Remus",
 common_domname, domid, rc);
 
 exit(rc ? -ERROR_FAIL: 0);
@@ -4458,7 +4468,7 @@ int main_migrate_receive(int argc, char **argv)
 int debug = 0, daemonize = 1, monitor = 1, remus = 0;
 int opt;
 
-SWITCH_FOREACH_OPT(opt, "Fedr", NULL,

[Xen-devel] [PATCH v6 COLO 08/15] Support colo mode for qemu disk

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

Usage: disk = ['...,colo,colo-params=xxx,active-disk=xxx,hidden-disk=xxx...']
The format of colo-params: host:port:exportname=xx

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 docs/man/xl.pod.1   |   2 +-
 tools/libxl/libxl.c |  42 ++-
 tools/libxl/libxl_create.c  |  25 -
 tools/libxl/libxl_device.c  |  38 +++
 tools/libxl/libxl_dm.c  | 262 ++--
 tools/libxl/libxl_types.idl |   5 +
 tools/libxl/libxlu_disk_l.l |   5 +
 7 files changed, 367 insertions(+), 12 deletions(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index f5e97d7..1c2ee24 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -454,7 +454,7 @@ N.B: Remus support in xl is still in experimental 
(proof-of-concept) phase.
  Disk replication support is limited to DRBD disks.
 
  COLO support in xl is still in experimental (proof-of-concept) phase.
- There is no support for network or disk at the moment.
+ There is no support for network at the moment.
 
 B
 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 7df2466..4a5957c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2273,6 +2273,8 @@ int libxl__device_disk_setdefault(libxl__gc *gc, 
libxl_device_disk *disk)
 int rc;
 
 libxl_defbool_setdefault(&disk->discard_enable, !!disk->readwrite);
+libxl_defbool_setdefault(&disk->colo_enable, false);
+libxl_defbool_setdefault(&disk->colo_restore_enable, false);
 
 rc = libxl__resolve_domid(gc, disk->backend_domname, &disk->backend_domid);
 if (rc < 0) return rc;
@@ -2473,6 +2475,14 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
domid,
 flexarray_append(back, "params");
 flexarray_append(back, libxl__sprintf(gc, "%s:%s",
   
libxl__device_disk_string_of_format(disk->format), disk->pdev_path));
+if (libxl_defbool_val(disk->colo_enable)) {
+flexarray_append(back, "colo-params");
+flexarray_append(back, libxl__sprintf(gc, "%s", 
disk->colo_params));
+flexarray_append(back, "active-disk");
+flexarray_append(back, libxl__sprintf(gc, "%s", 
disk->active_disk));
+flexarray_append(back, "hidden-disk");
+flexarray_append(back, libxl__sprintf(gc, "%s", 
disk->hidden_disk));
+}
 assert(device->backend_kind == LIBXL__DEVICE_KIND_QDISK);
 break;
 default:
@@ -2587,7 +2597,10 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
 goto cleanup;
 }
 
-/* "params" may not be present; but everything else must be. */
+/*
+ * "params" and "colo-params" may not be present; but everything
+ * else must be.
+ */
 tmp = xs_read(ctx->xsh, XBT_NULL,
   libxl__sprintf(gc, "%s/params", be_path), &len);
 if (tmp && strchr(tmp, ':')) {
@@ -2597,6 +2610,33 @@ static int libxl__device_disk_from_xs_be(libxl__gc *gc,
 disk->pdev_path = tmp;
 }
 
+tmp = xs_read(ctx->xsh, XBT_NULL,
+  libxl__sprintf(gc, "%s/colo-params", be_path), &len);
+if (tmp) {
+libxl_defbool_set(&disk->colo_enable, true);
+disk->colo_params = tmp;
+} else {
+libxl_defbool_set(&disk->colo_enable, false);
+}
+
+if (libxl_defbool_val(disk->colo_enable)) {
+tmp = xs_read(ctx->xsh, XBT_NULL,
+  libxl__sprintf(gc, "%s/active-disk", be_path), &len);
+if (!tmp) {
+LOG(ERROR, "Missing xenstore node %s/active-disk", be_path);
+goto cleanup;
+}
+disk->active_disk = tmp;
+
+tmp = xs_read(ctx->xsh, XBT_NULL,
+  libxl__sprintf(gc, "%s/hidden-disk", be_path), &len);
+if (!tmp) {
+LOG(ERROR, "Missing xenstore node %s/hidden-disk", be_path);
+goto cleanup;
+}
+disk->hidden_disk = tmp;
+}
+
 
 tmp = libxl__xs_read(gc, XBT_NULL,
  libxl__sprintf(gc, "%s/type", be_path));
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 6e307f3..17d0d18 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1727,12 +1727,29 @@ static void domain_create_cb(libxl__egc *egc,
 
 libxl__ao_complete(egc, ao, rc);
 }
-
+
+static void set_disk_colo_restore(libxl_domain_config *d_config)
+{
+int i;
+
+for (i = 0; i < d_config->num_disks; i++)
+libxl_defbool_set(&d_config->disks[i].colo_restore_enable, tr

[Xen-devel] [PATCH v6 COLO 05/15] send store mfn and console mfn to xl before resuming secondary vm

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

We will call libxl__xc_domain_restore_done() to rebuild secondary vm. But
we need store mfn and console mfn when rebuilding secondary vm. So make
restore_results is a function pointers in callbacks struct and struct
{save,restore}_callbacks, and use this callback to send store mfn and
console mfn to xl.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
CC: Andrew Cooper 
---
 tools/libxc/include/xenguest.h | 8 
 tools/libxc/xc_sr_restore.c| 8 ++--
 tools/libxl/libxl_colo_restore.c   | 5 -
 tools/libxl/libxl_create.c | 1 +
 tools/libxl/libxl_save_msgs_gen.pl | 2 +-
 5 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index d5902a6..50096b9 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -130,6 +130,14 @@ struct restore_callbacks {
 /* Enable qemu-dm logging dirty pages to xen */
 int (*switch_qemu_logdirty)(int domid, unsigned enable, void *data); /* 
HVM only */
 
+/*
+ * callback to send store mfn and console mfn to xl
+ * if we want to resume vm before xc_domain_save()
+ * exits.
+ */
+void (*restore_results)(unsigned long store_mfn, unsigned long console_mfn,
+void *data);
+
 /* callback to restore toolstack specific data */
 int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
 uint32_t size, void* data);
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 982a70e..5e2efd8 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -524,7 +524,10 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 if ( rc )
 goto err;
 
-/* TODO: call restore_results */
+/* call restore_results */
+ctx->restore.callbacks->restore_results(ctx->restore.xenstore_gfn,
+ctx->restore.console_gfn,
+ctx->restore.callbacks->data);
 
 /* Resume secondary vm */
 ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
@@ -793,7 +796,8 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
 /* this is COLO restore */
 assert(callbacks->suspend &&
callbacks->checkpoint &&
-   callbacks->postcopy);
+   callbacks->postcopy &&
+   callbacks->restore_results);
 }
 
 IPRINTF("In experimental %s", __func__);
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 6c39758..c613c15 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -153,11 +153,6 @@ static void colo_resume_vm(libxl__egc *egc,
 return;
 }
 
-/*
- * TODO: get store mfn and console mfn
- *  We should call the callback restore_results in
- *  xc_domain_restore() before resuming the guest.
- */
 libxl__xc_domain_restore_done(egc, dcs, 0, 0, 0);
 
 return;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 1548b70..6e307f3 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1157,6 +1157,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 rc = ERROR_INVAL;
 goto out;
 }
+callbacks->restore_results = libxl__srm_callout_callback_restore_results;
 
 if (checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
 crs->ao = ao;
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index fbb2d67..2ecd25d 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -32,7 +32,7 @@ our @msgs = (
 #toolstack_save  done entirely `by hand'
 [  7, 'rcxW',   "toolstack_restore", [qw(uint32_t domid
 BLOCK tsdata)] ],
-[  8, 'r',  "restore_results",   ['unsigned long', 'store_mfn',
+[  8, 'rcx',"restore_results",   ['unsigned long', 'store_mfn',
   'unsigned long', 'console_mfn'] 
],
 [  9, 'srW',"complete",  [qw(int retval
  int errnoval)] ],
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 09/15] COLO: use qemu block replication

2015-06-07 Thread Yang Hongyang
From: Wen Congyang 

The guest should be paused before doing COLO!!!

Signed-off-by: Wen Congyang 
---
 tools/libxl/Makefile |   1 +
 tools/libxl/libxl_colo_qdisk.c   | 209 +++
 tools/libxl/libxl_colo_restore.c |  21 +++-
 tools/libxl/libxl_colo_save.c|  36 ++-
 tools/libxl/libxl_internal.h |  18 
 tools/libxl/libxl_qmp.c  |  31 ++
 6 files changed, 312 insertions(+), 4 deletions(-)
 create mode 100644 tools/libxl/libxl_colo_qdisk.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 88c5426..d93b271 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -58,6 +58,7 @@ endif
 
 LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 LIBXL_OBJS-y += libxl_colo_restore.o libxl_colo_save.o
+LIBXL_OBJS-y += libxl_colo_qdisk.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_colo_qdisk.c b/tools/libxl/libxl_colo_qdisk.c
new file mode 100644
index 000..d73572e
--- /dev/null
+++ b/tools/libxl/libxl_colo_qdisk.c
@@ -0,0 +1,209 @@
+/*
+ * Copyright (C) 2015 FUJITSU LIMITED
+ * Author: Wen Congyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+typedef struct libxl__colo_qdisk {
+libxl__checkpoint_device *dev;
+} libxl__colo_qdisk;
+
+/* == init() and cleanup() == */
+int init_subkind_qdisk(libxl__checkpoint_devices_state *cds)
+{
+/*
+ * We don't know if we use qemu block replication, so
+ * we cannot start block replication here.
+ */
+return 0;
+}
+
+void cleanup_subkind_qdisk(libxl__checkpoint_devices_state *cds)
+{
+}
+
+/* == setup() and teardown() == */
+static void colo_qdisk_setup(libxl__egc *egc, libxl__checkpoint_device *dev,
+ bool primary)
+{
+const libxl_device_disk *disk = dev->backend_dev;
+const char *addr = NULL;
+const char *export_name;
+int ret, rc = 0;
+
+/* Convenience aliases */
+libxl__checkpoint_devices_state *const cds = dev->cds;
+const char *colo_params = disk->colo_params;
+const int domid = cds->domid;
+
+EGC_GC;
+
+if (disk->backend != LIBXL_DISK_BACKEND_QDISK ||
+!libxl_defbool_val(disk->colo_enable)) {
+rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+goto out;
+}
+
+export_name = strstr(colo_params, ":exportname=");
+if (!export_name) {
+rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+goto out;
+}
+export_name += strlen(":exportname=");
+if (export_name[0] == 0) {
+rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
+goto out;
+}
+
+dev->matched = 1;
+
+if (primary) {
+/* NBD server is not ready, so we cannot start block replication now */
+goto out;
+} else {
+libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+int len;
+
+if (crs->qdisk_setuped)
+goto out;
+
+crs->qdisk_setuped = true;
+
+len = export_name - strlen(":exportname=") - colo_params;
+addr = libxl__strndup(gc, colo_params, len);
+}
+
+ret = libxl__qmp_block_start_replication(gc, domid, primary, addr);
+if (ret)
+rc = ERROR_FAIL;
+
+out:
+dev->aodev.rc = rc;
+dev->aodev.callback(egc, &dev->aodev);
+}
+
+static void colo_qdisk_teardown(libxl__egc *egc, libxl__checkpoint_device *dev,
+bool primary)
+{
+int ret, rc = 0;
+
+/* Convenience aliases */
+libxl__checkpoint_devices_state *const cds = dev->cds;
+const int domid = cds->domid;
+
+EGC_GC;
+
+if (primary) {
+libxl__colo_save_state *css = CONTAINER_OF(cds, *css, cds);
+
+if (!css->qdisk_setuped)
+goto out;
+
+css->qdisk_setuped = false;
+} else {
+libxl__colo_restore_state *crs = CONTAINER_OF(cds, *crs, cds);
+
+if (!crs->qdisk_setuped)
+goto out;
+
+crs->qdisk_setuped = false;
+}
+
+ret = libxl__qmp_block_stop_replication(gc, domid, primary);
+if (ret)
+rc = ERROR_FAIL;
+
+out:
+dev->aodev.rc = rc;
+dev->aodev.callback(egc, &dev->aodev);
+}
+
+/* == checkpointing APIs == */
+/* should be called after libxl__checkpoint_device_instance_o

[Xen-devel] [PATCH v6 COLO 01/15] docs: add colo readme

2015-06-07 Thread Yang Hongyang
add colo readme, refer to
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

Signed-off-by: Yang Hongyang 
---
 docs/README.colo | 9 +
 1 file changed, 9 insertions(+)
 create mode 100644 docs/README.colo

diff --git a/docs/README.colo b/docs/README.colo
new file mode 100644
index 000..466eb72
--- /dev/null
+++ b/docs/README.colo
@@ -0,0 +1,9 @@
+COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
+project is a high availability solution. Both primary VM (PVM) and secondary VM
+(SVM) run in parallel. They receive the same request from client, and generate
+response in parallel too. If the response packets from PVM and SVM are
+identical, they are released immediately. Otherwise, a VM checkpoint (on 
demand)
+is conducted.
+
+See the website at http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
+for details.
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-07 Thread Yang Hongyang
call callbacks->get_dirty_pfn() after suspend primary vm to
get dirty pages on secondary vm, and send pages both dirty on
primary/secondary to secondary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
 tools/libxc/xc_sr_save.c | 49 +++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..cda61ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,31 @@ static int send_memory_live(struct xc_sr_context *ctx)
 return rc;
 }
 
+static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *), void *data,
+   unsigned long p2m_size, unsigned long *bitmap)
+{
+uint64_t *pfn_list;
+uint64_t count, i;
+uint64_t pfn;
+
+pfn_list = (uint64_t *)get_dirty_pfn(data);
+assert(pfn_list);
+
+count = pfn_list[0];
+for (i = 0; i < count; i++) {
+pfn = pfn_list[i + 1];
+if (pfn > p2m_size) {
+errno = EINVAL;
+return -1;
+}
+
+set_bit(pfn, bitmap);
+}
+
+free(pfn_list);
+return 0;
+}
+
 /*
  * Suspend the domain and send dirty memory.
  * This is the last iteration of the live migration and the
@@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct xc_sr_context 
*ctx)
 
 bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
 
+if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn )
+{
+rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn,
+ ctx->save.callbacks->data,
+ ctx->save.p2m_size,
+ dirty_bitmap);
+if ( rc )
+{
+PERROR("Failed to get secondary vm's dirty pages");
+goto out;
+}
+}
+
 rc = send_dirty_pages(ctx, stats.dirty_count + 
ctx->save.nr_deferred_pages);
 if ( rc )
 goto out;
@@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
 if ( rc )
 goto err;
 
-ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+rc = ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+if ( !rc ) {
+if ( !errno )
+{
+/* Postcopy request failed (without errno, using EINVAL) */
+errno = EINVAL;
+}
+rc = -1;
+goto err;
+}
 
 rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
 if ( rc <= 0 )
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 14/15] setup and control colo proxy on secondary side

2015-06-07 Thread Yang Hongyang
setup and control colo proxy on secondary side

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl_colo_restore.c | 28 +---
 tools/libxl/libxl_internal.h |  3 +++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 6731bd0..9c659c0 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -65,9 +65,11 @@ static void libxl__colo_restore_domain_resume_callback(void 
*data);
 static void libxl__colo_restore_domain_checkpoint_callback(void *data);
 static void libxl__colo_restore_domain_suspend_callback(void *data);
 
+extern const libxl__checkpoint_device_instance_ops colo_restore_device_nic;
 extern const libxl__checkpoint_device_instance_ops colo_restore_device_qdisk;
 
 static const libxl__checkpoint_device_instance_ops *colo_restore_ops[] = {
+&colo_restore_device_nic,
 &colo_restore_device_qdisk,
 NULL,
 };
@@ -167,8 +169,14 @@ static int 
init_device_subkind(libxl__checkpoint_devices_state *cds)
 int rc;
 STATE_AO_GC(cds->ao);
 
+rc = init_subkind_colo_nic(cds);
+if (rc) goto out;
+
 rc = init_subkind_qdisk(cds);
-if (rc)  goto out;
+if (rc) {
+cleanup_subkind_colo_nic(cds);
+goto out;
+}
 
 rc = 0;
 out:
@@ -180,6 +188,7 @@ static void 
cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
 /* cleanup device subkind-specific state in the libxl ctx */
 STATE_AO_GC(cds->ao);
 
+cleanup_subkind_colo_nic(cds);
 cleanup_subkind_qdisk(cds);
 }
 
@@ -398,6 +407,8 @@ static void colo_restore_teardown_done(libxl__egc *egc,
 if (crcs->teardown_devices)
 cleanup_device_subkind(cds);
 
+colo_proxy_teardown(&crs->cps);
+
 rc = crcs->saved_rc;
 if (!rc) {
 crcs->callback = do_failover_done;
@@ -607,6 +618,8 @@ static void colo_restore_preresume_cb(libxl__egc *egc,
 goto out;
 }
 
+colo_proxy_preresume(&crs->cps);
+
 colo_restore_resume_vm(egc, crcs);
 
 return;
@@ -643,6 +656,8 @@ static void colo_resume_vm_done(libxl__egc *egc,
 
 crcs->status = LIBXL_COLO_RESUMED;
 
+colo_proxy_postresume(&crs->cps);
+
 /* avoid calling libxl__xc_domain_restore_done() more than once */
 if (crs->saved_cb) {
 dcs->callback = crs->saved_cb;
@@ -792,13 +807,20 @@ static void colo_setup_checkpoint_devices(libxl__egc *egc,
 
 STATE_AO_GC(crs->ao);
 
-/* TODO: nic support */
-cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VBD);
+cds->device_kind_flags = (1 << LIBXL__DEVICE_KIND_VIF) |
+ (1 << LIBXL__DEVICE_KIND_VBD);
 cds->callback = colo_restore_setup_cds_done;
 cds->ao = ao;
 cds->domid = crs->domid;
 cds->ops = colo_restore_ops;
 
+crs->cps.ao = ao;
+if (colo_proxy_setup(&crs->cps)) {
+LOG(ERROR, "COLO: failed to setup colo proxy for guest with domid %u",
+cds->domid);
+goto out;
+}
+
 if (init_device_subkind(cds))
 goto out;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0e54865..33bf47b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3230,6 +3230,9 @@ struct libxl__colo_restore_state {
 
 /* private, used by qdisk block replication */
 bool qdisk_setuped;
+
+/* private, used by colo proxy */
+libxl__colo_proxy_state cps;
 };
 
 struct libxl__domain_create_state {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 COLO 15/15] cmdline switches and config vars to control colo-proxy

2015-06-07 Thread Yang Hongyang
Add cmdline switches to 'xl migrate-receive' command to specify
a domain-specific hotplug script to setup COLO proxy.

Add a new config var 'colo.default.agentscript' to xl.conf, that
allows the user to override the default global script used to
setup COLO proxy.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
 docs/man/xl.conf.pod.5  |  6 ++
 docs/man/xl.pod.1   |  1 -
 tools/libxl/libxl.c |  6 ++
 tools/libxl/libxl_create.c  | 14 +++--
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/xl.c|  3 +++
 tools/libxl/xl.h|  1 +
 tools/libxl/xl_cmdimpl.c| 49 ++---
 8 files changed, 66 insertions(+), 15 deletions(-)

diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5
index 8ae19bb..8f7fd28 100644
--- a/docs/man/xl.conf.pod.5
+++ b/docs/man/xl.conf.pod.5
@@ -111,6 +111,12 @@ Configures the default script used by Remus to setup 
network buffering.
 
 Default: C
 
+=item B
+
+Configures the default script used by COLO to setup colo-proxy.
+
+Default: C
+
 =item B
 
 Configures the default output format used by xl when printing "machine
diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 1c2ee24..8b425b5 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -454,7 +454,6 @@ N.B: Remus support in xl is still in experimental 
(proof-of-concept) phase.
  Disk replication support is limited to DRBD disks.
 
  COLO support in xl is still in experimental (proof-of-concept) phase.
- There is no support for network at the moment.
 
 B
 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4a5957c..224b54d 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -3375,6 +3375,11 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t 
domid,
 flexarray_append(back, nic->ifname);
 }
 
+if (nic->forwarddev) {
+flexarray_append(back, "forwarddev");
+flexarray_append(back, nic->forwarddev);
+}
+
 flexarray_append(back, "mac");
 flexarray_append(back,libxl__sprintf(gc,
 LIBXL_MAC_FMT, LIBXL_MAC_BYTES(nic->mac)));
@@ -3498,6 +3503,7 @@ static int libxl__device_nic_from_xs_be(libxl__gc *gc,
 nic->ip = READ_BACKEND(NOGC, "ip");
 nic->bridge = READ_BACKEND(NOGC, "bridge");
 nic->script = READ_BACKEND(NOGC, "script");
+nic->forwarddev = READ_BACKEND(NOGC, "forwarddev");
 
 /* vif_ioemu nics use the same xenstore entries as vif interfaces */
 tmp = READ_BACKEND(gc, "type");
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 17d0d18..597a64c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1168,6 +1168,11 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 crs->superpages = superpages;
 crs->pae = pae;
 crs->callback = libxl__colo_restore_setup_done;
+if (dcs->colo_proxy_script)
+crs->colo_proxy_script = libxl__strdup(gc, dcs->colo_proxy_script);
+else
+crs->colo_proxy_script = GCSPRINTF("%s/colo-proxy-setup",
+   libxl__xen_script_dir_path());
 libxl__colo_restore_setup(egc, crs);
 } else
 libxl__xc_domain_restore(egc, dcs,
@@ -1692,6 +1697,7 @@ static void domain_create_cb(libxl__egc *egc,
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
 uint32_t *domid, int restore_fd,
 int send_fd, int checkpointed_stream,
+const char *colo_proxy_script,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
@@ -1707,6 +1713,7 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 cdcs->dcs.send_fd = send_fd;
 cdcs->dcs.callback = domain_create_cb;
 cdcs->dcs.checkpointed_stream = checkpointed_stream;
+cdcs->dcs.colo_proxy_script = colo_proxy_script;
 libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
 cdcs->domid_out = domid;
 
@@ -1750,7 +1757,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncprogress_how *aop_console_how)
 {
 unset_disk_colo_restore(d_config);
-return do_domain_create(ctx, d_config, domid, -1, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, -1, 0, NULL,
 ao_how, aop_console_how);
 }
 
@@ -1761,16 +1768,19 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncprogress_how *aop_console_how)
 {
 int send_fd = -1;
+char *colo_proxy_script = 

Re: [Xen-devel] Xen 4.6 Development Update (five months reminder, 5 WEEKS TO FREEZE)

2015-06-07 Thread Yang Hongyang

On 06/05/2015 09:53 PM, wei.l...@citrix.com wrote:

(Note, please trim your quotes when replying, and also trim the CC list if
necessary. You might also consider changing the subject line of your reply to
"Status of  (Was: Xen 4.6 Development Update (X months reminder)")

Hi all

We are now four months into 4.6 development window. This is an email to keep
track of all the patch series I gathered. It is by no means complete and / or
acurate. Feel free to reply this email with new projects or correct my
misunderstanding.

= Timeline =

We are planning on a 9-month release cycle, but we could also release a bit
earlier if everything goes well (no blocker, no critical bug).

* Development start: 6 Jan 2015
<=== We are here ===>
* Feature Freeze: 10 Jul 2015
* RCs: TBD
* Release Date: 9 Oct 2015 (could release earlier)

The RCs and release will of course depend on stability and bugs, and
will therefore be fairly unpredictable.

Bug-fixes, if Acked-by by maintainer, can go anytime before the First
RC. Later on we will need to figure out the risk of regression/reward
to eliminate the possiblity of a bug introducing another bug.

= Prognosis =

The states are: none -> fair -> ok -> good -> done

none - nothing yet
fair - still working on it, patches are prototypes or RFC
ok   - patches posted, acting on review
good - some last minute pieces
done - all done, might have bugs

[...]


== Xen toolstack ==

*  Split libxc into multiple libraries (none)
   -  Ian Campbell

*  Remus using migration-v2 (good)
RFC posted - depends on v6 of 'New Migration'
   -  Yang Hongyang


Done.



*  Migration v2 (libxl) (none)
   -  Andrew Cooper


[...]


*  COarse-grain LOck-stepping Virtual Machines in Xen (fair)
RFC v5 posted
   -  Wen Congyang
   -  Gui Jianfeng
   -  Yang Hongyang
   -  Dong, Eddie


This should be ok:
[PATCH v6 COLO 00/15] COarse-grain LOck-stepping Virtual Machines for Non-stop 
Service





*  tmem migrationv2 patches. (none)
   -  Bob Liu & Andrew Cooper & David Vrabel

*  snapshot API extension (checkpointing disk) (fair)
v10
   -  Chunyan Liu


...

--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 01/13] libxc/restore: fix error handle of process_record

2015-06-08 Thread Yang Hongyang



On 06/08/2015 05:24 PM, Andrew Cooper wrote:

On 08/06/15 04:43, Yang Hongyang wrote:

If the err is RECORD_NOT_PROCESSED, and it is an optional record,
restore will still fail. The patch fix this.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
  tools/libxc/xc_sr_restore.c | 28 ++--
  1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..2d2edd3 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -560,19 +560,6 @@ static int process_record(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  free(rec->data);
  rec->data = NULL;

-if ( rc == RECORD_NOT_PROCESSED )
-{
-if ( rec->type & REC_TYPE_OPTIONAL )
-DPRINTF("Ignoring optional record %#x (%s)",
-rec->type, rec_type_to_str(rec->type));


You would be best setting rc to 0 here, rather than moving the logic out
of process_record().


There will be another error type in COLO, which indicates a failover, that
needs to be handled in restore(), so I moved the error handling down to
avoid duplex code...Otherwise, in process_record, RECORD_NOT_PROCESSED is
handled, and in restore another error type returned from process_record is
handled...



~Andrew


-else
-{
-ERROR("Mandatory record %#x (%s) not handled",
-  rec->type, rec_type_to_str(rec->type));
-rc = -1;
-}
-}
-
  return rc;
  }

@@ -678,7 +665,20 @@ static int restore(struct xc_sr_context *ctx)
  else
  {
  rc = process_record(ctx, &rec);
-if ( rc )
+if ( rc == RECORD_NOT_PROCESSED )
+{
+if ( rec.type & REC_TYPE_OPTIONAL )
+DPRINTF("Ignoring optional record %#x (%s)",
+rec.type, rec_type_to_str(rec.type));
+else
+{
+ERROR("Mandatory record %#x (%s) not handled",
+  rec.type, rec_type_to_str(rec.type));
+rc = -1;
+goto err;
+}
+}
+else if ( rc )
  goto err;
  }



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-08 Thread Yang Hongyang



On 06/08/2015 05:46 PM, Andrew Cooper wrote:

On 08/06/15 04:43, Yang Hongyang wrote:

ioreq page contains evtchn which will be set when we resume the
secondary vm the first time. The hypervisor will check if the
evtchn is corrupted, so we cannot zero the ioreq page more
than one time.

The ioreq->state is always STATE_IOREQ_NONE after the vm is
suspended, so it is OK if we only zero it one time.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen congyang 
CC: Andrew Cooper 


The issue here is that we are running the restore algorithm over a
domain which has already been running in Xen for a while.  This is a
brand new usecase, as far as I am aware.


Exactly.



Does the qemu process associated with this domain get frozen while the
secondary is being reset, or does the process get destroyed and recreated.


What do you mean by reset? do you mean secondary is suspended at checkpoint?



I have a gut feeling that it would be safer to clear all of the page
other than the event channel, but that depends on exactly what else is
going on.  We absolutely don't want to do is have an update to this page
from the primary with an in-progress IOREQ.

~Andrew


---
  tools/libxc/xc_sr_restore_x86_hvm.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c 
b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6f5af0e..06177e0 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -78,7 +78,8 @@ static int handle_hvm_params(struct xc_sr_context *ctx,
  break;
  case HVM_PARAM_IOREQ_PFN:
  case HVM_PARAM_BUFIOREQ_PFN:
-xc_clear_domain_page(xch, ctx->domid, entry->value);
+if ( !ctx->restore.buffer_all_records )
+xc_clear_domain_page(xch, ctx->domid, entry->value);
  break;
  }



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h

2015-06-08 Thread Yang Hongyang

Just to note that xc_bitops.h needs cleanup as Andy pointed out in v1...
It will done in v3.

On 06/08/2015 11:43 AM, Yang Hongyang wrote:

When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint. So we need to get/test
the dirty page bitmap. We just expose xc_bitops.h for libxl use.

NOTE:
   Need to make clean and rerun configure to get it compiled.

Signed-off-by: Yang Hongyang 
---
  tools/libxc/include/xc_bitops.h | 76 +
  tools/libxc/xc_bitops.h | 76 -
  2 files changed, 76 insertions(+), 76 deletions(-)
  create mode 100644 tools/libxc/include/xc_bitops.h
  delete mode 100644 tools/libxc/xc_bitops.h

diff --git a/tools/libxc/include/xc_bitops.h b/tools/libxc/include/xc_bitops.h
new file mode 100644
index 000..cd749f4
--- /dev/null
+++ b/tools/libxc/include/xc_bitops.h
@@ -0,0 +1,76 @@
+#ifndef XC_BITOPS_H
+#define XC_BITOPS_H 1
+
+/* bitmap operations for single threaded access */
+
+#include 
+#include 
+
+#define BITS_PER_LONG (sizeof(unsigned long) * 8)
+#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
+
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
+#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+
+/* calculate required space for number of longs needed to hold nr_bits */
+static inline int bitmap_size(int nr_bits)
+{
+int nr_long, nr_bytes;
+nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
+nr_bytes = nr_long * sizeof(unsigned long);
+return nr_bytes;
+}
+
+static inline unsigned long *bitmap_alloc(int nr_bits)
+{
+return calloc(1, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_set(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0xff, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0, bitmap_size(nr_bits));
+}
+
+static inline int test_bit(int nr, unsigned long *addr)
+{
+return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
+}
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
+}
+
+static inline int test_and_clear_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+clear_bit(nr, addr);
+return oldbit;
+}
+
+static inline int test_and_set_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+set_bit(nr, addr);
+return oldbit;
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+ int nr_bits)
+{
+int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
+for ( i = 0; i < nr_longs; ++i )
+dst[i] |= other[i];
+}
+
+#endif  /* XC_BITOPS_H */
diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
deleted file mode 100644
index cd749f4..000
--- a/tools/libxc/xc_bitops.h
+++ /dev/null
@@ -1,76 +0,0 @@
-#ifndef XC_BITOPS_H
-#define XC_BITOPS_H 1
-
-/* bitmap operations for single threaded access */
-
-#include 
-#include 
-
-#define BITS_PER_LONG (sizeof(unsigned long) * 8)
-#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
-
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
-
-/* calculate required space for number of longs needed to hold nr_bits */
-static inline int bitmap_size(int nr_bits)
-{
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
-}
-
-static inline unsigned long *bitmap_alloc(int nr_bits)
-{
-return calloc(1, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0xff, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0, bitmap_size(nr_bits));
-}
-
-static inline int test_bit(int nr, unsigned long *addr)
-{
-return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
-}
-
-static inline void clear_bit(int nr, unsigned long *addr)
-{
-BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
-}
-
-static inline void set_bit(int nr, unsigned long *addr)
-{
-BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
-}
-
-static inline int test_and_clear_bit(int nr, unsigned long *addr)
-{
-int oldbit = test_bit(nr, addr);
-clear_bit(nr, addr);
-return oldbit;
-}
-
-static inline int test_and_set_bit(int nr, unsigned long *addr)
-{
-int oldbit = test_bit(nr, addr);
-set_bit(nr, addr);
-return oldbit;
-}
-
-static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
- int nr_bits)
-{
-int i, nr_longs = (bitmap_size(nr_

Re: [Xen-devel] [PATCH v6 COLO 04/15] libxc/restore: support COLO restore

2015-06-08 Thread Yang Hongyang



On 06/08/2015 06:39 PM, Andrew Cooper wrote:

On 08/06/15 04:45, Yang Hongyang wrote:

call the callbacks resume/checkpoint/suspend while secondary vm
status is consistent with primary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
  tools/libxc/xc_sr_common.h  | 11 +--
  tools/libxc/xc_sr_restore.c | 63 -
  tools/libxc/xc_sr_restore_x86_hvm.c |  1 +
  3 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 565c5da..382bf76 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -132,8 +132,11 @@ struct xc_sr_restore_ops
   *
   * @return 0 for success, -1 for failure, or the sentinel value
   * RECORD_NOT_PROCESSED.
+ * BROKEN_CHANNEL: if we are under Remus/COLO, this means master may dead,
+ * we will failover.


"this means that the master"


Thanks.




   */
  #define RECORD_NOT_PROCESSED 1
+#define BROKEN_CHANNEL 2
  int (*process_record)(struct xc_sr_context *ctx, struct xc_sr_record 
*rec);

  /**
@@ -205,8 +208,12 @@ struct xc_sr_context
  uint32_t guest_type;
  uint32_t guest_page_size;

-/* Plain VM, or checkpoints over time. */
-bool checkpointed;
+/*
+ * 0: Plain VM
+ * 1: Remus
+ * 2: COLO
+ */
+int checkpointed;


I think this would be nicer as

enum {
STREAM_PLAIN,
STREAM_REMUS,
STREAM_COLO,
} stream;

perhaps?  It would reduce the use of a magic 2 in the code.


This is another place that I missed, good catch, and it's better, thank you.





  /* Currently buffering records between a checkpoint */
  bool buffer_all_records;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 2d2edd3..982a70e 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,4 +1,5 @@
  #include 
+#include 

  #include "xc_sr_common.h"

@@ -472,7 +473,7 @@ static int process_record(struct xc_sr_context *ctx, struct 
xc_sr_record *rec);
  static int handle_checkpoint(struct xc_sr_context *ctx)
  {
  xc_interface *xch = ctx->xch;
-int rc = 0;
+int rc = 0, ret;
  unsigned i;

  if ( !ctx->restore.checkpointed )
@@ -498,6 +499,46 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
  else
  ctx->restore.buffer_all_records = true;

+if ( ctx->restore.checkpointed == 2 )
+{
+#define HANDLE_CALLBACK_RETURN_VALUE(ret)   \


I would ideally like to avoid macros like this in an effort to avoid the
code slipping back into the state that the legacy code was in, but at
least it is local to the area used.


+do {\
+if ( ret == 0 ) \
+{   \
+/* Some internal error happens */   \
+rc = -1;\
+goto err;   \
+}   \
+else if ( ret == 2 )\
+{   \
+/* Reading/writing error, do failover */\
+rc = BROKEN_CHANNEL;\
+goto err;   \
+}   \
+} while (0)


This should have the logic inverted somewhat, to cover all possible
values of ret, including the negative half.


yes, will fix in the next version.



e.g.

if ( ret == 1 )
 rc = 0; /* Success */
else
{
 if ( ret == 2 )
 rc = BROKEN_CHANNEL;
 else
 rc = -1; /* Some unspecified error */
 goto err;
}


+
+/* COLO */
+
+/* We need to resume guest */
+rc = ctx->restore.ops.stream_complete(ctx);
+if ( rc )
+goto err;
+
+/* TODO: call restore_results */
+
+/* Resume secondary vm */
+ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+/* wait for new checkpoint */
+ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);
+
+/* suspend secondary vm */
+ret = ctx->restore.callbacks->suspend(ctx->restore.callbacks->data);
+HANDLE_CALLBACK_RETURN_VALUE(ret);


Please #undef HANDLE_CALLBACK_RETURN_VALUE here.


OK.




+}
+
   err:
  return rc;
  }
@@ -678,6 +719,8 @@ static int restore(struct xc_sr_context *ctx)
  goto err;
  }
  }
+ 

Re: [Xen-devel] [PATCH v6 COLO 05/15] send store mfn and console mfn to xl before resuming secondary vm

2015-06-08 Thread Yang Hongyang



On 06/08/2015 08:16 PM, Andrew Cooper wrote:

On 08/06/15 04:45, Yang Hongyang wrote:

From: Wen Congyang 

We will call libxl__xc_domain_restore_done() to rebuild secondary vm. But
we need store mfn and console mfn when rebuilding secondary vm. So make
restore_results is a function pointers in callbacks struct and struct
{save,restore}_callbacks, and use this callback to send store mfn and
console mfn to xl.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
CC: Andrew Cooper 
---
  tools/libxc/include/xenguest.h | 8 
  tools/libxc/xc_sr_restore.c| 8 ++--
  tools/libxl/libxl_colo_restore.c   | 5 -
  tools/libxl/libxl_create.c | 1 +
  tools/libxl/libxl_save_msgs_gen.pl | 2 +-
  5 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index d5902a6..50096b9 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -130,6 +130,14 @@ struct restore_callbacks {
  /* Enable qemu-dm logging dirty pages to xen */
  int (*switch_qemu_logdirty)(int domid, unsigned enable, void *data); /* 
HVM only */

+/*
+ * callback to send store mfn and console mfn to xl
+ * if we want to resume vm before xc_domain_save()
+ * exits.
+ */
+void (*restore_results)(unsigned long store_mfn, unsigned long console_mfn,
+void *data);
+
  /* callback to restore toolstack specific data */
  int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
  uint32_t size, void* data);
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 982a70e..5e2efd8 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -524,7 +524,10 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
  if ( rc )
  goto err;

-/* TODO: call restore_results */
+/* call restore_results */


I would drop this comment.  It is entirely redundant now.


will do, thanks.



Otherwise, looks good.

~Andrew


+ctx->restore.callbacks->restore_results(ctx->restore.xenstore_gfn,
+ctx->restore.console_gfn,
+ctx->restore.callbacks->data);

  /* Resume secondary vm */
  ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data);
@@ -793,7 +796,8 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
  /* this is COLO restore */
  assert(callbacks->suspend &&
 callbacks->checkpoint &&
-   callbacks->postcopy);
+   callbacks->postcopy &&
+   callbacks->restore_results);
  }

  IPRINTF("In experimental %s", __func__);
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
index 6c39758..c613c15 100644
--- a/tools/libxl/libxl_colo_restore.c
+++ b/tools/libxl/libxl_colo_restore.c
@@ -153,11 +153,6 @@ static void colo_resume_vm(libxl__egc *egc,
  return;
  }

-/*
- * TODO: get store mfn and console mfn
- *  We should call the callback restore_results in
- *  xc_domain_restore() before resuming the guest.
- */
  libxl__xc_domain_restore_done(egc, dcs, 0, 0, 0);

  return;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 1548b70..6e307f3 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1157,6 +1157,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  rc = ERROR_INVAL;
  goto out;
  }
+callbacks->restore_results = libxl__srm_callout_callback_restore_results;

  if (checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO) {
  crs->ao = ao;
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index fbb2d67..2ecd25d 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -32,7 +32,7 @@ our @msgs = (
  #toolstack_save  done entirely `by hand'
  [  7, 'rcxW',   "toolstack_restore", [qw(uint32_t domid
  BLOCK tsdata)] ],
-[  8, 'r',  "restore_results",   ['unsigned long', 'store_mfn',
+[  8, 'rcx',"restore_results",   ['unsigned long', 'store_mfn',
'unsigned long', 'console_mfn'] 
],
  [  9, 'srW',"complete",  [qw(int retval
   int errnoval)] ],


.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-08 Thread Yang Hongyang



On 06/08/2015 06:15 PM, Andrew Cooper wrote:

On 08/06/15 10:58, Yang Hongyang wrote:



On 06/08/2015 05:46 PM, Andrew Cooper wrote:

On 08/06/15 04:43, Yang Hongyang wrote:

ioreq page contains evtchn which will be set when we resume the
secondary vm the first time. The hypervisor will check if the
evtchn is corrupted, so we cannot zero the ioreq page more
than one time.

The ioreq->state is always STATE_IOREQ_NONE after the vm is
suspended, so it is OK if we only zero it one time.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen congyang 
CC: Andrew Cooper 


The issue here is that we are running the restore algorithm over a
domain which has already been running in Xen for a while.  This is a
brand new usecase, as far as I am aware.


Exactly.



Does the qemu process associated with this domain get frozen while the
secondary is being reset, or does the process get destroyed and
recreated.


What do you mean by reset? do you mean secondary is suspended at
checkpoint?


Well - at the point that the buffered records are being processed, we
are in the process of resetting the state of the secondary to match the
primary.


Yes, at this point, the qemu process associated with this domain is frozen.
the suspend callback will call libxl__qmp_stop(vm_stop() in qemu) to pause
qemu. After we processed all records, qemu will be restored with the received
state, that's why we add a libxl__qmp_restore(qemu_load_vmstate() in qemu)
api to restore qemu with received state. Currently in libxl, qemu only start
with the received state, there's no api to load received state while qemu is
running for a while.



~Andrew





I have a gut feeling that it would be safer to clear all of the page
other than the event channel, but that depends on exactly what else is
going on.  We absolutely don't want to do is have an update to this page
from the primary with an in-progress IOREQ.

~Andrew


---
   tools/libxc/xc_sr_restore_x86_hvm.c | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c
b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6f5af0e..06177e0 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -78,7 +78,8 @@ static int handle_hvm_params(struct xc_sr_context
*ctx,
   break;
   case HVM_PARAM_IOREQ_PFN:
   case HVM_PARAM_BUFIOREQ_PFN:
-xc_clear_domain_page(xch, ctx->domid, entry->value);
+if ( !ctx->restore.buffer_all_records )
+xc_clear_domain_page(xch, ctx->domid, entry->value);
   break;
   }



.





.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-08 Thread Yang Hongyang



On 06/08/2015 09:04 PM, Andrew Cooper wrote:

On 08/06/15 04:45, Yang Hongyang wrote:

call callbacks->get_dirty_pfn() after suspend primary vm to
get dirty pages on secondary vm, and send pages both dirty on
primary/secondary to secondary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
  tools/libxc/xc_sr_save.c | 49 +++-
  1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..cda61ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,31 @@ static int send_memory_live(struct xc_sr_context *ctx)
  return rc;
  }

+static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *), void *data,
+   unsigned long p2m_size, unsigned long *bitmap)


This function should take a ctx rather than having the caller expand 3
parameters.  Also, "update_dirty_bitmap" is a little misleading, as it
isn't querying the hypervisor for the dirty bitmap.


ok.




+{
+uint64_t *pfn_list;
+uint64_t count, i;
+uint64_t pfn;
+
+pfn_list = (uint64_t *)get_dirty_pfn(data);


This looks like a recipe for width-errors.  The get_dirty_pfn() call
should take a pointer to a struct for it to fill.


but the size is unknown for the caller.pfn_list[0] is the count of
pfn.




+assert(pfn_list);


This should turn into an error rather than an abort().


Even if there are no dirty pages on secondary, pfn_list shouldn't be
NULL, it's just that pfn_list[0] will be 0. if pfn_list is NULL,
there might be unexpected error happened.




+
+count = pfn_list[0];
+for (i = 0; i < count; i++) {


style


+pfn = pfn_list[i + 1];
+if (pfn > p2m_size) {
+errno = EINVAL;
+return -1;
+}
+
+set_bit(pfn, bitmap);
+}
+
+free(pfn_list);
+return 0;
+}
+
  /*
   * Suspend the domain and send dirty memory.
   * This is the last iteration of the live migration and the
@@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct xc_sr_context 
*ctx)

  bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);

+if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn )
+{


Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a
noop to start with) ?


It should be mandatory, it shouldn't be noop under COLO. perhaps we should
add sanity check at the beginning. But problem is save side do not have a param
passed from libxl to indicate the stream type(like checkpointed_stream in
restore side). So we may need to add another XCFLAGS? Currently there is
XCFLAGS_CHECKPOINTED which represents Remus, we might need to change this to
XCFLAGS_STREAM_REMUS
XCFLAGS_STREAM_COLO
so that we can know what kind of stream we are handling?



~Andrew


+rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn,
+ ctx->save.callbacks->data,
+ ctx->save.p2m_size,
+ dirty_bitmap);
+if ( rc )
+{
+PERROR("Failed to get secondary vm's dirty pages");
+goto out;
+}
+}
+
  rc = send_dirty_pages(ctx, stats.dirty_count + 
ctx->save.nr_deferred_pages);
  if ( rc )
  goto out;
@@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
  if ( rc )
  goto err;

-ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+rc = ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+if ( !rc ) {
+if ( !errno )
+{
+/* Postcopy request failed (without errno, using EINVAL) */
+errno = EINVAL;
+}
+rc = -1;
+goto err;
+}

  rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
  if ( rc <= 0 )


.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-08 Thread Yang Hongyang



On 06/08/2015 09:04 PM, Andrew Cooper wrote:

On 08/06/15 04:45, Yang Hongyang wrote:

call callbacks->get_dirty_pfn() after suspend primary vm to
get dirty pages on secondary vm, and send pages both dirty on
primary/secondary to secondary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
  tools/libxc/xc_sr_save.c | 49 +++-
  1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..cda61ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,31 @@ static int send_memory_live(struct xc_sr_context *ctx)
  return rc;
  }

+static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *), void *data,
+   unsigned long p2m_size, unsigned long *bitmap)


This function should take a ctx rather than having the caller expand 3
parameters.  Also, "update_dirty_bitmap" is a little misleading, as it
isn't querying the hypervisor for the dirty bitmap.


how about merge_secondary_dirty_bitmap()?




+{
+uint64_t *pfn_list;
+uint64_t count, i;
+uint64_t pfn;
+
+pfn_list = (uint64_t *)get_dirty_pfn(data);


This looks like a recipe for width-errors.  The get_dirty_pfn() call
should take a pointer to a struct for it to fill.


+assert(pfn_list);


This should turn into an error rather than an abort().


+
+count = pfn_list[0];
+for (i = 0; i < count; i++) {


style


+pfn = pfn_list[i + 1];
+if (pfn > p2m_size) {
+errno = EINVAL;
+return -1;
+}
+
+set_bit(pfn, bitmap);
+}
+
+free(pfn_list);
+return 0;
+}
+
  /*
   * Suspend the domain and send dirty memory.
   * This is the last iteration of the live migration and the
@@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct xc_sr_context 
*ctx)

  bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);

+if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn )
+{


Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a
noop to start with) ?

~Andrew


+rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn,
+ ctx->save.callbacks->data,
+ ctx->save.p2m_size,
+ dirty_bitmap);
+if ( rc )
+{
+PERROR("Failed to get secondary vm's dirty pages");
+goto out;
+}
+}
+
  rc = send_dirty_pages(ctx, stats.dirty_count + 
ctx->save.nr_deferred_pages);
  if ( rc )
  goto out;
@@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
  if ( rc )
  goto err;

-ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+rc = ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+if ( !rc ) {
+if ( !errno )
+{
+/* Postcopy request failed (without errno, using EINVAL) */
+errno = EINVAL;
+}
+rc = -1;
+goto err;
+}

  rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
  if ( rc <= 0 )


.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] save & restore failed when tmem enabled in Xen 4.1 & Xen 4.3

2015-06-09 Thread Yang Hongyang



On 06/08/2015 02:30 PM, yunfang tai wrote:

Hi Andrew,
 Thank you for your reply!
 I do not know much about migration V2. Was it integrated to Xen? If
integrated, from which version?


It's intended to be integrated to Xen4.6. However, the libxc part has already
been merged into upstream, but the libxl part still work in progress.


 Thank you!!

Best Regards,
Yunfang

2015-06-06 3:00 GMT+08:00 Andrew Cooper mailto:andrew.coop...@citrix.com>>:

On 05/06/15 19:45, Konrad Rzeszutek Wilk wrote:
 > On Thu, Jun 04, 2015 at 10:27:06PM +0800, yunfang tai wrote:
 >> Hi all,
 > Hey!
 >> Recently, I am testing the TMEM support on Xen. I discovered that 
when
 >> enabled TMEM in ubuntu 14.10 as guest on Xen 4.1 & Xen 4.3, "xm save" & 
"xm
 >> restore“ failed after there are more than 1000 pages put in persistent 
pool
 >> of TMEM in Xen. My operations are list as follows:
 > Is it exactly 1000 or just about? I presume it does not matter how much 
but
 > that you discovered it by having 1000 of them?
 >
 >> In ubuntu guest (8 cores , 8GB):
 >> sudo modprobe tmem
 >> (than wait for the selfballoon to finish)
 >> dd if=/dev/zero of=/tmp/test.img bs=10M count=1000
 >> dd if=/tmp/test.img of=/dev/null bs=10M
 >> dd if=/tmp/test.img of=/dev/null bs=10M
 >> .
 >> (until more than 1000 pages put in persistent pool)
 >> In Domain 0:
 >> (add tmem in grub.cfg)
 >> xm save ubuntu test.save
 >> xm restore ubuntu test.save
 >>
 >> When TMEM is not enabled, save & restore success after these operations.
 >> But if TMEM is enabled, save & restore fail.
 > Are there any errors from the logs? Anything?
 >> Does anyone test about save & restore when enabled TMEM in Xen?? Is 
there
 >> anything I do wrong?
 > Well lets see what broke. But I think Andrew discovered that the
 > migration protocol when it came to 'tmem' was not up to snuff. CC-ing him
 > just to confirm.
 >
 > (Andrew, for the persistent part of this - it conceptually should
 > get all of the tmem memory that pushed to the hypervisor back in the
 > image. When you were looking at migrationv2 did you just skim through
 > that or mostly ignored it?)

Took a look at the code, attempted to figure out what was going on, then
decided to ignore it for the time being.

As a baseline, there is no error checking of hypercalls or their
returned data putting the data into the stream.

Migration v2 currently has no TMEM support, and I would suggest
re-implementing it from scratch over attempting to port what currently
exists for legacy.

~Andrew




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-09 Thread Yang Hongyang



On 06/09/2015 03:20 PM, Andrew Cooper wrote:

On 09/06/2015 04:15, Yang Hongyang wrote:



On 06/08/2015 09:04 PM, Andrew Cooper wrote:

On 08/06/15 04:45, Yang Hongyang wrote:

call callbacks->get_dirty_pfn() after suspend primary vm to
get dirty pages on secondary vm, and send pages both dirty on
primary/secondary to secondary.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
---
   tools/libxc/xc_sr_save.c | 49
+++-
   1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..cda61ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -515,6 +515,31 @@ static int send_memory_live(struct
xc_sr_context *ctx)
   return rc;
   }

+static int update_dirty_bitmap(uint8_t *(*get_dirty_pfn)(void *),
void *data,
+   unsigned long p2m_size, unsigned
long *bitmap)


This function should take a ctx rather than having the caller expand 3
parameters.  Also, "update_dirty_bitmap" is a little misleading, as it
isn't querying the hypervisor for the dirty bitmap.


ok.


(Merging the other thread)


how about merge_secondary_dirty_bitmap()?


Much better!






+{
+uint64_t *pfn_list;
+uint64_t count, i;
+uint64_t pfn;
+
+pfn_list = (uint64_t *)get_dirty_pfn(data);


This looks like a recipe for width-errors.  The get_dirty_pfn() call
should take a pointer to a struct for it to fill.


but the size is unknown for the caller.pfn_list[0] is the count of
pfn.




+assert(pfn_list);


This should turn into an error rather than an abort().


Even if there are no dirty pages on secondary, pfn_list shouldn't be
NULL, it's just that pfn_list[0] will be 0. if pfn_list is NULL,
there might be unexpected error happened.


get_dirty_pfn() should be declared alongside a

struct pfn_data
{
 uint64_t count;
 uint64_t *pfns;
};

and this function here should create one of these on the stack and pass
it by pointer to get_dirty_pfn().  I might also be tempted to rename
this to get_remote_logdirty() or similar, to indicate that it is a
source of logdirty data from something other than the current hypervisor.


This is a callback, I can't find a way to pass pointer from libxc to libxl,
libxl can not access the pointer data...The struct can be used for represent
the data however.

I like with the rename part, sounds much better.








+
+count = pfn_list[0];
+for (i = 0; i < count; i++) {


style


+pfn = pfn_list[i + 1];
+if (pfn > p2m_size) {
+errno = EINVAL;
+return -1;
+}
+
+set_bit(pfn, bitmap);
+}
+
+free(pfn_list);
+return 0;
+}
+
   /*
* Suspend the domain and send dirty memory.
* This is the last iteration of the live migration and the
@@ -555,6 +580,19 @@ static int suspend_and_send_dirty(struct
xc_sr_context *ctx)

   bitmap_or(dirty_bitmap, ctx->save.deferred_pages,
ctx->save.p2m_size);

+if ( !ctx->save.live && ctx->save.callbacks->get_dirty_pfn )
+{


Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a
noop to start with) ?


It should be mandatory, it shouldn't be noop under COLO. perhaps we
should
add sanity check at the beginning. But problem is save side do not
have a param
passed from libxl to indicate the stream type(like checkpointed_stream in
restore side). So we may need to add another XCFLAGS? Currently there is
XCFLAGS_CHECKPOINTED which represents Remus, we might need to change
this to
XCFLAGS_STREAM_REMUS
XCFLAGS_STREAM_COLO
so that we can know what kind of stream we are handling?


checkpointed_stream started out as a bugfix for a legacy stream
migration breakage.  Really, this information should have been passed
right from the start.


Did I miss the bugfix? is it not in upstream?



It would probably be best to take the enum{} suggested elsewhere and
make it a top level ctx item, and have it present for both save and
restore, with sutable parameters passed in from the top.  (When I am
finally able to take out the legacy code, there is going to be a severe
pruning/consolidation of the parameters.)


This is what I thought when I saw the enum{} suggested.



~Andrew





~Andrew


+rc = update_dirty_bitmap(ctx->save.callbacks->get_dirty_pfn,
+ ctx->save.callbacks->data,
+ ctx->save.p2m_size,
+ dirty_bitmap);
+if ( rc )
+{
+PERROR("Failed to get secondary vm's dirty pages");
+goto out;
+}
+}
+
   rc = send_dirty_pages(ctx, stats.dirty_count +
ctx->save.nr_deferred_pages);
   if ( rc )
   goto out;
@@ -784,7 +822,16 @@ static int save(struct xc_sr_context *ctx,
uint16_t guest_type)
   if 

Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-09 Thread Yang Hongyang



On 06/09/2015 04:51 PM, Andrew Cooper wrote:

On 09/06/15 09:45, Yang Hongyang wrote:



Even if there are no dirty pages on secondary, pfn_list shouldn't be
NULL, it's just that pfn_list[0] will be 0. if pfn_list is NULL,
there might be unexpected error happened.


get_dirty_pfn() should be declared alongside a

struct pfn_data
{
  uint64_t count;
  uint64_t *pfns;
};

and this function here should create one of these on the stack and pass
it by pointer to get_dirty_pfn().  I might also be tempted to rename
this to get_remote_logdirty() or similar, to indicate that it is a
source of logdirty data from something other than the current
hypervisor.


This is a callback, I can't find a way to pass pointer from libxc to
libxl,
libxl can not access the pointer data...The struct can be used for
represent
the data however.


Right - my point is that it should be the implementation of
get_remote_logdirty() (i.e. in libxl_save_helper) which is responsible
for unpackaging the data from whatever RPC method is used, rather than
the caller.


Now I know what you mean, I will fix it in the next version, thanks!






Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it is a
noop to start with) ?


It should be mandatory, it shouldn't be noop under COLO. perhaps we
should
add sanity check at the beginning. But problem is save side do not
have a param
passed from libxl to indicate the stream type(like
checkpointed_stream in
restore side). So we may need to add another XCFLAGS? Currently
there is
XCFLAGS_CHECKPOINTED which represents Remus, we might need to change
this to
XCFLAGS_STREAM_REMUS
XCFLAGS_STREAM_COLO
so that we can know what kind of stream we are handling?


checkpointed_stream started out as a bugfix for a legacy stream
migration breakage.  Really, this information should have been passed
right from the start.


Did I miss the bugfix? is it not in upstream?


c/s 7051d5c


Ah, you are talking about the restore side, I'm talking about the save
side checkpointed_stream, so I should also post a prereq patch to
add checkpointed_stream to the save side? or there's already the
fix out there?



~Andrew
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 06/15] libxc/save: support COLO save

2015-06-09 Thread Yang Hongyang



On 06/09/2015 05:10 PM, Andrew Cooper wrote:

On 09/06/15 10:09, Yang Hongyang wrote:








Shouldn't get_dirty_pfn be mandatory for COLO streams (even if it
is a
noop to start with) ?


It should be mandatory, it shouldn't be noop under COLO. perhaps we
should
add sanity check at the beginning. But problem is save side do not
have a param
passed from libxl to indicate the stream type(like
checkpointed_stream in
restore side). So we may need to add another XCFLAGS? Currently
there is
XCFLAGS_CHECKPOINTED which represents Remus, we might need to change
this to
XCFLAGS_STREAM_REMUS
XCFLAGS_STREAM_COLO
so that we can know what kind of stream we are handling?


checkpointed_stream started out as a bugfix for a legacy stream
migration breakage.  Really, this information should have been passed
right from the start.


Did I miss the bugfix? is it not in upstream?


c/s 7051d5c


Ah, you are talking about the restore side, I'm talking about the save
side checkpointed_stream, so I should also post a prereq patch to
add checkpointed_stream to the save side? or there's already the
fix out there?


Sorry for being unclear.  You will have to add one to the save side.
The restore side only has one as a bugfix.


Got it~ thanks!



~Andrew
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-09 Thread Yang Hongyang



On 06/09/2015 03:30 PM, Andrew Cooper wrote:

On 09/06/2015 01:59, Yang Hongyang wrote:



On 06/08/2015 06:15 PM, Andrew Cooper wrote:

On 08/06/15 10:58, Yang Hongyang wrote:



On 06/08/2015 05:46 PM, Andrew Cooper wrote:

On 08/06/15 04:43, Yang Hongyang wrote:

ioreq page contains evtchn which will be set when we resume the
secondary vm the first time. The hypervisor will check if the
evtchn is corrupted, so we cannot zero the ioreq page more
than one time.

The ioreq->state is always STATE_IOREQ_NONE after the vm is
suspended, so it is OK if we only zero it one time.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen congyang 
CC: Andrew Cooper 


The issue here is that we are running the restore algorithm over a
domain which has already been running in Xen for a while.  This is a
brand new usecase, as far as I am aware.


Exactly.



Does the qemu process associated with this domain get frozen while the
secondary is being reset, or does the process get destroyed and
recreated.


What do you mean by reset? do you mean secondary is suspended at
checkpoint?


Well - at the point that the buffered records are being processed, we
are in the process of resetting the state of the secondary to match the
primary.


Yes, at this point, the qemu process associated with this domain is
frozen.
the suspend callback will call libxl__qmp_stop(vm_stop() in qemu) to
pause
qemu. After we processed all records, qemu will be restored with the
received
state, that's why we add a libxl__qmp_restore(qemu_load_vmstate() in
qemu)
api to restore qemu with received state. Currently in libxl, qemu only
start
with the received state, there's no api to load received state while
qemu is
running for a while.


Now I consider this more, it is absolutely wrong to not zero the page
here.  The event channel in the page is not guaranteed to be the same
between the primary and secondary,


That's why we don't zero it on secondary.


and we don't want to unexpectedly
find a pending/in-flight ioreq.


ioreq->state is always STATE_IOREQ_NONE after the vm is suspended, there
should be no pending/in-flight ioreq at checkpoint.



Either qemu needs to take care of re-initialising the event channels
back to appropriate values, or Xen should tolerate the channels
disappearing.

~Andrew
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h

2015-06-10 Thread Yang Hongyang



On 06/10/2015 11:20 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint.


... and this is a _libxl_ operation? Is that the right layer here?


For the first question, Yes, this is done in the suspend callback on
restore side. We do this in libxl because currently we only added a
back channel on libxl side. There're no back channel in libxc.

By considering this more, if we do this in libxc part, the code will be
less complex: we can drop the 4th & 9th patch of this series and also
get rid of the get_dirty_pfn() callback. instead we will add a patch to
add back channel in libxc.

For the second question, I'm not sure, what's Andrew's opinion? which
is the right layer to do this operation, libxl or libxc?




  So we need to get/test
the dirty page bitmap. We just expose xc_bitops.h for libxl use.

NOTE:
   Need to make clean and rerun configure to get it compiled.

Signed-off-by: Yang Hongyang 
---
  tools/libxc/include/xc_bitops.h | 76 +
  tools/libxc/xc_bitops.h | 76 -
  2 files changed, 76 insertions(+), 76 deletions(-)
  create mode 100644 tools/libxc/include/xc_bitops.h
  delete mode 100644 tools/libxc/xc_bitops.h

diff --git a/tools/libxc/include/xc_bitops.h b/tools/libxc/include/xc_bitops.h
new file mode 100644
index 000..cd749f4
--- /dev/null
+++ b/tools/libxc/include/xc_bitops.h
@@ -0,0 +1,76 @@
+#ifndef XC_BITOPS_H
+#define XC_BITOPS_H 1
+
+/* bitmap operations for single threaded access */
+
+#include 
+#include 
+
+#define BITS_PER_LONG (sizeof(unsigned long) * 8)
+#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
+
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
+#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+
+/* calculate required space for number of longs needed to hold nr_bits */
+static inline int bitmap_size(int nr_bits)
+{
+int nr_long, nr_bytes;
+nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
+nr_bytes = nr_long * sizeof(unsigned long);
+return nr_bytes;
+}
+
+static inline unsigned long *bitmap_alloc(int nr_bits)
+{
+return calloc(1, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_set(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0xff, bitmap_size(nr_bits));
+}
+
+static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+{
+memset(addr, 0, bitmap_size(nr_bits));
+}
+
+static inline int test_bit(int nr, unsigned long *addr)
+{
+return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
+}
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
+}
+
+static inline int test_and_clear_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+clear_bit(nr, addr);
+return oldbit;
+}
+
+static inline int test_and_set_bit(int nr, unsigned long *addr)
+{
+int oldbit = test_bit(nr, addr);
+set_bit(nr, addr);
+return oldbit;
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+ int nr_bits)
+{
+int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
+for ( i = 0; i < nr_longs; ++i )
+dst[i] |= other[i];
+}
+
+#endif  /* XC_BITOPS_H */
diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
deleted file mode 100644
index cd749f4..000
--- a/tools/libxc/xc_bitops.h
+++ /dev/null
@@ -1,76 +0,0 @@
-#ifndef XC_BITOPS_H
-#define XC_BITOPS_H 1
-
-/* bitmap operations for single threaded access */
-
-#include 
-#include 
-
-#define BITS_PER_LONG (sizeof(unsigned long) * 8)
-#define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
-
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
-
-/* calculate required space for number of longs needed to hold nr_bits */
-static inline int bitmap_size(int nr_bits)
-{
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
-}
-
-static inline unsigned long *bitmap_alloc(int nr_bits)
-{
-return calloc(1, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0xff, bitmap_size(nr_bits));
-}
-
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
-{
-memset(addr, 0, bitmap_size(nr_bits));
-}
-
-static inline int test_bit(int nr, unsigned long *addr)
-{
-return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
-}
-
-static inline void clear_bit(int nr, unsigned long *addr)
-{
-BITMAP_ENTRY(nr, addr) &= ~(1UL <

Re: [Xen-devel] [PATCH v2 COLOPre 05/13] tools/libxl: introduce a new API libxl__domain_restore() to load qemu state

2015-06-10 Thread Yang Hongyang



On 06/10/2015 11:35 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

Secondary vm is running in colo mode. So we will do
the following things again and again:
1. suspend both primay vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
We will send qemu's state each time in step2, and
slave's qemu should read it each time before resuming
secondary vm. Introduce a new API libxl__domain_restore()
to do it. This API should be called before resuming
secondary vm.


Is this a preexisting qemu interface or one to be added?


We added the qemu interface "xen-load-devices-state",
it's not in qemu upstream yet.





Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
  tools/libxl/libxl_dom_save.c | 47 
  tools/libxl/libxl_internal.h |  4 
  tools/libxl/libxl_qmp.c  | 10 ++
  3 files changed, 61 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 74a6bae..f9627f8 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -663,6 +663,53 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t 
*buf,
  return 0;
  }

+int libxl__domain_restore(libxl__gc *gc, uint32_t domid)
+{
+int rc = 0;
+
+libxl_domain_type type = libxl__domain_type(gc, domid);
+if (type != LIBXL_DOMAIN_TYPE_HVM) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+rc = libxl__domain_restore_device_model(gc, domid);
+if (rc)
+LOG(ERROR, "failed to restore device mode for domain %u:%d",
+domid, rc);
+out:
+return rc;
+}
+
+int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
+{
+char *state_file;
+int rc;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+/* not supported now */
+rc = ERROR_INVAL;
+break;
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+/*
+ * This function may be called too many times for the same gc,
+ * so we use NOGC, and free the memory before return to avoid
+ * OOM.
+ */
+state_file = libxl__sprintf(NOGC,
+XC_DEVICE_MODEL_RESTORE_FILE".%d",
+domid);
+rc = libxl__qmp_restore(gc, domid, state_file);
+free(state_file);
+break;
+default:
+rc = ERROR_INVAL;
+}
+
+return rc;
+}
+
  /*
   * Local variables:
   * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1905195..20364c6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1022,6 +1022,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t 
domid,

  _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
   uint32_t size, void *data);
+_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
  _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);

  _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1039,6 +1040,7 @@ _hidden int libxl__userdata_store(libxl__gc *gc, uint32_t 
domid,
const char *userdata_userid,
const uint8_t *data, int datalen);

+_hidden int libxl__domain_restore(libxl__gc *gc, uint32_t domid);
  _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
   int suspend_cancel);
  _hidden int libxl__domain_s3_resume(libxl__gc *gc, int domid);
@@ -1651,6 +1653,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
  _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
  /* Save current QEMU state into fd. */
  _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
  /* Set dirty bitmap logging status */
  _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool 
enable);
  _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const 
libxl_device_disk *disk);
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 9aa7e2e..a6f1a21 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -892,6 +892,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char 
*filename)
 NULL, NULL);
  }

+int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
+{
+libxl__json_object *args = NULL;
+
+qmp_parameters_add_string(gc, &args, "filename", state_file);
+
+return qmp_run_command(gc, domid, "xen-load-devices-state", args,
+   NULL, NULL);
+}
+
  static int qmp_change(libxl__gc *gc, libxl_

Re: [Xen-devel] [PATCH v2 COLOPre 01/13] libxc/restore: fix error handle of process_record

2015-06-10 Thread Yang Hongyang



On 06/10/2015 10:55 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

If the err is RECORD_NOT_PROCESSED, and it is an optional record,
restore will still fail. The patch fix this.


Whichever approach you take to fixing this, please say _how_ the change
fixes it, it's not at all clear why moving this code should matter.

And if there is an ulterior motive behind the move, please say that too.


Okay, will describe this in the next version, thank you!





Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
  tools/libxc/xc_sr_restore.c | 28 ++--
  1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..2d2edd3 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -560,19 +560,6 @@ static int process_record(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  free(rec->data);
  rec->data = NULL;

-if ( rc == RECORD_NOT_PROCESSED )
-{
-if ( rec->type & REC_TYPE_OPTIONAL )
-DPRINTF("Ignoring optional record %#x (%s)",
-rec->type, rec_type_to_str(rec->type));
-else
-{
-ERROR("Mandatory record %#x (%s) not handled",
-  rec->type, rec_type_to_str(rec->type));
-rc = -1;
-}
-}
-
  return rc;
  }

@@ -678,7 +665,20 @@ static int restore(struct xc_sr_context *ctx)
  else
  {
  rc = process_record(ctx, &rec);
-if ( rc )
+if ( rc == RECORD_NOT_PROCESSED )
+{
+if ( rec.type & REC_TYPE_OPTIONAL )
+DPRINTF("Ignoring optional record %#x (%s)",
+rec.type, rec_type_to_str(rec.type));
+else
+{
+ERROR("Mandatory record %#x (%s) not handled",
+  rec.type, rec_type_to_str(rec.type));
+rc = -1;
+goto err;
+}
+}
+else if ( rc )
  goto err;
  }




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 06/13] tools/libxl: Introduce a new internal API libxl__domain_unpause()

2015-06-10 Thread Yang Hongyang



On 06/10/2015 11:37 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

From: Wen Congyang 

The guest is paused after libxl_domain_create_restore().
Secondary vm is running in colo mode. So we need to unpause
the guest. The current API libxl_domain_unpause() is
not an internal API. Introduce a new API to support it.
No functional change.


In general there is nothing wrong with using a public function
internally. Is there some special consideration here?


It's just that we thought it's better to use internal functions for
internal purpose.
Most the public functions take ctx as the first param, the internal functions
take gc/egc as the first param(although we can get ctx from gcs and call
public functions when needed).
If it doesn't matter, we can drop this patch.





Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
  tools/libxl/libxl.c  | 20 ++--
  tools/libxl/libxl_internal.h |  1 +
  2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index ba2da92..d5691dc 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,9 +933,8 @@ out:
  return AO_INPROGRESS;
  }

-int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
+int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
  {
-GC_INIT(ctx);
  char *path;
  char *state;
  int ret, rc = 0;
@@ -947,7 +946,7 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
  }

  if (type == LIBXL_DOMAIN_TYPE_HVM) {
-uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);

  path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
  state = libxl__xs_read(gc, XBT_NULL, path);
@@ -957,12 +956,21 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
   NULL, NULL, NULL);
  }
  }
-ret = xc_domain_unpause(ctx->xch, domid);
-if (ret<0) {
-LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unpausing domain %d", domid);
+
+ret = xc_domain_unpause(CTX->xch, domid);
+if (ret < 0) {
+LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "unpausing domain %d", domid);
  rc = ERROR_FAIL;
  }
   out:
+return rc;
+}
+
+int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
+{
+GC_INIT(ctx);
+int rc = libxl__domain_unpause(gc, domid);
+
  GC_FREE;
  return rc;
  }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 20364c6..366470f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1044,6 +1044,7 @@ _hidden int libxl__domain_restore(libxl__gc *gc, uint32_t 
domid);
  _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
   int suspend_cancel);
  _hidden int libxl__domain_s3_resume(libxl__gc *gc, int domid);
+_hidden int libxl__domain_unpause(libxl__gc *gc, uint32_t domid);

  /* returns 0 or 1, or a libxl error code */
  _hidden int libxl__domain_pvcontrol_available(libxl__gc *gc, uint32_t domid);



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 05/13] tools/libxl: introduce a new API libxl__domain_restore() to load qemu state

2015-06-11 Thread Yang Hongyang



On 06/11/2015 04:43 PM, Ian Campbell wrote:

On Thu, 2015-06-11 at 10:09 +0800, Yang Hongyang wrote:


On 06/10/2015 11:35 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

Secondary vm is running in colo mode. So we will do
the following things again and again:
1. suspend both primay vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
We will send qemu's state each time in step2, and
slave's qemu should read it each time before resuming
secondary vm. Introduce a new API libxl__domain_restore()
to do it. This API should be called before resuming
secondary vm.


Is this a preexisting qemu interface or one to be added?


We added the qemu interface "xen-load-devices-state",
it's not in qemu upstream yet.


OK, please mention this dependency in the commit text since we will want
to be sure the interface is going to be accepted in this form by QEMU
upstream before we start using it. Please also CC the QEMU maintainers
on this patch in the future (by adding Cc: below the S-o-b if you don't
want to spam them the whole series), I've added them here now.

In particular "devices" seems odd to me, perhaps
"xen-load-device-state"?


This api is an invert operation to "xen-save-devices-stat", we used the name
"xen-load-devices-state" in order to follow the existing naming style...









Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
   tools/libxl/libxl_dom_save.c | 47 

   tools/libxl/libxl_internal.h |  4 
   tools/libxl/libxl_qmp.c  | 10 ++
   3 files changed, 61 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 74a6bae..f9627f8 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -663,6 +663,53 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t 
*buf,
   return 0;
   }

+int libxl__domain_restore(libxl__gc *gc, uint32_t domid)
+{
+int rc = 0;
+
+libxl_domain_type type = libxl__domain_type(gc, domid);
+if (type != LIBXL_DOMAIN_TYPE_HVM) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+rc = libxl__domain_restore_device_model(gc, domid);
+if (rc)
+LOG(ERROR, "failed to restore device mode for domain %u:%d",
+domid, rc);
+out:
+return rc;
+}
+
+int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
+{
+char *state_file;
+int rc;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+/* not supported now */
+rc = ERROR_INVAL;
+break;
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+/*
+ * This function may be called too many times for the same gc,
+ * so we use NOGC, and free the memory before return to avoid
+ * OOM.
+ */
+state_file = libxl__sprintf(NOGC,
+XC_DEVICE_MODEL_RESTORE_FILE".%d",
+domid);
+rc = libxl__qmp_restore(gc, domid, state_file);
+free(state_file);
+break;
+default:
+rc = ERROR_INVAL;
+}
+
+return rc;
+}
+
   /*
* Local variables:
* mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1905195..20364c6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1022,6 +1022,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t 
domid,

   _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
uint32_t size, void *data);
+_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
   _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);

   _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1039,6 +1040,7 @@ _hidden int libxl__userdata_store(libxl__gc *gc, uint32_t 
domid,
 const char *userdata_userid,
 const uint8_t *data, int datalen);

+_hidden int libxl__domain_restore(libxl__gc *gc, uint32_t domid);
   _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
int suspend_cancel);
   _hidden int libxl__domain_s3_resume(libxl__gc *gc, int domid);
@@ -1651,6 +1653,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
   _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
   /* Save current QEMU state into fd. */
   _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
   /* Set dirty bitmap logging status */
   _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool 
enable);

Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-11 Thread Yang Hongyang



On 06/11/2015 07:14 PM, Wen Congyang wrote:

On 06/11/2015 06:20 PM, Paul Durrant wrote:

-Original Message-
From: Wen Congyang [mailto:we...@cn.fujitsu.com]
Sent: 11 June 2015 09:48
To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-devel@lists.xen.org
Cc: Wei Liu; Ian Campbell; guijianf...@cn.fujitsu.com;
yunhong.ji...@intel.com; Eddie Dong; rshri...@cs.ubc.ca; Ian Jackson
Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq
page only one time

On 06/11/2015 04:32 PM, Paul Durrant wrote:

-Original Message-
From: Wen Congyang [mailto:we...@cn.fujitsu.com]
Sent: 11 June 2015 02:14
To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-

de...@lists.xen.org

Cc: Wei Liu; Ian Campbell; guijianf...@cn.fujitsu.com;
yunhong.ji...@intel.com; Eddie Dong; rshri...@cs.ubc.ca; Ian Jackson
Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero

ioreq

page only one time

On 06/10/2015 07:47 PM, Paul Durrant wrote:

-Original Message-
From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-
boun...@lists.xen.org] On Behalf Of Wen Congyang
Sent: 10 June 2015 12:38
To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-

de...@lists.xen.org

Cc: Wei Liu; Ian Campbell; guijianf...@cn.fujitsu.com;
yunhong.ji...@intel.com; Eddie Dong; rshri...@cs.ubc.ca; Ian Jackson
Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero

ioreq

page only one time

On 06/10/2015 06:58 PM, Paul Durrant wrote:

-Original Message-
From: Wen Congyang [mailto:we...@cn.fujitsu.com]
Sent: 10 June 2015 11:55
To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-

de...@lists.xen.org

Cc: Wei Liu; Ian Campbell; yunhong.ji...@intel.com; Eddie Dong;
guijianf...@cn.fujitsu.com; rshri...@cs.ubc.ca; Ian Jackson
Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore:

zero

ioreq

page only one time

On 06/10/2015 06:40 PM, Paul Durrant wrote:

-Original Message-
From: Wen Congyang [mailto:we...@cn.fujitsu.com]
Sent: 10 June 2015 10:06
To: Andrew Cooper; Yang Hongyang; xen-devel@lists.xen.org;

Paul

Durrant

Cc: Wei Liu; Ian Campbell; yunhong.ji...@intel.com; Eddie Dong;
guijianf...@cn.fujitsu.com; rshri...@cs.ubc.ca; Ian Jackson
Subject: Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore:

zero

ioreq

page only one time

Cc: Paul Durrant

On 06/10/2015 03:44 PM, Andrew Cooper wrote:

On 10/06/2015 06:26, Yang Hongyang wrote:



On 06/09/2015 03:30 PM, Andrew Cooper wrote:

On 09/06/2015 01:59, Yang Hongyang wrote:



On 06/08/2015 06:15 PM, Andrew Cooper wrote:

On 08/06/15 10:58, Yang Hongyang wrote:



On 06/08/2015 05:46 PM, Andrew Cooper wrote:

On 08/06/15 04:43, Yang Hongyang wrote:

ioreq page contains evtchn which will be set when we

resume

the

secondary vm the first time. The hypervisor will check if

the

evtchn is corrupted, so we cannot zero the ioreq page

more

than one time.

The ioreq->state is always STATE_IOREQ_NONE after

the

vm

is

suspended, so it is OK if we only zero it one time.

Signed-off-by: Yang Hongyang



Signed-off-by: Wen congyang



CC: Andrew Cooper 


The issue here is that we are running the restore

algorithm

over

a

domain which has already been running in Xen for a

while.

This

is a

brand new usecase, as far as I am aware.


Exactly.



Does the qemu process associated with this domain get

frozen

while the
secondary is being reset, or does the process get

destroyed

and

recreated.


What do you mean by reset? do you mean secondary is

suspended

at

checkpoint?


Well - at the point that the buffered records are being

processed,

we

are in the process of resetting the state of the secondary to

match

the
primary.


Yes, at this point, the qemu process associated with this

domain is

frozen.
the suspend callback will call libxl__qmp_stop(vm_stop() in

qemu)

to

pause
qemu. After we processed all records, qemu will be restored

with

the

received
state, that's why we add a

libxl__qmp_restore(qemu_load_vmstate()

in

qemu)
api to restore qemu with received state. Currently in libxl,

qemu

only

start
with the received state, there's no api to load received state

while

qemu is
running for a while.


Now I consider this more, it is absolutely wrong to not zero

the

page

here.  The event channel in the page is not guaranteed to be

the

same

between the primary and secondary,


That's why we don't zero it on secondary.


I think you missed my point.  Apologies for the double negative.

It

must, under all circumstances, be zeroed at this point, for safety

reasons.


The page in question is subject to logdirty just like any other

guest

pages, which means that if the guest writes to it naturally (i.e.

not a

Xen or Qemu write, both of whom have magic mappings which

are

not

subject to logdirty), it will be transmitted in the stream.  As the
event channel could be different, the lack of zeroing it at this

point

means that the event channel would be wrong as oppo

Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-11 Thread Yang Hongyang



On 06/11/2015 06:20 PM, Paul Durrant wrote:

-Original Message-
From: Wen Congyang [mailto:we...@cn.fujitsu.com]
Sent: 11 June 2015 09:48
To: Paul Durrant; Andrew Cooper; Yang Hongyang; xen-devel@lists.xen.org
Cc: Wei Liu; Ian Campbell; guijianf...@cn.fujitsu.com;

[...]




In our implementation, we don't start a new emulator. The codes can work,
but some bugs may be not triggered.



How do you reconcile the incoming QEMU save record with the running emulator 
state?


We introduce a qmp command "xen-load-devices-state"(libxl__qmp_restore) which
can restore the emulator state. The step of resotre emulator state at a
checkpoint is:

1. libxl__qmp_stop-> vm_stop() in qemu
2. libxl__qmp_restore -> load_vmstate() in qemu
3. libxl__qmp_resume  -> vm_start() in qemu



   Paul


Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



   Paul


We will set to the guest to a new state, the old state should be

dropped.


Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



~Andrew
.



.



.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
.



.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 03/13] libxc/restore: zero ioreq page only one time

2015-06-11 Thread Yang Hongyang



On 06/11/2015 08:54 PM, Yang Hongyang wrote:
[...]

this patch now.


Ok, this really is a historical patch...



Having tested, it is ok to drop this patch now.



Thanks
Wen Congyang



In our implementation, we don't start a new emulator. The codes can work,
but some bugs may be not triggered.



How do you reconcile the incoming QEMU save record with the running emulator
state?

   Paul


Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



   Paul


We will set to the guest to a new state, the old state should be

dropped.


Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



   Paul



Thanks
Wen Congyang



~Andrew
.



.



.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
.



.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
.



.





--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen

2015-06-14 Thread Yang Hongyang



On 06/12/2015 08:33 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:11AM +0800, Yang Hongyang wrote:

Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 


This looks very similar to an existing function called
libxl__domain_resume_device_model. Maybe you don't need to invent a new
function.


---
  tools/libxl/libxl.c | 42 +-
  1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5691dc..5c843c2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,10 +933,37 @@ out:
  return AO_INPROGRESS;
  }

-int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
+static int libxl__domain_unpause_device_model(libxl__gc *gc, uint32_t domid)
  {
  char *path;
  char *state;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
+state = libxl__xs_read(gc, XBT_NULL, path);
+if (state != NULL && !strcmp(state, "paused")) {


The only difference between your function and
libxl__domain_unpause_device_model is the check for "state" node. I
think you can just add the check to libxl__domain_resume_device_model
and use that function.


I'm not sure if we change the existing function's behavior will affect the
existing callers, if there's no problem to do so, I will do as what you
said in the next version.



Wei.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 10/13] tools/libxl: Add back channel to allow migration target send data back

2015-06-14 Thread Yang Hongyang



On 06/12/2015 08:54 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:14AM +0800, Yang Hongyang wrote:

From: Wen Congyang 

In colo mode, slave needs to send data to master, but the io_fd
only can be written in master, and only can be read in slave.
Save recv_fd in domain_suspend_state, and send_fd in
domain_create_state.



You failed to mention in commit message new structures are introduced in
IDL.


Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
  tools/libxl/libxl.c  |  2 +-
  tools/libxl/libxl_create.c   | 14 ++
  tools/libxl/libxl_internal.h |  2 ++
  tools/libxl/libxl_types.idl  |  7 +++
  tools/libxl/xl_cmdimpl.c |  7 +++


You also need to add LIBXL_HAVE in libxl.h.


  5 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5c843c2..36b97fe 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -832,7 +832,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
  dss->callback = remus_failover_cb;
  dss->domid = domid;
  dss->fd = send_fd;
-/* TODO do something with recv_fd */
+dss->recv_fd = recv_fd;
  dss->type = type;
  dss->live = 1;
  dss->debug = 0;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..bd8149c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
   int rc, uint32_t domid);

  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-uint32_t *domid,
-int restore_fd, int checkpointed_stream,
+uint32_t *domid, int restore_fd,
+int send_fd, int checkpointed_stream,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
@@ -1591,6 +1591,7 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
  libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
  cdcs->dcs.restore_fd = restore_fd;
+cdcs->dcs.send_fd = send_fd;
  cdcs->dcs.callback = domain_create_cb;
  cdcs->dcs.checkpointed_stream = checkpointed_stream;
  libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
@@ -1619,7 +1620,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, -1, 0,
  ao_how, aop_console_how);
  }

@@ -1629,7 +1630,12 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how 
*aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, restore_fd,
+int send_fd = -1;
+
+if (params->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_COLO)
+send_fd = params->send_fd;
+
+return do_domain_create(ctx, d_config, domid, restore_fd, send_fd,
  params->checkpointed_stream, ao_how, 
aop_console_how);
  }

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fbbae93..6d214b5 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2874,6 +2874,7 @@ struct libxl__domain_save_state {

  uint32_t domid;
  int fd;
+int recv_fd;
  libxl_domain_type type;
  int live;
  int debug;
@@ -3143,6 +3144,7 @@ struct libxl__domain_create_state {
  libxl_domain_config *guest_config;
  libxl_domain_config guest_config_saved; /* vanilla config */
  int restore_fd;
+int send_fd;
  libxl__domain_create_cb *callback;
  libxl_asyncprogress_how aop_console_how;
  /* private to domain_create */
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..8a3d7ba 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -198,6 +198,12 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
  (3, "reference_tsc"),
  ])

+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+(0, "NONE"),
+(1, "REMUS"),
+(2, "COLO"),
+], init_val = 0)


The default init_val is 0 so you don't need to write it down.


Okay.




+
  #
  # Complex libxl types
  #
@@ -346,6 +352,7 @@ libxl_domain_create_info = Struct("domain_create_info",[


Re: [Xen-devel] [PATCH v2 COLOPre 10/13] tools/libxl: Add back channel to allow migration target send data back

2015-06-14 Thread Yang Hongyang



On 06/12/2015 11:04 PM, Ian Jackson wrote:

Wei Liu writes ("Re: [Xen-devel] [PATCH v2 COLOPre 10/13] tools/libxl: Add back 
channel to allow migration target send data back"):

On Mon, Jun 08, 2015 at 11:43:14AM +0800, Yang Hongyang wrote:

From: Wen Congyang 

In colo mode, slave needs to send data to master, but the io_fd
only can be written in master, and only can be read in slave.
Save recv_fd in domain_suspend_state, and send_fd in
domain_create_state.

...

  libxl_domain_restore_params = Struct("domain_restore_params", [
  ("checkpointed_stream", integer),
+("send_fd", integer),


I'm not entirely sure if we want to bury an extra argument here.

After looking at code I think you're trying to work around API
limitation. I think we are safe to extend the API -- we've already done
that before. See libxl.h around line 990.

Ian and Ian, what do you think?


I agree with you, Wei.  I don't think an fd should be in
libxl_domain_restore_params at all.


Then I'll just extend the params of libxl_domain_create_restore().



We need to understand what the API semantics are.  Are are going to
introduce a new libxl API entrypoint ?  We already have
libxl_domain_remus_start.


We use libxl_domain_remus_start for COLO. COLO is an option of "xl remus".



Ian.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 11/13] tools/libxl: rename remus device to checkpoint device

2015-06-14 Thread Yang Hongyang



On 06/12/2015 10:57 PM, Ian Jackson wrote:

Wei Liu writes ("Re: [Xen-devel] [PATCH v2 COLOPre 11/13] tools/libxl: rename remus 
device to checkpoint device"):

On Fri, Jun 12, 2015 at 02:30:46PM +0100, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:15AM +0800, Yang Hongyang wrote:

-(-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
-(-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+(-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
+(-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),


You should add two new error numbers.



And in that case you might also need to go through all places to make
sure the correct error numbers are return. I.e. old remus code path
still returns REMUS error code and new CHECKPOINT code path returns new
error code.

I merely speak from API backward compatibility point of view. If you
think what I suggest doesn't make sense, please let me know.


To me this line of reasons prompts me to ask: what would be wrong with
leaving the word REMUS in the error names, and simply updating the
descriptions ?

After all AFIACT the circumstances are very similar.  I don't think it
makes sense to require libxl to do something like
rc = were_we_doing_colo_not_remus ? CHECKPOINT_BLAH : REMUS_BLAH;

Please to contradict me if I have misunderstood...


COLO and REMUS both are checkpoint device. We use checkpoint device layer
as a more abstract layer for both COLO and REMUS, come to the error code,
these can be used by both COLO and REMUS. So we don't distinguish if we
are doing COLO or REMUS, uses are aware of what they're executing(colo
or remus).



Ian.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 13/13] tools/libxl: don't touch remus in checkpoint_device

2015-06-14 Thread Yang Hongyang



On 06/12/2015 09:28 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:17AM +0800, Yang Hongyang wrote:

Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus:
1. remus_ops: we use remus ops directly in checkpoint
device. Store it in checkpoint device state.
2. concrete layer's private member: add a new structure
remus state, and move them to remus state.
3. init/cleanup device subkind: we call (init|cleanup)_subkind_nic
and (init|cleanup)_subkind_drbd_disk directly in checkpoint
device. Call them before calling libxl__checkpoint_devices_setup()
or after calling libxl__checkpoint_devices_teardown().




From the look of it this patch is mostly refactoring and doesn't involve

functional changes, right? If so please state that in commit message.


Yes, it is refactoring and no functional changes, will mention it in next
version.



I suppose this needs review from remus maintainer.

Wei.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h

2015-06-14 Thread Yang Hongyang



On 06/11/2015 06:55 PM, Ian Campbell wrote:

On Thu, 2015-06-11 at 11:45 +0100, Andrew Cooper wrote:

On 11/06/15 09:41, Ian Campbell wrote:

On Thu, 2015-06-11 at 10:07 +0800, Yang Hongyang wrote:

On 06/10/2015 11:20 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:43 +0800, Yang Hongyang wrote:

When we are under COLO, we will send dirty page bitmap info from
secondary to primary at every checkpoint.

... and this is a _libxl_ operation? Is that the right layer here?

For the first question, Yes, this is done in the suspend callback on
restore side. We do this in libxl because currently we only added a
back channel on libxl side. There're no back channel in libxc.

By considering this more, if we do this in libxc part, the code will be
less complex: we can drop the 4th & 9th patch of this series and also
get rid of the get_dirty_pfn() callback. instead we will add a patch to
add back channel in libxc.

That sounds better to me, but lets see what Andrew thinks.


For the second question, I'm not sure, what's Andrew's opinion? which
is the right layer to do this operation, libxl or libxc?


There are a number of bits of information which would be useful going in
"the backchannel".

Some are definitely more appropriate at the libxc level, but others are
more appropriate at the libxl.

If you recall from the hackathon, there was an Alibaba usecase where
they wanted a positive success/fail from the receiving side that the VM
has started up successfully before choosing between cleaning up or
continuing the VM on the sending side.  This would have to be a libxl
level backchannel.


FWIW this particular case is currently an xl level backchannel, but I
think your general point stands.


So are you both agree that we should add a backchannel to libxc, move this
operation to libxc layer, what's other tools maintainers's opinion?




Whatever happens, backchannel wise, it should be a sensibly
type/length/chunk'd stream.  (I think there is a spec or two floating
around somewhere which might be a good start ;p)  There should probably
be a bit of active negotiation at the start of the backchannel to a)
confirm you have the correct backchannel and b) the backchannel is
actually functioning.

The data on "the backchannel" is always going to be in reply to an
action taking place in the primary channel, but there are complications
in that the libxc bit is inherently a blocking model.  In terms of
coordination, I am leaning towards the view of it being easier and
cleaner for each level to maintain its own backchannel communication.
The libxc bits can expect to read some records out of the backchannel at
each checkpoint and take appropriate actions before starting the next
checkpoint.

Thoughts?

~Andrew




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 02/15] secondary vm suspend/resume/checkpoint code

2015-06-14 Thread Yang Hongyang



On 06/12/2015 10:23 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:45:46AM +0800, Yang Hongyang wrote:

From: Wen Congyang 

Secondary vm is running in colo mode. So we will do
the following things again and again:
1. Resume secondary vm
a. Send LIBXL_COLO_SVM_READY to master.
b. If it is not the first resume, call 
libxl__checkpoint_devices_preresume().
c. If it is the first resume(resume right after live migration),
   - call libxl__xc_domain_restore_done() to build the secondary vm.
   - enable secondary vm's logdirty.
   - call libxl__domain_resume() to resume secondary vm.
   - call libxl__checkpoint_devices_setup() to setup checkpoint devices.
d. Send LIBXL_COLO_SVM_RESUMED to master.
2. Wait a new checkpoint
a. Call libxl__checkpoint_devices_commit().
b. Read LIBXL_COLO_NEW_CHECKPOINT from master.
3. Suspend secondary vm
a. Suspend secondary vm.
b. Call libxl__checkpoint_devices_postsuspend().
c. Get secondary vm's dirty page information.
d. Send LIBXL_COLO_SVM_SUSPENDED to master.
e. Send secondary vm's dirty page information to master(count + pfn list).

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---

[...]

+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#ifndef LIBXL_COLO_H
+#define LIBXL_COLO_H
+
+/*
+ * values to control suspend/resume primary vm and secondary vm
+ * at the same time
+ */
+enum {
+LIBXL_COLO_NEW_CHECKPOINT = 1,
+LIBXL_COLO_SVM_SUSPENDED,
+LIBXL_COLO_SVM_READY,
+LIBXL_COLO_SVM_RESUMED,
+};
+


Any reason to not have this in IDL?


No, will move it to IDL in the next version.




+extern void libxl__colo_restore_done(libxl__egc *egc, void *dcs_void,
+ int ret, int retval, int errnoval);
+extern void libxl__colo_restore_setup(libxl__egc *egc,
+  libxl__colo_restore_state *crs);
+extern void libxl__colo_restore_teardown(libxl__egc *egc,
+ libxl__colo_restore_state *crs,
+ int rc);
+
+#endif
diff --git a/tools/libxl/libxl_colo_restore.c b/tools/libxl/libxl_colo_restore.c
new file mode 100644
index 000..6c39758
--- /dev/null
+++ b/tools/libxl/libxl_colo_restore.c
@@ -0,0 +1,1158 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Wen Congyang 
+ *     Yang Hongyang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+#include "libxl_colo.h"
+#include "xc_bitops.h"
+
+#define XC_PAGE_SHIFT   12
+#define PAGE_SHIFT  XC_PAGE_SHIFT


I don't think you need these.


+#define ROUNDUP(_x,_w) (((unsigned long)(_x)+(1UL<<(_w))-1) & ~((1UL<<(_w))-1))
+#define NRPAGES(x) (ROUNDUP(x, PAGE_SHIFT) >> PAGE_SHIFT)


And you can use XC_PAGE_SHIFT directly in above macro.


Okay, thanks.




+
+enum {
+LIBXL_COLO_SETUPED,
+LIBXL_COLO_SUSPENDED,
+LIBXL_COLO_RESUMED,
+};
+


Move it to IDL as well?


Ok.




+typedef struct libxl__colo_restore_checkpoint_state 
libxl__colo_restore_checkpoint_state;
+struct libxl__colo_restore_checkpoint_state {
+xc_hypercall_buffer_t _dirty_bitmap;
+xc_hypercall_buffer_t *dirty_bitmap;


This one looks like layer violation to me. I don't have other good
suggestion on how to do this though. Maybe Ian and Ian have better idea.


We are talking about moving this operation to libxc layer, what's your opinion?
Please refer to the 4th COLOPre patch.




+unsigned long p2m_size;
+libxl__domain_suspend_state dsps;
+libxl__datacopier_state dc;
+uint8_t section;


This could use a better name like "stage" / "state"?


stage should be better, thank you.




+libxl__logdirty_switch lds;
+libxl__colo_restore_state *crs;
+int status;
+bool preresume;
+/* used for teardown */
+int teardown_devices;
+int saved_rc;
+
+void (*callback)(libxl__egc *,
+ libxl__colo_restore_checkpoint_state *,
+ int);
+
+/*
+ * 0: secondary vm's dirty bitmap for domain @domid
+ * 1: secondary vm is ready(domain @domid)
+ * 2: secondary vm is resumed(domain @domid)
+ * 3. new checkpoint is triggered(domain @domid)
+ */
+ 

Re: [Xen-devel] [PATCH v6 COLO 02/15] secondary vm suspend/resume/checkpoint code

2015-06-14 Thread Yang Hongyang

Hi Ian J, Wei,

On 06/12/2015 10:51 PM, Ian Jackson wrote:

Wei Liu writes ("Re: [Xen-devel] [PATCH v6 COLO 02/15] secondary vm 
suspend/resume/checkpoint code"):

On Mon, Jun 08, 2015 at 11:45:46AM +0800, Yang Hongyang wrote:

From: Wen Congyang 
+crcs->status = LIBXL_COLO_RESUMED;
+
+/* avoid calling libxl__xc_domain_restore_done() more than once */
+if (crs->saved_cb) {
+dcs->callback = crs->saved_cb;
+crs->saved_cb = NULL;


I have a feeling that this trick should be avoided. But I'm not an
expert on this so I will defer judgement to Ian J.


Yes, this trick should be avoided.  It will make the resulting
control flow very confusing.


I agree that this part is a bit of tricky. I will try to find another
way to do this. Maybe add another state variable to indicate what stage
we are in, the first boot or under checkpoint.



Ian.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 12/15] COLO nic: implement COLO nic subkind

2015-06-14 Thread Yang Hongyang



On 06/12/2015 10:35 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:45:56AM +0800, Yang Hongyang wrote:

implement COLO nic subkind.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
---
  tools/hotplug/Linux/Makefile |   1 +
  tools/hotplug/Linux/colo-proxy-setup | 131 +++


There are hardcoded paths in this script. Please avoid that.

For one Debian has iptables under /sbin, not /usr/local/sbin.


We are using a modified iptables here. But hardcode is not a good thing,
will avoid this in the next version.



Wei.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 00/27] Libxl migration v2

2015-06-15 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

This series adds support for the libxl migration v2 stream, and untangles the
existing layering violations of the toolstack and qemu records.

At the end of the series, legacy migration is no longer used.

Note: Remus support is broken and (RFC) fixed in separate patches in this
series.  It was too tangled to fix in a bisectable fashon.  Plain
suspend/migrate/resume however is (should be) bisectable along the entire
series.


By a quick test on both pv/hvm, Remus support is still broken. The Remus
save/restore part is working, but failover is broken. To solve this:
On libxl side:
1. buffer toolstack and qemu records at checkpoint.
2. If stream read failed on xl side, drop the buffered records, return with
   error code that indicate a failover.
3. If all stream buffered(xl side), process/apply the toolstack and qemu
   records, return with success.
4. If apply toolstack and qemu records failed, return error.

On libxc side:
check the return value of checkpoint callback, if it indicate a failover,
then do failover.



There are a couple of outstanding questions:

1) What to do about the toolstack/xenstore record.  It is currently by being
passed around as a blob, but it might be better to split it out.

2) What (if any) ABI/API qualifications are needed? (Particularly in reference
to patch 21)

The Remus code is untested by me, but is hopefully in the correct ballpark.
All other combinations of suspend/migrate/resume have been tested with PV and
HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
(which was the underlying bug causing us to write migration v2 in the first
place).

There are some further improvements which could be made.  In particular, it
appears that sending the toolstack record on each checkpoint is redundant, and
there is certainly room for some more pruning of the legacy migration code.

Anyway, thoughts/comments welcome.  Please test!

~Andrew


Andrew Cooper (22):
   tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised 
children
   tools/libxc: Always compile the compat qemu variables into xc_sr_context
   tools/libxl: Stash all restore parameters in domain_create_state
   tools/xl: Mandatory flag indicating the format of the migration stream
   tools/libxl: Introduce ROUNDUP()
   tools/libxl: Extra APIs for the save helper
   tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
   docs: Libxl migration v2 stream specification
   tools/python: Libxc migration v2 infrastructure
   tools/python: Libxl migration v2 infrastructure
   tools/python: Verification utility for v2 stream spec compliance
   tools/python: Conversion utility for legacy migration streams
   tools/libxl: Support converting a legacy stream to a v2 stream
   tools/libxl: Convert a legacy stream if needed
   tools/libxc+libxl+xl: Restore v2 streams
   tools/libxc+libxl+xl: Save v2 streams
   docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus 
streams
   tools/libxl: [RFC] Write checkpoint records into the stream
   tools/libx{c,l}: [RFC] Introduce restore_callbacks.checkpoint()
   tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
   tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
   tools/libxl: Drop all knowledge of toolstack callbacks

Ian Jackson (2):
   libxl: cancellation: Preparations for save/restore cancellation
   libxl: cancellation: Handle SIGTERM in save/restore helper

Ross Lagerwall (3):
   tools/libxl: Migration v2 stream format
   tools/libxl: Infrastructure for reading a libxl migration v2 stream
   tools/libxl: Infrastructure for writing a v2 stream

  docs/specs/libxl-migration-stream.pandoc  |  218 
  tools/libxc/Makefile  |2 -
  tools/libxc/include/xenguest.h|3 +
  tools/libxc/xc_sr_common.h|5 -
  tools/libxc/xc_sr_restore.c   |   33 +-
  tools/libxc/xc_sr_restore_x86_hvm.c   |  124 -
  tools/libxc/xc_sr_save_x86_hvm.c  |   36 --
  tools/libxl/Makefile  |2 +
  tools/libxl/libxl_aoutils.c   |7 +
  tools/libxl/libxl_convert_callout.c   |  146 ++
  tools/libxl/libxl_create.c|   80 +--
  tools/libxl/libxl_dom.c   |   61 +--
  tools/libxl/libxl_internal.h  |  140 -
  tools/libxl/libxl_save_callout.c  |   63 +--
  tools/libxl/libxl_save_helper.c   |   95 ++--
  tools/libxl/libxl_save_msgs_gen.pl|9 +-
  tools/libxl/libxl_sr_stream_format.h  |   58 +++
  tools/libxl/libxl_stream_read.c   |  663 
  tools/libxl/libxl_stream_write.c  |  640 +++
  tools/libxl/libxl_types.idl   |2 +
  tools/libxl/xl_cmdimpl.c  |9 +-
  tools/python/Makefile   

Re: [Xen-devel] [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()

2015-06-15 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

And call it when a checkpoint record is found in the libxc stream.

Signed-off-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxc/include/xenguest.h |3 +++
  tools/libxc/xc_sr_restore.c|   15 ++-
  tools/libxl/libxl_save_msgs_gen.pl |2 +-
  3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..b0d27ed 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -102,6 +102,9 @@ struct restore_callbacks {
  int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
  uint32_t size, void* data);

+/* A checkpoint record has been found in the stream */


Describe the return value, e.g:
2 failover
1 success
0 error


+int (*checkpoint)(void* data);
+
  /* to be provided as the last argument to each callback function */
  void* data;
  };
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..5e0f817 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,5 +1,7 @@
  #include 

+#include 
+
  #include "xc_sr_common.h"

  /*
@@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  static int handle_checkpoint(struct xc_sr_context *ctx)
  {
  xc_interface *xch = ctx->xch;
-int rc = 0;
+int rc = 0, ret;
  unsigned i;

  if ( !ctx->restore.checkpointed )
@@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
  goto err;
  }

+ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);


Should check whether we need to failover.


+if ( ret )
+{
+rc = -1;
+goto err;
+}
+
  if ( ctx->restore.buffer_all_records )
  {
  IPRINTF("All records buffered");
@@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
  ctx.restore.checkpointed = checkpointed_stream;
  ctx.restore.callbacks = callbacks;

+/* Sanity checks for callbacks. */
+if (checkpointed_stream)
+assert(callbacks->checkpoint);
+
  IPRINTF("In experimental %s", __func__);
  DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
  ", checkpointed_stream %d", io_fd, dom, hvm, pae,
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..36b279e 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -25,7 +25,7 @@ our @msgs = (
  'unsigned long', 'total'] ],
  [  3, 'scxA',   "suspend", [] ],
  [  4, 'scxA',   "postcopy", [] ],
-[  5, 'scxA',   "checkpoint", [] ],
+[  5, 'srcxA',   "checkpoint", [] ],
  [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
unsigned enable)] ],
  #toolstack_save  done entirely `by hand'



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/6] tools/libxl: rename libxl__domain_suspend to libxl__domain_save

2015-06-16 Thread Yang Hongyang



On 06/16/2015 08:59 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

Rename libxl__domain_suspend() to libxl__domain_save() since it
actually do the save domain work.


This results in some strangeness in that some functions called *save*
are now passed a struct called *suspend*. I think this is probably
temporary and is all fixed up by the end of the series, is that true?


Yes, it is fixed by the refactor of the suspend state.



If so then this temporary state affairs is:
 Acked-by: Ian Campbell 


.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/6] tools/libxl: move domain resume code into libxl_dom_suspend.c

2015-06-16 Thread Yang Hongyang



On 06/16/2015 09:04 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move domain resume code into libxl_dom_suspend.c.


Even though it has "resume" in the name, I'm not sure that
libxl__domain_s3_resume is a good candidate for moving to the suspend
code, it's called only from libxl_send_trigger and IIRC we don't really
implement s3, just a fake version where the domain is paused/unpaused
(rather the destroyed and resumed, say.

Having moved libxl__domain_resume and libxl__domain_resume_device_model
into the same file I think the latter could become static. (If you do
that in this patch please say something along the lines of "pure code
motion except for making libxl__domain_resume_device_model static" in
the commit message)


ok, will fix this in the next version, thank you for the review!




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/6] tools/libxl: move domain suspend code into libxl_dom_suspend.c

2015-06-16 Thread Yang Hongyang



On 06/16/2015 09:00 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

Move domain suspend code into a separate file libxl_dom_suspend.c.
export an API libxl__domain_suspend() which wrappers the static


just "..which wraps the..."


will fix, thank you !




function domain_suspend_callback_common() for internal use.

Note that the newly added file libxl_dom_suspend.c is used for
suspend/resume code.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
Acked-by: Ian Campbell 
---
  tools/libxl/Makefile|   3 +-
  tools/libxl/libxl_dom.c | 350 +---
  tools/libxl/libxl_dom_suspend.c | 381 
  tools/libxl/libxl_internal.h|   6 +
  4 files changed, 393 insertions(+), 347 deletions(-)
  create mode 100644 tools/libxl/libxl_dom_suspend.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cc9c152..3f98d62 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -95,7 +95,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_internal.o libxl_utils.o libxl_uuid.o \
libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
libxl_save_callout.o _libxl_save_msgs_callout.o \
-   libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
+   libxl_qmp.o libxl_event.o libxl_fork.o 
libxl_dom_suspend.o \
+   $(LIBXL_OBJS-y)
  LIBXL_OBJS += libxl_genid.o
  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index cce04dd..9444329 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1103,11 +1103,6 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *buf,

  /* Domain suspend (save) */

-static void domain_save_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
-static void domain_suspend_callback_common_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int ok);
-
  /*- complicated callback, called by xc_domain_save -*/

  /*
@@ -1324,35 +1319,6 @@ static void switch_logdirty_done(libxl__egc *egc,

  /*- callbacks, called by xc_domain_save -*/

-int libxl__domain_suspend_device_model(libxl__gc *gc,
-   libxl__domain_suspend_state *dss)
-{
-int ret = 0;
-uint32_t const domid = dss->domid;
-const char *const filename = dss->dm_savefile;
-
-switch (libxl__device_model_version_running(gc, domid)) {
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-LOG(DEBUG, "Saving device model state to %s", filename);
-libxl__qemu_traditional_cmd(gc, domid, "save");
-libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, 
NULL, NULL);
-break;
-}
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-if (libxl__qmp_stop(gc, domid))
-return ERROR_FAIL;
-/* Save DM state into filename */
-ret = libxl__qmp_save(gc, domid, filename);
-if (ret)
-unlink(filename);
-break;
-default:
-return ERROR_INVAL;
-}
-
-return ret;
-}
-
  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
  {

@@ -1373,301 +1339,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, 
uint32_t domid)
  return 0;
  }

-static void domain_suspend_common_wait_guest(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_guest_suspended(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-
-static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
-  libxl__xswait_state *xswa, int rc, const char *state);
-static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
-libxl__ev_evtchn *evev);
-static void suspend_common_wait_guest_watch(libxl__egc *egc,
-  libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
-static void suspend_common_wait_guest_check(libxl__egc *egc,
-libxl__domain_suspend_state *dss);
-static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-  libxl__ev_time *ev, const struct timeval *requested_abs);
-
-static void domain_suspend_common_failed(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_done(libxl__egc *egc,
-   libxl__domain_suspend_state *dss,
-   bool ok);
-
-static bool domain_suspend_pvcontrol_acked(const char *state) {
-/* any value other than "suspend

Re: [Xen-devel] [PATCH v2 4/6] tools/libxl: move remus code into libxl_remus.c

2015-06-16 Thread Yang Hongyang



On 06/16/2015 09:08 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move remus code into libxl_remus.c.


Please say something like "... by refactoring bits of
libxl_domain_remus_start and domain_save_done into X and Y and moving
the remaining functionality unchanged into the new file".

I gave two examples of functions which changed there, but please make
sure the list is complete+accurate.


Okay, will fix the commit message next time, thank you!




Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxl/Makefile |   2 +-
  tools/libxl/libxl.c  |  55 +---
  tools/libxl/libxl_dom.c  | 206 +
  tools/libxl/libxl_internal.h |  11 ++
  tools/libxl/libxl_remus.c| 304 +++
  5 files changed, 318 insertions(+), 260 deletions(-)
  create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 3f98d62..8535eaa 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -56,7 +56,7 @@ else
  LIBXL_OBJS-y += libxl_nonetbuffer.o
  endif

-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o

  LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 77c6a36..0f9248e 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -792,10 +792,6 @@ out:
  return ptr;
  }

-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc);
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc);
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc);

@@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,

  assert(info);

-/* Convenience aliases */
-libxl__remus_devices_state *const rds = &dss->rds;
-
-if (libxl_defbool_val(info->netbuf)) {
-if (!libxl__netbuffer_enabled(gc)) {
-LOG(ERROR, "Remus: No support for network buffering");
-rc = ERROR_FAIL;
-goto out;
-}
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-}
-
-if (libxl_defbool_val(info->diskbuf))
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-rds->ao = ao;
-rds->domid = domid;
-rds->callback = libxl__remus_setup_done;
-
  /* Point of no return */
-libxl__remus_devices_setup(egc, rds);
+libxl__remus_setup(egc, dss);
  return AO_INPROGRESS;

   out:
  return AO_ABORT(rc);
  }

-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (!rc) {
-libxl__domain_save(egc, dss);
-return;
-}
-
-LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-dss->domid, rc);
-rds->callback = libxl__remus_setup_failed;
-libxl__remus_devices_teardown(egc, rds);
-}
-
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (rc)
-LOG(ERROR, "Remus: failed to teardown device after setup failed"
-" for guest with domid %u, rc %d", dss->domid, rc);
-
-dss->callback(egc, dss, rc);
-}
-
  static void remus_failover_cb(libxl__egc *egc,
libxl__domain_suspend_state *dss, int rc)
  {
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 701e9f7..0f81081 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1409,189 +1409,6 @@ int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
  return 0;
  }

-/*- remus callbacks -*/
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int ok);
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
- libxl__remus_devices_state *rds,
- int rc);
-static void remus_devices_preresume_cb(libxl__egc *egc,
-   libxl__remus_devices_state *rds,
-   int rc);
-
-static void libxl__remus_domain_suspend_callback(void *data)
-{
-libxl__save_helper_

Re: [Xen-devel] [PATCH v2 5/6] tools/libxl: move save/restore code into libxl_dom_save.c

2015-06-16 Thread Yang Hongyang



On 06/16/2015 09:09 PM, Ian Campbell wrote:

On Wed, 2015-06-03 at 16:01 +0800, Yang Hongyang wrote:

move save/restore code into libxl_dom_save.c.


If this (unlike other patches in the series) is purely code motion
please indicate that this is the case.

You might also like to consider refactoring things such that all patches
are pure motion.


Yes, this is only code move, no refactoring or other thing.




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 09/13] tools/libxl: Update libxl_save_msgs_gen.pl to support return data from xl to xc

2015-06-16 Thread Yang Hongyang

Hi Ian,

On 06/16/2015 07:05 PM, Ian Jackson wrote:

Yang Hongyang writes ("[Xen-devel] [PATCH v2 COLOPre 09/13] tools/libxl: Update 
libxl_save_msgs_gen.pl to support return data from xl to xc"):

From: Wen Congyang 

  Currently, all callbacks return an integer value or void. We cannot
  return some data to xc via callback. Update libxl_save_msgs_gen.pl
  to support this case.


Thanks for this.  I would have some comments on the details, but first
I want to properly understand your use case.  So while I'm the author
and maintainer of this save helper, I won't review this in detail just
yet.  I'm following the thread about what this is for...


We need to send secondary's dirty page pfn back to primary. Primary will
then send pages that are both dirtied on primary/secondary to secondary.
in this way the secondary's memory will be consistent with primary.

As we disscussed in [PATCH v2 COLOPre 04/13] tools/libxc: export xc_bitops.h
If we move this operation to libxc layer, this patch could be dropped.



Thanks,
Ian.
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

2015-06-16 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
[...]

+
+static void write_emulator_record(libxl__egc *egc,
+  libxl__stream_write_state *stream)
+{
+libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+libxl__datacopier_state *dc = &stream->dc;
+STATE_AO_GC(stream->ao);
+struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
+struct libxl_sr_emulator_hdr ehdr = { 0 };
+struct stat st;
+int ret = 0;
+uint32_t qemu_state_len;
+
+assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
+
+/* Convenience aliases */
+const char *const filename = dss->dm_savefile;
+const uint32_t domid = dss->domid;
+
+switch(libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+ehdr.id = EMULATOR_QEMU_TRADITIONAL;
+break;
+
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+ehdr.id = EMULATOR_QEMU_UPSTREAM;
+break;
+
+default:
+ret = ERROR_FAIL;
+goto err;
+}
+
+ret = libxl__domain_suspend_device_model(gc, dss);


This is no longer needed, the suspend callback already called
this function and the emulator context already saved to a file.

This call will cause Primary's emulator stop under Remus.
postcopy callback will resume primary. then in checkpoint
callback, we shouldn't suspend device model.


+if (ret)
+goto err;
+
+dc->readwhat = GCSPRINTF("qemu save file %s", filename);
+dc->copywhat = "emulator record";
+dc->writewhat = "save/migration stream";
+dc->callback = emulator_body_done;
+
+dc->readfd = open(filename, O_RDONLY);
+if (dc->readfd < 0) {
+LOGE(ERROR, "unable to open %s", dc->readwhat);
+goto err;
+}
+
+if (fstat(dc->readfd, &st))
+{
+LOGE(ERROR, "unable to fstat %s", dc->readwhat);
+goto err;
+}
+
+if (!S_ISREG(st.st_mode)) {
+LOG(ERROR, "%s is not a plain file!", dc->readwhat);
+goto err;
+}
+
+qemu_state_len = st.st_size;
+rec.length = qemu_state_len + sizeof(ehdr);
+
+ret = libxl__datacopier_start(dc);
+if (ret)
+goto err;
+
+libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
+
+stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - 
qemu_state_len;
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+static void emulator_body_done(libxl__egc *egc,
+   libxl__datacopier_state *dc,
+   int onwrite, int errnoval)
+{
+libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+STATE_AO_GC(stream->ao);
+int ret = 0;
+
+if (onwrite || errnoval) {
+ret = ERROR_FAIL;
+goto err;
+}
+
+dc->readwhat = "";
+dc->readfd = -1;
+
+if (stream->padding) {
+assert(stream->padding < (1U << REC_ALIGN_ORDER));
+
+dc->copywhat = "emulator padding";
+dc->writewhat = "save/migration stream";
+dc->callback = emulator_padding_done;
+
+ret = libxl__datacopier_start(dc);
+if (ret)
+goto err;
+
+libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
+return;
+}
+
+emulator_padding_done(egc, dc, 0, 0);
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+static void emulator_padding_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval)
+{
+libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+STATE_AO_GC(stream->ao);
+int ret = 0;
+
+if (onwrite || errnoval) {
+ret = ERROR_FAIL;
+goto err;
+}
+
+write_end_record(egc, stream);
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+static void write_end_record(libxl__egc *egc,
+ libxl__stream_write_state *stream)
+{
+libxl__datacopier_state *dc = &stream->dc;
+STATE_AO_GC(stream->ao);
+struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
+int ret = 0;
+
+dc->copywhat = "suspend footer";
+dc->writewhat = "save/migration stream";
+dc->callback = end_record_done;
+
+ret = libxl__datacopier_start(dc);
+if (ret)
+goto err;
+
+libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+static void end_record_done(libxl__egc *egc,
+libxl__datacopier_state *dc,
+int onwrite, int errnoval)
+{
+libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+STATE_AO_GC(stream->ao);
+int ret = 0;
+
+if (onwrite || errnoval) {
+ret = ERROR_FAIL;
+goto err;
+}
+
+stream_success(egc, stream);

Re: [Xen-devel] [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
[...]

+
+static void stream_success(libxl__egc *egc,
+   libxl__stream_write_state *stream);
+static void stream_failed(libxl__egc *egc,
+  libxl__stream_write_state *stream, int ret);
+static void stream_done(libxl__egc *egc,
+libxl__stream_write_state *stream);
+
+static void check_stream_finished(libxl__egc *egc,
+  libxl__domain_suspend_state *dcs,


s/dcs/dss/


+  int rc, const char *what);
+
+/* Event callbacks for plain VM. */
+static void stream_header_done(libxl__egc *egc,
+   libxl__datacopier_state *dc,
+   int onwrite, int errnoval);
+static void libxc_header_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+/* libxl__xc_domain_save_done() lives here, event-order wise. */
+static void write_toolstack_record(libxl__egc *egc,
+   libxl__stream_write_state *stream);
+static void toolstack_record_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+static void write_emulator_record(libxl__egc *egc,
+  libxl__stream_write_state *stream);
+static void emulator_body_done(libxl__egc *egc,
+   libxl__datacopier_state *dc,
+   int onwrite, int errnoval);
+static void emulator_padding_done(libxl__egc *egc,
+  libxl__datacopier_state *dc,
+  int onwrite, int errnoval);
+static void write_end_record(libxl__egc *egc,
+ libxl__stream_write_state *stream);
+static void end_record_done(libxl__egc *egc,
+libxl__datacopier_state *dc,
+int onwrite, int errnoval);
+
+void libxl__stream_write_start(libxl__egc *egc,
+   libxl__stream_write_state *stream)
+{
+libxl__datacopier_state *dc = &stream->dc;
+STATE_AO_GC(stream->ao);
+struct libxl_sr_hdr hdr = { 0 };
+int ret = 0;
+
+assert(!stream->running);
+stream->running = true;
+
+memset(dc, 0, sizeof(*dc));
+dc->readwhat = "";
+dc->copywhat = "suspend header";
+dc->writewhat = "save/migration stream";
+dc->ao = ao;
+dc->readfd = -1;
+dc->writefd = stream->fd;
+dc->maxsz = INT_MAX;
+dc->bytes_to_read = INT_MAX;
+dc->callback = stream_header_done;
+
+ret = libxl__datacopier_start(dc);
+if (ret)
+goto err;
+
+hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
+hdr.version = htobe32(RESTORE_STREAM_VERSION);
+hdr.options = htobe32(0);
+
+libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
+return;
+
+ err:
+assert(ret);
+stream_failed(egc, stream, ret);
+}
+
+void libxl__stream_write_abort(libxl__egc *egc,
+   libxl__stream_write_state *stream, int rc)
+{
+stream_failed(egc, stream, rc);
+}
+
+static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
+{
+stream->rc = 0;
+stream->running = false;
+
+stream_done(egc, stream);
+}
+
+static void stream_failed(libxl__egc *egc,
+  libxl__stream_write_state *stream, int rc)
+{
+assert(rc);
+stream->rc = rc;
+
+if (stream->running) {
+stream->running = false;
+stream_done(egc, stream);
+}
+}
+
+static void stream_done(libxl__egc *egc,
+libxl__stream_write_state *stream)
+{
+libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+
+assert(!stream->running);
+
+check_stream_finished(egc, dss, stream->rc, "stream");
+}
+
+static void check_stream_finished(libxl__egc *egc,
+  libxl__domain_suspend_state *dss,
+  int rc, const char *what)
+{
+libxl__stream_write_state *stream = &dss->sws;
+STATE_AO_GC(dss->ao);
+
+LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
+
+if (rc && !stream->joined_rc) {
+bool skip = false;
+/* First reported failure from joining tasks.  Tear everything down */
+stream->joined_rc = rc;
+
+if (libxl__stream_write_inuse(&dss->sws)) {
+skip = true;
+libxl__stream_write_abort(egc, &dss->sws, rc);
+}
+
+if (libxl__save_helper_inuse(&dss->shs)) {
+skip = true;
+libxl__save_helper_abort(egc, &dss->shs);
+}
+
+/* There is at least one more active task to join - wait for its
+   callback */
+if ( skip )
+return;
+}
+
+if (libxl__stream_write_inuse(&dss->sws))
+LOG(DEBUG, "stream still in use");
+e

Re: [Xen-devel] [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

And call it when a checkpoint record is found in the libxc stream.

Signed-off-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxc/include/xenguest.h |3 +++
  tools/libxc/xc_sr_restore.c|   15 ++-
  tools/libxl/libxl_save_msgs_gen.pl |2 +-
  3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..b0d27ed 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -102,6 +102,9 @@ struct restore_callbacks {
  int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
  uint32_t size, void* data);

+/* A checkpoint record has been found in the stream */
+int (*checkpoint)(void* data);
+
  /* to be provided as the last argument to each callback function */
  void* data;
  };
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..5e0f817 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,5 +1,7 @@
  #include 

+#include 
+
  #include "xc_sr_common.h"

  /*
@@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  static int handle_checkpoint(struct xc_sr_context *ctx)
  {
  xc_interface *xch = ctx->xch;
-int rc = 0;
+int rc = 0, ret;
  unsigned i;

  if ( !ctx->restore.checkpointed )
@@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
  goto err;
  }

+ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);
+if ( ret )
+{
+rc = -1;
+goto err;
+}
+
  if ( ctx->restore.buffer_all_records )
  {
  IPRINTF("All records buffered");
@@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
  ctx.restore.checkpointed = checkpointed_stream;
  ctx.restore.callbacks = callbacks;

+/* Sanity checks for callbacks. */
+if (checkpointed_stream)


coding style


+assert(callbacks->checkpoint);
+
  IPRINTF("In experimental %s", __func__);
  DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
  ", checkpointed_stream %d", io_fd, dom, hvm, pae,
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..36b279e 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -25,7 +25,7 @@ our @msgs = (
  'unsigned long', 'total'] ],
  [  3, 'scxA',   "suspend", [] ],
  [  4, 'scxA',   "postcopy", [] ],
-[  5, 'scxA',   "checkpoint", [] ],
+[  5, 'srcxA',   "checkpoint", [] ],
  [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
unsigned enable)] ],
  #toolstack_save  done entirely `by hand'



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 07/13] tools/libxl: Update libxl__domain_unpause() to support qemu-xen

2015-06-17 Thread Yang Hongyang



On 06/16/2015 12:22 AM, Wei Liu wrote:

On Mon, Jun 15, 2015 at 09:29:55AM +0800, Yang Hongyang wrote:



On 06/12/2015 08:33 PM, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:11AM +0800, Yang Hongyang wrote:

Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 


This looks very similar to an existing function called
libxl__domain_resume_device_model. Maybe you don't need to invent a new
function.


---
  tools/libxl/libxl.c | 42 +-
  1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d5691dc..5c843c2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -933,10 +933,37 @@ out:
  return AO_INPROGRESS;
  }

-int libxl__domain_unpause(libxl__gc *gc, uint32_t domid)
+static int libxl__domain_unpause_device_model(libxl__gc *gc, uint32_t domid)
  {
  char *path;
  char *state;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
+state = libxl__xs_read(gc, XBT_NULL, path);
+if (state != NULL && !strcmp(state, "paused")) {


The only difference between your function and
libxl__domain_unpause_device_model is the check for "state" node. I
think you can just add the check to libxl__domain_resume_device_model
and use that function.


I'm not sure if we change the existing function's behavior will affect the
existing callers, if there's no problem to do so, I will do as what you
said in the next version.



Qemu-dm currently has several states. libxl__domain_resume_device_model
doesn't check the state and writes unconditionally. I think checking
before writing would be an improvement.


fixed, thank you!



Wei.



Wei.
.



--
Thanks,
Yang.

.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state

2015-06-17 Thread Yang Hongyang



On 06/15/2015 09:44 PM, Andrew Cooper wrote:

Shortly more parameters will appear, and this saves unboxing each one.

No functional change.

Signed-off-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 


Reviewed-by: Yang Hongyang 


---
  tools/libxl/libxl_create.c   |   12 ++--
  tools/libxl/libxl_internal.h |2 +-
  tools/libxl/libxl_save_callout.c |2 +-
  3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..385891c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
   int rc, uint32_t domid);

  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-uint32_t *domid,
-int restore_fd, int checkpointed_stream,
+uint32_t *domid, int restore_fd,
+const libxl_domain_restore_params *params,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
@@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
  libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
  cdcs->dcs.restore_fd = restore_fd;
+if (params) cdcs->dcs.restore_params = *params;
  cdcs->dcs.callback = domain_create_cb;
-cdcs->dcs.checkpointed_stream = checkpointed_stream;
  libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
  cdcs->domid_out = domid;

@@ -1619,7 +1619,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how *aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, NULL,
  ao_how, aop_console_how);
  }

@@ -1629,8 +1629,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
  const libxl_asyncop_how *ao_how,
  const libxl_asyncprogress_how 
*aop_console_how)
  {
-return do_domain_create(ctx, d_config, domid, restore_fd,
-params->checkpointed_stream, ao_how, 
aop_console_how);
+return do_domain_create(ctx, d_config, domid, restore_fd, params,
+ao_how, aop_console_how);
  }

  /*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6226c18..796bd21 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
  libxl_domain_config *guest_config;
  libxl_domain_config guest_config_saved; /* vanilla config */
  int restore_fd;
+libxl_domain_restore_params restore_params;
  libxl__domain_create_cb *callback;
  libxl_asyncprogress_how aop_console_how;
  /* private to domain_create */
  int guest_domid;
-int checkpointed_stream;
  libxl__domain_build_state build_state;
  libxl__bootloader_state bl;
  libxl__stub_dm_spawn_state dmss;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 40b25e4..3585a84 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -59,7 +59,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, 
libxl__domain_create_state *dcs,
  state->store_domid, state->console_port,
  state->console_domid,
  hvm, pae, superpages,
-cbflags, dcs->checkpointed_stream,
+cbflags, dcs->restore_params.checkpointed_stream,
  };

  dcs->shs.ao = ao;



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 01/15] docs: add colo readme

2015-06-24 Thread Yang Hongyang



On 06/16/2015 06:56 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:45 +0800, Yang Hongyang wrote:

add colo readme, refer to
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

Signed-off-by: Yang Hongyang 


This is fine as far as it goes but I wonder if perhaps
docs/README.{remus,colo} ought to be moved into docs/misc, perhaps
converted to markdown (which should be trivial) and perhaps merged into
a single document about checkpointing?


Agreeed that we can add a checkpointing.txt to docs/misc, and describe
remus/COLO in that file. but can we do this later when COLO feature is
merged? at that time we can do this within one patch.



The reason for the move is twofold, first it is a bit a typical for docs
to live in the top-level docs dir and secondly moving it into misc will
cause it to appear automatically at
http://xenbits.xen.org/docs/unstable/ etc.

Ian.

---
  docs/README.colo | 9 +
  1 file changed, 9 insertions(+)
  create mode 100644 docs/README.colo

diff --git a/docs/README.colo b/docs/README.colo
new file mode 100644
index 000..466eb72
--- /dev/null
+++ b/docs/README.colo
@@ -0,0 +1,9 @@
+COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop Service)
+project is a high availability solution. Both primary VM (PVM) and secondary VM
+(SVM) run in parallel. They receive the same request from client, and generate
+response in parallel too. If the response packets from PVM and SVM are
+identical, they are released immediately. Otherwise, a VM checkpoint (on 
demand)
+is conducted.
+
+See the website at http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
+for details.



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] Remus: update email address in MAINTAINERS file

2016-03-01 Thread Yang Hongyang
From: Yang Hongyang 

Signed-off-by: Yang Hongyang 
Cc: Shriram Rajagopalan 
Cc: Wei Liu 
Cc: Ian Jackson 
Cc: Ian Campbell 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index db14cfe..27280a4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -275,7 +275,7 @@ T:  git git://xenbits.xen.org/qemu-xen.git
 
 REMUS
 M: Shriram Rajagopalan 
-M: Yang Hongyang 
+M: Yang Hongyang 
 S: Maintained
 F: docs/README.remus
 F: tools/blktap2/drivers/block-remus.c
-- 
2.6.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 COLOPre 11/13] tools/libxl: rename remus device to checkpoint device

2015-06-24 Thread Yang Hongyang



On 06/16/2015 06:53 PM, Ian Campbell wrote:

On Mon, 2015-06-15 at 17:24 +0100, Wei Liu wrote:

On Mon, Jun 15, 2015 at 09:45:54AM +0800, Yang Hongyang wrote:



On 06/12/2015 10:57 PM, Ian Jackson wrote:

Wei Liu writes ("Re: [Xen-devel] [PATCH v2 COLOPre 11/13] tools/libxl: rename remus 
device to checkpoint device"):

On Fri, Jun 12, 2015 at 02:30:46PM +0100, Wei Liu wrote:

On Mon, Jun 08, 2015 at 11:43:15AM +0800, Yang Hongyang wrote:

-(-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
-(-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+(-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
+(-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),


You should add two new error numbers.



And in that case you might also need to go through all places to make
sure the correct error numbers are return. I.e. old remus code path
still returns REMUS error code and new CHECKPOINT code path returns new
error code.

I merely speak from API backward compatibility point of view. If you
think what I suggest doesn't make sense, please let me know.


To me this line of reasons prompts me to ask: what would be wrong with
leaving the word REMUS in the error names, and simply updating the
descriptions ?

After all AFIACT the circumstances are very similar.  I don't think it
makes sense to require libxl to do something like
rc = were_we_doing_colo_not_remus ? CHECKPOINT_BLAH : REMUS_BLAH;

Please to contradict me if I have misunderstood...


COLO and REMUS both are checkpoint device. We use checkpoint device layer
as a more abstract layer for both COLO and REMUS, come to the error code,
these can be used by both COLO and REMUS. So we don't distinguish if we
are doing COLO or REMUS, uses are aware of what they're executing(colo
or remus).



Right. So continue using REMUS_ error code is fine.


Seems like it would also be OK to switch the name and then in libxl,h

#ifdef LIB_API_VERSION < 0xWHENEVER
#define REMUS_BLAH CHECKPOINT_BLAH
#define ...
#endif

_If_ we think the new names make more sense going fwd...


Well, I think the new names are better, I also think it is safe to just rename
them, I don't find any other users using these error codes except Remus/COLO,
it is only used by Remus/COLO internally.




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 COLO 07/15] implement the cmdline for COLO

2015-06-24 Thread Yang Hongyang



On 06/16/2015 07:19 PM, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:45 +0800, Yang Hongyang wrote:

From: Wen Congyang 

Add a new option -c to the command 'xl remus'. If you want
to use COLO HA instead of Remus HA, please use -c option.

Update man pages to reflect the addition of a new option to
'xl remus' command.

Also add a new option -c to the internal command 'xl migrate-receive'.


I asked about whether COLO was an extension or a peer to Remus in an
earlier patch. the answer may have an impact here too.


We implemented COLO based on Remus, so we assume it is an extension to Remus.




@@ -498,6 +501,11 @@ Disable network output buffering. Requires enabling unsafe 
mode.

  Disable disk replication. Requires enabling unsafe mode.

+=item B<-c>
+
+Enable COLO HA. It is conflict with B<-i> and B<-b>, and memory


"It conflicts with" or "This conflicts with".


+checkpoint compression must be disabled.
+
  =back

  =item B I
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 1145ae4..7df2466 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -811,6 +811,22 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
  goto out;
  }

+/* The caller must set this defbool */
+if (libxl_defbool_is_default(info->colo)) {
+LOG(ERROR, "colo mode must be enabled/disabled");


As I wondered earlier -- this suggests it should not be a defbool, or
that the interfaces should split.


+rc = ERROR_FAIL;
+goto out;
+}
+
+if (libxl_defbool_val(info->colo)) {
+libxl_defbool_setdefault(&info->compression, false);


Assuming this isn't invalidated by the above comments, you should make
the existing:
  libxl_defbool_setdefault(&info->compression, true);
into
  libxl_defbool_setdefault(&info->compression, libxl_defbool_val(colo));

and then do an error check later.


+if (libxl_defbool_val(info->compression)) {
+LOG(ERROR, "cannot use memory checkpoint compression in COLO 
mode");
+rc = ERROR_FAIL;
+goto out;
+}
+}
+
  libxl_defbool_setdefault(&info->allow_unsafe, false);
  libxl_defbool_setdefault(&info->blackhole, false);
  libxl_defbool_setdefault(&info->compression, true);
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index adfadd1..4bbadd3 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4273,6 +4273,9 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
  dom_info.send_fd = send_fd;
  dom_info.migration_domname_r = &migration_domname;
  dom_info.checkpointed_stream = remus;
+if (remus == LIBXL_CHECKPOINTED_STREAM_COLO)
+/* COLO uses stdout to send control message to master */
+dom_info.quiet = 1;


Please set a const char * to either "COLO" or "Remus" here and use it
everywhere you've currently got an open coded decision on that.



  rc = create_domain(&dom_info);
  if (rc < 0) {
@@ -4287,7 +4290,8 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
  /* If we are here, it means that the sender (primary) has crashed.
   * TODO: Split-Brain Check.
   */
-fprintf(stderr, "migration target: Remus Failover for domain %u\n",
+fprintf(stderr, "migration target: %s Failover for domain %u\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : "Remus",
  domid);

  /*
@@ -4304,15 +4308,21 @@ static void migrate_receive(int debug, int daemonize, 
int monitor,
  rc = libxl_domain_rename(ctx, domid, migration_domname,
   common_domname);
  if (rc)
-fprintf(stderr, "migration target (Remus): "
+fprintf(stderr, "migration target (%s): "
  "Failed to rename domain from %s to %s:%d\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : 
"Remus",
  migration_domname, common_domname, rc);
  }

+if (remus == LIBXL_CHECKPOINTED_STREAM_COLO)
+/* The guest is running after failover in COLO mode */
+exit(rc ? -ERROR_FAIL: 0);
+
  rc = libxl_domain_unpause(ctx, domid);
  if (rc)
-fprintf(stderr, "migration target (Remus): "
+fprintf(stderr, "migration target (%s): "
  "Failed to unpause domain %s (id: %u):%d\n",
+remus == LIBXL_CHECKPOINTED_STREAM_COLO ? "COLO" : "Remus",
  common_domname, domid, rc);

  exit(rc ? -ERROR_FAIL: 0);
@@ -4458,7 +4468,

Re: [Xen-devel] [PATCH v6 COLO 10/15] COLO proxy: implement setup/teardown of COLO proxy module

2015-06-24 Thread Yang Hongyang



On 06/16/2015 07:26 PM, Ian Campbell wrote:

On Tue, 2015-06-16 at 12:24 +0100, Ian Campbell wrote:

On Mon, 2015-06-08 at 11:45 +0800, Yang Hongyang wrote:

setup/teardown of COLO proxy module.
we use netlink to communicate with proxy module.


What is a COLO proxy module and where would one get hold of such a
thing?

Is this a new kernel feature with a patch? If so then please link to its
posting to the appropriate upstream and indicate what you understand of
its progress upstream.

(I seem to remember discussing a COLO networking component at the
hackathon which seemed like it could be done using existing components,
is that this?)


IIRC the existing component I was thinking of was
http://www.netfilter.org/projects/libnetfilter_queue/ which allows
userspace to do pretty advanced filtering, queueing, gating, delaying
etc of packets.


The reason we are not using userspace solution is that we worried about
the performance. There will be huge amount of packets pass through, the
context switch cost will be an overhead. The colo-proxy module:
https://lkml.org/lkml/2015/6/18/32



Ian.

.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 01/26] tools/libxl: rename libxl__domain_suspend to libxl__domain_save

2015-06-24 Thread Yang Hongyang
Rename libxl__domain_suspend() to libxl__domain_save() since it
actually do the save domain work.

This results in some strangeness in that some functions called *save*
are now passed a struct called *suspend*, this is temporary and is all
fixed up later by the refactoring of the suspend_state.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
 tools/libxl/libxl.c  |  4 ++--
 tools/libxl/libxl_dom.c  | 14 +++---
 tools/libxl/libxl_internal.h |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9117b01..5a70062 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -911,7 +911,7 @@ static void libxl__remus_setup_done(libxl__egc *egc,
 STATE_AO_GC(dss->ao);
 
 if (!rc) {
-libxl__domain_suspend(egc, dss);
+libxl__domain_save(egc, dss);
 return;
 }
 
@@ -978,7 +978,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, 
int fd, int flags,
 dss->live = flags & LIBXL_SUSPEND_LIVE;
 dss->debug = flags & LIBXL_SUSPEND_DEBUG;
 
-libxl__domain_suspend(egc, dss);
+libxl__domain_save(egc, dss);
 return AO_INPROGRESS;
 
  out_err:
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 43915a2..9d9e409 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1109,8 +1109,8 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *buf,
 
 static void stream_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int rc);
-static void domain_suspend_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int rc);
+static void domain_save_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int ok);
 
@@ -1960,7 +1960,7 @@ static void remus_next_checkpoint(libxl__egc *egc, 
libxl__ev_time *ev,
 
 /*- main code for suspending, in order of execution -*/
 
-void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
+void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
 {
 STATE_AO_GC(dss->ao);
 int port;
@@ -2045,13 +2045,13 @@ void libxl__domain_suspend(libxl__egc *egc, 
libxl__domain_suspend_state *dss)
 return;
 
  out:
-domain_suspend_done(egc, dss, rc);
+domain_save_done(egc, dss, rc);
 }
 
 static void stream_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int rc)
 {
-domain_suspend_done(egc, dss, rc);
+domain_save_done(egc, dss, rc);
 }
 
 static void save_device_model_datacopier_done(libxl__egc *egc,
@@ -2150,8 +2150,8 @@ static void remus_teardown_done(libxl__egc *egc,
libxl__remus_devices_state *rds,
int rc);
 
-static void domain_suspend_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int rc)
+static void domain_save_done(libxl__egc *egc,
+ libxl__domain_suspend_state *dss, int rc)
 {
 STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e0f6e09..19ebaab 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3264,8 +3264,8 @@ struct libxl__domain_create_state {
 /*- Domain suspend (save) functions -*/
 
 /* calls dss->callback when done */
-_hidden void libxl__domain_suspend(libxl__egc *egc,
-   libxl__domain_suspend_state *dss);
+_hidden void libxl__domain_save(libxl__egc *egc,
+libxl__domain_suspend_state *dss);
 
 
 /* calls libxl__xc_domain_suspend_done when done */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 02/26] tools/libxl: move domain suspend code into libxl_dom_suspend.c

2015-06-24 Thread Yang Hongyang
Move domain suspend code into a separate file libxl_dom_suspend.c.
Add an API libxl__domain_suspend() which wraps the static
function domain_suspend_callback_common() for internal use.
Export the existing API libxl__domain_suspend_callback() used by
libxc to suspend the guest during migration.

Note that the newly added file libxl_dom_suspend.c is used for
suspend/resume code.

Signed-off-by: Yang Hongyang 
CC: Ian Jackson 
CC: Wei Liu 
Acked-by: Ian Campbell 
---
 tools/libxl/Makefile|   2 +-
 tools/libxl/libxl_dom.c | 346 +---
 tools/libxl/libxl_dom_suspend.c | 381 
 tools/libxl/libxl_internal.h|   6 +
 4 files changed, 389 insertions(+), 346 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_suspend.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 63e32f7..e98e26f 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -96,7 +96,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
libxl_stream_read.o libxl_stream_write.o \
libxl_save_callout.o _libxl_save_msgs_callout.o \
-   libxl_convert_callout.o \
+   libxl_convert_callout.o libxl_dom_suspend.o \
libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 9d9e409..3b02562 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -,8 +,6 @@ static void stream_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int rc);
 static void domain_save_done(libxl__egc *egc,
  libxl__domain_suspend_state *dss, int rc);
-static void domain_suspend_callback_common_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int ok);
 
 /*- complicated callback, called by xc_domain_save -*/
 
@@ -1328,37 +1326,6 @@ static void switch_logdirty_done(libxl__egc *egc,
 libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, broke);
 }
 
-/*- callbacks, called by xc_domain_save -*/
-
-int libxl__domain_suspend_device_model(libxl__gc *gc,
-   libxl__domain_suspend_state *dss)
-{
-int ret = 0;
-uint32_t const domid = dss->domid;
-const char *const filename = dss->dm_savefile;
-
-switch (libxl__device_model_version_running(gc, domid)) {
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-LOG(DEBUG, "Saving device model state to %s", filename);
-libxl__qemu_traditional_cmd(gc, domid, "save");
-libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, 
NULL, NULL);
-break;
-}
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-if (libxl__qmp_stop(gc, domid))
-return ERROR_FAIL;
-/* Save DM state into filename */
-ret = libxl__qmp_save(gc, domid, filename);
-if (ret)
-unlink(filename);
-break;
-default:
-return ERROR_INVAL;
-}
-
-return ret;
-}
-
 int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
 {
 
@@ -1379,301 +1346,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, 
uint32_t domid)
 return 0;
 }
 
-static void domain_suspend_common_wait_guest(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_guest_suspended(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-
-static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
-  libxl__xswait_state *xswa, int rc, const char *state);
-static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
-libxl__ev_evtchn *evev);
-static void suspend_common_wait_guest_watch(libxl__egc *egc,
-  libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
-static void suspend_common_wait_guest_check(libxl__egc *egc,
-libxl__domain_suspend_state *dss);
-static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-  libxl__ev_time *ev, const struct timeval *requested_abs);
-
-static void domain_suspend_common_failed(libxl__egc *egc,
- libxl__domain_suspend_state *dss);
-static void domain_suspend_common_done(libxl__egc *egc,
-   libxl__domain_suspend_state *dss,
-   bool ok);
-
-static bool domain_suspend_pvcontrol_acked(const char *state) {
-/* any value other than "suspend", including ENOENT (i.e. !state), is OK */
-if (!state) return 1;
- 

[Xen-devel] [PATCH v3 COLOPre 00/26] Prerequisite patches for COLO

2015-06-24 Thread Yang Hongyang
This patchset is Prerequisite for COLO feature. For what COLO is, refer
to http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

This patchse is based on Andrew Cooper's Libxl migration v2:
  
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/libxl-migv2-v1

I only did the compile test because Remus on Libxl migration v2 still
need to be fixed.

You can also get the patchset from:
  https://github.com/macrosheep/xen/tree/colo-v7

v2->v3:
 - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
   for easy review
 - Addressed review comments
 - Add back channel to libxc
 - Introduce should_checkpoint callback
 - Introduce DIRTY_BITMAP record on libxc side
 - Introduce COLO_CONTEXT record on libxl side
 - Ported to Libxl migration v2

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record


Wen Congyang (5):
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: Add back channel to allow migration target send data back
  tools/libxl: refactor write stream to support back channel
  tools/libxl: refactor read stream to support back channel
  docs/libxl: Introduce COLO_CONTEXT to support migration v2 colo
streams

Yang Hongyang (21):
  tools/libxl: rename libxl__domain_suspend to libxl__domain_save
  tools/libxl: move domain suspend code into libxl_dom_suspend.c
  tools/libxl: move domain resume code into libxl_dom_suspend.c
  tools/libxl: move remus code into libxl_remus.c
  tools/libxl: move save/restore code into libxl_dom_save.c
  libxl/save: Refactor libxl__domain_suspend_state
  libxc/restore: fix error handle of process_record
  tools/libxl: introduce enum type libxl_checkpointed_stream
  migration/save: pass checkpointed_stream from libxl to libxc
  tools/libxl: introduce a new API libxl__domain_restore() to load qemu
state
  tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  tools/libxl: export logdirty_init
  tools/libx{l,c}: add back channel to libxc
  tools/libx{l,c}: introduce should_checkpoint callback
  tools/libx{l,c}: add postcopy/suspend callback to restore side
  libxc/migration: Specification update for DIRTY_BITMAP records
  libxc/migration: export read_record for common use
  tools/libxl: rename remus device to checkpoint device
  tools/libxl: adjust the indentation
  tools/libxl: don't touch remus in checkpoint_device

 docs/specs/libxc-migration-stream.pandoc |   23 +-
 docs/specs/libxl-migration-stream.pandoc |   21 +-
 tools/libxc/include/xenguest.h   |   41 +-
 tools/libxc/xc_domain_restore.c  |4 +-
 tools/libxc/xc_domain_save.c |6 +-
 tools/libxc/xc_nomigrate.c   |3 +-
 tools/libxc/xc_resume.c  |   22 +-
 tools/libxc/xc_sr_common.c   |   50 ++
 tools/libxc/xc_sr_common.h   |   16 +-
 tools/libxc/xc_sr_restore.c  |   93 +--
 tools/libxc/xc_sr_save.c |5 +-
 tools/libxc/xc_sr_stream_format.h|1 +
 tools/libxl/Makefile |4 +-
 tools/libxl/libxl.c  |  119 +--
 tools/libxl/libxl.h  |   29 +-
 tools/libxl/libxl_checkpoint_device.c|  282 +++
 tools/libxl/libxl_create.c   |   41 +-
 tools/libxl/libxl_dom.c  | 1171 --
 tools/libxl/libxl_dom_save.c |  713 ++
 tools/libxl/libxl_dom_suspend.c  |  446 
 tools/libxl/libxl_internal.h |  250 ---
 tools/libxl/libxl_netbuffer.c|  117 +--
 tools/libxl/libxl_nonetbuffer.c  |   10 +-
 tools/libxl/libxl_qmp.c  |   10 +
 tools/libxl/libxl_remus.c|  386 ++
 tools/libxl/libxl_remus_device.c |  327 -
 tools/libxl/libxl_remus_disk_drbd.c  |   56 +-
 tools/libxl/libxl_save_callout.c |   27 +-
 tools/libxl/libxl_save_helper.c  |9 +-
 tools/libxl/libxl_save_msgs_gen.pl   |   11 +-
 tools/libxl/libxl_sr_stream_format.h |   11 +
 tools/libxl/libxl_stream_read.c  |   33 +-
 tools/libxl/libxl_stream_write.c |   46 +-
 tools/libxl/libxl_types.idl  |   10 +-
 tools/libxl/xl_cmdimpl.c |   21 +-
 tools/python/xen/migration/libxl.py  |9 +
 36 files changed, 2465 insertions(+), 1958 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_dom_save.c
 create mode 100644 tools/libxl/libxl_dom_suspend.c
 create mode 100644 tools/libxl/libxl_remus.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 03/26] tools/libxl: move domain resume code into libxl_dom_suspend.c

2015-06-24 Thread Yang Hongyang
move domain resume code into libxl_dom_suspend.c.
pure code motion except for making
libxl__domain_resume_device_model() static.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl.c | 33 -
 tools/libxl/libxl_dom.c | 20 ---
 tools/libxl/libxl_dom_suspend.c | 55 +
 tools/libxl/libxl_internal.h|  1 -
 4 files changed, 55 insertions(+), 54 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5a70062..1f52bf0 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -510,39 +510,6 @@ int libxl_domain_rename(libxl_ctx *ctx, uint32_t domid,
 return rc;
 }
 
-int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
-{
-int rc = 0;
-
-if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
-LOGE(ERROR, "xc_domain_resume failed for domain %u", domid);
-rc = ERROR_FAIL;
-goto out;
-}
-
-libxl_domain_type type = libxl__domain_type(gc, domid);
-if (type == LIBXL_DOMAIN_TYPE_INVALID) {
-rc = ERROR_FAIL;
-goto out;
-}
-
-if (type == LIBXL_DOMAIN_TYPE_HVM) {
-rc = libxl__domain_resume_device_model(gc, domid);
-if (rc) {
-LOG(ERROR, "failed to resume device model for domain %u:%d",
-domid, rc);
-goto out;
-}
-}
-
-if (!xs_resume_domain(CTX->xsh, domid)) {
-LOGE(ERROR, "xs_resume_domain failed for domain %u", domid);
-rc = ERROR_FAIL;
-}
-out:
-return rc;
-}
-
 int libxl_domain_resume(libxl_ctx *ctx, uint32_t domid, int suspend_cancel,
 const libxl_asyncop_how *ao_how)
 {
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 3b02562..f457f72 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1326,26 +1326,6 @@ static void switch_logdirty_done(libxl__egc *egc,
 libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, broke);
 }
 
-int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
-{
-
-switch (libxl__device_model_version_running(gc, domid)) {
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-libxl__qemu_traditional_cmd(gc, domid, "continue");
-libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, 
NULL, NULL);
-break;
-}
-case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-if (libxl__qmp_resume(gc, domid))
-return ERROR_FAIL;
-break;
-default:
-return ERROR_INVAL;
-}
-
-return 0;
-}
-
 static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
  uint32_t domid,
  char *phys_offset, char *node)
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 4edf936..5f1f4fd 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -372,6 +372,61 @@ static void domain_suspend_callback_common_done(libxl__egc 
*egc,
 {
 libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
 }
+
+/*=== Domain resume */
+
+static int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
+{
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+libxl__qemu_traditional_cmd(gc, domid, "continue");
+libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, 
NULL, NULL);
+break;
+}
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+if (libxl__qmp_resume(gc, domid))
+return ERROR_FAIL;
+break;
+default:
+return ERROR_INVAL;
+}
+
+return 0;
+}
+
+int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
+{
+int rc = 0;
+
+if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
+LOGE(ERROR, "xc_domain_resume failed for domain %u", domid);
+rc = ERROR_FAIL;
+goto out;
+}
+
+libxl_domain_type type = libxl__domain_type(gc, domid);
+if (type == LIBXL_DOMAIN_TYPE_INVALID) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+if (type == LIBXL_DOMAIN_TYPE_HVM) {
+rc = libxl__domain_resume_device_model(gc, domid);
+if (rc) {
+LOG(ERROR, "failed to resume device model for domain %u:%d",
+domid, rc);
+goto out;
+}
+}
+
+if (!xs_resume_domain(CTX->xsh, domid)) {
+LOGE(ERROR, "xs_resume_domain failed for domain %u", domid);
+rc = ERROR_FAIL;
+}
+out:
+return rc;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5

[Xen-devel] [PATCH v3 COLOPre 04/26] tools/libxl: move remus code into libxl_remus.c

2015-06-24 Thread Yang Hongyang
Do the following things:
- There are 2 checkpoint callbacks, rename them to:
  libxl__remus_domain_{save/restore}_checkpoint_callback
- Moving remus callbacks into libxl_remus.c, export following
  callbacks for internal use:
  * libxl__remus_domain_suspend_callback
  * libxl__remus_domain_resume_callback
  * libxl__remus_domain_save_checkpoint_callback
  * libxl__remus_domain_restore_checkpoint_callback
- Refactoring Remus setup/teardown, introduce 2 APIs to
  setup/teardown Remus:
  * libxl__remus_setup
  * libxl__remus_teardown

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/Makefile |   2 +-
 tools/libxl/libxl.c  |  55 +---
 tools/libxl/libxl_create.c   |  24 +---
 tools/libxl/libxl_dom.c  | 204 +--
 tools/libxl/libxl_internal.h |  12 ++
 tools/libxl/libxl_remus.c| 325 +++
 6 files changed, 342 insertions(+), 280 deletions(-)
 create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index e98e26f..0c04ae7 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -56,7 +56,7 @@ else
 LIBXL_OBJS-y += libxl_nonetbuffer.o
 endif
 
-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 1f52bf0..b939f2f 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -792,10 +792,6 @@ out:
 return ptr;
 }
 
-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc);
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
   libxl__domain_suspend_state *dss, int rc);
 
@@ -844,63 +840,14 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 
 assert(info);
 
-/* Convenience aliases */
-libxl__remus_devices_state *const rds = &dss->rds;
-
-if (libxl_defbool_val(info->netbuf)) {
-if (!libxl__netbuffer_enabled(gc)) {
-LOG(ERROR, "Remus: No support for network buffering");
-rc = ERROR_FAIL;
-goto out;
-}
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-}
-
-if (libxl_defbool_val(info->diskbuf))
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-rds->ao = ao;
-rds->domid = domid;
-rds->callback = libxl__remus_setup_done;
-
 /* Point of no return */
-libxl__remus_devices_setup(egc, rds);
+libxl__remus_setup(egc, dss);
 return AO_INPROGRESS;
 
  out:
 return AO_ABORT(rc);
 }
 
-static void libxl__remus_setup_done(libxl__egc *egc,
-libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (!rc) {
-libxl__domain_save(egc, dss);
-return;
-}
-
-LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-dss->domid, rc);
-rds->callback = libxl__remus_setup_failed;
-libxl__remus_devices_teardown(egc, rds);
-}
-
-static void libxl__remus_setup_failed(libxl__egc *egc,
-  libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (rc)
-LOG(ERROR, "Remus: failed to teardown device after setup failed"
-" for guest with domid %u, rc %d", dss->domid, rc);
-
-dss->callback(egc, dss, rc);
-}
-
 static void remus_failover_cb(libxl__egc *egc,
   libxl__domain_suspend_state *dss, int rc)
 {
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index ac918bd..dfea992 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -747,27 +747,6 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
 libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
-/*- remus asynchronous checkpoint callback -*/
-
-static void remus_checkpoint_stream_done(
-libxl__egc *egc, libxl__domain_create_state *dcs, int rc);
-
-static void libxl__remus_domain_checkpoint_callback(void *data)
-{
-libxl__save_helper_state *shs = data;
-libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
-libxl__egc *egc = dcs->shs.egc;
-STATE_AO_GC(dcs->ao);
-
-libxl__stream_read_start_checkpoint(egc, &dcs->s

[Xen-devel] [PATCH v3 COLOPre 09/26] tools/libxl: introduce enum type libxl_checkpointed_stream

2015-06-24 Thread Yang Hongyang
introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl_types.idl |  5 +
 tools/libxl/xl_cmdimpl.c| 13 +++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 7418d92..7c82f33 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -198,6 +198,11 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (3, "reference_tsc"),
 ])
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+(0, "NONE"),
+(1, "REMUS"),
+])
+
 #
 # Complex libxl types
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 35bc26d..c965ef5 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4249,7 +4249,7 @@ static void migrate_domain(uint32_t domid, const char 
*rune, int debug,
 }
 
 static void migrate_receive(int debug, int daemonize, int monitor,
-int send_fd, int recv_fd, int remus)
+int send_fd, int recv_fd, int checkpointed)
 {
 uint32_t domid;
 int rc, rc2;
@@ -4274,7 +4274,7 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 dom_info.paused = 1;
 dom_info.migrate_fd = recv_fd;
 dom_info.migration_domname_r = &migration_domname;
-dom_info.checkpointed_stream = remus;
+dom_info.checkpointed_stream = checkpointed;
 
 rc = create_domain(&dom_info);
 if (rc < 0) {
@@ -4285,7 +4285,7 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 
 domid = rc;
 
-if (remus) {
+if (checkpointed) {
 /* If we are here, it means that the sender (primary) has crashed.
  * TODO: Split-Brain Check.
  */
@@ -4456,7 +4456,8 @@ int main_restore(int argc, char **argv)
 
 int main_migrate_receive(int argc, char **argv)
 {
-int debug = 0, daemonize = 1, monitor = 1, remus = 0;
+int debug = 0, daemonize = 1, monitor = 1;
+int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
 int opt;
 
 SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
@@ -4471,7 +4472,7 @@ int main_migrate_receive(int argc, char **argv)
 debug = 1;
 break;
 case 'r':
-remus = 1;
+checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
 break;
 }
 
@@ -4481,7 +4482,7 @@ int main_migrate_receive(int argc, char **argv)
 }
 migrate_receive(debug, daemonize, monitor,
 STDOUT_FILENO, STDIN_FILENO,
-remus);
+checkpointed);
 
 return 0;
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 05/26] tools/libxl: move save/restore code into libxl_dom_save.c

2015-06-24 Thread Yang Hongyang
This is purely code motion.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/Makefile |   2 +-
 tools/libxl/libxl_dom.c  | 607 -
 tools/libxl/libxl_dom_save.c | 634 +++
 3 files changed, 635 insertions(+), 608 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_save.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 0c04ae7..d61c191 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -96,7 +96,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
libxl_stream_read.o libxl_stream_write.o \
libxl_save_callout.o _libxl_save_msgs_callout.o \
-   libxl_convert_callout.o libxl_dom_suspend.o \
+   libxl_convert_callout.o libxl_dom_suspend.o 
libxl_dom_save.o \
libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 6f4cda8..ccab2f3 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1024,613 +1024,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t 
domid,
 return libxl__xs_write(gc, XBT_NULL, path, "%s", cmd);
 }
 
-struct libxl__physmap_info {
-uint64_t phys_offset;
-uint64_t start_addr;
-uint64_t size;
-uint32_t namelen;
-char name[];
-};
-
-#define TOOLSTACK_SAVE_VERSION 1
-
-static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid,
-   uint32_t domid,
-   uint64_t phys_offset, char *node)
-{
-return libxl__device_model_xs_path(gc, dm_domid, domid,
-   "/physmap/%"PRIx64"/%s",
-   phys_offset, node);
-}
-
-int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
- uint32_t size, void *user)
-{
-libxl__save_helper_state *shs = user;
-libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
-STATE_AO_GC(dcs->ao);
-int i, ret;
-const uint8_t *ptr = buf;
-uint32_t count = 0, version = 0;
-struct libxl__physmap_info* pi;
-char *xs_path;
-uint32_t dm_domid;
-
-LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, size);
-
-if (size < sizeof(version) + sizeof(count)) {
-LOG(ERROR, "wrong size");
-return -1;
-}
-
-memcpy(&version, ptr, sizeof(version));
-ptr += sizeof(version);
-
-if (version != TOOLSTACK_SAVE_VERSION) {
-LOG(ERROR, "wrong version");
-return -1;
-}
-
-memcpy(&count, ptr, sizeof(count));
-ptr += sizeof(count);
-
-if (size < sizeof(version) + sizeof(count) +
-count * (sizeof(struct libxl__physmap_info))) {
-LOG(ERROR, "wrong size");
-return -1;
-}
-
-dm_domid = libxl_get_stubdom_id(CTX, domid);
-for (i = 0; i < count; i++) {
-pi = (struct libxl__physmap_info*) ptr;
-ptr += sizeof(struct libxl__physmap_info) + pi->namelen;
-
-xs_path = restore_helper(gc, dm_domid, domid,
- pi->phys_offset, "start_addr");
-ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->start_addr);
-if (ret)
-return -1;
-xs_path = restore_helper(gc, dm_domid, domid, pi->phys_offset, "size");
-ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->size);
-if (ret)
-return -1;
-if (pi->namelen > 0) {
-xs_path = restore_helper(gc, dm_domid, domid,
- pi->phys_offset, "name");
-ret = libxl__xs_write(gc, 0, xs_path, "%s", pi->name);
-if (ret)
-return -1;
-}
-}
-return 0;
-}
-
-/* Domain suspend (save) */
-
-static void stream_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int rc);
-static void domain_save_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
-
-/*- complicated callback, called by xc_domain_save -*/
-
-/*
- * We implement the other end of protocol for controlling qemu-dm's
- * logdirty.  There is no documentation for this protocol, but our
- * counterparty's implementation is in
- * qemu-xen-traditional.git:xenstore.c in the function
- * xenstore_process_logdirty_event
- */
-
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_t

[Xen-devel] [PATCH v3 COLOPre 06/26] libxl/save: Refactor libxl__domain_suspend_state

2015-06-24 Thread Yang Hongyang
Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.

After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
 tools/libxl/libxl.c  |  10 +--
 tools/libxl/libxl_dom_save.c |  74 +
 tools/libxl/libxl_dom_suspend.c  | 167 ---
 tools/libxl/libxl_internal.h |  55 +++--
 tools/libxl/libxl_netbuffer.c|   2 +-
 tools/libxl/libxl_remus.c|  39 -
 tools/libxl/libxl_save_callout.c |   2 +-
 tools/libxl/libxl_stream_write.c |  26 +++---
 8 files changed, 199 insertions(+), 176 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index b939f2f..8b63eff 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -793,7 +793,7 @@ out:
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc);
+  libxl__domain_save_state *dss, int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -801,7 +801,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
  const libxl_asyncop_how *ao_how)
 {
 AO_CREATE(ctx, domid, ao_how);
-libxl__domain_suspend_state *dss;
+libxl__domain_save_state *dss;
 int rc;
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -849,7 +849,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc)
+  libxl__domain_save_state *dss, int rc)
 {
 STATE_AO_GC(dss->ao);
 /*
@@ -861,7 +861,7 @@ static void remus_failover_cb(libxl__egc *egc,
 }
 
 static void domain_suspend_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc)
+  libxl__domain_save_state *dss, int rc)
 {
 STATE_AO_GC(dss->ao);
 libxl__ao_complete(egc,ao,rc);
@@ -880,7 +880,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, 
int fd, int flags,
 goto out_err;
 }
 
-libxl__domain_suspend_state *dss;
+libxl__domain_save_state *dss;
 GCNEW(dss);
 
 dss->ao = ao;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 9ee74ba..93061c7 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -30,9 +30,10 @@ struct libxl__physmap_info {
 /*= Domain save */
 
 static void stream_done(libxl__egc *egc,
-libxl__domain_suspend_state *dss, int rc);
+libxl__domain_save_state *dss, int rc);
 static void domain_save_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
+ libxl__domain_save_state *dss, int rc);
+
 
 /*- complicated callback, called by xc_domain_save -*/
 
@@ -49,7 +50,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, 
libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int ok);
+ libxl__domain_save_state *dss, int ok);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -63,7 +64,7 @@ static void 
domain_suspend_switch_qemu_xen_traditional_logdirty
 libxl__save_helper_state *shs)
 {
 libxl__egc *egc = shs->egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
 libxl__logdirty_switch *lds = &dss->logdirty;
 STATE_AO_GC(dss->ao);
 int rc;
@@ -135,7 +136,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
 libxl__save_helper_state *shs)
 {
 libxl__egc *egc = shs->egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
 STATE_AO_GC(dss->ao);
 int rc;
 
@@ -153,7 +154,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
 {
 libxl__save_helper_state *shs = user;
 libxl__egc *egc = shs->egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_save_state *ds

[Xen-devel] [PATCH v3 COLOPre 07/26] libxc/restore: fix error handle of process_record

2015-06-24 Thread Yang Hongyang
If the err is RECORD_NOT_PROCESSED, and it is an optional record,
restore will still fail. There're two options to fix this,
  a, setting rc to 0 when it is an optional record.
  b, do the error handling in the caller.
We choose b because:
There will be another error type in COLO, which indicates a failover,
that needs to be handled in restore(), so moving the error handling out
to make the logic clearer...Otherwise, in process_record,
RECORD_NOT_PROCESSED is handled, and in restore another error type
returned from process_record is handled.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
 tools/libxc/xc_sr_restore.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index fd45775..d5645e0 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -569,19 +569,6 @@ static int process_record(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 free(rec->data);
 rec->data = NULL;
 
-if ( rc == RECORD_NOT_PROCESSED )
-{
-if ( rec->type & REC_TYPE_OPTIONAL )
-DPRINTF("Ignoring optional record %#x (%s)",
-rec->type, rec_type_to_str(rec->type));
-else
-{
-ERROR("Mandatory record %#x (%s) not handled",
-  rec->type, rec_type_to_str(rec->type));
-rc = -1;
-}
-}
-
 return rc;
 }
 
@@ -669,7 +656,20 @@ static int restore(struct xc_sr_context *ctx)
 else
 {
 rc = process_record(ctx, &rec);
-if ( rc )
+if ( rc == RECORD_NOT_PROCESSED )
+{
+if ( rec.type & REC_TYPE_OPTIONAL )
+DPRINTF("Ignoring optional record %#x (%s)",
+rec.type, rec_type_to_str(rec.type));
+else
+{
+ERROR("Mandatory record %#x (%s) not handled",
+  rec.type, rec_type_to_str(rec.type));
+rc = -1;
+goto err;
+}
+}
+else if ( rc )
 goto err;
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 08/26] tools/libxc: support to resume uncooperative HVM guests

2015-06-24 Thread Yang Hongyang
From: Wen Congyang 

1. suspend
a. PVHVM and PV: we use the same way to suspend the guest(send the suspend
   request to the guest). If the guest doesn't support evtchn, the xenstore
   variant will be used, suspending the guest via XenBus control node.
b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
   the guest

2. Resume:
a. fast path
   In this case, we don't change the guest's state.
   PV: modify the return code to 1, and than call the domctl:
   XEN_DOMCTL_resumedomain
   PVHVM: same with PV
   HVM: do nothing in modify_returncode, and than call the domctl:
XEN_DOMCTL_resumedomain
b. slow
   In this case, we have changed the guest's state.
   PV: update start info, and reset all secondary CPU states. Than call the
   domctl: XEN_DOMCTL_resumedomain
   PVHVM and HVM can not be resumed.

For PVHVM, in my test, only call the domctl: XEN_DOMCTL_resumedomain
can work. I am not sure if we should update start info and reset all
secondary CPU states.

For pure HVM guest, in my test, only call the domctl:
XEN_DOMCTL_resumedomain can work.

So we can call libxl__domain_resume(..., 1) if we don't change the guest
state, otherwise call libxl__domain_resume(..., 0).

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
---
 tools/libxc/xc_resume.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index e67bebd..bd82334 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -109,6 +109,23 @@ static int xc_domain_resume_cooperative(xc_interface *xch, 
uint32_t domid)
 return do_domctl(xch, &domctl);
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+DECLARE_DOMCTL;
+
+/*
+ * If it is PVHVM, the hypercall return code is 0, because this
+ * is not a fast path resume, we do not modify_returncode as in
+ * xc_domain_resume_cooperative.
+ * (resuming it in a new domain context)
+ *
+ * If it is a HVM, the hypercall is a NOP.
+ */
+domctl.cmd = XEN_DOMCTL_resumedomain;
+domctl.domain = domid;
+return do_domctl(xch, &domctl);
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
 DECLARE_DOMCTL;
@@ -138,10 +155,7 @@ static int xc_domain_resume_any(xc_interface *xch, 
uint32_t domid)
  */
 #if defined(__i386__) || defined(__x86_64__)
 if ( info.hvm )
-{
-ERROR("Cannot resume uncooperative HVM guests");
-return rc;
-}
+return xc_domain_resume_hvm(xch, domid);
 
 if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 13/26] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()

2015-06-24 Thread Yang Hongyang
Secondary vm is running in colo mode, we need to send
secondary vm's dirty page information to master at checkpoint,
so we have to enable qemu logdirty on secondary.

libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
qemu logdirty. But it uses domain_save_state, and calls
libxl__xc_domain_saverestore_async_callback_done()
before exits. This can not be used for secondary vm.

Update libxl__domain_suspend_common_switch_qemu_logdirty() to
introduce a new API libxl__domain_common_switch_qemu_logdirty().
This API only uses libxl__logdirty_switch, and calls
lds->callback before exits.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
Acked-by: Ian Campbell 
---
 tools/libxl/libxl_dom_save.c | 78 ++--
 tools/libxl/libxl_internal.h |  8 +
 2 files changed, 54 insertions(+), 32 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 0ad2894..5becc68 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -50,7 +50,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, 
libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
- libxl__domain_save_state *dss, int ok);
+ libxl__logdirty_switch *lds, int ok);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -60,13 +60,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 }
 
 static void domain_suspend_switch_qemu_xen_traditional_logdirty
-   (int domid, unsigned enable,
-libxl__save_helper_state *shs)
+   (libxl__egc *egc, int domid, unsigned enable,
+libxl__logdirty_switch *lds)
 {
-libxl__egc *egc = shs->egc;
-libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-libxl__logdirty_switch *lds = &dss->logdirty;
-STATE_AO_GC(dss->ao);
+STATE_AO_GC(lds->ao);
 int rc;
 xs_transaction_t t = 0;
 const char *got;
@@ -128,64 +125,81 @@ static void 
domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
 LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
 libxl__xs_transaction_abort(gc, &t);
-switch_logdirty_done(egc,dss,-1);
+switch_logdirty_done(egc,lds,-1);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
-   (int domid, unsigned enable,
-libxl__save_helper_state *shs)
+   (libxl__egc *egc, int domid, unsigned enable,
+libxl__logdirty_switch *lds)
 {
-libxl__egc *egc = shs->egc;
-libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-STATE_AO_GC(dss->ao);
+STATE_AO_GC(lds->ao);
 int rc;
 
 rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
 if (!rc) {
-libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+lds->callback(egc, lds, 0);
 } else {
 LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
-libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+lds->callback(egc, lds, -1);
 }
 }
 
+static void libxl__domain_suspend_switch_qemu_logdirty_done
+(libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
+{
+libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
+
+libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, rc);
+}
+
 void libxl__domain_suspend_common_switch_qemu_logdirty
(int domid, unsigned enable, void *user)
 {
 libxl__save_helper_state *shs = user;
 libxl__egc *egc = shs->egc;
 libxl__domain_save_state *dss = CONTAINER_OF(shs, *dss, shs);
-STATE_AO_GC(dss->ao);
+
+/* convenience aliases */
+libxl__logdirty_switch *const lds = &dss->logdirty;
+
+lds->callback = libxl__domain_suspend_switch_qemu_logdirty_done;
+libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
+}
+
+void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+   int domid, unsigned enable,
+   libxl__logdirty_switch *lds)
+{
+STATE_AO_GC(lds->ao);
 
 switch (libxl__device_model_version_running(gc, domid)) {
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, 
shs);
+domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
+lds);
 break;
 case LIBXL_DEVICE_MO

[Xen-devel] [PATCH v3 COLOPre 14/26] tools/libxl: export logdirty_init

2015-06-24 Thread Yang Hongyang
We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl_dom_save.c | 4 ++--
 tools/libxl/libxl_internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 5becc68..5797148 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -52,7 +52,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, 
libxl__ev_xswatch*,
 static void switch_logdirty_done(libxl__egc *egc,
  libxl__logdirty_switch *lds, int ok);
 
-static void logdirty_init(libxl__logdirty_switch *lds)
+void libxl__logdirty_init(libxl__logdirty_switch *lds)
 {
 lds->cmd_path = 0;
 libxl__ev_xswatch_init(&lds->watch);
@@ -377,7 +377,7 @@ void libxl__domain_save(libxl__egc *egc, 
libxl__domain_save_state *dss)
 goto out;
 }
 
-logdirty_init(&dss->logdirty);
+libxl__logdirty_init(&dss->logdirty);
 dss->logdirty.ao = ao;
 libxl__xswait_init(&dsps->pvcontrol);
 libxl__ev_evtchn_init(&dsps->guest_evtchn);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b15c24a..5f875ee 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2921,6 +2921,8 @@ typedef struct libxl__logdirty_switch {
 libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
+
 struct libxl__domain_suspend_state {
 /* set by caller of domain_suspend_callback_common */
 libxl__ao *ao;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 10/26] migration/save: pass checkpointed_stream from libxl to libxc

2015-06-24 Thread Yang Hongyang
Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.

Signed-off-by: Yang Hongyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
---
 tools/libxc/include/xenguest.h   |  9 ++---
 tools/libxc/xc_domain_save.c |  6 --
 tools/libxc/xc_nomigrate.c   |  3 ++-
 tools/libxc/xc_sr_common.h   |  2 +-
 tools/libxc/xc_sr_save.c |  5 +++--
 tools/libxl/libxl.c  |  2 ++
 tools/libxl/libxl_dom_save.c | 12 +---
 tools/libxl/libxl_internal.h |  1 +
 tools/libxl/libxl_save_callout.c |  2 +-
 tools/libxl/libxl_save_helper.c  |  3 ++-
 10 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index b0d27ed..d7c8fe4 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -30,7 +30,6 @@
 #define XCFLAGS_HVM   (1 << 2)
 #define XCFLAGS_STDVGA(1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS(1 << 4)
-#define XCFLAGS_CHECKPOINTED(1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
@@ -85,16 +84,20 @@ struct save_callbacks {
  * @parm xch a handle to an open hypervisor interface
  * @parm fd the file descriptor to save a domain to
  * @parm dom the id of the domain
+ * @parm checkpointed_stream non-zero if the far end of the stream is using
+ *   checkpointing
  * @return 0 on success, -1 on failure
  */
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
-   struct save_callbacks* callbacks, int hvm);
+   struct save_callbacks* callbacks, int hvm,
+   int checkpointed_stream);
 
 /* Domain Save v2 */
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
 uint32_t max_factor, uint32_t flags,
-struct save_callbacks* callbacks, int hvm);
+struct save_callbacks* callbacks, int hvm,
+int checkpointed_stream);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 301e770..c76980e 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -802,7 +802,8 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, 
int io_fd)
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
uint32_t max_factor, uint32_t flags,
-   struct save_callbacks* callbacks, int hvm)
+   struct save_callbacks* callbacks, int hvm,
+   int checkpointed_stream)
 {
 xc_dominfo_t info;
 DECLARE_DOMCTL;
@@ -897,7 +898,8 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t 
dom, uint32_t max_iter
 if ( getenv("XG_MIGRATION_V2") )
 {
 return xc_domain_save2(xch, io_fd, dom, max_iters,
-   max_factor, flags, callbacks, hvm);
+   max_factor, flags, callbacks, hvm,
+   checkpointed_stream);
 }
 
 DPRINTF("%s: starting save of domid %u", __func__, dom);
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 76978a0..374d5bf 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -23,7 +23,8 @@
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
uint32_t max_factor, uint32_t flags,
-   struct save_callbacks* callbacks, int hvm)
+   struct save_callbacks* callbacks, int hvm,
+   int checkpointed_stream)
 {
 errno = ENOSYS;
 return -1;
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 42af55b..04f32f5 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -175,7 +175,7 @@ struct xc_sr_context
 bool live;
 
 /* Plain VM, or checkpoints over time. */
-bool checkpointed;
+int checkpointed;
 
 /* Further debugging information in the stream. */
 bool debug;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..6102b66 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -820,7 +820,8 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
 
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
 uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-struct save_callbacks* callbacks, int hvm)
+struct save_callbacks* callbacks, int hvm,
+int checkpointed_stream)
 {
 xen_pfn_t nr_pfns;
 struct xc_sr_context ctx =
@@ -833,7 +834

[Xen-devel] [PATCH v3 COLOPre 12/26] tools/libxl: Update libxl_domain_unpause() to support qemu-xen

2015-06-24 Thread Yang Hongyang
Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.
We use libxl__domain_resume_device_model to unpause guest dm.

Signed-off-by: Yang Hongyang 
---
 tools/libxl/libxl.c | 15 +--
 tools/libxl/libxl_dom_suspend.c | 15 ---
 tools/libxl/libxl_internal.h|  1 +
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2ca59ea..59e2dfe 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -938,8 +938,6 @@ out:
 int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
 {
 GC_INIT(ctx);
-char *path;
-char *state;
 int ret, rc = 0;
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -949,14 +947,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
 }
 
 if (type == LIBXL_DOMAIN_TYPE_HVM) {
-uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
-
-path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
-state = libxl__xs_read(gc, XBT_NULL, path);
-if (state != NULL && !strcmp(state, "paused")) {
-libxl__qemu_traditional_cmd(gc, domid, "continue");
-libxl__wait_for_device_model_deprecated(gc, domid, "running",
- NULL, NULL, NULL);
+rc = libxl__domain_resume_device_model(gc, domid);
+if (rc < 0) {
+LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
+   "for domain %u:%d", domid, rc);
+goto out;
 }
 }
 ret = xc_domain_unpause(ctx->xch, domid);
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 1c486c4..3edbd2e 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -376,13 +376,22 @@ static void 
domain_suspend_callback_common_done(libxl__egc *egc,
 
 /*=== Domain resume */
 
-static int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
+int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
 {
+char *path;
+char *state;
 
 switch (libxl__device_model_version_running(gc, domid)) {
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-libxl__qemu_traditional_cmd(gc, domid, "continue");
-libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, 
NULL, NULL);
+uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
+state = libxl__xs_read(gc, XBT_NULL, path);
+if (state != NULL && !strcmp(state, "paused")) {
+libxl__qemu_traditional_cmd(gc, domid, "continue");
+libxl__wait_for_device_model_deprecated(gc, domid, "running",
+NULL, NULL, NULL);
+}
 break;
 }
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0adc9b4..6960280 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3326,6 +3326,7 @@ static inline bool libxl__save_helper_inuse(const 
libxl__save_helper_state *shs)
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
libxl__domain_suspend_state *dsps);
+_hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
 _hidden void libxl__domain_save_device_model(libxl__egc *egc,
  libxl__domain_save_state *dss,
  libxl__save_device_model_cb *callback);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 18/26] tools/libx{l, c}: add postcopy/suspend callback to restore side

2015-06-24 Thread Yang Hongyang
Secondary(restore side) is running under COLO, we also need
postcopy/suspend callbacks.

Signed-off-by: Yang Hongyang 
---
 tools/libxc/include/xenguest.h | 10 ++
 tools/libxl/libxl_save_msgs_gen.pl |  4 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index e804a1d..dcc441a 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -114,6 +114,16 @@ struct restore_callbacks {
 int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
 uint32_t size, void* data);
 
+/* Called after a new checkpoint to suspend the guest.
+ */
+int (*suspend)(void* data);
+
+/* Called after the secondary vm is ready to resume.
+ * Callback function resumes the guest & the device model,
+ *  returns to xc_domain_restore.
+ */
+int (*postcopy)(void* data);
+
 /* A checkpoint record has been found in the stream */
 int (*checkpoint)(void* data);
 
diff --git a/tools/libxl/libxl_save_msgs_gen.pl 
b/tools/libxl/libxl_save_msgs_gen.pl
index 7284975..86cd395 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -23,8 +23,8 @@ our @msgs = (
  STRING doing_what),
 'unsigned long', 'done',
 'unsigned long', 'total'] ],
-[  3, 'scxA',   "suspend", [] ],
-[  4, 'scxA',   "postcopy", [] ],
+[  3, 'srcxA',  "suspend", [] ],
+[  4, 'srcxA',  "postcopy", [] ],
 [  5, 'srcxA',   "checkpoint", [] ],
 [  6, 'srcxA',  "should_checkpoint", [] ],
 [  7, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 COLOPre 11/26] tools/libxl: introduce a new API libxl__domain_restore() to load qemu state

2015-06-24 Thread Yang Hongyang
Secondary vm is running in colo mode. So we will do
the following things again and again:
1. suspend both primay vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
We will send qemu's state each time in step2, and
slave's qemu should read it each time before resuming
secondary vm. Introduce a new API libxl__domain_restore()
to do it. This API should be called before resuming
secondary vm.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
Cc: Anthony Perard 
---
 tools/libxl/libxl_dom_save.c | 49 
 tools/libxl/libxl_internal.h |  3 +++
 tools/libxl/libxl_qmp.c  | 10 +
 3 files changed, 62 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 8fe1625..0ad2894 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -639,6 +639,55 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t 
*buf,
 }
 return 0;
 }
+
+static int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
+
+int libxl__domain_restore(libxl__gc *gc, uint32_t domid)
+{
+int rc = 0;
+
+libxl_domain_type type = libxl__domain_type(gc, domid);
+if (type != LIBXL_DOMAIN_TYPE_HVM) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+rc = libxl__domain_restore_device_model(gc, domid);
+if (rc)
+LOG(ERROR, "failed to restore device mode for domain %u:%d",
+domid, rc);
+out:
+return rc;
+}
+
+static int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
+{
+char *state_file;
+int rc;
+
+switch (libxl__device_model_version_running(gc, domid)) {
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+/* not supported now */
+rc = ERROR_INVAL;
+break;
+case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+/*
+ * This function may be called too many times for the same gc,
+ * so we use NOGC, and free the memory before return to avoid
+ * OOM.
+ */
+state_file = libxl__sprintf(NOGC,
+XC_DEVICE_MODEL_RESTORE_FILE".%d",
+domid);
+rc = libxl__qmp_restore(gc, domid, state_file);
+free(state_file);
+break;
+default:
+rc = ERROR_INVAL;
+}
+
+return rc;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 573c844..0adc9b4 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1043,6 +1043,7 @@ _hidden int libxl__userdata_store(libxl__gc *gc, uint32_t 
domid,
   const char *userdata_userid,
   const uint8_t *data, int datalen);
 
+_hidden int libxl__domain_restore(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
  int suspend_cancel);
 
@@ -1654,6 +1655,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
 _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
 /* Save current QEMU state into fd. */
 _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
 /* Set dirty bitmap logging status */
 _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool 
enable);
 _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const 
libxl_device_disk *disk);
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 9aa7e2e..a6f1a21 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -892,6 +892,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char 
*filename)
NULL, NULL);
 }
 
+int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
+{
+libxl__json_object *args = NULL;
+
+qmp_parameters_add_string(gc, &args, "filename", state_file);
+
+return qmp_run_command(gc, domid, "xen-load-devices-state", args,
+   NULL, NULL);
+}
+
 static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
   char *device, char *target, char *arg)
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   4   5   >