From: "Michael R. Hines" <mrhi...@us.ibm.com> This patchest introduces RDMA-based live-migration to QEMU.
A copy of this documentation is located online: http://wiki.qemu.org/Features/RDMALiveMigration DESIGN: ========== 1. In order to provide maximum cross-device compatibility, we use the librdmacm library, which abstracts out the RDMA capabilities of each individual type of RDMA device, including infiniband, iWARP, as well as RoCE. This patch has been tested on both RoCE and infiniband devices from Mellanox. 2. A new file named "migration-rdma.c" contains the core code required to perform librdmacm connection establishment and the transfer of actual RDMA contents. 3. Files "arch_init.c" and "savevm.c" have been modified to transfer the VM's memory in the standard live migration path using RMDA memory instead of using TCP. 4. All of the original logic for migration of devices and protocol synchronization does not change - that happens simultaneously over TCP as it normally does. 5. Currently, the XBZRLE capability and the detection of zero pages (dup_page()) significantly slow down the empircal throughput observed when RDMA is activated, so the code path skips these capabilities when RDMA is enabled. Hopefully, we can stop doing this in the future and come up with a way to preserve these capabilities simultaneously with the use of RDMA. PERFORMANCE: ============ Using a 40gbps infinband link performing a worst-case stress test: RDMA Throughput With $ stress --vm-bytes 1024M --vm 1 --vm-keep Approximately 26 gpbs 1. Average worst-case throughput TCP Throughput With $ stress --vm-bytes 1024M --vm 1 --vm-keep 2. Approximately 8 gpbs (using IPOIB IP over Infiniband) Average downtime (stop time) ranges between 28 and 33 milliseconds. An *exhaustive* paper (2010) shows additional performance details linked on the QEMU wiki: http://wiki.qemu.org/Features/RDMALiveMigration USAGE: ========== Complete instructions for compiling and running with RDMA are also available on the wiki (probably too much for a cover letter). Signed-off-by: Michael R. Hines <mrhi...@us.ibm.com> --- Makefile.objs | 1 + configure | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+) diff --git a/Makefile.objs b/Makefile.objs index 68eb0ce..38767cc 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -57,6 +57,7 @@ common-obj-$(CONFIG_POSIX) += os-posix.o common-obj-$(CONFIG_LINUX) += fsdev/ common-obj-y += migration.o migration-tcp.o +common-obj-$(CONFIG_RDMA) += migration-rdma.o common-obj-y += qemu-char.o #aio.o common-obj-y += block-migration.o common-obj-y += page_cache.o diff --git a/configure b/configure index b7635e4..893935f 100755 --- a/configure +++ b/configure @@ -170,6 +170,7 @@ xfs="" vhost_net="no" kvm="no" +rdma="no" gprof="no" debug_tcg="no" debug="no" @@ -897,6 +898,10 @@ for opt do ;; --enable-virtio-blk-data-plane) virtio_blk_data_plane="yes" ;; + --enable-rdma) rdma="yes" + ;; + --disable-rdma) rdma="no" + ;; *) echo "ERROR: unknown option $opt"; show_help="yes" ;; esac @@ -1087,6 +1092,8 @@ echo " --enable-bluez enable bluez stack connectivity" echo " --disable-slirp disable SLIRP userspace network connectivity" echo " --disable-kvm disable KVM acceleration support" echo " --enable-kvm enable KVM acceleration support" +echo " --disable-rdma disable RDMA-based migration support" +echo " --enable-rdma enable RDMA-based migration support" echo " --enable-tcg-interpreter enable TCG with bytecode interpreter (TCI)" echo " --disable-nptl disable usermode NPTL support" echo " --enable-nptl enable usermode NPTL support" @@ -1718,6 +1725,18 @@ EOF libs_softmmu="$sdl_libs $libs_softmmu" fi +if test "$rdma" = "yes" ; then + cat > $TMPC <<EOF +#include <rdma/rdma_cma.h> +int main(void) { return 0; } +EOF + rdma_libs="-lrdmacm" + if ! compile_prog "" "$rdma_libs" ; then + feature_not_found "rdma" + fi + +fi + ########################################## # VNC TLS/WS detection if test "$vnc" = "yes" -a \( "$vnc_tls" != "no" -o "$vnc_ws" != "no" \) ; then @@ -3318,6 +3337,7 @@ echo "Linux AIO support $linux_aio" echo "ATTR/XATTR support $attr" echo "Install blobs $blobs" echo "KVM support $kvm" +echo "RDMA support $rdma" echo "TCG interpreter $tcg_interpreter" echo "fdt support $fdt" echo "preadv support $preadv" @@ -4278,6 +4298,11 @@ if [ "$pixman" = "internal" ]; then echo "config-host.h: subdir-pixman" >> $config_host_mak fi +if test "$rdma" = "yes" ; then +echo "CONFIG_RDMA=y" >> $config_host_mak +echo "LIBS+=$rdma_libs" >> $config_host_mak +fi + # build tree in object directory in case the source is not in the current directory DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32" DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas" -- 1.7.10.4