From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> If the remote host, or networking dies during a migration, the socket can be waiting for a long timeout, and migration_cancel can't complete the cancel for a long time (and you can't start a new one to somewhere else). (Where 'long' is the TCP timeout, that's ~15 mins)
This patch set uses the shutdown(2) syscall to unblock any write/sends that are in progress to let the migrate_cancel happen quickly. 1/3: socket shutdown - An updated patch from my postcopy world to add a shut_down method on QEMUFile - only for 'socket' (where the syscall is supported). 2/3: Handle bi-directional communication for fd migration - A patch from Cristian Klein to use the socket QEMUFile for FDs that are passed in, if the FDs are sockets; this is needed so that libvirt migrations can take advantage of the other patches. Again this patch (and its naming) come from the postcopy world. 3/3: migration_cancel: shutdown migration socket - A new patch that uses the shutdown in migrate_fd_cancel Note this does not fix the timeout if you try to migrate to an already dead host; the connect timeout is typically a much shorter 2 minutes anyway. Dave Cristian Klein (1): Handle bi-directional communication for fd migration Dr. David Alan Gilbert (2): socket shutdown migration_cancel: shutdown migration socket include/migration/qemu-file.h | 10 ++++++++++ include/qemu/sockets.h | 7 +++++++ migration/fd.c | 24 ++++++++++++++++++++++-- migration/migration.c | 12 ++++++++++++ migration/qemu-file-unix.c | 23 +++++++++++++++++++---- migration/qemu-file.c | 12 ++++++++++++ 6 files changed, 82 insertions(+), 6 deletions(-) -- 2.1.0