Does anyone have tips on troubleshooting live migration? I've got several E5-2650 servers running in test environment, kernel 3.10.26 and qemu 1.7.0. If I start a VM guest (say ubuntu, debian, or centos), I can migrate it around from host to host to host just fine, but if I wait awhile (say 1 hour), I try to migrate and it succeeds but the guest is hosed. No longer pings, cpu is thrashing. I've tried to strace it and don't see anything that other working hosts aren't doing, and I've tried gdb but I'm not entirely sure what I'm doing. I tried downgrading to qemu 1.6.1. I've found dozens of reports of such behavior, but they're all due to other things (migrating between different host CPUs, someone thinking it's virtio or memballoon only to later find a fix like changing machine type, etc). I'm at a loss. This seems to work just fine with stock CentOS builds.
I'd be happy to try to capture a core if someone is willing to look at it. Here's an example xml: <domain type='kvm'> <name>VM12</name> <uuid>dd25acfc-e24d-4de6-814c-72ac465bc208</uuid> <description></description> <memory unit='KiB'>4194304</memory> <currentMemory unit='KiB'>4194304</currentMemory> <vcpu placement='static'>2</vcpu> <cputune> <shares>2000</shares> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-1.7'>hvm</type> <boot dev='cdrom'/> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu> </cpu> <clock offset='utc'> <timer name='kvmclock' tickpolicy='catchup'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source dev='/dev/sdc'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' cache='none'/> <target dev='hdc' bus='ide'/> <readonly/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='ide' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <interface type='bridge'> <mac address='02:00:09:66:00:18'/> <source bridge='br1000192'/> <model type='virtio'/> <bandwidth> <inbound average='128000' peak='128000'/> <outbound average='128000' peak='128000'/> </bandwidth> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/VM12.agent'/> <target type='virtio' name='VM12.vport'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'/> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <video> <model type='cirrus' vram='9216' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </memballoon> </devices> <seclabel type='none'/> </domain>