Launchpad has imported 27 comments from the remote bug at https://bugzilla.redhat.com/show_bug.cgi?id=1202453.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2015-03-16T16:27:06+00:00 Lukas wrote: Created attachment 1002391 domain.xml There is a bug in libvirt (built from current master: 51f9f03a4ca50b070c0fbfb29748d49f583e15e1) when live migrating a VM with a big storage attached to it - the migration fails with "error: operation failed: migration job: unexpectedly failed". Not sure what's the threshold for the storage size to trigger the bug, but a guest with 30GB storage fails to migrate in our test lab. This only happens when --tunnelled parameter is passed to "virsh migrate". # virsh migrate --live --p2p --copy-storage-inc --tunnelled ubuntuutopic "qemu+tcp://lab5/system" error: operation failed: migration job: unexpectedly failed #on the other hand, this WORKS OK: virsh migrate --live --p2p --copy-storage-inc ubuntuutopic "qemu+tcp://lab5/system" libvirt: current master - 51f9f03a4ca50b070c0fbfb29748d49f583e15e1 qemu: 2.0.0+dfsg-2ubuntu1.10 linux kernel: 3.13.0-46-generic #79-Ubuntu Versions are same on both boxes. libvirtd.conf only changed to listen on TCP and not to require authentication. Logs and the domain xml attached. Steps to Reproduce: 1. create a new domain on host1 (if can't reproduce, you might need to creat a domain with a bigger storage) 2. setup host2 - precreate an empty qcow2 disk in the corresponding location, change libvirtd config to listen on tcp port 3. run "virsh migrate --live --p2p --copy-storage-inc --tunnelled GUEST_VM "qemu+tcp://host2/system" on host1 Actual results: error: operation failed: migration job: unexpectedly failed Expected results: migration succeeds just like when --tunnelled is not used Domain and logs attached. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/5 ------------------------------------------------------------------------ On 2015-03-16T16:30:12+00:00 Lukas wrote: Created attachment 1002394 destination libvirtd log Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/7 ------------------------------------------------------------------------ On 2015-03-16T16:30:57+00:00 Lukas wrote: Created attachment 1002395 source node libvirtd log Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/8 ------------------------------------------------------------------------ On 2015-03-16T16:33:57+00:00 Lukas wrote: Created attachment 1002397 destination libvirt/qemu/guest.log Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/9 ------------------------------------------------------------------------ On 2015-03-16T16:57:28+00:00 Jiri wrote: As the error message from source daemon suggests, the reason is a different way of transferring disk images with p2p vs tunnelled migration. The preferred way is using NBD but this is unfortunately impossible with tunnelled migration. Thus it falls back to the old way of storage migration. I'm not sure how much this older method is supported by QEMU community but you can try to raise the issue with them. There doesn't seem to be any bug in libvirt here. Except for the lack of NBD support with tunnelled migration. But that's rather a request for new feature. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/10 ------------------------------------------------------------------------ On 2015-03-16T17:26:55+00:00 Lukas wrote: Thanks for quick answer. Just two things. 1) The migration works fine when --tunnelled is not used. Based on that I'd assume native QEMU migration works fine. 2) Could libvirt provide better errors logs when this fail occurs? Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/11 ------------------------------------------------------------------------ On 2015-03-16T19:16:39+00:00 Jiri wrote: 1) There are two implementations of storage migration in QEMU. The old variant ("migrate -b" monitor command) and the new variant using NBD. The usage of --tunnelled forces libvirt to switch from NBD to the old implementation when asking QEMU to migrate. It's QEMU doing the migration including storage in both cases. According to the logs NBD based storage migration works fine for you while the old implementation doesn't work. 2) The error actually comes from QEMU so unless it provides anything better to us, we can't report it. And there's nothing interesting in the qemu log on destination host, which doesn't make things any better. Can you also check that log file on the source host? Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/12 ------------------------------------------------------------------------ On 2015-03-16T22:01:06+00:00 Lukas wrote: Thanks for clarification. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/13 ------------------------------------------------------------------------ On 2015-03-18T12:50:49+00:00 Lukas wrote: Hi Jiri I did more tests at my side and it turns out it very well might be an issue in libvirt so I have to reopen this issue. I have done following tests: Test A) 1) start libvirt on hostA and hostB 2) start GUEST on hostA 3) create an empty disk on hostB with qemu-img create (might not be necessary with recent enough libvirt) 4) start migration using "virsh migrate --live --p2p --copy-storage-inc --tunnelled GUEST "qemu+tcp://hostB/system" # at this point migration fails with "unexpectedly failed" 5) Now I stopped libvirt on hostA and hostB 6) I have manually started qemu with -incoming on hostB 7) I have connected via QMP to hostA and executed "migrate blk=true inc=true uri=tcp:10.0.1.31:49152" in qmp-shell # migration fails with "unexpectedly failed" At this point I suspected the problem to be in qemu. However, when I do everything manually the live migration works. ie.: Test B) 1) stop libvirt on hostA and hostB 2) start GUEST on hostA 3) create an empty disk on hostB with qemu-img create 4) start qemu with -incoming on hostB 5) connect via QMP to hostA and execute "migrate blk=true inc=true uri=tcp:10.0.1.31:49152" in qmp-shell migration works! btw. this issue might be more important than it seems because openstack nova defaults to tunneled migration. Thanks! Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/16 ------------------------------------------------------------------------ On 2015-03-19T13:13:18+00:00 Jiri wrote: (In reply to Lukas Vacek from comment #8) > Test B) > 1) stop libvirt on hostA and hostB > 2) start GUEST on hostA > 3) create an empty disk on hostB with qemu-img create I see it now. Another difference between migrating storage using NBD vs. the old way is in this step 3. Current libvirt (as of 1.2.13) will precreate the disk on the destination host but only when NBD is used. If it's not used, the files need to be properly created on the destination before starting migration (I think Nova takes care of this). With older libvirt, the files need to exist even if NBD is used. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/18 ------------------------------------------------------------------------ On 2015-03-19T14:08:15+00:00 Lukas wrote: Agreed. But I don't think it's the cause of the issue because I precreate the files exactly the same way in Test A and Test B. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/19 ------------------------------------------------------------------------ On 2015-03-19T14:14:13+00:00 Jiri wrote: Heh, I'm blind. Anyway, could you please post the logs I asked for on IRC few days ago? Turn on debug logs (http://wiki.libvirt.org/page/DebugLogs), run the migration and attach libvirtd.log and guest.log files from both source and destination hosts. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/20 ------------------------------------------------------------------------ On 2015-03-31T13:48:05+00:00 Lukas wrote: Created attachment 1009066 libvirtd source log log_filters="3:rpc 3:remote 3:util.json 3:util.event 3:node_device 3:util.object 3:util.netlink 3:access" Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/22 ------------------------------------------------------------------------ On 2015-03-31T13:48:36+00:00 Lukas wrote: Created attachment 1009067 libvirtd destination log log_filters="3:rpc 3:remote 3:util.json 3:util.event 3:node_device 3:util.object 3:util.netlink 3:access" Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/23 ------------------------------------------------------------------------ On 2015-03-31T13:51:46+00:00 Lukas wrote: Created attachment 1009068 new qemu/guest.log on source Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/24 ------------------------------------------------------------------------ On 2015-03-31T13:52:01+00:00 Lukas wrote: Created attachment 1009070 new qemu/guest.log on destination Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/25 ------------------------------------------------------------------------ On 2015-03-31T13:52:53+00:00 Lukas wrote: First of all, sorry I didn't get to this earlier. We did some reorganizing of our lab env so I could reproduce the test with logs on only now. It dies with another error now. However, direct qemu migration works as does not-tunnelled libvirt migration. root@lab1:/var/lib/libvirt# virsh migrate --live --p2p --copy-storage-inc --tunnelled ubuntuutopic "qemu+tcp://lab2/system" error: Unable to read from monitor: Connection reset by peer Libvirt debug logs attached. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/26 ------------------------------------------------------------------------ On 2015-04-08T07:38:40+00:00 Kashyap wrote: (In reply to Lukas Vacek from comment #16) > First of all, sorry I didn't get to this earlier. We did some reorganizing > of our lab env so I could reproduce the test with logs on only now. > > It dies with another error now. However, direct qemu migration works as does > not-tunnelled libvirt migration. > > root@lab1:/var/lib/libvirt# virsh migrate --live --p2p --copy-storage-inc > --tunnelled ubuntuutopic "qemu+tcp://lab2/system" > error: Unable to read from monitor: Connection reset by peer Just a side question, can you also reproduce it with qemu+ssh? I was just testing a slight variant of the above CLI yesterday with qemu+ssh on Fedora 22, and it worked: $ virsh migrate --verbose --copy-storage-all --p2p --live cvm1 \ qemu+ssh://root@desthost/system (NOTE: The above assumes root on src can SSH to dst without any password prompt, so, for testing you might want to quickly create SSH keys with empty passphrase, assuming it's a trusted network.) > Libvirt debug logs attached. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/27 ------------------------------------------------------------------------ On 2015-04-08T08:17:53+00:00 Lukas wrote: Just wondering, is qemu+tcp working for you or not? Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/28 ------------------------------------------------------------------------ On 2015-04-09T11:40:24+00:00 Kashyap wrote: Yes, qemu+tcp is working for me, I tested four variants (refer further below) with these versions: kernel-4.0.0-0.rc5.git4.1.fc22.x86_64 libvirt-daemon-kvm-1.2.13-2.fc22.x86_64 qemu-system-x86-2.3.0-0.2.rc1.fc22.x86_64 Config setup ------------ I had this config in destination's libvirtd.conf: $ cat /etc/libvirt/libvirtd.conf | grep -v ^$ | grep -v ^# listen_tls = 0 listen_tcp = 1 auth_tcp = "none" And started the libvirtd daemon on the destination with: $ cat /etc/sysconfig/libvirtd | grep -v ^$ | grep -v ^# LIBVIRTD_ARGS="--listen" Since I'm testing in a trusted network, I also had SSH access (via public/private keys) to root on destinatoin host without any password prompts. Tests ----- I just tested three variants of migration with qemu+tcp, successfully: (1) Native migration, client to two libvirtd servers $ virsh migrate --verbose --copy-storage-all \ --live cvm1 qemu+tcp://kashyapc@devstack3/system (2) Native migration, client to and peer2peer between, two libvirtd servers $ virsh migrate --verbose --copy-storage-all \ --p2p --live cvm1 qemu+tcp://kashyapc@devstack3/system (3) Tunnelled migration, client and peer2peer between two libvirtd servers $ virsh migrate --verbose --copy-storage-all \ --p2p --tunnelled --live cvm1 qemu+tcp://kashyapc@devstack3/system Successful libvirtd log (with debug filter set) for the 3rd variant: https://kashyapc.fedorapeople.org/virt/temp/tunnelled-p2p-migration- qemu-tcp-libvirtd-log.txt Additionally, I also tested the below (without explicit '--copy-storage-all' flag, it works too): $ virsh migrate --verbose --p2p --tunnelled \ --live cvm1 qemu+tcp://kashyapc@devstack3/system Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/29 ------------------------------------------------------------------------ On 2015-04-09T12:00:07+00:00 Kashyap wrote: Closing the bug, per comment #19. Feel free to reopen in case you can provide a reliable reproducer with appropriate logs. Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/30 ------------------------------------------------------------------------ On 2015-04-09T12:04:37+00:00 Lukas wrote: I'd like to test with qemu+ssh but after I have provided the debug logs I have downgraded qemu on our lab boxes. I think it would be best to raise a separate issue for the problem with qemu+ssh. Thanks, Lucas Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/31 ------------------------------------------------------------------------ On 2015-04-09T12:17:30+00:00 Kashyap wrote: (In reply to Kashyap Chamarthy from comment #19) [. . .] [Just correcting the terminology for migration scenarios (2) and (3).] Assuming I'm reading this doc correctly. (Libvirt devs, please correct me if I'm wrong.) http://libvirt.org/migration.html#scenarios > Tests > ----- > > I just tested three variants of migration with qemu+tcp, successfully: > > > (1) Native migration, client to two libvirtd servers > > $ virsh migrate --verbose --copy-storage-all \ > --live cvm1 qemu+tcp://kashyapc@devstack3/system > > (2) Native migration, client to and peer2peer between, two libvirtd servers The below is called "Native migration, peer2peer between two libvirtd servers" Refer: http://libvirt.org/migration.html#nativepeer2peer > > $ virsh migrate --verbose --copy-storage-all \ > --p2p --live cvm1 qemu+tcp://kashyapc@devstack3/system > > (3) Tunnelled migration, client and peer2peer between two libvirtd servers The below is called "Tunnelled migration, peer2peer between two libvirtd servers" Refer: http://libvirt.org/migration.html#scenariotunnelpeer2peer2 > > $ virsh migrate --verbose --copy-storage-all \ > --p2p --tunnelled --live cvm1 qemu+tcp://kashyapc@devstack3/system [. . .] Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/32 ------------------------------------------------------------------------ On 2015-04-27T11:22:24+00:00 Lukas wrote: bump Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/33 ------------------------------------------------------------------------ On 2015-05-28T21:10:36+00:00 Frank wrote: Hi, I ran into the same problem and using qemu+tcp instead of qemu+ssh solved it. However, it took me a lot of hours to figure this out. :( I like to add that the error seems to depend on the VMs workload. I was able to reproduce the error with a higher workload while the live migration worked fine with a lighter workload. Best, Frank Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/34 ------------------------------------------------------------------------ On 2016-04-10T20:43:11+00:00 Cole wrote: It's been a while since the last report. Is anyone still seeing this with more recent libvirt + distro? Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/37 ------------------------------------------------------------------------ On 2016-05-02T14:34:28+00:00 Cole wrote: Since there's no response, closing as DEFERRED. But if anyone is still affected with newer libvirt versions, please re-open and we can triage from there Reply at: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1432630/comments/38 ** Changed in: libvirt Status: Unknown => Won't Fix ** Changed in: libvirt Importance: Unknown => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1432630 Title: libvirt tunnelled migration fails with "migration job: unexpectedly failed" To manage notifications about this bug go to: https://bugs.launchpad.net/libvirt/+bug/1432630/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs