On 09/03/2015 05:37 AM, Ian Campbell wrote:
On Thu, 2015-09-03 at 11:26 +0100, Ian Campbell wrote:
Notice that it has bound to 127.0.1.1 and not to 10.80.228.77!
So while I investigate how to make d-i not create these entries I also
removed the line from /etc/hosts such that looking up the FQDN gives the
non-local IP. But:
root@moss-bug:/var/log# strace -o /tmp/virsh -fff virsh --debug 0 migrate
--live debian.guest.osstest xen+ssh ://10.80.228.77
migrate: live(bool): (none)
migrate: domain(optdata): debian.guest.osstest
migrate: desturi(optdata): xen+ssh://10.80.228.77
migrate: found option <domain>: debian.guest.osstest
migrate: <domain> trying as domain NAME
migrate: found option <domain>: debian.guest.osstest
migrate: <domain> trying as domain NAME
error: internal error: Failed to send migration data to destination host
The senders libxl-driver.log says:
2015-09-03 12:29:45 BST libxl-save-helper: debug: starting save: Success
2015-09-03 12:29:45 BST xc: detail: fd 27, dom 3, max_iters 0, max_factor 0,
flags 1, hvm 0
2015-09-03 12:29:45 BST xc: info: Saving domain 3, type x86 PV
2015-09-03 12:29:45 BST xc: detail: 64 bits, 4 levels
2015-09-03 12:29:45 BST xc: detail: max_pfn 0x1ffff, p2m_frames 256
2015-09-03 12:29:45 BST xc: detail: max_mfn 0x120000
2015-09-03 12:29:46 BST xc: error: Failed to write page data to stream (104 =
Connection reset by peer): Internal error
2015-09-03 12:29:46 BST xc: error: Save failed (104 = Connection reset by
peer): Internal error
2015-09-03 12:29:46 BST libxl-save-helper: debug: complete r=-1: Connection
reset by peer
2015-09-03 12:29:46 BST libxl: error:
libxl_stream_write.c:329:libxl__xc_domain_save_done: saving domain: domain did
not respond to suspend request: Connection reset by peer
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1874:libxl__ao_complete: ao
0x7f67b3f63e90: complete, rc=-8
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1843:libxl__ao__destroy: ao
0x7f67b3f63e90: destroy
2015-09-03 12:29:46 BST libxl: debug: libxl.c:526:libxl_domain_resume: ao
0x7f67b3fa44b0: create: how=(nil) callback=(nil) poller=0x7f67a0002610
2015-09-03 12:29:46 BST xc: error: Dom 3 not suspended: (shutdown 0, reason
255): Internal error
2015-09-03 12:29:46 BST libxl: error:
libxl_dom_suspend.c:409:libxl__domain_resume: xc_domain_resume failed for
domain 3: Invalid argument
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1874:libxl__ao_complete: ao
0x7f67b3fa44b0: complete, rc=-3
2015-09-03 12:29:46 BST libxl: debug: libxl.c:529:libxl_domain_resume: ao
0x7f67b3fa44b0: inprogress: poller=0x7f67a0002610, flags=ic
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1843:libxl__ao__destroy: ao
0x7f67b3fa44b0: destroy
While the receiver has:
2015-09-03 12:29:45 BST libxl-save-helper: debug: starting restore: Success
2015-09-03 12:29:45 BST xc: detail: fd 31, dom 4, hvm 0, pae 0, superpages 0,
checkpointed_stream 0
2015-09-03 12:29:45 BST xc: info: Found x86 PV domain from Xen 4.6
2015-09-03 12:29:45 BST xc: info: Restoring domain
2015-09-03 12:29:45 BST xc: detail: 64 bits, 4 levels
2015-09-03 12:29:45 BST xc: detail: max_mfn 0x120000
2015-09-03 12:29:45 BST xc: detail: Expanded p2m from 0 to 0x1ffff
2015-09-03 12:29:45 BST xc: error: Failed to read 4202504 bytes of data for
record (0x00000001, Page data) (11 = Resource temporarily unavailabl): Internal
error
2015-09-03 12:29:45 BST xc: error: Restore failed (11 = Resource temporarily
unavailabl): Internal error
2015-09-03 12:29:45 BST libxl-save-helper: debug: complete r=-1: Resource
temporarily unavailable
2015-09-03 12:29:45 BST libxl: error:
libxl_stream_read.c:749:libxl__xc_domain_restore_done: restoring domain:
Resource temporarily unavailable
2015-09-03 12:29:45 BST libxl: error:
libxl_create.c:1141:domcreate_rebuild_done: cannot (re-)build domain: -3
2015-09-03 12:29:46 BST libxl: debug: libxl.c:1708:devices_destroy_cb: forked
pid 18738 for destroy of domain 4
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1874:libxl__ao_complete: ao
0x7fbb7687e900: complete, rc=-3
2015-09-03 12:29:46 BST libxl: debug: libxl_event.c:1843:libxl__ao__destroy: ao
0x7fbb7687e900: destroy
"xc: error: Failed to write page data to stream (104 = Connection reset by
peer): Internal error" seems to be the initial failure.
I wonder if this has anything to do with migration V2? I noticed a migration
regression a few days back, but later realized that the sender was 4.5 and
receiver was 4.6. I planned to see if migration worked through libvirt between
two 4.6 hosts, but before doing so I had to re-purpose the machines for another
task. I think libvirt needs some work to accommodate migration V2...
Regards,
Jim
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel