Comments inline. FYI: please CC mrhi...@us.ibm.com,
because it helps me know when to scroll threw the bazillion qemu-devel
emails.
I have things separated out into folders and rules, but a direct CC is
better =)
On 05/03/2013 07:28 PM, Chegu Vinod wrote:
Hi Michael,
I picked up the qemu bits from your github branch and gave it a try.
(BTW the setup I was given temporary access to has a pair of MLX's IB
QDR cards connected back to back via QSFP cables)
Observed a couple of things and wanted to share..perhaps you may be
aware of them already or perhaps these are unrelated to your specific
changes ? (Note: Still haven't finished the review of your changes ).
a) x-rdma-pin-all off case
Seem to only work sometimes but fails at other times. Here is an
example...
(qemu) rdma: Accepting rdma connection...
rdma: Memory pin all: disabled
rdma: verbs context after listen: 0x555556757d50
rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID:
fe80::2:c903:9:5855
rdma: Accepted migration
qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistent
with Host ind
ex 0x4ec: delta 0xffe6
qemu: warning: error while loading state for instance 0x0 of device
'virtio-net'
load of migration failed
Can you give me more details about the configuration of your VM?
b) x-rdma-pin-all on case :
The guest is not resuming on the target host. i.e. the source host's
qemu states that migration is complete but the guest is not responsive
anymore... (doesn't seem to have crashed but its stuck somewhere).
Have you seen this behavior before ? Any tips on how I could extract
additional info ?
Is the QEMU monitor still responsive?
Can you capture a screenshot of the guest's console to see if there is a
panic?
What kind of storage is attached to the VM?
Besides the list of noted restrictions/issues around having to pin all
of guest memory....if the pinning is done as part of starting of the
migration it ends up taking noticeably long time for larger guests.
Wonder whether that should be counted as part of the total migration
time ?.
That's a good question: The pin-all option should not be slowing down
your VM to much as the VM should still be running before the
migration_thread() actually kicks in and starts the migration.
I need more information on the configuration of your VM, guest operating
system, architecture and so forth.......
And similarly as before whether or not QEMU is not responsive or whether
or not it's the guest that's panicked.......
Also the act of pinning all the memory seems to "freeze" the guest.
e.g. : For larger enterprise sized guests (say 128GB and higher) the
guest is "frozen" is anywhere from nearly a minute (~50seconds) to
multiple minutes as the guest size increases...which imo kind of
defeats the purpose of live guest migration.
That's bad =) There must be a bug somewhere........ the largest VM I can
create on my hardware is ~16GB - so let me give that a try and try to
track down the problem.
Would like to hear if you have already thought about any other
alternatives to address this issue ? for e.g. would it be better to
pin all of the guest's memory as part of starting the guest itself ?
Yes there are restrictions when we do pinning...but it can help with
performance.
For such a large VM, I would definitely recommend pinning because I'm
assuming you have enough processors or a large enough application to
actually *use* that much memory, which would suggest that even after the
bulk phase round of the migration has already completed that your VM is
probably going to remain to be pretty busy.
It's just a matter of me tracking down what's causing the freeze and
fixing it........ I'll look into it right now on my machine.
---
BTW, a different (yet sort of related) topic... recently a patch went
into upstream that provided an option to qemu to mlock all of guest
memory :
https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html .
I had no idea.......very interesting.
but when attempting to do the mlock for larger guests a lot of time is
spent bringing each page into cache and clearing/zeron'g it etc.etc.
https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html
Wow, I didn't know that either. Perhaps this must be causing the entire
QEMU process and its threads to seize up.
It may be necessary to run the pinning command *outside* of QEMU's I/O
lock in a separate thread if it's really that much overhead.
Thanks a lot for pointing this out.........
----
Note: The basic tcp based live guest migration in the same qemu
version still works fine on the same hosts over a pair of non-RDMA
cards 10Gb NICs connected back-to-back.
Acknowledged.