On 12/01/2010 11:01 AM, Avi Kivity wrote:
On 12/01/2010 06:56 PM, Anthony Liguori wrote:
On 12/01/2010 10:52 AM, Avi Kivity wrote:
On 12/01/2010 06:49 PM, Anthony Liguori wrote:
We need actual measurements instead of speculations.
Yes, I agree 100%. I think the place to start is what I suggested
in a previous note in this thread, we need to measure actual stall
time in the guest.
I'd actually start at the host. How much time does
ioctl(KVM_GET_DIRTY_LOG) take? What's the percentage of time
qemu_mutex is held?
The question is, what really are the symptoms of the problem. It's
not necessarily a bad thing if KVM_GET_DIRTY_LOG takes a long while
qemu_mutex is held.
Whether or not qemu_mutex is held, long KVM_GET_DIRTY_LONG runtimes
are bad, since they are a lower bound on your downtime. And
KVM_GET_DIRTY_LOG does a lot of work, and invokes
synchronize_srcu_expedited(), which can be very slow.
That's fine, and you're right, it's a useful thing to do, but this
series originated because of a problem and we ought to make sure we
capture what the actual problem is. That's not to say we shouldn't
improve things that could stand to be improved.
Is the problem that the monitor responds slowly? Is the problem that
the guest isn't consistently getting execution time? Is the proper
simply that the guest isn't getting enough total execution time?
All three can happen if qemu_mutex is held too long.
Right, but I'm starting to think that the root of the problem is not
that it's being held too long but that it's being held too often.
Regards,
Anthony Liguori
The third can happen for other reasons (like KVM_GET_DIRTY_LOG
holding the mmu spinlock for too long, can fix with O(1) write
protection).
I think we need to identify exactly what the problem is before we
look for sources.