On 02/25/2010 11:33 AM, Avi Kivity wrote:
On 02/25/2010 07:15 PM, Anthony Liguori wrote:
I agree. Further, once we fine-grain device threading, the iothread
essentially disappears and is replaced by device-specific threads.
There's no "idle" anymore.
That's a nice idea, but how is io dispatch handled? Is everything
synchronous or do we continue to program asynchronously?
Simple stuff can be kept asynchronous, complex stuff (like qcow2)
ought to be made synchronous (it uses threads anyway, so we don't lose
anything). Stuff like vnc can go either way.
We've discussed this before and I still contend that threads do not make
qcow2 any simpler.
It's very difficult to mix concepts.
We're complicated enough to have conflicting requirements and a large
code base with its own inertia, so no choice really.
I personally don't anticipate per-device threading but rather
anticipate re-entrant device models. I would expect all I/O to be
dispatched within the I/O thread and the VCPU threads to be able to
execute device models simultaneously with the I/O thread.
That means long-running operations on the iothread can lock out other
completions.
Candidates for own threads are:
- live migration
- block format drivers (except linux-aio, perhaps have a thread for
the aio completion handler)
- vnc
- sdl
- sound?
- hotplug, esp. memory
Each such thread could run the same loop as the iothread. Any
pollable fd or timer would be associated with a thread, so things
continue as normal more or less. Unassociated objects continue with
the main iothread.
Is the point latency or increasing available CPU resources? If the
device models are re-entrant, that reduces a ton of the demand on the
qemu_mutex which means that IO thread can run uncontended. While we
have evidence that the VCPU threads and IO threads are competing with
each other today, I don't think we have any evidence to suggest that the
IO thread is self-starving itself with long running events.
With the device model, I'd like to see us move toward a very well
defined API for each device to use. Part of the reason for this is to
limit the scope of the devices in such a way that we can enforce this at
compile time. Then we can introduce locking within devices with some
level of guarantee that we've covered the API devices are actually
consuming.
For host services though, it's much more difficult to isolate them like
this. I'm not necessarily claiming that this will never be the right
thing to do, but I don't think we really have the evidence today to
suggest that we should focus on this in the short term.
Regards,
Anthony Liguori