Thread model in QEMU

Avi Kivity Tue, 30 Mar 2010 03:26:27 -0700

On 03/30/2010 12:23 AM, Anthony Liguori wrote:

It's not sufficient. If you have a single thread that runs both livemigrations and timers, then timers will be backlogged behind livemigration, or you'll have to yield often. This is regardless of thelocking model (and of course having threads without fixing thelocking is insufficient as well, live migration accesses guest memoryso it needs the big qemu lock).
But what's the solution? Sending every timer in a separate thread?We'll hit the same problem if we implement an arbitrary limit tonumber of threads.

A completion that's expected to take a couple of microseconds at mostcan live in the iothread. A completion that's expected to take a coupleof milliseconds wants its own thread. We'll have to think aboutanything in between.

vnc and migration can perform large amounts of work in a singlecompletion; they're limited only by the socket send rate and ourinternal rate-limiting which are both outside our control. Most devicetimers are O(1). virtio completions probably fall into the annoying"have to think about it" department.

What I'm skeptical of, is whether converting virtio-9p or qcow2 tohandle each request in a separate thread is really going to improvethings.
Currently qcow2 isn't even fullly asynchronous, so it can't fail toimprove things.
Unless it introduces more data corruptions which is my concern withany significant change to qcow2.

It's possible to move qcow2 to a thread without any significant changeto it (simply run the current code in its own thread, protected by amutex). Further changes would be very incremental.

The VNC server is another area that I think multithreading would bea bad idea.
If the vnc server is stuffing a few megabytes of screen into asocket, then timers will be delayed behind it, unless you litter thecode with calls to bottom halves. Even worse if it does complicatedcompression and encryption.
Sticking the VNC server in it's own thread would be fine. Trying tomake the VNC server multithreaded though would be problematic.

Why would it be problematic? Each client gets its own threads, theydon't interact at all do they?


I don't see a need to do it though (beyond dropping it into a thread).

Basically, sticking isolated components in a single thread should bepretty reasonable.

Now you're doomed. It's easy to declare things "isolated components"one by one, pretty soon the main loop will be gone.

But if those system calls are blocking, you need a thread?
You can dispatch just the system call to a thread pool. Theadvantage of doing that is that you don't need to worry aboutlocking since the system calls are not (usually) handling shared state.
There is always implied shared state. If you're doing direct guestmemory access, you need to lock memory against hotunplug, or thesyscall will end up writing into freed memory. If the device can behotunplugged, you need to make sure all threads have returned beforeunplugging it.
There are other ways to handle hot unplug (like reference counting)that avoid this problem.


That's just more clever locking.

Ultimately, this comes down to a question of lock granularity andthread granularity. I don't think it's a good idea to start with theassumption that we want extremely fine granularity. There's certainlyvery low hanging fruit with respect to threading.

Sure. Currently the hotspots are block devices (except raw) and hpet(seen with large Windows guests). The latter includes the bus lookupand hpet itself, hpet reads can be performed locklessly if we're clever.

On a philosophical note, threads may be easier to model complexhardware that includes a processor, for example our scsi card (andhow about using tcg as a jit to boost it :)
Yeah, it's hard to argue that script evaluation shouldn't be done ina thread. But that doesn't prevent me from being very cautiousabout how and where we use threading :-)
Caution where threads are involved is a good thing. They areinevitable however, IMO.
We already are using threads so they aren't just inevitable, they'rereality. I still don't think using threads would significantlysimplify virtio-9p.

I meant, exposing qemu core to the threads instead of pretending theyaren't there. I'm not familiar with 9p so don't hold much of anopinion, but didn't you say you need threads in order to handle asyncsyscalls? That may not be the deep threading we're discussing here.

btw, IIUC currently disk hotunplug will stall a guest, no? We needasync aio_flush().


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

Re: [Qemu-devel] [PATCH -V3 09/32] virtio-9p: Implement P9_TWRITE/ Thread model in QEMU

Reply via email to