On 03/29/2010 01:36 AM, jvrao wrote:
Aneesh Kumar K.V wrote:
From: Anthony Liguori<aligu...@us.ibm.com>
We have implemented all the vfs calls in state machine model so that we are
prepared
for the model where the VCPU thread(s) does the initial work until it needs to
block then it
submits that work (via a function pointer) to a thread pool. A thread in that
thread pool
picks up the work, and completes the blocking call, when blocking call returns
a callback is
invoked in the IO thread. Then the IO thread runs until the next blocking
function, and goto start.
Basically the VCPU/IO threads does all the non-blocking work, and let the
threads in the
thread pool work on the blocking calls like mkdir() stat() etc.
My question is, why not let the whole work done by the thread in the thread
pool?
VCPU thread receives the PDU and hands over the entire job to worker thread.
When all work is completed, either the worker thread or the IO thread(we can
switch back at this point if needed) marks the request as completed in the
virtqueue and injects an
interrupt to notify the guest.
We can still keep the same number of threads in the thread pool.
This way, we are not increasing #of threads employed by QEMU...also it makes
code lot
more easy to read/maintain.
I may be missing something..but would like to know more on the advantages of
this model.
In qemu, we tend to prefer state-machine based code.
There are a few different models that we have discussed at different points:
1) state machines with a thread pool to make blocking functions
asynchronous (what we have today)
2) co-operative threading
3) pre-emptive threading
All three models are independent of making device models re-entrant. In
order to allow VCPU threads to run in qemu simultaneously, we need each
device to carry a lock and for that lock to be acquired upon any of the
public functions for the device model.
For individual device models or host services, I think (3) is probably
the worst model overall. I personally think that (1) is better in the
long run but ultimately would need an existence proof to compare against
(2). (2) looks appealing until you actually try to have the device
handle multiple requests at a time.
For virtio-9p, I don't think (1) is much of a burden to be honest. In
particular, when the 9p2000.L dialect is used, there should be a 1-1
correlation between protocol operations and system calls which means
that for the most part, there's really no advantage to threading from a
complexity point of view.
Regards,
Anthony Liguori
Thanks,
JV