Hi,
/!\/!\/!\
I am not a scheduler expert so my view maybe be wrong. Dario feel free to
correct me :).
/!\/!\/!\
On 02/08/2019 14:07, Andrii Anisov wrote:
On 02.08.19 12:15, Julien Grall wrote:
I can make such a list, how it is done in this series:
From the list below it is not clear what is the split between hypervisor time
and guest time. See some of the examples below.
I guess your question is *why* do I split hyp/guest time in such a way.
So for the guest I count time spent in the guest mode. Plus time spent in
hypervisor mode to serve explicit requests by guest.
That time may be quite deterministic from the guest's point of view.
But the time spent by hypervisor to handle interrupts, update the hardware state
is not requested by the guest itself. It is a virtualization overhead. And the
overhead heavily depends on the system configuration (e.g. how many guests are
running).
While context switch cost will depend on your system configuration. The HW state
synchronization on entry to the hypervisor and exit from the hypervisor will
always be there. This is even if you have one guest running or partitioning your
system.
Furthermore, Xen is implementing a voluntary preemption model. The main
preemption point for Arm is on return to the guest. So if you have work
initiated by the guest that takes long, then you need may want to defer until
you can preempt without much trouble.
Your definition of "virtualization overhead" is somewhat unclear. A guest is not
aware that a device may be emulated. So emulating any I/O is per se an overhead.
That overhead may be accounted for a guest or for hyp, depending on the model
agreed.
There are some issues to account some of the work on exit to the hypervisor
time. Let's take the example of the P2M, this task is a deferred work from an
system register emulation because we need preemption.
The task can be long running (several hundred milliseconds). A scheduler may
only take into account the guest time and consider that vCPU does not need to be
unscheduled. You are at the risk a vCPU will hog a pCPU and delay any other
vCPU. This is not something ideal even for RT task.
Other work done on exit (e.g syncing the vGIC state to HW) will be less a
concern where they are accounted because it cannot possibly hog a pCPU.
I understand you want to get the virtualization overhead. It feels to me, this
needs to be a different category (i.e neither hypervisor time, nor guest time).
My idea is as following:
Accounting that overhead for guests is quite OK for server applications, you put
server overhead time on guests and charge money from their budget.
Yet for RT applications you will have more accurate view on the guest execution
time if you drop that overhead.
Our target is XEN in safety critical systems. So I chosen more deterministic
(from my point of view) approach.
See above, I believe you are building an secure system with accounting some of
the guest work to the hypervisor.
Cheers,
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel