(+ Dario and George)
Hi Stefano,
I have added Dario and George to get some inputs from the scheduling
part.
On 27/02/2021 01:58, Stefano Stabellini wrote:
On Fri, 26 Feb 2021, Julien Grall wrote:
From: Julien Grall <jgr...@amazon.com>
A vCPU can get scheduled as soon as _VPF_down is cleared. As there is
currently not ordering guarantee in arch_set_info_guest(), it may be
possible that flag can be observed cleared before the new values of
vCPU
registers are observed.
Add an smp_mb() before the flag is cleared to prevent re-ordering.
Signed-off-by: Julien Grall <jgr...@amazon.com>
---
Barriers should work in pair. However, I am not entirely sure
whether to
put the other half. Maybe at the beginning of context_switch_to()?
It should be right after VGCF_online is set or cleared, right?
vcpu_guest_context_t is variable allocated on the heap just for the
purpose of
this call. So an ordering with VGFC_online is not going to do anything.
So it
would be:
xen/arch/arm/domctl.c:arch_get_info_guest
xen/arch/arm/vpsci.c:do_common_cpu_on
But I think it is impossible that either of them get called at the same
time as arch_set_info_guest, which makes me wonder if we actually need
the barrier...
arch_get_info_guest() is called without the domain lock held and I
can't see
any other lock that could prevent it to be called in // of
arch_set_info_guest().
So you could technically get corrupted information from
XEN_DOMCTL_getvcpucontext. For this case, we would want a smp_wmb()
before
writing to v->is_initialised. The corresponding read barrier would be in
vcpu_pause() -> vcpu_sleep_sync() -> sync_vcpu_execstate().
But this is not the issue I was originally trying to solve. Currently,
do_common_cpu_on() will roughly do:
1) domain_lock(d)
2) v->arch.sctlr = ...
v->arch.ttbr0 = ...
3) clear_bit(_VFP_down, &v->pause_flags);
4) domain_unlock(d)
5) vcpu_wake(v);
If we had only one pCPU on the system, then we would only wake the
vCPU in
step 5. We would be fine in this situation. But that's not the
interesting
case.
If you add a second pCPU in the story, it may be possible to have
vcpu_wake()
happening in // (see more below). As there is no memory barrier, step
3 may be
observed before 2. So, assuming the vcpu is runnable, we could start to
schedule a vCPU before any update to the registers (step 2) are
observed.
This means that when context_switch_to() is called, we may end up to
restore
some old values.
Now the question is can vcpu_wake() be called in // from another
pCPU? AFAICT,
it would be only called if a given flag in v->pause_flags is cleared
(e.g.
_VFP_blocked). But can we rely on that?
Even if we can rely on it, v->pause_flags has other flags in it. I
couldn't
rule out that _VPF_down cannot be set at the same time as the other
_VPF_*.
Therefore, I think a barrier is necessary to ensure the ordering.
Do you agree with this analysis?