On 22/07/16 10:27, Corneliu ZUZU wrote:
Hi,
I've been inspecting vm-event code parts to try and understand when
and why domain pausing/locking is done. If I understood correctly,
domain pausing is done solely to force all the vCPUs of that domain to
see a configuration update and act upon it (e.g. in the case of a
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
domain pausing/unpausing ensures immediate enabling of CR3
load-exiting for all vCPUs), not to synchronize concurrent operations
(lock-behavior).
Correct. Without the vcpu/domain pause, a vcpu could be midway through
a vmexit path with the monitor configuration changing under its feet.
OTOH, it could be running in guest context, and only receive the update
one scheduling timeslice later.
As for locking, I see that for example vm_event_init_domain(),
vm_event_cleanup_domain() and monitor_domctl() are all protected by
the domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
resources, but it doesn't seem to take the domctl lock, so it seems
possible for that to happen _while_ those resources are
initialized/cleaned-up
* monitor_guest_request() also calls
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
which attempts a vm_event_ring_lock(ved), which could also be called
_while_ that's initialized (vm_event_enable()) or cleaned-up
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()
there doesn't seem to be taken into account the possibility of a
concurrent monitor_init_domain()/monitor_cleanup_domain()
Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 'subsys_lock'
field in the vm_event_domain structure, either spinlock or rw-lock,
which would be initialised/uninitialised when d.vm_event is
allocated/freed (domain_create()/complete_domain_destroy()).
I don't think you are missing anything. It seems like a monitor lock is
needed.
FWIW, the domctl lock is only used at the moment because it was easy,
and worked when everything was a domctl. Being a global spinlock, it is
a severe bottlekneck for concurrent domain management, which I need to
find some time to break apart, so the less reliance on it the better
from my point of view.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel