On 11/22/18 5:37 PM, Roger Pau Monné wrote:
> I don't think you are supposed to try to pause other vcpus while
> holding a lock, as you can see it's quite likely that you will end up
> deadlocking because the vCPU you are trying to pause is stuck waiting
> on the lock that you are holding.
> 
> You should figure out whether you can get into vmx_start_reexecute
> without holding any locks, or alternatively drop the lock, pause the
> vCPUs and pick the lock again.
> 
> See for example how hap_track_dirty_vram releases the lock before
> attempting to pause the domain for this same reason.

Right, this will take more thinking.

I've unlocked the p2m for testing and the initial hang is gone, however
the same problem now applies to rexec_lock: nothing prevents two or more
VCPUs from arriving in vmx_start_reexecute_instruction() simultaneously,
at which point one of them might take the lock and try to pause the
other, while the other is waiting to take the lock, with predictable
results.

On the other hand, releasing rexec_lock as well will allow two VCPUs to
end up trying to pause each other (especially unpleasant in a 2 VCPU
guest). At any given moment, there should be only one VCPU alive and
trying to reexecute an instruction - and at least one VCPU alive on the
guest.

We'll get more coffee, and of course suggestions are appreciated (as has
been all your help).


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to