On Thu, Jun 18, 2020 at 07:39:04AM -0700, Tamas K Lengyel wrote:
> While forking VMs running a small RTOS system (Zephyr) a Xen crash has been
> observed due to a mm-lock order violation while copying the HVM CPU context
> from the parent. This issue has been identified to be due to
> hap_update_paging_modes first getting a lock on the gfn using get_gfn. This
> call also creates a shared entry in the fork's memory map for the cr3 gfn. The
> function later calls hap_update_cr3 while holding the paging_lock, which
> results in the lock-order violation in vmx_load_pdptrs when it tries to 
> unshare
> the above entry when it grabs the page with the P2M_UNSHARE flag set.
> 
> Since vmx_load_pdptrs only reads from the page its usage of P2M_UNSHARE was
> unnecessary to start with. Using P2M_ALLOC is the appropriate flag to ensure
> the p2m is properly populated.
> 
> Note that the lock order violation is avoided because before the paging_lock 
> is
> taken a lookup is performed with P2M_ALLOC that forks the page, thus the 
> second
> lookup in vmx_load_pdptrs succeeds without having to perform the fork. We keep
> P2M_ALLOC in vmx_load_pdptrs because there are code-paths leading up to it
> which don't take the paging_lock and that have no previous lookup. Currently 
> no
> other code-path exists leading there with the paging_lock taken, thus no
> further adjustments are necessary.
> 
> Signed-off-by: Tamas K Lengyel <tamas.leng...@intel.com>

Reviewed-by: Roger Pau Monné <roger....@citrix.com>

Thanks!

Reply via email to