On Tue, Jun 16, 2020 at 12:31:06PM -0700, Tamas K Lengyel wrote:
> While forking VMs running a small RTOS systems (Zephyr) a Xen crash has been
> observed due to a mm-lock order violation while copying the HVM CPU context
> from the parent. This issue has been identified to be due to
> hap_update_paging_modes getting a lock on the gfn using get_gfn. This call 
> also
> creates a shared entry in the fork's memory map for the cr3 gfn. The function
> later calls hap_update_cr3 while holding the paging_lock, which results in the
> lock-order violation in vmx_load_pdptrs when it tries to unshare the above 
> entry.
> 
> This issue has not affected VMs running other OS's as a call to 
> vmx_load_pdptrs
> is benign if PAE is not enabled or if EFER_LMA is set and returns before
> trying to unshare and map the page.
> 
> Using get_gfn_type to get a lock on the gfn avoids this problem as we can
> populate the fork's gfn with a copied page instead of a shared entry if its
> needed, thus avoiding the lock order violation while holding paging_lock.
> 
> Signed-off-by: Tamas K Lengyel <tamas.leng...@intel.com>
> ---
> The bug seems to have been present since commit 4cb6c4f4941, only discovered
> now due to the heavy use of mem_sharing with VM forks. As this is a simple
> bug-fix it would be nice to include it in the 4.14 release.

I agree it seems like a candidate bugfix to be included. I've added
Paul to the Cc so he is aware.

> ---
>  xen/arch/x86/mm/hap/hap.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index 7f84d0c6ea..9ae4c3ae6e 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -748,12 +748,19 @@ static void hap_update_paging_modes(struct vcpu *v)
>      struct domain *d = v->domain;
>      unsigned long cr3_gfn = v->arch.hvm.guest_cr[3] >> PAGE_SHIFT;
>      p2m_type_t t;
> +    p2m_query_t q = P2M_ALLOC;
>  
> -    /* We hold onto the cr3 as it may be modified later, and
> -     * we need to respect lock ordering. No need for 
> -     * checks here as they are performed by vmx_load_pdptrs
> -     * (the potential user of the cr3) */
> -    (void)get_gfn(d, cr3_gfn, &t);
> +    /*
> +     * We hold onto the cr3 as it may be modified later, and
> +     * we need to respect lock ordering. Unshare here if we have to as to 
> avoid
> +     * a lock-order violation later while we are holding the paging_lock.
> +     * Further checks are performed by vmx_load_pdptrs (the potential user of
> +     * the cr3).
> +     */
> +    if ( hvm_pae_enabled(v) && !hvm_long_mode_active(v) )
> +        q |= P2M_UNSHARE;
> +
> +    (void)get_gfn_type(d, cr3_gfn, &t, q);

While there I think you can drop the cast to void.

In order for this to be more resilient, maybe it would be better to
just use get_gfn_unshare directly and avoid checking what paging mode
the guest is currently using?

Or would that be too expensive in terms of performance for the not
affected case?

I feel like relying on the internals of vmx_load_pdptrs here is
fragile.

Thanks, Roger.

Reply via email to