On Wed, May 03, 2023 at 05:57:09PM +0200, Jan Beulich wrote:
> The registration by virtual/linear address has downsides: At least on
> x86 the access is expensive for HVM/PVH domains. Furthermore for 64-bit
> PV domains the areas are inaccessible (and hence cannot be updated by
> Xen) when in guest-user mode, and for HVM guests they may be
> inaccessible when Meltdown mitigations are in place. (There are yet
> more issues.)
> 
> In preparation of the introduction of new vCPU operations allowing to
> register the respective areas (one of the two is x86-specific) by
> guest-physical address, flesh out the map/unmap functions.
> 
> Noteworthy differences from map_vcpu_info():
> - areas can be registered more than once (and de-registered),
> - remote vCPU-s are paused rather than checked for being down (which in
>   principle can change right after the check),
> - the domain lock is taken for a much smaller region.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> ---
> RFC: By using global domain page mappings the demand on the underlying
>      VA range may increase significantly. I did consider to use per-
>      domain mappings instead, but they exist for x86 only. Of course we
>      could have arch_{,un}map_guest_area() aliasing global domain page
>      mapping functions on Arm and using per-domain mappings on x86. Yet
>      then again map_vcpu_info() doesn't (and can't) do so.

I guess it's fine as you propose now, we might see about using
per-domain mappings.

> RFC: In map_guest_area() I'm not checking the P2M type, instead - just
>      like map_vcpu_info() - solely relying on the type ref acquisition.
>      Checking for p2m_ram_rw alone would be wrong, as at least
>      p2m_ram_logdirty ought to also be okay to use here (and in similar
>      cases, e.g. in Argo's find_ring_mfn()). p2m_is_pageable() could be
>      used here (like altp2m_vcpu_enable_ve() does) as well as in
>      map_vcpu_info(), yet then again the P2M type is stale by the time
>      it is being looked at anyway without the P2M lock held.

check_get_page_from_gfn() already does some type checks on the page.

> ---
> v2: currd -> d, to cover mem-sharing's copy_guest_area(). Re-base over
>     change(s) earlier in the series. Use ~0 as "unmap" request indicator.
> 
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1576,7 +1576,82 @@ int map_guest_area(struct vcpu *v, paddr
>                     struct guest_area *area,
>                     void (*populate)(void *dst, struct vcpu *v))
>  {
> -    return -EOPNOTSUPP;
> +    struct domain *d = v->domain;
> +    void *map = NULL;
> +    struct page_info *pg = NULL;
> +    int rc = 0;

Should you check/assert that size != 0?

> +    if ( ~gaddr )

I guess I will find in future patches, but why this special handling
for ~0 address?

Might be worth to add a comment here, or in the patch description.
Otherwise I would expect that when passed a ~0 address the first check
for page boundary crossing to fail.

> +    {
> +        unsigned long gfn = PFN_DOWN(gaddr);
> +        unsigned int align;
> +        p2m_type_t p2mt;
> +
> +        if ( gfn != PFN_DOWN(gaddr + size - 1) )
> +            return -ENXIO;
> +
> +#ifdef CONFIG_COMPAT
> +        if ( has_32bit_shinfo(d) )
> +            align = alignof(compat_ulong_t);
> +        else
> +#endif
> +            align = alignof(xen_ulong_t);
> +        if ( gaddr & (align - 1) )

IS_ALIGNED() might be clearer.

> +            return -ENXIO;
> +
> +        rc = check_get_page_from_gfn(d, _gfn(gfn), false, &p2mt, &pg);
> +        if ( rc )
> +            return rc;
> +
> +        if ( !get_page_type(pg, PGT_writable_page) )
> +        {
> +            put_page(pg);
> +            return -EACCES;
> +        }
> +
> +        map = __map_domain_page_global(pg);
> +        if ( !map )
> +        {
> +            put_page_and_type(pg);
> +            return -ENOMEM;
> +        }
> +        map += PAGE_OFFSET(gaddr);
> +    }
> +
> +    if ( v != current )
> +    {
> +        if ( !spin_trylock(&d->hypercall_deadlock_mutex) )
> +        {
> +            rc = -ERESTART;
> +            goto unmap;
> +        }
> +
> +        vcpu_pause(v);
> +
> +        spin_unlock(&d->hypercall_deadlock_mutex);
> +    }
> +
> +    domain_lock(d);
> +
> +    if ( map )
> +        populate(map, v);

The call to map_guest_area() in copy_guest_area() does pass NULL as
the populate parameter, hence this will crash?

Should you either assert that populate != NULL or change the if
condition to be map && populate?

> +
> +    SWAP(area->pg, pg);
> +    SWAP(area->map, map);
> +
> +    domain_unlock(d);
> +
> +    if ( v != current )
> +        vcpu_unpause(v);
> +
> + unmap:
> +    if ( pg )
> +    {
> +        unmap_domain_page_global(map);
> +        put_page_and_type(pg);
> +    }
> +
> +    return rc;
>  }
>  
>  /*
> @@ -1587,9 +1662,24 @@ int map_guest_area(struct vcpu *v, paddr
>  void unmap_guest_area(struct vcpu *v, struct guest_area *area)
>  {
>      struct domain *d = v->domain;
> +    void *map;
> +    struct page_info *pg;
>  
>      if ( v != current )
>          ASSERT(atomic_read(&v->pause_count) | atomic_read(&d->pause_count));
> +
> +    domain_lock(d);
> +    map = area->map;
> +    area->map = NULL;
> +    pg = area->pg;
> +    area->pg = NULL;

FWIW you could also use SWAP() here by initializing both map and pg to
NULL (I know it uses one extra local variable).

Thanks, Roger.

Reply via email to