On 21/12/2022 1:26 pm, Jan Beulich wrote:
> The hook isn't mode dependent, hence it's misplaced in struct
> paging_mode. (Or alternatively I see no reason why the alloc_page() and
> free_page() hooks don't also live there.) Move it to struct
> paging_domain.

How you flush the TLBs has absolutely nothing to do with what mode the
guest is in.

But this hook too confuses p2m flushes with vcpu flushes.

> The hook also is used for HVM guests only, so make respective pieces
> conditional upon CONFIG_HVM.
>
> While there also add __must_check to the hook declaration, as it's
> imperative that callers deal with getting back "false".
>
> While moving the shadow implementation, introduce a "curr" local
> variable.
>
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Reviewed-by: Andrew Cooper <andrew.coop...@citrix.com>, with two
observations.

> --- a/xen/arch/x86/include/asm/paging.h
> +++ b/xen/arch/x86/include/asm/paging.h
> @@ -300,6 +299,12 @@ static inline unsigned long paging_ga_to
>          page_order);
>  }
>  
> +/* Flush selected vCPUs TLBs.  NULL for all. */
> +static inline bool paging_flush_tlb(const unsigned long *vcpu_bitmap)
> +{
> +    return current->domain->arch.paging.flush_tlb(vcpu_bitmap);

Not for this patch, but for cases like this, we should probably drop the
function pointer.

There are only two options, and they're invariant for the context, so

if ( hap )
    hap_flush_tlb(...);
else
    shadow_flush_tlb(...);

will almost certainly be faster on any CPU that Xen is liable to run
on.  Especially as HAP is probably ~100% common case.

> --- a/xen/arch/x86/mm/shadow/hvm.c
> +++ b/xen/arch/x86/mm/shadow/hvm.c
> @@ -688,6 +688,66 @@ static void sh_emulate_unmap_dest(struct
>      atomic_inc(&v->domain->arch.paging.shadow.gtable_dirty_version);
>  }
>  
> +static bool flush_vcpu(const struct vcpu *v, const unsigned long 
> *vcpu_bitmap)
> +{
> +    return !vcpu_bitmap || test_bit(v->vcpu_id, vcpu_bitmap);
> +}
> +
> +/* Flush TLB of selected vCPUs.  NULL for all. */
> +bool cf_check shadow_flush_tlb(const unsigned long *vcpu_bitmap)
> +{
> +    static DEFINE_PER_CPU(cpumask_t, flush_cpumask);

The hap and shadow variants both have a static percpu mask like this.

However, this is an irqs-on region with no nested locking, so I suspect
this path can share one of the scheduler percpu variables too.

~Andrew

Reply via email to