On 06.02.2024 02:20, George Dunlap wrote:
> --- /dev/null
> +++ b/docs/designs/nested-svm-cpu-features.md
> @@ -0,0 +1,110 @@
> +# Nested SVM (AMD) CPUID requirements
> +
> +The first step in making nested SVM production-ready is to make sure
> +that all features are implemented and well-tested.  To make this
> +tractable, we will initially be limiting the "supported" range of
> +nested virt to a specific subset of host and guest features.  This
> +document describes the criteria for deciding on features, and the
> +rationale behind each feature.
> +
> +For AMD, all virtualization-related features can be found in CPUID
> +leaf 8000000A:edx
> +
> +# Criteria
> +
> +- Processor support: At a minimum we want to support processors from
> +  the last 5 years.  All things being equal, older processors are
> +  better.

Nit: Perhaps missing "covering"? Generally I hope newer processors are
"better".

>  Bits 0:7 were available in the very earliest processors;
> +  and even through bit 15 we should be pretty good support-wise.
> +
> +- Faithfulness to hardware: We need the behavior of the "virtual cpu"
> +  from the L1 hypervisor's perspective to be as close as possible to
> +  the original hardware.  In particular, the behavior of the hardware
> +  on error paths 1) is not easy to understand or test, 2) can be the
> +  source of surprising vulnerabiliies.  (See XSA-7 for an example of a
> +  case where subtle error-handling differences can open up a privilege
> +  escalation.)  We should avoid emulating any bit of the hardware with
> +  complex error paths if we can at all help it.
> +
> +- Cost of implementation: We want to minimize the cost of
> +  implementation (where this includes bringing an existing sub-par
> +  implementation up to speed).  All things being equal, we'll favor a
> +  configuration which does not require any new implementation.
> +
> +- Performance: All things being equal, we'd prefer to choose a set of
> +  L0 / L1 CPUID bits that are faster than slower.
> +
> +
> +# Bits
> +
> +- 0 `NP` *Nested Paging*: Required both for L0 and L1.
> +
> +  Based primarily on faithfulness and performance, as well as
> +  potential cost of implementation.  Available on earliest hardware,
> +  so no compatibility issues.
> +
> +- 1 `LbrVirt` *LBR / debugging virtualization*: Require for L0 and L1.
> +
> +  For L0 this is required for performance: There's no way to tell the
> +  guests not to use the LBR-related registers; and if the guest does,
> +  then you have to save and restore all LBR-related registers on
> +  context switch, which is prohibitive.

"prohibitive" is too strong imo; maybe "undesirable"?

> --- a/xen/arch/x86/hvm/svm/nestedhvm.h
> +++ b/xen/arch/x86/hvm/svm/nestedhvm.h
> @@ -35,6 +35,7 @@ enum nestedhvm_vmexits
>  nestedsvm_check_intercepts(struct vcpu *v, struct cpu_user_regs *regs,
>      uint64_t exitcode);
>  void svm_nested_features_on_efer_update(struct vcpu *v);
> +void __init start_nested_svm(struct hvm_function_table *svm_function_table);

No section placement attributes on declarations, please.

> --- a/xen/arch/x86/hvm/svm/nestedsvm.c
> +++ b/xen/arch/x86/hvm/svm/nestedsvm.c
> @@ -1666,3 +1666,17 @@ void svm_nested_features_on_efer_update(struct vcpu *v)
>          }
>      }
>  }
> +
> +void __init start_nested_svm(struct hvm_function_table *svm_function_table)
> +{
> +    /* 
> +     * Required host functionality to support nested virt.  See
> +     * docs/designs/nested-svm-cpu-features.md for rationale.
> +     */
> +    svm_function_table->caps.nested_virt =
> +        cpu_has_svm_nrips &&
> +        cpu_has_svm_lbrv &&
> +        cpu_has_svm_nrips &&

nrips twice? Was the earlier one meant to be npt?

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3021,6 +3021,9 @@ const struct hvm_function_table * __init start_vmx(void)
>      if ( cpu_has_vmx_tsc_scaling )
>          vmx_function_table.tsc_scaling.ratio_frac_bits = 48;
>  
> +    /* TODO: Require hardware support before enabling nested virt */
> +    vmx_function_table.caps.nested_virt = vmx_function_table.caps.hap;

This won't have the intended effect if hap_supported() ends up clearing
the bit (used as input here) due to a command line option override. I
wonder if instead this wants doing e.g. in a new hook to be called from
nestedhvm_setup(). On the SVM side the hook function would then be the
start_nested_svm() that you already introduce, with a caps.hap check
added.

Since you leave the other nested-related if() in place in
arch_sanitise_domain_config(), all ought to be well, but I think that
other if() really wants replacing by the one you presently add.

Jan

Reply via email to