On Wed, May 26, 2021 at 02:59:00PM +0200, Jan Beulich wrote:
> Sufficiently old Linux (3.12-ish) accesses these MSRs in an unguarded
> manner. Furthermore these MSRs, at least on Fam11 and older CPUs, are
> also consulted by modern Linux, and their (bogus) built-in zapping of
> #GP faults from MSR accesses leads to it effectively reading zero
> instead of the intended values, which are relevant for PCI BAR placement
> (which ought to all live in MMIO-type space, not in DRAM-type one).
> 
> For SYSCFG, only certain bits get exposed. In fact, whether to expose
> MtrrVarDramEn is debatable: It controls use of not just TOM, but also
> the IORRs. Introduce (consistently named) constants for the bits we're
> interested in and use them in pre-existing code as well.

I think we should also allow access to the IORRs MSRs for coherency
(c001001{6,9}) for the hardware domain.

> As a welcome side effect, verbosity on/of debug builds gets (perhaps
> significantly) reduced.
> 
> Note that at least as far as those MSR accesses by Linux are concerned,
> there's no similar issue for DomU-s, as the accesses sit behind PCI
> device matching logic. The checked for devices would never be exposed to
> DomU-s in the first place. Nevertheless I think that at least for HVM we
> should return sensible values, not 0 (as svm_msr_read_intercept() does
> right now). The intended values may, however, need to be determined by
> hvmloader, and then get made known to Xen.

Could we maybe come up with a fixed memory layout that hvmloader had
to respect?

Ie: DRAM from 0 to 3G, MMIO from 3G to 4G, and then the remaining
DRAM from 4G in a contiguous single block?

hvmloader would have to place BARs that don't fit in the 3G-4G hole at
the end of DRAM (ie: after TOM2).

> 
> Fixes: 322ec7c89f66 ("x86/pv: disallow access to unknown MSRs")
> Reported-by: Olaf Hering <o...@aepfle.de>
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> 
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -468,14 +468,14 @@ static void check_syscfg_dram_mod_en(voi
>               return;
>  
>       rdmsrl(MSR_K8_SYSCFG, syscfg);
> -     if (!(syscfg & K8_MTRRFIXRANGE_DRAM_MODIFY))
> +     if (!(syscfg & SYSCFG_MTRR_FIX_DRAM_MOD_EN))
>               return;
>  
>       if (!test_and_set_bool(printed))
>               printk(KERN_ERR "MTRR: SYSCFG[MtrrFixDramModEn] not "
>                       "cleared by BIOS, clearing this bit\n");
>  
> -     syscfg &= ~K8_MTRRFIXRANGE_DRAM_MODIFY;
> +     syscfg &= ~SYSCFG_MTRR_FIX_DRAM_MOD_EN;
>       wrmsrl(MSR_K8_SYSCFG, syscfg);
>  }
>  
> --- a/xen/arch/x86/cpu/mtrr/generic.c
> +++ b/xen/arch/x86/cpu/mtrr/generic.c
> @@ -224,7 +224,7 @@ static void __init print_mtrr_state(cons
>               uint64_t syscfg, tom2;
>  
>               rdmsrl(MSR_K8_SYSCFG, syscfg);
> -             if (syscfg & (1 << 21)) {
> +             if (syscfg & SYSCFG_MTRR_TOM2_EN) {
>                       rdmsrl(MSR_K8_TOP_MEM2, tom2);
>                       printk("%sTOM2: %012"PRIx64"%s\n", level, tom2,
>                              syscfg & (1 << 22) ? " (WB)" : "");
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -339,6 +339,19 @@ int guest_rdmsr(struct vcpu *v, uint32_t
>          *val = msrs->tsc_aux;
>          break;
>  
> +    case MSR_K8_SYSCFG:
> +    case MSR_K8_TOP_MEM1:
> +    case MSR_K8_TOP_MEM2:
> +        if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
> +            goto gp_fault;
> +        if ( !is_hardware_domain(d) )
> +            return X86EMUL_UNHANDLEABLE;

It might be clearer to also handle the !is_hardware_domain case here,
instead of deferring to svm_msr_read_intercept:

if ( is_hardware_domain(d) )
    rdmsrl(msr, *val);
else
    *val = 0;

...

> +        rdmsrl(msr, *val);
> +        if ( msr == MSR_K8_SYSCFG )
> +            *val &= (SYSCFG_TOM2_FORCE_WB | SYSCFG_MTRR_TOM2_EN |
> +                     SYSCFG_MTRR_VAR_DRAM_EN | SYSCFG_MTRR_FIX_DRAM_EN);
> +        break;
> +
>      case MSR_K8_HWCR:
>          if ( !(cp->x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) )
>              goto gp_fault;
> --- a/xen/arch/x86/x86_64/mmconf-fam10h.c
> +++ b/xen/arch/x86/x86_64/mmconf-fam10h.c
> @@ -69,7 +69,7 @@ static void __init get_fam10h_pci_mmconf
>       rdmsrl(address, val);
>  
>       /* TOP_MEM2 is not enabled? */
> -     if (!(val & (1<<21))) {
> +     if (!(val & SYSCFG_MTRR_TOM2_EN)) {
>               tom2 = 1ULL << 32;
>       } else {
>               /* TOP_MEM2 */
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -116,6 +116,13 @@
>  #define  PASID_PASID_MASK                   0x000fffff
>  #define  PASID_VALID                        (_AC(1, ULL) << 31)
>  
> +#define MSR_K8_SYSCFG                       0xc0010010
> +#define  SYSCFG_MTRR_FIX_DRAM_EN            (_AC(1, ULL) << 18)
> +#define  SYSCFG_MTRR_FIX_DRAM_MOD_EN        (_AC(1, ULL) << 19)
> +#define  SYSCFG_MTRR_VAR_DRAM_EN            (_AC(1, ULL) << 20)
> +#define  SYSCFG_MTRR_TOM2_EN                (_AC(1, ULL) << 21)
> +#define  SYSCFG_TOM2_FORCE_WB               (_AC(1, ULL) << 22)
> +
>  #define MSR_K8_VM_CR                        0xc0010114
>  #define  VM_CR_INIT_REDIRECTION             (_AC(1, ULL) <<  1)
>  #define  VM_CR_SVM_DISABLE                  (_AC(1, ULL) <<  4)
> @@ -279,10 +286,7 @@
>  #define MSR_K8_TOP_MEM1                      0xc001001a
>  #define MSR_K7_CLK_CTL                       0xc001001b
>  #define MSR_K8_TOP_MEM2                      0xc001001d
> -#define MSR_K8_SYSCFG                        0xc0010010
>  
> -#define K8_MTRRFIXRANGE_DRAM_ENABLE  0x00040000 /* MtrrFixDramEn bit    */
> -#define K8_MTRRFIXRANGE_DRAM_MODIFY  0x00080000 /* MtrrFixDramModEn bit */
>  #define K8_MTRR_RDMEM_WRMEM_MASK     0x18181818 /* Mask: RdMem|WrMem    */

That last one seem to be unused, I wonder if you could also drop it as
part of this cleanup?

Thanks, Roger.

Reply via email to