Re: [PATCH 0/4] Virtualize architectural LBRs
On 18/11/2024 10:52, Andrew Cooper wrote: > There's also a reason why we haven't got this working yet. There are a > couple of areas of prerequisite work which need addressing before XSS > can be enabled properly. > > If you're willing to tackle this, then I can explain what needs doing, > and in roughly which order. I would appreciate explanations of the pending XSS issues. Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
Re: [PATCH 0/4] Virtualize architectural LBRs
On 18/11/2024 09:52, Jan Beulich wrote: > Looking over just the files touched: No change to XSAVE logic at all? XSAVE is hidden behind a new IA32_XSS bit. I'll try to implement that next. Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
Re: [PATCH 0/4] Virtualize architectural LBRs
Hi Andrew, On 18/11/2024 10:52, Andrew Cooper wrote: > On 18/11/2024 9:13 am, Tu Dinh wrote: >> On 18/11/2024 09:52, Jan Beulich wrote: >>> Looking over just the files touched: No change to XSAVE logic at all? >> XSAVE is hidden behind a new IA32_XSS bit. I'll try to implement that next. > > It's rather more severe than that. > > Without XSAVE support, Xen can't context-switch the LBR state when vCPUs > are scheduled in and out. (In patch 4 you seem to have copied the > legacy way, which is extremely expensive.) > > Architecturally, ARCH_LBR depends on XSAVES so OSes can context switch > it easily(ish) per thread. > > There's also a reason why we haven't got this working yet. There are a > couple of areas of prerequisite work which need addressing before XSS > can be enabled properly. > > If you're willing to tackle this, then I can explain what needs doing, > and in roughly which order. > > ~Andrew Following the community call yesterday, I'd like to clarify my understanding of the issue: - Firstly, virtual XSS support for architectural LBR must be enabled. I noticed that XSS is already implemented, just not enabled; barring the LBR format issues below, are there any other issues with the current XSS implementation? - There are LBR format differences between some cores of the same CPU (e.g. in Intel hybrid CPUs: P-cores use effective IP while E-cores use linear IP). These differences are expected to be handled by XSAVES/XRSTORS. However, Xen would have to make sure that LBR MSRs are saved/restored by XSS instead of by manually poking MSRs. - A related issue is handling the compressed XSAVE format for migration streams. Xen currently expands/compacts XSAVE format manually during migration; are there any concerns with arch LBR breaking the XSAVE migration logic? My understanding is that as long as we don't manually poke the LBR state component, and that LBR state size remains consistent across hybrid cores in the same CPU (which it should be for XSAVE compatibility), there should be no concern with the XSAVE state itself. However, Xen must check CPU features of both sides during migration to make sure that XSAVE states are compatible, which is more complex in migrations involving hosts with hybrid CPUs. Please tell me if I'm missing any potential issues. Thanks, Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 01/10] x86: Add architectural LBR definitions
Signed-off-by: Tu Dinh --- xen/arch/x86/include/asm/msr-index.h | 12 1 file changed, 12 insertions(+) diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h index 9cdb5b2625..97df740b04 100644 --- a/xen/arch/x86/include/asm/msr-index.h +++ b/xen/arch/x86/include/asm/msr-index.h @@ -112,6 +112,8 @@ #define MCU_OPT_CTRL_GDS_MIT_DIS (_AC(1, ULL) << 4) #define MCU_OPT_CTRL_GDS_MIT_LOCK (_AC(1, ULL) << 5) +#define MSR_LER_INFO0x01e0 + #define MSR_RTIT_OUTPUT_BASE0x0560 #define MSR_RTIT_OUTPUT_MASK0x0561 #define MSR_RTIT_CTL0x0570 @@ -193,6 +195,16 @@ #define MSR_UARCH_MISC_CTRL 0x1b01 #define UARCH_CTRL_DOITM (_AC(1, ULL) << 0) +/* Architectural LBR state MSRs */ +#define MSR_LBR_INFO(n) (0x1200 + (n)) +#define MSR_LBR_CTL 0x14ce +#define LBR_CTL_VALID _AC(0x7f000f, ULL) +#define MSR_LBR_DEPTH 0x14cf +#define MSR_LBR_FROM_IP(n) (0x1500 + (n)) +#define MSR_LBR_TO_IP(n)(0x1600 + (n)) +/* Must be updated along with XSTATE LBR state size */ +#define NUM_MSR_ARCH_LBR_FROM_TO32 + #define MSR_EFER_AC(0xc080, U) /* Extended Feature Enable Register */ #define EFER_SCE (_AC(1, ULL) << 0) /* SYSCALL Enable */ #define EFER_LME (_AC(1, ULL) << 8) /* Long Mode Enable */ -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 09/10] x86/vmx: Implement arch LBR
Use guest LBR_CTL in VMCS to limit LBR operation per guest. Use MSR bitmap to disable interception of arch LBR MSRs. Reconfigure bitmap on each valid LBR depth write. Signed-off-by: Tu Dinh --- xen/arch/x86/domain.c | 7 + xen/arch/x86/hvm/vmx/vmcs.c | 11 +- xen/arch/x86/hvm/vmx/vmx.c | 203 ++-- xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 11 ++ xen/arch/x86/include/asm/msr.h | 5 + xen/arch/x86/msr.c | 86 ++ 6 files changed, 307 insertions(+), 16 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 78a13e6812..8ed35cbbc8 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2021,6 +2021,13 @@ static void __context_switch(void) if ( cpu_has_xsaves && is_hvm_vcpu(n) ) set_msr_xss(n->arch.msrs->xss.raw); } +#ifdef CONFIG_HVM +/* XRSTORS LBR state behavior depends on MSR_LBR_DEPTH */ +if ( using_vmx() && + is_hvm_vcpu(n) && + n->domain->arch.cpu_policy->feat.arch_lbr ) +wrmsrl(MSR_LBR_DEPTH, n->arch.msrs->lbr_depth.raw); +#endif vcpu_restore_fpu_nonlazy(n, false); nd->arch.ctxt_switch->to(n); } diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 147e998371..a16daad78a 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -203,6 +203,7 @@ static void __init vmx_display_features(void) P(cpu_has_vmx_bus_lock_detection, "Bus Lock Detection"); P(cpu_has_vmx_notify_vm_exiting, "Notify VM Exit"); P(cpu_has_vmx_virt_spec_ctrl, "Virtualize SPEC_CTRL"); +P(cpu_has_vmx_guest_lbr_ctl, "Architectural LBR virtualization"); #undef P if ( !printed ) @@ -448,7 +449,8 @@ static int vmx_init_vmcs_config(bool bsp) min = VM_EXIT_ACK_INTR_ON_EXIT; opt = (VM_EXIT_SAVE_GUEST_PAT | VM_EXIT_LOAD_HOST_PAT | - VM_EXIT_LOAD_HOST_EFER | VM_EXIT_CLEAR_BNDCFGS); + VM_EXIT_LOAD_HOST_EFER | VM_EXIT_CLEAR_BNDCFGS | + VM_EXIT_CLEAR_GUEST_LBR_CTL); min |= VM_EXIT_IA32E_MODE; _vmx_vmexit_control = adjust_vmx_controls( "VMExit Control", min, opt, MSR_IA32_VMX_EXIT_CTLS, &mismatch); @@ -489,7 +491,7 @@ static int vmx_init_vmcs_config(bool bsp) min = 0; opt = (VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_GUEST_EFER | - VM_ENTRY_LOAD_BNDCFGS); + VM_ENTRY_LOAD_BNDCFGS | VM_ENTRY_LOAD_GUEST_LBR_CTL); _vmx_vmentry_control = adjust_vmx_controls( "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS, &mismatch); @@ -1329,6 +1331,9 @@ static int construct_vmcs(struct vcpu *v) | (paging_mode_hap(d) ? 0 : (1U << X86_EXC_PF)) | (v->arch.fully_eager_fpu ? 0 : (1U << X86_EXC_NM)); +if ( cpu_has_vmx_guest_lbr_ctl ) +__vmwrite(GUEST_LBR_CTL, 0); + if ( cpu_has_vmx_notify_vm_exiting ) __vmwrite(NOTIFY_WINDOW, vm_notify_window); @@ -2087,6 +2092,8 @@ void vmcs_dump_vcpu(struct vcpu *v) vmr32(GUEST_PREEMPTION_TIMER), vmr32(GUEST_SMBASE)); printk("DebugCtl = 0x%016lx DebugExceptions = 0x%016lx\n", vmr(GUEST_IA32_DEBUGCTL), vmr(GUEST_PENDING_DBG_EXCEPTIONS)); +if ( vmentry_ctl & VM_ENTRY_LOAD_GUEST_LBR_CTL ) +printk("LbrCtl = 0x%016lx\n", vmr(GUEST_LBR_CTL)); if ( vmentry_ctl & (VM_ENTRY_LOAD_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_BNDCFGS) ) printk("PerfGlobCtl = 0x%016lx BndCfgS = 0x%016lx\n", vmr(GUEST_PERF_GLOBAL_CTRL), vmr(GUEST_BNDCFGS)); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 9f1e9d515f..c706f01d79 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -48,6 +48,7 @@ #include #include #include +#include #include static bool __initdata opt_force_ept; @@ -773,6 +774,67 @@ void vmx_update_exception_bitmap(struct vcpu *v) __vmwrite(EXCEPTION_BITMAP, bitmap); } +static void cf_check vmx_set_lbr_depth(struct vcpu *v, + uint32_t depth) +{ +struct cpu_policy *cp = v->domain->arch.cpu_policy; +bool host_lip, guest_lip; +uint32_t i; + +if ( !cp->feat.arch_lbr ) +return; + +ASSERT(depth != 0 && + depth <= NUM_MSR_ARCH_LBR_FROM_TO && + depth % 8 == 0); +ASSERT(cp->basic.lbr_1Ca.supported_depths & ((1u << (depth / 8)) - 1)); + +host_lip = current_cpu_has_lbr_lip; +guest_lip = !!cp->basic.lbr_1Ca.ip_contains_lip; + +if ( v->arch.msrs->lbr_depth.raw == depth && + v->arch.hvm.vmx.last_host_lip == host_lip ) +return; + +if ( host_lip != guest_lip ) +{ +for ( i = 0; i
[RFC PATCH v2 02/10] x86: Define arch LBR feature bits
Add three featureset words corresponding to the 3 CPUID words in leaf 0x1c. Intel SDM states that CPUID may indicate a LBR depth of up to 64. However, since XSAVE LBR state only goes up to 32 LBRs, don't expose the other CPUID depth bits for now. Signed-off-by: Tu Dinh --- xen/arch/x86/include/asm/cpufeature.h | 5 ++ xen/include/public/arch-x86/cpufeatureset.h | 28 ++- xen/include/xen/lib/x86/cpu-policy.h| 51 - xen/lib/x86/cpuid.c | 6 +++ 4 files changed, 88 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/include/asm/cpufeature.h b/xen/arch/x86/include/asm/cpufeature.h index 3a06b6f297..4323ffb8cb 100644 --- a/xen/arch/x86/include/asm/cpufeature.h +++ b/xen/arch/x86/include/asm/cpufeature.h @@ -219,6 +219,11 @@ static inline bool boot_cpu_has(unsigned int feat) #define cpu_has_rfds_no boot_cpu_has(X86_FEATURE_RFDS_NO) #define cpu_has_rfds_clear boot_cpu_has(X86_FEATURE_RFDS_CLEAR) +/* CPUID level 0x001c.eax */ + +#define current_cpu_has_lbr_lip cpu_has(¤t_cpu_data, \ +X86_FEATURE_LBR_LIP); + /* Synthesized. */ #define cpu_has_arch_perfmonboot_cpu_has(X86_FEATURE_ARCH_PERFMON) #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 8fa3fb711a..86d3e61438 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -284,7 +284,7 @@ XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*A SERIALIZE insn */ XEN_CPUFEATURE(HYBRID,9*32+15) /* Heterogeneous platform */ XEN_CPUFEATURE(TSXLDTRK, 9*32+16) /*a TSX load tracking suspend/resume insns */ XEN_CPUFEATURE(PCONFIG, 9*32+18) /* PCONFIG instruction */ -XEN_CPUFEATURE(ARCH_LBR, 9*32+19) /* Architectural Last Branch Record */ +XEN_CPUFEATURE(ARCH_LBR, 9*32+19) /*s Architectural Last Branch Record */ XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ XEN_CPUFEATURE(AMX_BF16, 9*32+22) /* AMX BFloat16 instruction */ XEN_CPUFEATURE(AVX512_FP16, 9*32+23) /*A AVX512 FP16 instructions */ @@ -379,6 +379,32 @@ XEN_CPUFEATURE(RFDS_CLEAR, 16*32+28) /*!A| Register File(s) cleared by V /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.edx, word 17 */ +/* Intel-defined CPU features, CPUID level 0x001c.eax, word 18 */ +XEN_CPUFEATURE(LBR_DEPTH_8,18*32+ 0) /*s Depth 8 */ +XEN_CPUFEATURE(LBR_DEPTH_16, 18*32+ 1) /*s Depth 16 */ +XEN_CPUFEATURE(LBR_DEPTH_24, 18*32+ 2) /*s Depth 24 */ +XEN_CPUFEATURE(LBR_DEPTH_32, 18*32+ 3) /*s Depth 32 */ +XEN_CPUFEATURE(LBR_DEPTH_40, 18*32+ 4) /* Depth 40 */ +XEN_CPUFEATURE(LBR_DEPTH_48, 18*32+ 5) /* Depth 48 */ +XEN_CPUFEATURE(LBR_DEPTH_56, 18*32+ 6) /* Depth 56 */ +XEN_CPUFEATURE(LBR_DEPTH_64, 18*32+ 7) /* Depth 64 */ +XEN_CPUFEATURE(LBR_DCST_RST, 18*32+30) /*s Deep C-state reset */ +XEN_CPUFEATURE(LBR_LIP,18*32+31) /*! IP is linear IP */ + +/* Intel-defined CPU features, CPUID level 0x001c.ebx, word 19 */ +XEN_CPUFEATURE(LBR_CPL_FILTER, 19*32+ 0) /*s CPL filtering */ +XEN_CPUFEATURE(LBR_BR_FILTER, 19*32+ 1) /*s Branch filtering */ +XEN_CPUFEATURE(LBR_CALL_STACK, 19*32+ 2) /*s Call stack mode */ + +/* Intel-defined CPU features, CPUID level 0x001c.ecx, word 20 */ +XEN_CPUFEATURE(LBR_MISPRED,20*32+ 0) /*s Mispredict mode */ +XEN_CPUFEATURE(LBR_TIMED, 20*32+ 1) /*s Timed mode */ +XEN_CPUFEATURE(LBR_BR_TYPE,20*32+ 2) /*s Branch type */ +XEN_CPUFEATURE(LBR_EVENT_LOG_0,20*32+16) /*s Event logging for counter 0 */ +XEN_CPUFEATURE(LBR_EVENT_LOG_1,20*32+17) /*s Event logging for counter 1 */ +XEN_CPUFEATURE(LBR_EVENT_LOG_2,20*32+18) /*s Event logging for counter 2 */ +XEN_CPUFEATURE(LBR_EVENT_LOG_3,20*32+19) /*s Event logging for counter 3 */ + #endif /* XEN_CPUFEATURE */ /* Clean up from a default include. Close the enum (for C). */ diff --git a/xen/include/xen/lib/x86/cpu-policy.h b/xen/include/xen/lib/x86/cpu-policy.h index f43e1a3b21..f3b331f36c 100644 --- a/xen/include/xen/lib/x86/cpu-policy.h +++ b/xen/include/xen/lib/x86/cpu-policy.h @@ -22,6 +22,9 @@ #define FEATURESET_7d1 15 /* 0x0007:1.edx*/ #define FEATURESET_m10Al 16 /* 0x010a.eax */ #define FEATURESET_m10Ah 17 /* 0x010a.edx */ +#define FEATURESET_1Ca 18 /* 0x001c.eax */ +#define FEATURESET_1Cb 19 /* 0x001c.ebx */ +#define FEATURESET_1Cc 20 /* 0x001c.ecx */ struct cpuid_leaf { @@ -85,7 +88,7 @@ unsigned int x86_cpuid_lookup_vendor(uint32_t ebx, uint32_t ecx, uint32_t edx); */ const char *x86_cpuid_vendor_to_str(unsigned int vendor); -#define CPUID_GUEST_NR_BASIC (0xdu + 1) +#define CPUID_GUEST_NR_BASIC (
[RFC PATCH v2 00/10] Virtualize architectural LBRs
Intel model-specific last branch records (LBRs) were replaced by architectural LBRs (see Chapter 20 of Intel SDM volume 3B). This patchset implements virtual LBRs for HVM guests using Intel's "load guest IA32_LBR_CTL" and "clear IA32_LBR_CTL" VMX controls. It dynamically intercepts accesses to LBR state to translate between linear and effective IP depending on the current host CPU core type. The v2 patchset implements LBR state support in Xen's xstate handling. Additionally, it adds XSAVES/XRSTORS support to the x86 emulator. Finally, migration is handled by adding a new HVM save code CPU_XSAVES_CODE containing a vCPU's compacted xstates as written by XSAVES. I'm looking for feedback on emulator handling of XSAVES/XRSTORS, especially concerning FPU bits as it's not clear to me what should be done in these cases. Tu Dinh (10): x86: Add architectural LBR definitions x86: Define arch LBR feature bits tools: Add arch LBR feature bits x86: Calculate arch LBR CPUID policies x86: Keep a copy of XSAVE area size x86: Enable XSTATE save/restore for arch LBR x86/hvm: Don't count XSS bits in XSAVE size x86/emulate: Implement XSAVES/XRSTORS for arch LBR x86/vmx: Implement arch LBR x86/hvm: Enable XSAVES LBR save/restore tools/libs/light/libxl_cpuid.c | 3 + tools/misc/xen-cpuid.c | 3 + tools/tests/x86_emulator/x86-emulate.h | 2 + xen/arch/x86/cpu-policy.c | 28 +++ xen/arch/x86/cpu/common.c | 7 + xen/arch/x86/domain.c | 7 + xen/arch/x86/hvm/emulate.c | 11 + xen/arch/x86/hvm/hvm.c | 70 +- xen/arch/x86/hvm/vmx/vmcs.c | 11 +- xen/arch/x86/hvm/vmx/vmx.c | 203 +-- xen/arch/x86/include/asm/cpufeature.h | 5 + xen/arch/x86/include/asm/domain.h | 1 + xen/arch/x86/include/asm/hvm/hvm.h | 3 + xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 11 + xen/arch/x86/include/asm/msr-index.h| 12 + xen/arch/x86/include/asm/msr.h | 5 + xen/arch/x86/include/asm/xstate.h | 22 +- xen/arch/x86/msr.c | 89 ++- xen/arch/x86/x86_emulate/0fc7.c | 260 ++-- xen/arch/x86/x86_emulate/blk.c | 142 +++ xen/arch/x86/x86_emulate/private.h | 8 + xen/arch/x86/x86_emulate/util-xen.c | 14 ++ xen/arch/x86/x86_emulate/x86_emulate.c | 19 ++ xen/arch/x86/x86_emulate/x86_emulate.h | 33 +++ xen/arch/x86/xstate.c | 83 +-- xen/include/public/arch-x86/cpufeatureset.h | 28 ++- xen/include/public/arch-x86/hvm/save.h | 4 +- xen/include/xen/lib/x86/cpu-policy.h| 51 +++- xen/lib/x86/cpuid.c | 6 + 29 files changed, 1013 insertions(+), 128 deletions(-) -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 03/10] tools: Add arch LBR feature bits
Signed-off-by: Tu Dinh --- tools/libs/light/libxl_cpuid.c | 3 +++ tools/misc/xen-cpuid.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c index 063fe86eb7..05be36f258 100644 --- a/tools/libs/light/libxl_cpuid.c +++ b/tools/libs/light/libxl_cpuid.c @@ -342,6 +342,9 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *policy, const char* str) CPUID_ENTRY(0x0007, 1, CPUID_REG_EDX), MSR_ENTRY(0x10a, CPUID_REG_EAX), MSR_ENTRY(0x10a, CPUID_REG_EDX), +CPUID_ENTRY(0x001C, NA, CPUID_REG_EAX), +CPUID_ENTRY(0x001C, NA, CPUID_REG_EBX), +CPUID_ENTRY(0x001C, NA, CPUID_REG_ECX), #undef MSR_ENTRY #undef CPUID_ENTRY }; diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c index 4c4593528d..4f0fb0a6ea 100644 --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -37,6 +37,9 @@ static const struct { { "CPUID 0x0007:1.edx", "7d1" }, { "MSR_ARCH_CAPS.lo", "m10Al" }, { "MSR_ARCH_CAPS.hi", "m10Ah" }, +{ "CPUID 0x001c.eax", "1Ca" }, +{ "CPUID 0x001c.ebx", "1Cb" }, + { "CPUID 0x001c.ecx", "1Cc" }, }; #define COL_ALIGN "24" -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 06/10] x86: Enable XSTATE save/restore for arch LBR
Add a function get_xstate_component_comp() to allow fetching a specific XSTATE component from a compressed image. Also add LBR state declarations in xstate.h. Signed-off-by: Tu Dinh --- xen/arch/x86/include/asm/xstate.h | 22 - xen/arch/x86/msr.c| 3 +- xen/arch/x86/xstate.c | 79 +++ 3 files changed, 79 insertions(+), 25 deletions(-) diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index 07017cc4ed..cc77f599d7 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -33,13 +33,13 @@ extern uint32_t mxcsr_mask; #define XSTATE_FP_SSE (X86_XCR0_X87 | X86_XCR0_SSE) #define XCNTXT_MASK(X86_XCR0_X87 | X86_XCR0_SSE | X86_XCR0_YMM | \ X86_XCR0_OPMASK | X86_XCR0_ZMM | X86_XCR0_HI_ZMM | \ -XSTATE_NONLAZY) +XSTATE_NONLAZY | XSTATE_XSAVES_ONLY) #define XSTATE_ALL (~(1ULL << 63)) #define XSTATE_NONLAZY (X86_XCR0_BNDREGS | X86_XCR0_BNDCSR | X86_XCR0_PKRU | \ X86_XCR0_TILE_CFG | X86_XCR0_TILE_DATA) #define XSTATE_LAZY(XSTATE_ALL & ~XSTATE_NONLAZY) -#define XSTATE_XSAVES_ONLY 0 +#define XSTATE_XSAVES_ONLY (X86_XSS_LBR) #define XSTATE_COMPACTION_ENABLED (1ULL << 63) #define XSTATE_XSS (1U << 0) @@ -91,6 +91,21 @@ struct xstate_bndcsr { uint64_t bndstatus; }; +struct xstate_lbr_entry { +uint64_t lbr_from_ip; +uint64_t lbr_to_ip; +uint64_t lbr_info; +}; + +struct xstate_lbr { + uint64_t lbr_ctl; + uint64_t lbr_depth; + uint64_t ler_from_ip; + uint64_t ler_to_ip; + uint64_t ler_info; + struct xstate_lbr_entry entries[32]; +}; + /* extended state operations */ bool __must_check set_xcr0(u64 xfeatures); uint64_t get_xcr0(void); @@ -114,6 +129,9 @@ int xstate_alloc_save_area(struct vcpu *v); void xstate_init(struct cpuinfo_x86 *c); unsigned int xstate_uncompressed_size(uint64_t xcr0); unsigned int xstate_compressed_size(uint64_t xstates); +void *get_xstate_component_comp(struct xsave_struct *xstate, +unsigned int size, +uint64_t component); static inline uint64_t xgetbv(unsigned int index) { diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c index 289cf10b78..68a419ac8e 100644 --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -522,8 +522,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val) if ( !cp->xstate.xsaves ) goto gp_fault; -/* No XSS features currently supported for guests */ -if ( val != 0 ) +if ( val & ~(uint64_t)XSTATE_XSAVES_ONLY ) goto gp_fault; msrs->xss.raw = val; diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index baae8e1a13..607bf9c8a5 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -18,13 +18,16 @@ #include /* - * Maximum size (in byte) of the XSAVE/XRSTOR save area required by all + * Maximum size (in byte) of the XSAVE(S)/XRSTOR(S) save area required by all * the supported and enabled features on the processor, including the * XSAVE.HEADER. We only enable XCNTXT_MASK that we have known. */ static u32 __read_mostly xsave_cntxt_size; -/* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */ +/* + * A 64-bit bitmask of the XSAVE(S)/XRSTOR(S) features supported by + * processor. + */ u64 __read_mostly xfeature_mask; unsigned int *__read_mostly xstate_offsets; @@ -126,7 +129,8 @@ static int setup_xstate_features(bool bsp) cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx); BUG_ON(eax != xstate_sizes[leaf]); -BUG_ON(ebx != xstate_offsets[leaf]); +if ( (1ul << leaf) & X86_XCR0_STATES ) +BUG_ON(ebx != xstate_offsets[leaf]); BUG_ON(!(ecx & XSTATE_ALIGN64) != !test_bit(leaf, &xstate_align)); } } @@ -210,7 +214,7 @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) * non-compacted offset. */ src = xstate; -valid = xstate_bv & ~XSTATE_FP_SSE; +valid = xstate_bv & ~XSTATE_FP_SSE & ~X86_XSS_STATES; while ( valid ) { u64 feature = valid & -valid; @@ -276,7 +280,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) * possibly compacted offset. */ dest = xstate; -valid = xstate_bv & ~XSTATE_FP_SSE; +valid = xstate_bv & ~XSTATE_FP_SSE & ~X86_XSS_STATES; while ( valid ) { u64 feature = valid & -valid; @@ -516,7 +520,7 @@ int xstate_alloc_save_area(struct vcpu *v) */ size = XSTATE_AREA_MIN_SIZE; } -else if ( !is_idle_vcpu(v) || !cpu_has_xsavec ) +else if ( !is_idle_vcpu(v)
[RFC PATCH v2 08/10] x86/emulate: Implement XSAVES/XRSTORS for arch LBR
Add new set_lbr_depth HVM function and emulate ops to support LBR XSAVES/XRSTORS emulation. Add the appropriate instruction hooks to 0fc7.c. Translate LBR registers using cs.base within a large block emulator operation. Signed-off-by: Tu Dinh --- tools/tests/x86_emulator/x86-emulate.h | 2 + xen/arch/x86/hvm/emulate.c | 11 ++ xen/arch/x86/include/asm/hvm/hvm.h | 3 + xen/arch/x86/x86_emulate/0fc7.c| 260 ++--- xen/arch/x86/x86_emulate/blk.c | 142 ++ xen/arch/x86/x86_emulate/private.h | 8 + xen/arch/x86/x86_emulate/util-xen.c| 14 ++ xen/arch/x86/x86_emulate/x86_emulate.c | 19 ++ xen/arch/x86/x86_emulate/x86_emulate.h | 33 9 files changed, 422 insertions(+), 70 deletions(-) diff --git a/tools/tests/x86_emulator/x86-emulate.h b/tools/tests/x86_emulator/x86-emulate.h index 929c1a72ae..75a9a65ae7 100644 --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -218,6 +218,8 @@ void wrpkru(unsigned int val); #define cpu_has_fma4(cpu_policy.extd.fma4 && xcr0_mask(6)) #define cpu_has_tbm cpu_policy.extd.tbm +#define current_cpu_has_lbr_lip cpu_policy.basic.lbr_1Ca.ip_contains_lip + int emul_test_cpuid( uint32_t leaf, uint32_t subleaf, diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index a1935a1748..c3b0bd4cbe 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2562,6 +2562,16 @@ static int cf_check hvmemul_vmfunc( return rc; } +static int cf_check hvmemul_set_lbr_depth( +uint32_t depth, +struct x86_emulate_ctxt *ctxt) +{ +if ( !hvm_funcs.set_lbr_depth ) +return X86EMUL_UNHANDLEABLE; +alternative_vcall(hvm_funcs.set_lbr_depth, current, depth); +return X86EMUL_OKAY; +} + static const struct x86_emulate_ops hvm_emulate_ops = { .read = hvmemul_read, .insn_fetch= hvmemul_insn_fetch, @@ -2590,6 +2600,7 @@ static const struct x86_emulate_ops hvm_emulate_ops = { .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, .vmfunc= hvmemul_vmfunc, +.set_lbr_depth = hvmemul_set_lbr_depth, }; static const struct x86_emulate_ops hvm_emulate_ops_no_write = { diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h index cad3a94278..bfce78952f 100644 --- a/xen/arch/x86/include/asm/hvm/hvm.h +++ b/xen/arch/x86/include/asm/hvm/hvm.h @@ -238,6 +238,9 @@ struct hvm_function_table { int (*vmtrace_get_option)(struct vcpu *v, uint64_t key, uint64_t *value); int (*vmtrace_reset)(struct vcpu *v); +/* Arch LBR */ +void (*set_lbr_depth)(struct vcpu *v, uint32_t depth); + uint64_t (*get_reg)(struct vcpu *v, unsigned int reg); void (*set_reg)(struct vcpu *v, unsigned int reg, uint64_t val); diff --git a/xen/arch/x86/x86_emulate/0fc7.c b/xen/arch/x86/x86_emulate/0fc7.c index 5268d5cafd..bb2b308afe 100644 --- a/xen/arch/x86/x86_emulate/0fc7.c +++ b/xen/arch/x86/x86_emulate/0fc7.c @@ -10,6 +10,10 @@ #include "private.h" +#if defined(__XEN__) && !defined(X86EMUL_NO_FPU) +# include +#endif + /* Avoid namespace pollution. */ #undef cmpxchg @@ -107,87 +111,203 @@ int x86emul_0fc7(struct x86_emulate_state *s, } else { -union { -uint32_t u32[2]; -uint64_t u64[2]; -} *old, *aux; - -/* cmpxchg8b/cmpxchg16b */ -generate_exception_if((s->modrm_reg & 7) != 1, X86_EXC_UD); -fail_if(!ops->cmpxchg); -if ( s->rex_prefix & REX_W ) -{ -host_and_vcpu_must_have(cx16); -generate_exception_if(!is_aligned(s->ea.mem.seg, s->ea.mem.off, 16, - ctxt, ops), - X86_EXC_GP, 0); -s->op_bytes = 16; -} -else +switch ( s->modrm_reg & 7 ) { -vcpu_must_have(cx8); -s->op_bytes = 8; -} +default: +return X86EMUL_UNRECOGNIZED; -old = container_of(&mmvalp->ymm[0], typeof(*old), u64[0]); -aux = container_of(&mmvalp->ymm[2], typeof(*aux), u64[0]); +case 1: /* cmpxchg8b/cmpxchg16b */ +{ +union { +uint32_t u32[2]; +uint64_t u64[2]; +} *old, *aux; -/* Get actual old value. */ -if ( (rc = ops->read(s->ea.mem.seg, s->ea.mem.off, old, s->op_bytes, - ctxt)) != X86EMUL_OKAY ) -goto done; +fail_if(!ops->cmpxchg); +if ( s->rex_prefix & REX_W ) +{ +host_and_vcpu_must_have(cx16); +generate_exception_if(!is_aligned(s
[RFC PATCH v2 10/10] x86/hvm: Enable XSAVES LBR save/restore
Add a new save code type CPU_XSAVES_CODE containing a compressed XSAVES image. Signed-off-by: Tu Dinh --- xen/arch/x86/hvm/hvm.c | 67 +- xen/arch/x86/xstate.c | 3 +- xen/include/public/arch-x86/hvm/save.h | 4 +- 3 files changed, 60 insertions(+), 14 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c7b93c7d91..e5a50d9fca 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1238,6 +1238,36 @@ static int cf_check hvm_save_cpu_xsave_states( return 0; } +#define HVM_CPU_XSAVES_SIZE(xcr0) (offsetof(struct hvm_hw_cpu_xsave, \ +save_area) + \ + xstate_compressed_size(xcr0)) + +static int cf_check hvm_save_cpu_xsaves_states( +struct vcpu *v, hvm_domain_context_t *h) +{ +struct hvm_hw_cpu_xsave *ctxt; +unsigned int size; +int err; + +if ( !xsave_enabled(v) ) +return 0; /* do nothing */ + +size = HVM_CPU_XSAVES_SIZE(v->arch.xcr0_accum); +err = _hvm_init_entry(h, CPU_XSAVES_CODE, v->vcpu_id, size); +if ( err ) +return err; + +ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur]; +h->cur += size; +ctxt->xfeature_mask = xfeature_mask; +ctxt->xcr0 = v->arch.xcr0; +ctxt->xcr0_accum = v->arch.xcr0_accum; + +memcpy(&ctxt->save_area, v->arch.xsave_area, size); + +return 0; +} + /* * Structure layout conformity checks, documenting correctness of the cast in * the invocation of validate_xstate() below. @@ -1311,6 +1341,10 @@ static int cf_check hvm_load_cpu_xsave_states( ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur]; h->cur += desc->length; +if ( !cpu_has_xsaves && + xsave_area_compressed((const void *)&ctxt->save_area) ) +return -EOPNOTSUPP; + err = validate_xstate(d, ctxt->xcr0, ctxt->xcr0_accum, (const void *)&ctxt->save_area.xsave_hdr); if ( err ) @@ -1322,7 +1356,10 @@ static int cf_check hvm_load_cpu_xsave_states( ctxt->xcr0, ctxt->save_area.xsave_hdr.xstate_bv, err); return err; } -size = HVM_CPU_XSAVE_SIZE(ctxt->xcr0_accum); +if ( xsave_area_compressed((const void *)&ctxt->save_area) ) +size = HVM_CPU_XSAVES_SIZE(ctxt->xcr0_accum); +else +size = HVM_CPU_XSAVE_SIZE(ctxt->xcr0_accum); desc_length = desc->length; if ( desc_length > size ) { @@ -1348,14 +1385,7 @@ static int cf_check hvm_load_cpu_xsave_states( desc_length = size; } -if ( xsave_area_compressed((const void *)&ctxt->save_area) ) -{ -printk(XENLOG_G_WARNING - "HVM%d.%u restore: compressed xsave state not supported\n", - d->domain_id, vcpuid); -return -EOPNOTSUPP; -} -else if ( desc_length != size ) +if ( desc_length != size ) { printk(XENLOG_G_WARNING "HVM%d.%u restore mismatch: xsave length %#x != %#x\n", @@ -1367,8 +1397,13 @@ static int cf_check hvm_load_cpu_xsave_states( v->arch.xcr0 = ctxt->xcr0; v->arch.xcr0_accum = ctxt->xcr0_accum; v->arch.nonlazy_xstate_used = ctxt->xcr0_accum & XSTATE_NONLAZY; -compress_xsave_states(v, &ctxt->save_area, - size - offsetof(struct hvm_hw_cpu_xsave, save_area)); +if ( xsave_area_compressed((const void *)&ctxt->save_area) ) +memcpy(v->arch.xsave_area, &ctxt->save_area, + size - offsetof(struct hvm_hw_cpu_xsave, save_area)); +else +compress_xsave_states(v, &ctxt->save_area, + size - offsetof(struct hvm_hw_cpu_xsave, + save_area)); return 0; } @@ -1385,6 +1420,7 @@ static const uint32_t msrs_to_send[] = { MSR_AMD64_DR1_ADDRESS_MASK, MSR_AMD64_DR2_ADDRESS_MASK, MSR_AMD64_DR3_ADDRESS_MASK, +MSR_LBR_DEPTH, }; static int cf_check hvm_save_cpu_msrs(struct vcpu *v, hvm_domain_context_t *h) @@ -1572,6 +1608,15 @@ static int __init cf_check hvm_register_CPU_save_and_restore(void) sizeof(struct hvm_save_descriptor), HVMSR_PER_VCPU); +hvm_register_savevm(CPU_XSAVES_CODE, +"CPU_XSAVES", +hvm_save_cpu_xsaves_states, +NULL, +hvm_load_cpu_xsave_states, +HVM_CPU_XSAVES_SIZE(xfeature_mask) + +sizeof(struct hvm_save_descriptor), +HVMSR_PER_VCPU); + hvm_register_savevm(CPU_MSR_CODE, "CPU_MSR", hvm_save_cpu_msrs, diff --git a/xen/arc
[RFC PATCH v2 04/10] x86: Calculate arch LBR CPUID policies
Ensure that the arch LBR feature and its dependents are disabled if any prerequisites are not available. Signed-off-by: Tu Dinh --- xen/arch/x86/cpu-policy.c | 28 xen/arch/x86/cpu/common.c | 7 +++ 2 files changed, 35 insertions(+) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 78bc9872b0..b1398b2e3c 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -190,6 +190,16 @@ static void sanitise_featureset(uint32_t *fs) } } +static void recalculate_arch_lbr(struct cpu_policy *p) +{ +if ( p->basic.max_leaf < 0x1c || + !(cpu_policy_xstates(&host_cpu_policy) & X86_XSS_LBR) || + p->basic.lbr_1Ca.supported_depths == 0) +p->feat.arch_lbr = 0; +if ( !p->feat.arch_lbr ) +p->basic.raw[0x1c] = EMPTY_LEAF; +} + static void recalculate_xstate(struct cpu_policy *p) { uint64_t xstates = XSTATE_FP_SSE; @@ -219,6 +229,9 @@ static void recalculate_xstate(struct cpu_policy *p) if ( p->feat.amx_tile ) xstates |= X86_XCR0_TILE_CFG | X86_XCR0_TILE_DATA; +if ( p->feat.arch_lbr ) +xstates |= X86_XSS_LBR; + /* Subleaf 0 */ p->xstate.max_size = xstate_uncompressed_size(xstates & ~XSTATE_XSAVES_ONLY); @@ -271,6 +284,8 @@ static void recalculate_misc(struct cpu_policy *p) p->basic.raw[0xc] = EMPTY_LEAF; +zero_leaves(p->basic.raw, 0xe, 0x1b); + p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES; /* Most of Power/RAS hidden from guests. */ @@ -630,6 +645,7 @@ static void __init calculate_pv_max_policy(void) sanitise_featureset(fs); x86_cpu_featureset_to_policy(fs, p); +recalculate_arch_lbr(p); recalculate_xstate(p); p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */ @@ -670,6 +686,7 @@ static void __init calculate_pv_def_policy(void) } x86_cpu_featureset_to_policy(fs, p); +recalculate_arch_lbr(p); recalculate_xstate(p); } @@ -755,6 +772,14 @@ static void __init calculate_hvm_max_policy(void) if ( !cpu_has_vmx_xsaves ) __clear_bit(X86_FEATURE_XSAVES, fs); + +/* + * VMX bitmap is needed for passing through LBR info MSRs. + * Require it for virtual arch LBR. + */ +if ( !cpu_has_vmx_guest_lbr_ctl || !cpu_has_vmx_msr_bitmap || + !cpu_has_vmx_xsaves ) +__clear_bit(X86_FEATURE_ARCH_LBR, fs); } /* @@ -787,6 +812,7 @@ static void __init calculate_hvm_max_policy(void) sanitise_featureset(fs); x86_cpu_featureset_to_policy(fs, p); +recalculate_arch_lbr(p); recalculate_xstate(p); /* It's always possible to emulate CPUID faulting for HVM guests */ @@ -839,6 +865,7 @@ static void __init calculate_hvm_def_policy(void) } x86_cpu_featureset_to_policy(fs, p); +recalculate_arch_lbr(p); recalculate_xstate(p); } @@ -971,6 +998,7 @@ void recalculate_cpuid_policy(struct domain *d) p->extd.maxlinaddr = p->extd.lm ? 48 : 32; +recalculate_arch_lbr(p); recalculate_xstate(p); recalculate_misc(p); diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 067d855bad..0056b55457 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -505,6 +505,13 @@ static void generic_identify(struct cpuinfo_x86 *c) &c->x86_capability[FEATURESET_Da1], &tmp, &tmp, &tmp); + if (c->cpuid_level >= 0x1c) + cpuid(0x1c, + &c->x86_capability[FEATURESET_1Ca], + &c->x86_capability[FEATURESET_1Cb], + &c->x86_capability[FEATURESET_1Cc], + &tmp); + if (test_bit(X86_FEATURE_ARCH_CAPS, c->x86_capability)) rdmsr(MSR_ARCH_CAPABILITIES, c->x86_capability[FEATURESET_m10Al], -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 05/10] x86: Keep a copy of XSAVE area size
Signed-off-by: Tu Dinh --- xen/arch/x86/include/asm/domain.h | 1 + xen/arch/x86/xstate.c | 1 + 2 files changed, 2 insertions(+) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index b79d6badd7..d3f2695c20 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -638,6 +638,7 @@ struct arch_vcpu * #NM handler, we XRSTOR the states we XSAVE-ed; */ struct xsave_struct *xsave_area; +unsigned int xsave_area_size; uint64_t xcr0; /* Accumulated eXtended features mask for using XSAVE/XRESTORE by Xen * itself, as we can never know whether guest OS depends on content diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index af9e345a7a..baae8e1a13 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -550,6 +550,7 @@ int xstate_alloc_save_area(struct vcpu *v) save_area->fpu_sse.mxcsr = MXCSR_DEFAULT; v->arch.xsave_area = save_area; +v->arch.xsave_area_size = size; v->arch.xcr0 = 0; v->arch.xcr0_accum = 0; -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[RFC PATCH v2 07/10] x86/hvm: Don't count XSS bits in XSAVE size
HVM vCPU state images are uncompressed and therefore can't contain XSS states. Signed-off-by: Tu Dinh --- xen/arch/x86/hvm/hvm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 922c9b3af6..c7b93c7d91 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1208,7 +1208,8 @@ HVM_REGISTER_SAVE_RESTORE(CPU, hvm_save_cpu_ctxt, NULL, hvm_load_cpu_ctxt, 1, #define HVM_CPU_XSAVE_SIZE(xcr0) (offsetof(struct hvm_hw_cpu_xsave, \ save_area) + \ - xstate_uncompressed_size(xcr0)) + xstate_uncompressed_size(xcr0 & \ + ~X86_XSS_STATES)) static int cf_check hvm_save_cpu_xsave_states( struct vcpu *v, hvm_domain_context_t *h) -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[PATCH 3/4] x86: Adjust arch LBR CPU policy
From: Tu Dinh Allow virtual arch LBR with a single depth that's equal to that of the host. If this is not possible, disable arch LBR altogether. Signed-off-by: Tu Dinh --- xen/arch/x86/cpu-policy.c | 33 + 1 file changed, 33 insertions(+) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index cf6b212fb6..2ac76eb058 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -638,6 +638,36 @@ static void __init calculate_pv_max_policy(void) p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */ } +/* + * Allow virtual arch LBR with a single depth that's equal to that of the + * host. If this is not possible, disable arch LBR altogether. + */ +static void adjust_arch_lbr_depth(uint32_t fs[FEATURESET_NR_ENTRIES]) +{ +uint64_t host_lbr_depth; +bool lbr_supported = true; + +rdmsrl(MSR_IA32_LASTBRANCH_DEPTH, host_lbr_depth); +if ((host_lbr_depth == 0) || +(host_lbr_depth % 8) || +(host_lbr_depth > 64)) +lbr_supported = false; + +host_lbr_depth = 1ul << ((host_lbr_depth / 8) - 1); +if ((host_lbr_depth & fs[FEATURESET_1Ca] & 0xff) == 0) +lbr_supported = false; + +if (lbr_supported) +{ +fs[FEATURESET_1Ca] = (fs[FEATURESET_1Ca] & ~0xffu) | host_lbr_depth; +} +else +{ +__clear_bit(X86_FEATURE_ARCH_LBR, fs); +fs[FEATURESET_1Ca] = fs[FEATURESET_1Cb] = fs[FEATURESET_1Cc] = 0; +} +} + static void __init calculate_pv_def_policy(void) { struct cpu_policy *p = &pv_def_cpu_policy; @@ -760,6 +790,9 @@ static void __init calculate_hvm_max_policy(void) __clear_bit(X86_FEATURE_XSAVES, fs); } +if ( test_bit(X86_FEATURE_ARCH_LBR, fs) ) +adjust_arch_lbr_depth(fs); + /* * Xen doesn't use PKS, so the guest support for it has opted to not use * the VMCS load/save controls for efficiency reasons. This depends on -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[PATCH 2/4] x86: Add architectural LBR declarations
From: Tu Dinh Signed-off-by: Tu Dinh --- xen/arch/x86/include/asm/msr-index.h | 11 +++ 1 file changed, 11 insertions(+) diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h index 9cdb5b2625..867deab3c6 100644 --- a/xen/arch/x86/include/asm/msr-index.h +++ b/xen/arch/x86/include/asm/msr-index.h @@ -304,6 +304,17 @@ #define MSR_IA32_LASTINTFROMIP 0x01dd #define MSR_IA32_LASTINTTOIP 0x01de +/* Architectural LBR state MSRs */ +#define MSR_IA32_LASTBRANCH_CTL0x14ce +#define LASTBRANCH_CTL_LBREN (1<<0) /* Enable LBR recording */ +#define LASTBRANCH_CTL_VALID _AC(0x7f000f, ULL) +#define MSR_IA32_LASTBRANCH_DEPTH 0x14cf +#define MSR_IA32_LER_INFO 0x01e0 +#define MSR_IA32_LASTBRANCH_0_INFO 0x1200 +#define MSR_IA32_LASTBRANCH_0_FROM_IP 0x1500 +#define MSR_IA32_LASTBRANCH_0_TO_IP0x1600 +#define MAX_MSR_ARCH_LASTBRANCH_FROM_TO64 + #define MSR_IA32_POWER_CTL 0x01fc #define MSR_IA32_MTRR_PHYSBASE(n) (0x0200 + 2 * (n)) -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[PATCH 4/4] x86/vmx: Virtualize architectural LBRs
From: Tu Dinh Virtual architectural LBRs work in guest mode only, using the "load guest IA32_LBR_CTL" and "clear IA32_LBR_CTL" VMX controls. Intercept writes to MSR_IA32_LASTBRANCH_{CTL,DEPTH} to inject LBR MSRs into guest. MSR_IA32_LASTBRANCH_DEPTH is only allowed to be equal to that of the host's. Signed-off-by: Tu Dinh --- xen/arch/x86/cpu-policy.c | 3 + xen/arch/x86/hvm/vmx/vmcs.c | 11 +- xen/arch/x86/hvm/vmx/vmx.c | 269 +--- xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 8 + 4 files changed, 211 insertions(+), 80 deletions(-) diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 2ac76eb058..9e78273a79 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -788,6 +788,9 @@ static void __init calculate_hvm_max_policy(void) if ( !cpu_has_vmx_xsaves ) __clear_bit(X86_FEATURE_XSAVES, fs); + +if ( !cpu_has_vmx_guest_lbr_ctl ) +__clear_bit(X86_FEATURE_ARCH_LBR, fs); } if ( test_bit(X86_FEATURE_ARCH_LBR, fs) ) diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c index 147e998371..a16daad78a 100644 --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -203,6 +203,7 @@ static void __init vmx_display_features(void) P(cpu_has_vmx_bus_lock_detection, "Bus Lock Detection"); P(cpu_has_vmx_notify_vm_exiting, "Notify VM Exit"); P(cpu_has_vmx_virt_spec_ctrl, "Virtualize SPEC_CTRL"); +P(cpu_has_vmx_guest_lbr_ctl, "Architectural LBR virtualization"); #undef P if ( !printed ) @@ -448,7 +449,8 @@ static int vmx_init_vmcs_config(bool bsp) min = VM_EXIT_ACK_INTR_ON_EXIT; opt = (VM_EXIT_SAVE_GUEST_PAT | VM_EXIT_LOAD_HOST_PAT | - VM_EXIT_LOAD_HOST_EFER | VM_EXIT_CLEAR_BNDCFGS); + VM_EXIT_LOAD_HOST_EFER | VM_EXIT_CLEAR_BNDCFGS | + VM_EXIT_CLEAR_GUEST_LBR_CTL); min |= VM_EXIT_IA32E_MODE; _vmx_vmexit_control = adjust_vmx_controls( "VMExit Control", min, opt, MSR_IA32_VMX_EXIT_CTLS, &mismatch); @@ -489,7 +491,7 @@ static int vmx_init_vmcs_config(bool bsp) min = 0; opt = (VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_GUEST_EFER | - VM_ENTRY_LOAD_BNDCFGS); + VM_ENTRY_LOAD_BNDCFGS | VM_ENTRY_LOAD_GUEST_LBR_CTL); _vmx_vmentry_control = adjust_vmx_controls( "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS, &mismatch); @@ -1329,6 +1331,9 @@ static int construct_vmcs(struct vcpu *v) | (paging_mode_hap(d) ? 0 : (1U << X86_EXC_PF)) | (v->arch.fully_eager_fpu ? 0 : (1U << X86_EXC_NM)); +if ( cpu_has_vmx_guest_lbr_ctl ) +__vmwrite(GUEST_LBR_CTL, 0); + if ( cpu_has_vmx_notify_vm_exiting ) __vmwrite(NOTIFY_WINDOW, vm_notify_window); @@ -2087,6 +2092,8 @@ void vmcs_dump_vcpu(struct vcpu *v) vmr32(GUEST_PREEMPTION_TIMER), vmr32(GUEST_SMBASE)); printk("DebugCtl = 0x%016lx DebugExceptions = 0x%016lx\n", vmr(GUEST_IA32_DEBUGCTL), vmr(GUEST_PENDING_DBG_EXCEPTIONS)); +if ( vmentry_ctl & VM_ENTRY_LOAD_GUEST_LBR_CTL ) +printk("LbrCtl = 0x%016lx\n", vmr(GUEST_LBR_CTL)); if ( vmentry_ctl & (VM_ENTRY_LOAD_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_BNDCFGS) ) printk("PerfGlobCtl = 0x%016lx BndCfgS = 0x%016lx\n", vmr(GUEST_PERF_GLOBAL_CTRL), vmr(GUEST_BNDCFGS)); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index b6885d0e27..d417ae17d3 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -423,65 +423,96 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, return rc; } -static const struct lbr_info { +struct lbr_info { u32 base, count; -} p4_lbr[] = { -{ MSR_P4_LER_FROM_LIP, 1 }, -{ MSR_P4_LER_TO_LIP,1 }, -{ MSR_P4_LASTBRANCH_TOS,1 }, -{ MSR_P4_LASTBRANCH_0_FROM_LIP, NUM_MSR_P4_LASTBRANCH_FROM_TO }, -{ MSR_P4_LASTBRANCH_0_TO_LIP, NUM_MSR_P4_LASTBRANCH_FROM_TO }, -{ 0, 0 } +u64 initial; +}; + +static const struct lbr_info p4_lbr[] = { +{ MSR_P4_LER_FROM_LIP, 1, 0 }, +{ MSR_P4_LER_TO_LIP,1, 0 }, +{ MSR_P4_LASTBRANCH_TOS,1, 0 }, +{ MSR_P4_LASTBRANCH_0_FROM_LIP, NUM_MSR_P4_LASTBRANCH_FROM_TO, 0 }, +{ MSR_P4_LASTBRANCH_0_TO_LIP, NUM_MSR_P4_LASTBRANCH_FROM_TO, 0 }, +{ 0, 0, 0 } }, c2_lbr[] = { -{ MSR_IA32_LASTINTFROMIP, 1 }, -{ MSR_IA32_LASTINTTOIP, 1 }, -{ MSR_C2_LASTBRANCH_TOS,1 }, -{ MSR_C2_LASTBRANCH_0_FROM_IP, NUM_MSR_C2_LASTBRANCH_FROM_TO }, -{ MSR_C2_LASTBRANCH_0_TO_IP,NUM_MSR_C2_LASTBRANCH_FROM_TO }, -{ 0, 0 } +{ MSR_IA32_LASTINTFROMIP, 1, 0 }, +{ MSR_IA32_LASTINTTOIP, 1, 0 }, +
[PATCH 0/4] Virtualize architectural LBRs
From: Tu Dinh Intel model-specific last branch records (LBRs) were replaced by architectural LBRs (see Chapter 20 of Intel SDM volume 3B). This patchset implements virtual LBRs for HVM guests using Intel's "load guest IA32_LBR_CTL" and "clear IA32_LBR_CTL" VMX controls. Add the necessary CPUID and VMX feature checks into Xen. Note that in this patchset, MSR_IA32_LASTBRANCH_DEPTH is only allowed to be equal to that of the host's. Tu Dinh (4): x86: Add Intel architectural LBR featureset bits x86: Add architectural LBR declarations x86: Adjust arch LBR CPU policy x86/vmx: Virtualize architectural LBRs tools/libs/guest/xg_cpuid_x86.c | 2 +- tools/misc/xen-cpuid.c | 3 + xen/arch/x86/cpu-policy.c | 39 +++ xen/arch/x86/cpu/common.c | 7 + xen/arch/x86/hvm/vmx/vmcs.c | 11 +- xen/arch/x86/hvm/vmx/vmx.c | 269 ++-- xen/arch/x86/include/asm/hvm/vmx/vmcs.h | 8 + xen/arch/x86/include/asm/msr-index.h| 11 + xen/include/public/arch-x86/cpufeatureset.h | 28 +- xen/include/xen/lib/x86/cpu-policy.h| 38 ++- xen/lib/x86/cpuid.c | 6 + 11 files changed, 339 insertions(+), 83 deletions(-) -- 2.43.0 Ngoc Tu Dinh | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech
[PATCH 1/4] x86: Add Intel architectural LBR featureset bits
From: Tu Dinh Expose ARCH_LBR feature to guests. Extend CPU featureset with 3 words for CPUID leaf 0x1c. Signed-off-by: Tu Dinh --- tools/libs/guest/xg_cpuid_x86.c | 2 +- tools/misc/xen-cpuid.c | 3 ++ xen/arch/x86/cpu-policy.c | 3 ++ xen/arch/x86/cpu/common.c | 7 xen/include/public/arch-x86/cpufeatureset.h | 28 ++- xen/include/xen/lib/x86/cpu-policy.h| 38 - xen/lib/x86/cpuid.c | 6 7 files changed, 84 insertions(+), 3 deletions(-) diff --git a/tools/libs/guest/xg_cpuid_x86.c b/tools/libs/guest/xg_cpuid_x86.c index 4453178100..64d9baa538 100644 --- a/tools/libs/guest/xg_cpuid_x86.c +++ b/tools/libs/guest/xg_cpuid_x86.c @@ -656,7 +656,7 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid, bool restore, p->policy.feat.mpx = test_bit(X86_FEATURE_MPX, host_featureset); } -p->policy.basic.max_leaf = min(p->policy.basic.max_leaf, 0xdu); +p->policy.basic.max_leaf = min(p->policy.basic.max_leaf, 0x1cu); p->policy.feat.max_subleaf = 0; p->policy.extd.max_leaf = min(p->policy.extd.max_leaf, 0x801c); } diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c index 4c4593528d..4f0fb0a6ea 100644 --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -37,6 +37,9 @@ static const struct { { "CPUID 0x0007:1.edx", "7d1" }, { "MSR_ARCH_CAPS.lo", "m10Al" }, { "MSR_ARCH_CAPS.hi", "m10Ah" }, +{ "CPUID 0x001c.eax", "1Ca" }, +{ "CPUID 0x001c.ebx", "1Cb" }, +{ "CPUID 0x001c.ecx", "1Cc" }, }; #define COL_ALIGN "24" diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 78bc9872b0..cf6b212fb6 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -271,6 +271,8 @@ static void recalculate_misc(struct cpu_policy *p) p->basic.raw[0xc] = EMPTY_LEAF; +zero_leaves(p->basic.raw, 0xe, 0x1b); + p->extd.e1d &= ~CPUID_COMMON_1D_FEATURES; /* Most of Power/RAS hidden from guests. */ @@ -303,6 +305,7 @@ static void recalculate_misc(struct cpu_policy *p) zero_leaves(p->basic.raw, 0x2, 0x3); memset(p->cache.raw, 0, sizeof(p->cache.raw)); zero_leaves(p->basic.raw, 0x9, 0xa); +p->basic.raw[0x1c] = EMPTY_LEAF; p->extd.vendor_ebx = p->basic.vendor_ebx; p->extd.vendor_ecx = p->basic.vendor_ecx; diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 067d855bad..4c8eb188e9 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -505,6 +505,13 @@ static void generic_identify(struct cpuinfo_x86 *c) &c->x86_capability[FEATURESET_Da1], &tmp, &tmp, &tmp); + if (c->cpuid_level >= 0x1c) + cpuid(0x1c, + &c->x86_capability[FEATURESET_1Ca], + &c->x86_capability[FEATURESET_1Cb], + &c->x86_capability[FEATURESET_1Cc], + &tmp); + if (test_bit(X86_FEATURE_ARCH_CAPS, c->x86_capability)) rdmsr(MSR_ARCH_CAPABILITIES, c->x86_capability[FEATURESET_m10Al], diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 8fa3fb711a..9304856fba 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -284,7 +284,7 @@ XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*A SERIALIZE insn */ XEN_CPUFEATURE(HYBRID,9*32+15) /* Heterogeneous platform */ XEN_CPUFEATURE(TSXLDTRK, 9*32+16) /*a TSX load tracking suspend/resume insns */ XEN_CPUFEATURE(PCONFIG, 9*32+18) /* PCONFIG instruction */ -XEN_CPUFEATURE(ARCH_LBR, 9*32+19) /* Architectural Last Branch Record */ +XEN_CPUFEATURE(ARCH_LBR, 9*32+19) /*S Architectural Last Branch Record */ XEN_CPUFEATURE(CET_IBT, 9*32+20) /* CET - Indirect Branch Tracking */ XEN_CPUFEATURE(AMX_BF16, 9*32+22) /* AMX BFloat16 instruction */ XEN_CPUFEATURE(AVX512_FP16, 9*32+23) /*A AVX512 FP16 instructions */ @@ -379,6 +379,32 @@ XEN_CPUFEATURE(RFDS_CLEAR, 16*32+28) /*!A| Register File(s) cleared by V /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.edx, word 17 */ +/* Intel-defined CPU features, CPUID level 0x001c.eax, word 18 */ +XEN_CPUFEATURE(LBR_DEPTH_8,18*32+ 0) /*S Depth 8 */ +XEN_CPUFEATURE(LBR_DEPTH_16, 18*32+ 1) /*S Depth 16 */ +XEN_CPUFEATURE(LBR_DEPTH_24, 18*32+ 2) /*S Depth 24 */ +XEN_CPUFEATURE(LBR_DEPTH_32,
[PATCH 2/2] x86: Set up framebuffer given by Multiboot2
Previously, we do not make use of the framebuffer given by Multiboot. This means graphics will not work in some scenarios such as booting from Kexec. Enable the Multiboot framebuffer if it exists and not overridden by EFI probe. --- xen/arch/x86/setup.c | 45 +--- 1 file changed, 42 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 115f8f6517..04d8be407e 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -551,16 +551,55 @@ struct boot_video_info { extern struct boot_video_info boot_vid_info; #endif -static void __init parse_video_info(void) +static void __init parse_video_info(multiboot_info_t *mbi) { #ifdef CONFIG_VIDEO struct boot_video_info *bvi = &bootsym(boot_vid_info); +/* + * fb detection will be in this order: + * - efifb (as efifb probe sets a new GOP mode before parse_video_info is called, + *we must use this mode instead of the one given by mbifb) + * - mbifb + * - vesafb + */ + /* vga_console_info is filled directly on EFI platform. */ if ( efi_enabled(EFI_BOOT) ) return; -if ( (bvi->orig_video_isVGA == 1) && (bvi->orig_video_mode == 3) ) +if ( mbi->flags & MBI_FB ) +{ +uint64_t lfb_rgb_bitmap = 0; + +vga_console_info.video_type = XEN_VGATYPE_VESA_LFB; +vga_console_info.u.vesa_lfb.width = mbi->fb.width; +vga_console_info.u.vesa_lfb.height = mbi->fb.height; +vga_console_info.u.vesa_lfb.bytes_per_line = mbi->fb.pitch; +vga_console_info.u.vesa_lfb.bits_per_pixel = mbi->fb.bpp; +vga_console_info.u.vesa_lfb.lfb_base = mbi->fb.addr; +vga_console_info.u.vesa_lfb.lfb_size = (mbi->fb.pitch * mbi->fb.height + 0x) >> 16; + +vga_console_info.u.vesa_lfb.red_pos = mbi->fb.red_pos; +vga_console_info.u.vesa_lfb.red_size = mbi->fb.red_size; +lfb_rgb_bitmap |= (((uint64_t)1 << mbi->fb.red_size) - 1) << mbi->fb.red_pos; +vga_console_info.u.vesa_lfb.green_pos = mbi->fb.green_pos; +vga_console_info.u.vesa_lfb.green_size = mbi->fb.green_size; +lfb_rgb_bitmap |= (((uint64_t)1 << mbi->fb.green_size) - 1) << mbi->fb.green_pos; +vga_console_info.u.vesa_lfb.blue_pos = mbi->fb.blue_pos; +vga_console_info.u.vesa_lfb.blue_size = mbi->fb.blue_size; +lfb_rgb_bitmap |= (((uint64_t)1 << mbi->fb.blue_size) - 1) << mbi->fb.blue_pos; + +/* assume non-weird bit format */ +vga_console_info.u.vesa_lfb.rsvd_pos = find_first_zero_bit(&lfb_rgb_bitmap, sizeof(lfb_rgb_bitmap) * __CHAR_BIT__); +vga_console_info.u.vesa_lfb.rsvd_size = mbi->fb.bpp - mbi->fb.red_size - mbi->fb.green_size - mbi->fb.blue_size; +if (vga_console_info.u.vesa_lfb.rsvd_pos >= mbi->fb.bpp || vga_console_info.u.vesa_lfb.rsvd_size < 0) { +vga_console_info.u.vesa_lfb.rsvd_pos = 0; +vga_console_info.u.vesa_lfb.rsvd_size = 0; +} +vga_console_info.u.vesa_lfb.gbl_caps = 2; /* possibly non-VGA */ +} +else if ( (bvi->orig_video_isVGA == 1) && (bvi->orig_video_mode == 3) ) { vga_console_info.video_type = XEN_VGATYPE_TEXT_MODE_3; vga_console_info.u.text_mode_3.font_height = bvi->orig_video_points; @@ -933,7 +972,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) */ hypervisor_name = hypervisor_probe(); -parse_video_info(); +parse_video_info(mbi); rdmsrl(MSR_EFER, this_cpu(efer)); asm volatile ( "mov %%cr4,%0" : "=r" (get_cpu_info()->cr4) ); -- 2.25.1
[PATCH 1/2] x86: Parse Multiboot2 framebuffer information
Multiboot2 exposes framebuffer data in its boot information tags. Xen requests this information from the bootloader, but does not make use of it. Parse this information for later use. --- xen/arch/x86/boot/reloc.c| 22 ++ xen/include/xen/multiboot.h | 17 + xen/include/xen/multiboot2.h | 33 + 3 files changed, 72 insertions(+) diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c index 4f4039bb7c..01a53d3ae5 100644 --- a/xen/arch/x86/boot/reloc.c +++ b/xen/arch/x86/boot/reloc.c @@ -156,6 +156,8 @@ static multiboot_info_t *mbi2_reloc(u32 mbi_in) multiboot_info_t *mbi_out; u32 ptr; unsigned int i, mod_idx = 0; +u64 fbaddr; +u8 fbtype; ptr = alloc_mem(sizeof(*mbi_out)); mbi_out = _p(ptr); @@ -254,6 +256,26 @@ static multiboot_info_t *mbi2_reloc(u32 mbi_in) ++mod_idx; break; +case MULTIBOOT2_TAG_TYPE_FRAMEBUFFER: +fbaddr = get_mb2_data(tag, framebuffer, framebuffer_addr); +fbtype = get_mb2_data(tag, framebuffer, framebuffer_type); +if (fbaddr == 0 || fbtype != MULTIBOOT2_FRAMEBUFFER_TYPE_RGB) +break; +mbi_out->flags |= MBI_FB; +mbi_out->fb.addr = fbaddr; +mbi_out->fb.pitch = get_mb2_data(tag, framebuffer, framebuffer_pitch); +mbi_out->fb.width = get_mb2_data(tag, framebuffer, framebuffer_width); +mbi_out->fb.height = get_mb2_data(tag, framebuffer, framebuffer_height); +mbi_out->fb.bpp = get_mb2_data(tag, framebuffer, framebuffer_bpp); +mbi_out->fb.type = fbtype; +mbi_out->fb.red_pos = get_mb2_data(tag, framebuffer, framebuffer_red_field_position); +mbi_out->fb.red_size = get_mb2_data(tag, framebuffer, framebuffer_red_mask_size); +mbi_out->fb.green_pos = get_mb2_data(tag, framebuffer, framebuffer_green_field_position); +mbi_out->fb.green_size = get_mb2_data(tag, framebuffer, framebuffer_green_mask_size); +mbi_out->fb.blue_pos = get_mb2_data(tag, framebuffer, framebuffer_blue_field_position); +mbi_out->fb.blue_size = get_mb2_data(tag, framebuffer, framebuffer_blue_mask_size); +break; + case MULTIBOOT2_TAG_TYPE_END: return mbi_out; diff --git a/xen/include/xen/multiboot.h b/xen/include/xen/multiboot.h index d1b43e1183..2d829b5fa7 100644 --- a/xen/include/xen/multiboot.h +++ b/xen/include/xen/multiboot.h @@ -42,6 +42,7 @@ #define MBI_BIOSCONFIG (_AC(1,u) << 8) #define MBI_LOADERNAME (_AC(1,u) << 9) #define MBI_APM(_AC(1,u) << 10) +#define MBI_FB (_AC(1,u) << 11) #ifndef __ASSEMBLY__ @@ -101,6 +102,22 @@ typedef struct { /* Valid if flags sets MBI_APM */ u32 apm_table; + +/* Valid if flags sets MBI_FB */ +struct { +u64 addr; +u32 pitch; +u32 width; +u32 height; +u8 bpp; +u8 type; +u8 red_pos; +u8 red_size; +u8 green_pos; +u8 green_size; +u8 blue_pos; +u8 blue_size; +} fb; } multiboot_info_t; /* The module structure. */ diff --git a/xen/include/xen/multiboot2.h b/xen/include/xen/multiboot2.h index 5acd225044..a86a080038 100644 --- a/xen/include/xen/multiboot2.h +++ b/xen/include/xen/multiboot2.h @@ -177,6 +177,39 @@ typedef struct { u32 mod_end; char cmdline[]; } multiboot2_tag_module_t; + +typedef struct { +u8 red; +u8 green; +u8 blue; +} multiboot2_framebuffer_color_t; + +typedef struct { +u32 type; +u32 size; +u64 framebuffer_addr; +u32 framebuffer_pitch; +u32 framebuffer_width; +u32 framebuffer_height; +u8 framebuffer_bpp; +u8 framebuffer_type; +u16 reserved; + +union { +struct { +u16 framebuffer_palette_num_colors; +multiboot2_framebuffer_color_t framebuffer_palette[0]; +}; +struct { +u8 framebuffer_red_field_position; +u8 framebuffer_red_mask_size; +u8 framebuffer_green_field_position; +u8 framebuffer_green_mask_size; +u8 framebuffer_blue_field_position; +u8 framebuffer_blue_mask_size; +}; +}; +} multiboot2_tag_framebuffer_t; #endif /* __ASSEMBLY__ */ #endif /* __MULTIBOOT2_H__ */ -- 2.25.1
[PATCH 0/2] x86: Use Multiboot framebuffer
Xen does not currently use the Multiboot framebuffer. This means there is no graphics when booting Xen with Kexec. This patchset parses and uses the Multiboot framebuffer information during boot. Tu Dinh Ngoc (2): x86: Parse Multiboot2 framebuffer information x86: Set up framebuffer given by Multiboot2 xen/arch/x86/boot/reloc.c| 22 ++ xen/arch/x86/setup.c | 45 +--- xen/include/xen/multiboot.h | 17 ++ xen/include/xen/multiboot2.h | 33 ++ 4 files changed, 114 insertions(+), 3 deletions(-) -- 2.25.1
[PATCH v2] x86: Use low memory size directly from Multiboot
Previously, Xen used information from the BDA to detect the amount of available low memory. This does not work on some scenarios such as Coreboot, or when booting from Kexec on a UEFI system without CSM. Use the information directly supplied by Multiboot boot information instead. --- xen/arch/x86/boot/head.S | 34 -- 1 file changed, 12 insertions(+), 22 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index dd1bea0d10..62fe3fe55b 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -524,33 +524,23 @@ trampoline_bios_setup: mov %ecx,%fs mov %ecx,%gs -/* Set up trampoline segment 64k below EBDA */ -movzwl 0x40e,%ecx /* EBDA segment */ -cmp $0xa000,%ecx/* sanity check (high) */ -jae 0f -cmp $0x4000,%ecx/* sanity check (low) */ -jae 1f -0: -movzwl 0x413,%ecx /* use base memory size on failure */ -shl $10-4,%ecx -1: +/* Use lower memory size directly from Multiboot */ +mov %edx,%ecx /* - * Compare the value in the BDA with the information from the - * multiboot structure (if available) and use the smallest. + * Old Kexec used to report the value in bytes instead of kilobytes + * like it's supposed to, so fix that if detected. */ -cmp $0x100,%edx /* is the multiboot value too small? */ -jb 2f /* if so, do not use it */ -shl $10-4,%edx -cmp %ecx,%edx /* compare with BDA value */ -cmovb %edx,%ecx /* and use the smaller */ +cmpl$640,%ecx +jbe 1f +shr $10,%ecx +1: +/* From arch/x86/smpboot.c: start_eip had better be page-aligned! */ +shr $2,%ecx -2: /* Reserve memory for the trampoline and the low-memory stack. */ -sub $((TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE)>>4),%ecx +sub $((TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE)>>12),%ecx -/* From arch/x86/smpboot.c: start_eip had better be page-aligned! */ -xor %cl, %cl -shl $4, %ecx +shl $12,%ecx mov %ecx,sym_esi(trampoline_phys) trampoline_setup: -- 2.25.1
[PATCH v3] x86: Prioritize low memory size from Multiboot
Previously, Xen used information from the BDA to detect the amount of available low memory. This does not work on some scenarios such as Coreboot, or when booting from Kexec on a UEFI system without CSM. Prioritize the information supplied by Multiboot instead. If this is not available, fall back to the old BDA method. Signed-off-by: Tu Dinh Ngoc --- Changes in v3: - Prioritize using Multiboot's memory information.. Fall back to using BDA in case MBI does not supply memory info. Changes in v2: - Detect if Multiboot claims there's more than 640 KB of low memory (happens with old Kexec versions), and correct the memory unit in such cases. --- xen/arch/x86/boot/head.S | 44 ++-- 1 file changed, 29 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index dd1bea0d10..da7810060e 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -524,27 +524,41 @@ trampoline_bios_setup: mov %ecx,%fs mov %ecx,%gs -/* Set up trampoline segment 64k below EBDA */ -movzwl 0x40e,%ecx /* EBDA segment */ -cmp $0xa000,%ecx/* sanity check (high) */ -jae 0f -cmp $0x4000,%ecx/* sanity check (low) */ -jae 1f +/* Check if Multiboot provides us with low memory size. */ +mov %edx,%ecx +test%ecx,%ecx +jz 1f + +/* + * Old Kexec used to report memory sizes in bytes instead of kilobytes + * like it's supposed to. + * + * If Multiboot reports more than 640 KB of low memory, assume we have + * this problem. + */ +cmp $640,%ecx +jbe 0f +shr $10,%ecx 0: -movzwl 0x413,%ecx /* use base memory size on failure */ +/* %ecx now contains the low memory size in kilobytes. */ shl $10-4,%ecx +jmp 3f + 1: /* - * Compare the value in the BDA with the information from the - * multiboot structure (if available) and use the smallest. + * Multiboot doesn't provide us with memory info. Set up trampoline + * segment 64k below EBDA as fallback. */ -cmp $0x100,%edx /* is the multiboot value too small? */ -jb 2f /* if so, do not use it */ -shl $10-4,%edx -cmp %ecx,%edx /* compare with BDA value */ -cmovb %edx,%ecx /* and use the smaller */ - +movzwl 0x40e,%ecx /* EBDA segment */ +cmp $0xa000,%ecx/* sanity check (high) */ +jae 2f +cmp $0x4000,%ecx/* sanity check (low) */ +jae 3f 2: +movzwl 0x413,%ecx /* use base memory size on failure */ +shl $10-4,%ecx + +3: /* Reserve memory for the trampoline and the low-memory stack. */ sub $((TRAMPOLINE_SPACE+TRAMPOLINE_STACK_SPACE)>>4),%ecx -- 2.25.1