PING: [PATCH v2] target/i386/kvm: Refine VMX controls setting for backward compatibility
Dear QEMU Community, Two months have passed since my last submission of the patch aimed at addressing an issue encountered with kernels prior to Linux kernel 5.3. When using the latest version of QEMU with '-cpu host', if the host supports the rdseed instruction but not rdseed exiting, it results in a QEMU error: "error: failed to set MSR 0x48b to 0x*". I understand everyone is busy, and reviewing patches can be quite resource-intensive. Nevertheless, I am eager to hear if there are any further comments, suggestions, or advice on what steps I should take next regarding this patch. Sincerely,Ewan On 11/26/23 22:43, EwanHai wrote: Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(<5.3) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), but not in `kvm_msr_list`-obtained by ioctl(KVM_GET_MSR_INDEX_LIST). As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- In response to the suggestions from ZhaoLiu(zhao1@intel.com), the following changes have been implemented in v2: - Adjusted some punctuation in the commit message as per the suggestions. - Added comments to the newly added code to indicate that it is a compatibility fix. v1 link: https://lore.kernel.org/all/20230925071453.14908-1-ewanhai...@zhaoxin.com/ --- target/i386/kvm/kvm.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 11b8177eff..c8f6c0b531 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2296,6 +2296,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2330,6 +2331,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } +/* + * Compatibility fix: + * Older Linux kernels(<5.3) include the MSR_IA32_VMX_PROCBASED_CTLS2 + * only in feature msr list, but not in regular msr list. This lead to + * an issue in older kernel versions where QEMU, through the regular + * MSR list check, assumes the kernel doesn't maintain this msr, + * resulting in incorrect settings by QEMU for this msr. + */ +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; }
Re: [PATCH v2] target/i386/kvm: Refine VMX controls setting for backward compatibility
On 2/20/24 03:32, Xiaoyao Li wrote: diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 11b8177eff..c8f6c0b531 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2296,6 +2296,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; + int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2330,6 +2331,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* + * Compatibility fix: + * Older Linux kernels(<5.3) include the MSR_IA32_VMX_PROCBASED_CTLS2 we can be more accurate, that kernel version 4.17 to 5.2, reports MSR_IA32_VMX_PROCBASED_CTLS2 in KVM_GET_MSR_FEATURE_INDEX_LIST but not KVM_GET_MSR_INDEX_LIST. Yeah, I'll add this more precise comment to the next patch. + * only in feature msr list, but not in regular msr list. This lead to + * an issue in older kernel versions where QEMU, through the regular + * MSR list check, assumes the kernel doesn't maintain this msr, + * resulting in incorrect settings by QEMU for this msr. + */ + for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { + if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { + has_msr_vmx_procbased_ctls2 = true; + } + } I'm wondering should we move all the initialization of has_msr_*, that associated with feature MSRs, to here. e.g., has_msr_arch_capabs, has_msr_vmx_vmfunc,... I believe this is a more elegant way to fix the issue, which will be reflected in my next patch.
Re: [PATCH v2] target/i386/kvm: Refine VMX controls setting for backward compatibility
On 2/20/24 06:07, Ewan Hai wrote: On 2/20/24 03:32, Xiaoyao Li wrote: diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 11b8177eff..c8f6c0b531 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2296,6 +2296,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; + int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2330,6 +2331,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* + * Compatibility fix: + * Older Linux kernels(<5.3) include the MSR_IA32_VMX_PROCBASED_CTLS2 we can be more accurate, that kernel version 4.17 to 5.2, reports MSR_IA32_VMX_PROCBASED_CTLS2 in KVM_GET_MSR_FEATURE_INDEX_LIST but not KVM_GET_MSR_INDEX_LIST. Yeah, I'll add this more precise comment to the next patch. + * only in feature msr list, but not in regular msr list. This lead to + * an issue in older kernel versions where QEMU, through the regular + * MSR list check, assumes the kernel doesn't maintain this msr, + * resulting in incorrect settings by QEMU for this msr. + */ + for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { + if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { + has_msr_vmx_procbased_ctls2 = true; + } + } I'm wondering should we move all the initialization of has_msr_*, that associated with feature MSRs, to here. e.g., has_msr_arch_capabs, has_msr_vmx_vmfunc,... I believe this is a more elegant way to fix the issue, which will be reflected in my next patch. When attempting to move the detection logic for feature MSRs (currently including VMX_VMFUNC, UCODE_REV, ARCH_CAPABILITIES, PROCBASED_CTLS2) from kvm_get_supported_msrs to kvm_get_supported_feature_msrs in the current QEMU, I encountered an "error: failed to set MSR 0x491 to 0x***" on kernel 4.19.67. This issue is due to commit 27c42a1bb ("KVM: nVMX: Enable VMFUNC for the L1 hypervisor", 2017-08-03) exposing VMFUNC to the QEMU guest without corresponding VMFUNC MSR modification code, leading to an error when QEMU proactively tries to set the VMFUNC MSR. This bug affects kernels from 4.14 to 5.2, with a fix introduced in 5.3 by Paolo (e8a70bd4e ("KVM: nVMX: allow setting the VMFUNC controls MSR", 2019-07-02)). Therefore, even if we were to move all feature MSRs to kvm_get_supported_feature_msrs,VMX_VMFUNC could not be moved due to the need to maintain compatibility with different kernel versions. This exception makes our move less elegant. Hence, I am wondering whether we need to move all feature MSRs to kvm_get_supported_feature_msrs. Perhaps we just need to simply move PROCBASED_CTLS2 to fix the "failed set 0x48b ..." type of bugs, and add a comment about it?
Re: [PATCH v3] target/i386/kvm: Refine VMX controls setting for backward compatibility
Sorry for my oversight, I am adding the maintainers who were missed in the previous email. On 6/24/24 05:58, EwanHai wrote: Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(4.17~5.2) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), but not in `kvm_msr_list`-obtained by ioctl(KVM_GET_MSR_INDEX_LIST). As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM does maintain the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- Changes in v3: - Use a more precise version range in the comment, specifically "4.17~5.2" instead of "<5.3". Changes in v2: - Adjusted some punctuation in the commit message as per suggestions. - Added comments to the newly added code to indicate that it is a compatibility fix. v1 link: https://lore.kernel.org/all/20230925071453.14908-1-ewanhai...@zhaoxin.com/ v2 link: https://lore.kernel.org/all/20231127034326.257596-1-ewanhai...@zhaoxin.com/ --- target/i386/kvm/kvm.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 7ad8072748..a7c6c5b2d0 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2386,6 +2386,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2420,6 +2421,20 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* +* Compatibility fix: +* Older Linux kernels (4.17~5.2) report MSR_IA32_VMX_PROCBASED_CTLS2 +* in KVM_GET_MSR_FEATURE_INDEX_LIST but not in KVM_GET_MSR_INDEX_LIST. +* This leads to an issue in older kernel versions where QEMU, +* through the KVM_GET_MSR_INDEX_LIST check, assumes the kernel +* doesn't maintain MSR_IA32_VMX_PROCBASED_CTLS2, resulting in +* incorrect settings by QEMU for this MSR. +*/ +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; }
Re: [PATCH v3] target/i386/kvm: Refine VMX controls setting for backward compatibility
On 6/25/24 05:49, Zhao Liu wrote: diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 7ad8072748..a7c6c5b2d0 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2386,6 +2386,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2420,6 +2421,20 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* +* Compatibility fix: +* Older Linux kernels (4.17~5.2) report MSR_IA32_VMX_PROCBASED_CTLS2 +* in KVM_GET_MSR_FEATURE_INDEX_LIST but not in KVM_GET_MSR_INDEX_LIST. +* This leads to an issue in older kernel versions where QEMU, +* through the KVM_GET_MSR_INDEX_LIST check, assumes the kernel +* doesn't maintain MSR_IA32_VMX_PROCBASED_CTLS2, resulting in +* incorrect settings by QEMU for this MSR. +*/ +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { nit: `i` could be declared here, for (int i = 0; i < kvm_feature_msrs->nmsrs; i++) { do I need to send a v4 version patch,to do this fix? +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; } -- 2.34.1 Since the minimum KVM version supported for i386 is v4.5 (docs/system/ target-i386.rst), this fix makes sense, so for this patch, Reviewed-by: Zhao Liu Additionally, has_msr_vmx_vmfunc has the similar compat issue. I think it deserves a fix, too. -Zhao Thanks for your reply. In fact, I've tried to process has_msr_vmx_vmfunc in the same way as has_msr_vmx_procbased_ctls in this patch, but when I tested on Linux kernel 4.19.67, I encountered an "error: failed to set MSR 0x491 to 0x***". This issue is due to Linux kernel commit 27c42a1bb ("KVM: nVMX: Enable VMFUNC for the L1 hypervisor", 2017-08-03) exposing VMFUNC to the QEMU guest without corresponding VMFUNC MSR modification code, leading to an error when QEMU attempts to set the VMFUNC MSR. This bug affects kernels from 4.14 to 5.2, with a fix introduced in 5.3 by Paolo (e8a70bd4e "KVM: nVMX: allow setting the VMFUNC controls MSR", 2019-07-02). So the fix for has_msr_vmx_vmfunc is clearly different from has_msr_vmx_procbased_ctls2. However, due to the different kernel support situations, I have not yet come up with a suitable way to handle the compatibility of has_msr_vmx_procbased_ctls2 across different kernel versions. Therefore, should we consider only fixing has_msr_vmx_procbased_ctls2 this time and addressing has_msr_vmx_vmfunc in a future patch when the timing is more appropriate?
Re: [PATCH 0/4] Add support for Zhaoxin Yongfeng CPU model and other improvements
I’m sorry, but currently Zhaoxin has not released any specs or datasheets related to the current patch. Zhaoxin CPUs are compatible with the x86 architecture, particularly with Intel. For example, you can refer to the Intel SDM (Software Developer’s Manual). Regarding the current patch, except for the features in the EDX of CPUID leaf 0xC000_0001, all other features can be found in the general Intel specs and are already well known. The Zhaoxin-specific features in CPUID leaf 0xC000_0001 were introduced to the Linux Kernel a long time ago. For example, the FEAT_C000_0001_EDX feature word defined in QEMU comes from the Linux kernel’s arch/x86/include/asm/cpufeatures.h. The CPU model is just a feature enumeration function, which I believe does not require as deep an understanding of the features as driver code does. These changes will only affect users attempting to emulate Zhaoxin CPUs and will not impact other Vendor/Micro-Arch CPUs. On 6/25/24 10:31, Zhao Liu wrote: [这封邮件来自外部发件人 谨防风险] Hi EwanHai, On Tue, Jun 25, 2024 at 05:19:01AM -0400, EwanHai wrote: Date: Tue, 25 Jun 2024 05:19:01 -0400 From: EwanHai Subject: [PATCH 0/4] Add support for Zhaoxin Yongfeng CPU model and other improvements X-Mailer: git-send-email 2.34.1 This patch series introduces support for the Zhaoxin Yongfeng CPU model and includes some improvements and updates related to Zhaoxin and VIA CPUs. The changes ensure that QEMU can correctly identify and emulate Zhaoxin CPUs, providing accurate functionality and performance characteristics. ### Summary of Changes EwanHai (4): target/i386: Add support for Zhaoxin/VIA CPU vendor identification target/i386: Add CPUID leaf 0xC000_0001 EDX definitions target/i386: Introduce Zhaoxin Yongfeng CPU model target/i386: Update CMPLegacy handling for Zhaoxin and VIA CPUs target/i386/cpu.c | 130 -- target/i386/cpu.h | 38 ++ 2 files changed, 165 insertions(+), 3 deletions(-) ### Known Bugs 1. Issue with VMX Preemption Timer Rate on Yongfeng CPU: - Description: On Yongfeng CPUs, the VMX preemption timer rate is 128, meaning that bits 4:0 of MSR_IA32_VMX_MISC_CTLS should be set to 7. However, due to Intel's rate being 5, the Linux kernel has hardcoded this value as 5: `#define VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE 5` - Impact: This discrepancy can cause incorrect behavior in the VMX preemption timer on Yongfeng CPUs. - Workaround: A patch to correct this issue in the Linux kernel is currently being prepared and will be submitted soon. Thanks for your patch. Is there some spec/datasheet link that people can refer to? Regards, Zhao
Re: [PATCH 4/4] target/i386: Update CMPLegacy handling for Zhaoxin and VIA CPUs
On 7/3/24 10:49, Xiaoyao Li wrote: On 6/25/2024 5:19 PM, EwanHai wrote: Zhaoxin and VIA CPUs handle the CMPLegacy bit in the same way as Intel CPUs. This patch simplifies the existing logic by using the IS_XXX_CPU macro and includes checks for Zhaoxin and VIA vendors to align their behavior with Intel. Signed-off-by: EwanHai --- target/i386/cpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 50edff077e..0836416617 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6945,9 +6945,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * So don't set it here for Intel to make Linux guests happy. */ if (threads_per_pkg > 1) { - if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || - env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || - env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { + if (!IS_INTEL_CPU(env) && + !IS_ZHAOXIN_CPU(env) && + !IS_VIA_CPU(env)) { it seems you added ! by mistake. *ecx |= 1 << 1; /* CmpLegacy bit */ } } For CPUID leaf 0x8001 ECX bit 1, Intel defines it as "Bits 04-01: Reserved," whereas AMD defines it as "CmpLegacy, Core multi-processing legacy mode." For Intel CPUs and those following Intel's behavior, this bit should not be set to 1. Therefore, I believe the "!" here is correct.
Re: [PATCH 4/4] target/i386: Update CMPLegacy handling for Zhaoxin and VIA CPUs
On 7/3/24 23:19, Xiaoyao Li wrote: On 7/4/2024 11:14 AM, Ewan Hai wrote: On 7/3/24 10:49, Xiaoyao Li wrote: On 6/25/2024 5:19 PM, EwanHai wrote: Zhaoxin and VIA CPUs handle the CMPLegacy bit in the same way as Intel CPUs. This patch simplifies the existing logic by using the IS_XXX_CPU macro and includes checks for Zhaoxin and VIA vendors to align their behavior with Intel. Signed-off-by: EwanHai --- target/i386/cpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 50edff077e..0836416617 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6945,9 +6945,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * So don't set it here for Intel to make Linux guests happy. */ if (threads_per_pkg > 1) { - if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || - env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || - env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { + if (!IS_INTEL_CPU(env) && + !IS_ZHAOXIN_CPU(env) && + !IS_VIA_CPU(env)) { it seems you added ! by mistake. *ecx |= 1 << 1; /* CmpLegacy bit */ } } For CPUID leaf 0x8001 ECX bit 1, Intel defines it as "Bits 04-01: Reserved," whereas AMD defines it as "CmpLegacy, Core multi-processing legacy mode." For Intel CPUs and those following Intel's behavior, this bit should not be set to 1. Therefore, I believe the "!" here is correct. Sorry, I misread the original code. I think maybe we can just use is_AMD_CPU(). But I'm not sure if any magic use case with customized VENDOR ID relies on it. So you code looks good to me. Ok, thanks. Additionally, in this patch series, I used some VIA terms, which might cause confusion. I will remove all VIA references in the description of the next version of the patch. Currently, the “Centaurhauls” Vendor ID belongs to Zhaoxin CPUs.
Re: [PATCH v2 0/4] Add support for Zhaoxin Yongfeng CPU model and other improvements
Dear Maintainers, Could you please review the current patchset and let me know if you have any concerns? On 7/4/24 07:25, EwanHai wrote: ### Summary of changes EwanHai (4): target/i386: Add support for Zhaoxin CPU vendor identification target/i386: Add CPUID leaf 0xC000_0001 EDX definitions target/i386: Introduce Zhaoxin Yongfeng CPU model target/i386: Update CMPLegacy handling for Zhaoxin CPUs target/i386/cpu.c | 128 -- target/i386/cpu.h | 41 ++- 2 files changed, 165 insertions(+), 4 deletions(-) ### Changes since v1 1. Removed VIA-related information from the patch description to avoid misunderstanding. 2. Replaced CPUID_VENDOR_VIA with CPUID_VENDOR_ZHAOXIN1 because the "Centaurhauls" vendor ID now belongs to Zhaoxin.The previous CPUID_VENDOR_VIA macro was only defined but never used in QEMU, making this change straightforward. v1 link: https://lore.kernel.org/qemu-devel/20240625091905.1325205-1-ewanhai- o...@zhaoxin.com/ ### Known Issues 1. Issue with VMX Preemption Timer Rate on Yongfeng CPU: - Description: On Yongfeng CPUs, the VMX preemption timer rate is 128, meaning that bits 4:0 of MSR_IA32_VMX_MISC_CTLS should be set to 7. However, due to Intel's rate being 5, the Linux kernel has hardcoded this value as 5: `#define VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE 5`. - Impact: This discrepancy can cause incorrect behavior in the VMX preemption timer on Yongfeng CPUs. - Workaround: A patch to correct this issue in the Linux kernel is currently being prepared and will be submitted soon.
Re: [PATCH v2 4/4] target/i386: Update CMPLegacy handling for Zhaoxin CPUs
Hi Zhao Liu, Thank you for your feedback. On 8/8/24 06:30, Zhao Liu wrote: Hi EwanHai, On Thu, Jul 04, 2024 at 07:25:11AM -0400, EwanHai wrote: Date: Thu, 4 Jul 2024 07:25:11 -0400 From: EwanHai Subject: [PATCH v2 4/4] target/i386: Update CMPLegacy handling for Zhaoxin CPUs X-Mailer: git-send-email 2.34.1 Zhaoxin CPUs handle the CMPLegacy bit in the same way as Intel CPUs. This patch simplifies the existing logic by using the IS_XXX_CPU macro and includes checks for Zhaoxin vendor to align their behavior with Intel. Signed-off-by: EwanHai --- target/i386/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index a3747fc487..c52a4cf3ba 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6945,9 +6945,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * So don't set it here for Intel to make Linux guests happy. */ if (threads_per_pkg > 1) { -if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || -env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || -env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { +if (!IS_INTEL_CPU(env) && !IS_ZHAOXIN_CPU(env)) { This change implicitly changes the behavior of existing VIA CPU. Is this a bug for the original VIA? If so, I suggest a separate patch to fix it and explain the effect on the VIA (Zhaoxin1) CPU. Regards, Zhao The reason for this change is not due to a discovered bug, but rather because both Centaurhauls and Shanghai CPUs follow Intel’s behavior regarding the CMPLegacy bit. Specifically, AMD CPUs enumerate the threads per package information in the CPUID leaf 0x8001 output ECX register, while Intel (and **other processors following Intel’s behavior**) do not. Therefore, this modification is simply intended to logically supplement the existing code. Given this, do you think it would be appropriate for me to submit a separate patch to explain this behavior and its effect on VIA (Zhaoxin1) CPUs? If so, I will submmit this change in a separate patch. *ecx |= 1 << 1;/* CmpLegacy bit */ } } -- 2.34.1
Re: [PATCH v2 4/4] target/i386: Update CMPLegacy handling for Zhaoxin CPUs
On 8/8/24 23:22, Zhao Liu wrote: Hi EwanHai, On Thu, Jul 04, 2024 at 07:25:11AM -0400, EwanHai wrote: Date: Thu, 4 Jul 2024 07:25:11 -0400 From: EwanHai Subject: [PATCH v2 4/4] target/i386: Update CMPLegacy handling for Zhaoxin CPUs X-Mailer: git-send-email 2.34.1 Zhaoxin CPUs handle the CMPLegacy bit in the same way as Intel CPUs. Here it could be clearer to say "Don't set up CMPLegacy bit in CPUID[0x8001].ecx for VIA/Zhaoxin CPUs". Ok, I will change this statement. This patch simplifies the existing logic by using the IS_XXX_CPU macro and includes checks for Zhaoxin vendor to align their behavior with Intel. Signed-off-by: EwanHai --- target/i386/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index a3747fc487..c52a4cf3ba 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -6945,9 +6945,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * So don't set it here for Intel to make Linux guests happy. */ if (threads_per_pkg > 1) { -if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 || -env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 || -env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) { +if (!IS_INTEL_CPU(env) && !IS_ZHAOXIN_CPU(env)) { This change implicitly changes the behavior of existing VIA CPU. Is this a bug for the original VIA? If so, I suggest a separate patch to fix it and explain the effect on the VIA (Zhaoxin1) CPU. Regards, Zhao The reason for this change is not due to a discovered bug, but rather because both Centaurhauls and Shanghai CPUs follow Intel’s behavior regarding the CMPLegacy bit. Specifically, AMD CPUs enumerate the threads per package information in the CPUID leaf 0x8001 output ECX register, while Intel (and **other processors following Intel’s behavior**) do not. Therefore, this modification is simply intended to logically supplement the existing code. I see, thanks. Given this, do you think it would be appropriate for me to submit a separate patch to explain this behavior and its effect on VIA (Zhaoxin1) CPUs? If so, I will submmit this change in a separate patch. I think there's no need to split this. However, I think it's necessary to state the effect of the change in the changelog/commit message. It's also worth stating if it won't have any effect on the OS/software. Afterall, the comment of this bit said it affects Linux kernel. Also, changes to the old VIA behavior are worth stating in the commit message, i.e., this patch's changes to Zhaoxin CPUs include the previous VIA CPUs. Additionally, considering this change is to fix the CPUID which doesn't match the bare metal, then what about changing the subject to "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ecx for Zhaoxin/VIA CPUs"? Thanks, Zhao Thank you for your suggestion; the changes will indeed make it clearer. I have a question: since you’ve already added your reviewed-by tag to the first three patches, if I want to modify these descriptions, should I submit a v3 patchset containing all four patches, or should I only send a new patch titled "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ecx for Zhaoxin/Centaur CPUs"? *ecx |= 1 << 1;/* CmpLegacy bit */ } } -- 2.34.1
Re: [PATCH v2 4/4] target/i386: Update CMPLegacy handling for Zhaoxin CPUs
On 8/8/24 23:47, Zhao Liu wrote: On Thu, Aug 08, 2024 at 11:25:45PM -0400, Ewan Hai wrote: [snip] Thank you for your suggestion; the changes will indeed make it clearer. I have a question: since you’ve already added your reviewed-by tag to the first three patches, if I want to modify these descriptions, should I submit a v3 patchset containing all four patches, or should I only send a new patch titled "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ecx for Zhaoxin/Centaur CPUs"? The v3 should contain all 4 patches, and you can add my R/b tag in the first three patches. Thanks! See you in v3!
Re: [PATCH v3 4/4] target/i386: Mask CMPLegacy bit in CPUID[0x80000001].ECX for Zhaoxin CPUs
Thank you for your review!, I will udpate the commit message according to your suggestions to ensure it provides the most accurate information. On 8/12/24 05:52, Zhao Liu wrote: On Fri, Aug 09, 2024 at 05:42:59AM -0400, EwanHai wrote: Date: Fri, 9 Aug 2024 05:42:59 -0400 From: EwanHai Subject: [PATCH v3 4/4] target/i386: Mask CMPLegacy bit in CPUID[0x8001].ECX for Zhaoxin CPUs X-Mailer: git-send-email 2.34.1 Zhaoxin CPUs (including vendors "Shanghai" and "Centaurhauls") handle the CMPLegacy bit similarly to Intel CPUs. Therefore, this commit masks the CMPLegacy bit in CPUID[0x8001].ECX for Zhaoxin CPUs, just as it is done for Intel CPUs. AMD uses the CMPLegacy bit (CPUID[0x8001].ECX.bit1) along with other CPUID information to enumerate platform topology (e.g., the number of logical processors per package). However, for Intel and other CPUs that follow Intel's behavior, CPUID[0x8001].ECX.bit1 is reserved. - Impact on Intel and similar CPUs: This change has no effect on Intel and similar CPUs, as the goal is to accurately emulate CPU CPUID information. - Impact on Linux Guests running on Intel (and similar) vCPUs: During boot, Linux checks if the CPU supports Hyper-Threading. If it detects Maybe "For the kernel before v6.9, if it detects"? About this change, see the below comment... X86_FEATURE_CMP_LEGACY, it assumes Hyper-Threading is not supported. For Intel and similar vCPUs, if the CMPLegacy bit is not masked in CPUID[0x8001].ECX, Linux will incorrectly assume that Hyper-Threading is not supported, even if the vCPU does support it. ...It seems this issue exists in the kernel before v6.9. Thomas' topology refactoring has fixed this behavior: * commit 22d63660c35e ("x86/cpu: Use common topology code for Intel") * commit 598e719c40d6 ("x86/cpu: Use common topology code for Centaur and Zhaoxin") Signed-off-by: EwanHai --- target/i386/cpu.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) Just the above nit. Otherwise, LGTM, Reviewed-by: Zhao Liu
Re: [PATCH v2] target/i386/kvm: Refine VMX controls setting for backward compatibility
Dear Xiaoyao and Maintainers, Are there any new comments regarding this patch? On 2/22/24 22:13, Ewan Hai wrote: On 2/20/24 06:07, Ewan Hai wrote: On 2/20/24 03:32, Xiaoyao Li wrote: diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index 11b8177eff..c8f6c0b531 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2296,6 +2296,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; + int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2330,6 +2331,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* + * Compatibility fix: + * Older Linux kernels(<5.3) include the MSR_IA32_VMX_PROCBASED_CTLS2 we can be more accurate, that kernel version 4.17 to 5.2, reports MSR_IA32_VMX_PROCBASED_CTLS2 in KVM_GET_MSR_FEATURE_INDEX_LIST but not KVM_GET_MSR_INDEX_LIST. Yeah, I'll add this more precise comment to the next patch. + * only in feature msr list, but not in regular msr list. This lead to + * an issue in older kernel versions where QEMU, through the regular + * MSR list check, assumes the kernel doesn't maintain this msr, + * resulting in incorrect settings by QEMU for this msr. + */ + for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { + if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { + has_msr_vmx_procbased_ctls2 = true; + } + } I'm wondering should we move all the initialization of has_msr_*, that associated with feature MSRs, to here. e.g., has_msr_arch_capabs, has_msr_vmx_vmfunc,... I believe this is a more elegant way to fix the issue, which will be reflected in my next patch. When attempting to move the detection logic for feature MSRs (currently including VMX_VMFUNC, UCODE_REV, ARCH_CAPABILITIES, PROCBASED_CTLS2) from kvm_get_supported_msrs to kvm_get_supported_feature_msrs in the current QEMU, I encountered an "error: failed to set MSR 0x491 to 0x***" on kernel 4.19.67. This issue is due to commit 27c42a1bb ("KVM: nVMX: Enable VMFUNC for the L1 hypervisor", 2017-08-03) exposing VMFUNC to the QEMU guest without corresponding VMFUNC MSR modification code, leading to an error when QEMU proactively tries to set the VMFUNC MSR. This bug affects kernels from 4.14 to 5.2, with a fix introduced in 5.3 by Paolo (e8a70bd4e ("KVM: nVMX: allow setting the VMFUNC controls MSR", 2019-07-02)). Therefore, even if we were to move all feature MSRs to kvm_get_supported_feature_msrs,VMX_VMFUNC could not be moved due to the need to maintain compatibility with different kernel versions. This exception makes our move less elegant. Hence, I am wondering whether we need to move all feature MSRs to kvm_get_supported_feature_msrs. Perhaps we just need to simply move PROCBASED_CTLS2 to fix the "failed set 0x48b ..." type of bugs, and add a comment about it?
PING: VMX controls setting patch for backward compatibility
Hi Zhao Liu and QEMU/KVM Community, I hope this email finds you well. I am writing to follow up on the conversation we had a month ago regarding my patch submission for refining the VMX controls setting for backward compatibility on QEMU-KVM. On October 27, I responded to the feedback and suggestions provided by Zhao Liu, making necessary corrections and improvements to the patch. However, since then, I haven't received any further responses or reviews. I understand that everyone is busy, and I appreciate the time and effort that goes into reviewing these submissions. I just wanted to check if there are any updates, additional feedback, or steps I should take next. I am more than willing to make further changes if needed. Please let me know if there is anything else required from my side for this patch to move forward. Thank you for your time and attention. I look forward to hearing from you. Best regards, Ewan Hai On 10/27/23 02:08, Ewan Hai wrote: Hi Zhao, since I found last email contains non-plain-text content, and...@vger.kernel.org rejected to receive my mail, so just re-send last mail here, to follow the rule of qemu /kvm community. On 10/25/23 23:20, Zhao Liu wrote: On Mon, Sep 25, 2023 at 03:14:53AM -0400, EwanHai wrote: Date: Mon, 25 Sep 2023 03:14:53 -0400 From: EwanHai Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility X-Mailer: git-send-email 2.34.1 Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn’t account for cases in older Linux s/didn’t/didn't/ I'll fix it. kernels(e.g., linux-4.19.90) where `MSR_IA32_VMX_PROCBASED_CTLS2` is For this old kernel, it's better to add the brief lifecycle note (e.g., lts, EOL) to illustrate the value of considering such compatibility fixes. I've checked the linux-stable repo, found that MSR_IA32_VMX_PROCBASED_CTLS2 is not included in kvm regular msr list until linux-5.3, and in linux-4.19.x(EOL:Dec,2024), there is also no MSR_IA32_VMX_PROCBASED_CTLS2 in kvm regular msr list. So maybe this is an important compatibility fix for kernel < 5.3. in `kvm_feature_msrs`—obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), s/—obtained/-obtained/ I'll fix it. but not in `kvm_msr_list`—obtained by ioctl(KVM_GET_MSR_INDEX_LIST). s/—obtained/-obtained/ I'll fix it. As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- target/i386/kvm/kvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index af101fcdf6..6299284de4 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2343,6 +2343,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; + int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2377,6 +2378,11 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } It's worth adding a comment here to indicate that this is a compatibility fix. -Zhao + for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { + if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { + has_msr_vmx_procbased_ctls2 = true; + } + } return 0; } -- 2.34.1 Plan to use patch bellow, any more suggestion? From a3006fcec3615d98ac1eb252a61952d44aa5029b Mon Sep 17 00:00:00 2001 From: EwanHai Date: Mon, 25 Sep 2023 02:11:59 -0400 Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(<5.3) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obta
Re: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility
On 10/25/23 23:20, Zhao Liu wrote: On Mon, Sep 25, 2023 at 03:14:53AM -0400, EwanHai wrote: Date: Mon, 25 Sep 2023 03:14:53 -0400 From: EwanHai Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility X-Mailer: git-send-email 2.34.1 Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn’t account for cases in older Linux s/didn’t/didn't/ I'll fix it. kernels(e.g., linux-4.19.90) where `MSR_IA32_VMX_PROCBASED_CTLS2` is For this old kernel, it's better to add the brief lifecycle note (e.g., lts, EOL) to illustrate the value of considering such compatibility fixes. I've checked the linux-stable repo, found that MSR_IA32_VMX_PROCBASED_CTLS2 is not included in kvm regular msr list until linux-5.3, and in linux-4.19.x(EOL:Dec,2024), there is also no MSR_IA32_VMX_PROCBASED_CTLS2 in kvm regular msr list. So maybe this is an important compatibility fix for kernel < 5.3. in `kvm_feature_msrs`—obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), s/—obtained/-obtained/ I'll fix it. but not in `kvm_msr_list`—obtained by ioctl(KVM_GET_MSR_INDEX_LIST). s/—obtained/-obtained/ I'll fix it. As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- target/i386/kvm/kvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index af101fcdf6..6299284de4 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2343,6 +2343,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2377,6 +2378,11 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } It's worth adding a comment here to indicate that this is a compatibility fix. -Zhao +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; } -- 2.34.1 Plan to use patch bellow, any more suggestion? From a3006fcec3615d98ac1eb252a61952d44aa5029b Mon Sep 17 00:00:00 2001 From: EwanHai Date: Mon, 25 Sep 2023 02:11:59 -0400 Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(<5.3) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), but not in `kvm_msr_list`-obtained by ioctl(KVM_GET_MSR_INDEX_LIST). As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- target/i386/kvm/kvm.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index af101fcdf6..3cf95f8579 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2343,6 +2343,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2377,6 +2378,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } +/* + * Compatibility fix: + * Older Linux kernels(<5.3) include the MSR_IA32_VMX_PROCBASED_CTLS2 + * only in feature msr list, but not in regular msr list. This lead to + * an issue in older kernel versions where QEMU, throu
Re: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility
Hi Zhao, since I found last email contains non-plain-text content, and...@vger.kernel.org rejected to receive my mail, so just re-send last mail here, to follow the rule of qemu /kvm community. On 10/25/23 23:20, Zhao Liu wrote: On Mon, Sep 25, 2023 at 03:14:53AM -0400, EwanHai wrote: Date: Mon, 25 Sep 2023 03:14:53 -0400 From: EwanHai Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility X-Mailer: git-send-email 2.34.1 Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn’t account for cases in older Linux s/didn’t/didn't/ I'll fix it. kernels(e.g., linux-4.19.90) where `MSR_IA32_VMX_PROCBASED_CTLS2` is For this old kernel, it's better to add the brief lifecycle note (e.g., lts, EOL) to illustrate the value of considering such compatibility fixes. I've checked the linux-stable repo, found that MSR_IA32_VMX_PROCBASED_CTLS2 is not included in kvm regular msr list until linux-5.3, and in linux-4.19.x(EOL:Dec,2024), there is also no MSR_IA32_VMX_PROCBASED_CTLS2 in kvm regular msr list. So maybe this is an important compatibility fix for kernel < 5.3. in `kvm_feature_msrs`—obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), s/—obtained/-obtained/ I'll fix it. but not in `kvm_msr_list`—obtained by ioctl(KVM_GET_MSR_INDEX_LIST). s/—obtained/-obtained/ I'll fix it. As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- target/i386/kvm/kvm.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index af101fcdf6..6299284de4 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2343,6 +2343,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2377,6 +2378,11 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } It's worth adding a comment here to indicate that this is a compatibility fix. -Zhao +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; } -- 2.34.1 Plan to use patch bellow, any more suggestion? From a3006fcec3615d98ac1eb252a61952d44aa5029b Mon Sep 17 00:00:00 2001 From: EwanHai Date: Mon, 25 Sep 2023 02:11:59 -0400 Subject: [PATCH] target/i386/kvm: Refine VMX controls setting for backward compatibility Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(<5.3) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), but not in `kvm_msr_list`-obtained by ioctl(KVM_GET_MSR_INDEX_LIST). As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM maintains the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- target/i386/kvm/kvm.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index af101fcdf6..3cf95f8579 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2343,6 +2343,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2377,6 +2378,19 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } +/* + * Compatibility fix: + * Older Linux
Re: [PATCH v3] target/i386/kvm: Refine VMX controls setting for backward compatibility
Dear Maintainers and Paolo, I hope this message finds you well. I am writing to inquire about the status of the patch I submitted a month ago. Could you please provide any updates or addtional comments regarding its review? Thank you for your time and assistance. Best regards, Ewan On 6/25/24 10:08, Zhao Liu wrote: Additionally, has_msr_vmx_vmfunc has the similar compat issue. I think it deserves a fix, too. -Zhao Thanks for your reply. In fact, I've tried to process has_msr_vmx_vmfunc in the same way as has_msr_vmx_procbased_ctls in this patch, but when I tested on Linux kernel 4.19.67, I encountered an "error: failed to set MSR 0x491 to 0x***". This issue is due to Linux kernel commit 27c42a1bb ("KVM: nVMX: Enable VMFUNC for the L1 hypervisor", 2017-08-03) exposing VMFUNC to the QEMU guest without corresponding VMFUNC MSR modification code, leading to an error when QEMU attempts to set the VMFUNC MSR. This bug affects kernels from 4.14 to 5.2, with a fix introduced in 5.3 by Paolo (e8a70bd4e "KVM: nVMX: allow setting the VMFUNC controls MSR", 2019-07-02). It looks like this fix was not ported to the 4.19 stable kernel. So the fix for has_msr_vmx_vmfunc is clearly different from has_msr_vmx_procbased_ctls2. However, due to the different kernel support situations, I have not yet come up with a suitable way to handle the compatibility of has_msr_vmx_procbased_ctls2 across different kernel versions. Therefore, should we consider only fixing has_msr_vmx_procbased_ctls2 this time and addressing has_msr_vmx_vmfunc in a future patch when the timing is more appropriate? I agree this fix should focus on MSR_IA32_VMX_PROCBASED_CTLS2. But I think at least we need a comment (maybe a TODO) to note the case of has_msr_vmx_vmfunc in a followup patch. Let's wait and see what Paolo will say. -Zhao
[PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. This patch corrects the model field. Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model") Signed-off-by: Ewan Hai Reviewed-by: Zhao Liu --- target/i386/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba..0dd9788a68 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5503,7 +5503,7 @@ static const X86CPUDefinition builtin_x86_defs[] = { .level = 0x1F, .vendor = CPUID_VENDOR_ZHAOXIN1, .family = 7, -.model = 11, +.model = 0x5b, .stepping = 3, /* missing: CPUID_HT, CPUID_TM, CPUID_PBE */ .features[FEAT_1_EDX] = -- 2.34.1
Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
[2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's vendor when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore, should we display a warning to users who enable both vPMU and KVM acceleration but do not manually set the guest vendor when it differs from the host vendor? Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the default :) Okay, handling this situation can be rather complex, so let's keep it simple. I have added a dedicated function to capture the intended behavior for potential future reference. Anyway, Thanks for taking Zhaoxin's situation into account, regardless. +/* + * check_vendor_compatibility_and_warn() returns true if the host and + * guest vendors are compatible for vPMU virtualization. In addition, if + * the guest vendor is not explicitly set in a cross-vendor emulation + * scenario (e.g., a Zhaoxin host emulating an Intel guest or vice versa), + * it issues a warning. + */ +static bool check_vendor_compatibility_and_warn(CPUX86State *env) +{ +char host_vendor[CPUID_VENDOR_SZ + 1]; +uint32_t host_cpuid_vendor1, host_cpuid_vendor2, host_cpuid_vendor3; + +/* Retrieve host vendor info */ +host_cpuid(0x0, 0, NULL, &host_cpuid_vendor1, &host_cpuid_vendor3, + &host_cpuid_vendor2); +x86_cpu_vendor_words2str(host_vendor, host_cpuid_vendor1, + host_cpuid_vendor2, host_cpuid_vendor3); + +/* + * Case A: + * If the host vendor is Intel or Zhaoxin and the guest CPU type is + * either Intel or Zhaoxin, consider them compatible. However, if a + * cross-vendor scenario is detected (e.g., host is Zhaoxin but guest is + * Intel, or vice versa) and the guest vendor fields have not been + * overridden (i.e., they still match the host), then warn the user. + */ +if ((g_str_equal(host_vendor, CPUID_VENDOR_INTEL) || + g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN1) || + g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN2)) && +(IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) +{ +if ((g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN1) || + g_str_equal(host_vendor, CPUID_VENDOR_ZHAOXIN2)) && +IS_INTEL_CPU(env) && +(env->cpuid_vendor1 == host_cpuid_vendor1 && + env->cpuid_vendor2 == host_cpuid_vendor2 && + env->cpuid_vendor3 == host_cpuid_vendor3)) +{ +warning_report("vPMU emulation will fail because the guest vendor " +"is not explicitly set. Use '-cpu,vendor=Intel' to " +"emulate Intel vPMU on a Zhaoxin host."); +} +else if (g_str_equal(host_vendor, CPUID_VENDOR_INTEL) && + IS_ZHAOXIN_CPU(env) && + (env->cpuid_vendor1 == host_cpuid_vendor1 && + env->cpuid_vendor2 == host_cpuid_vendor2 && + env->cpuid_vendor3 == host_cpuid_vendor3)) +{ +warning_report("vPMU emulation will fail because the guest vendor" +"is not explicitly set. Use '-cpu,vendor=Zhaoxin' " +"to emulate Zhaoxin vPMU on an Intel host."); +} +return true; +} + +/* + * Case B: + * For other CPU types, if the guest vendor fields exactly match the host, + * consider them compatible. + */ +if (env->cpuid_vendor1 == host_cpuid_vendor1 && +env->cpuid_vendor2 == host_cpuid_vendor2 && +env->cpuid_vendor3 == host_cpuid_vendor3) +{ +return true; +} + +return false; +} +
Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset
On 4/7/25 4:51 PM, Zhao Liu wrote: On Tue, Apr 01, 2025 at 11:35:49AM +0800, Ewan Hai wrote: Date: Tue, 1 Apr 2025 11:35:49 +0800 From: Ewan Hai Subject: Re: [PATCH v2 08/10] target/i386/kvm: reset AMD PMU registers during VM reset [2] As mentioned in [1], QEMU always sets the vCPU's vendor to match the host's vendor when acceleration (KVM or HVF) is enabled. Therefore, if users want to emulate a Zhaoxin CPU on an Intel host, the vendor must be set manually.Furthermore, should we display a warning to users who enable both vPMU and KVM acceleration but do not manually set the guest vendor when it differs from the host vendor? Maybe not? Sometimes I emulate AMD on Intel host, while vendor is still the default :) Okay, handling this situation can be rather complex, so let's keep it simple. I have added a dedicated function to capture the intended behavior for potential future reference. Anyway, Thanks for taking Zhaoxin's situation into account, regardless. Thanks for your code example!! Zhaoxin implements perfmon v2, so I think checking the vendor might be overly complicated. If a check is needed, it seems more reasonable to check the perfmon version rather than the vendor, similar to how avx10 version is checked in x86_cpu_filter_features(). I understand Ewan's concern is that if an Intel guest requires a higher perfmon version that the Zhaoxin host doesn't support, there could be issues (although I think this situation doesn't currently exist in KVM-QEMU, one reason is QEMU uses the pmu_version in 0xa queried from KVM directly, which means QEMU currently doesn't support custom pmu_version). Yeah, that's exactly what I was concerned about. Thanks for clearing that up! perfmon_version is a great idea --- I might add it as a property to the QEMU vCPU template in the future, so it can adjust based on user input and host support. Can't promise a timeline yet, but it's definitely something I'll keep in mind.
Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
On 4/11/25 11:22 AM, Zhao Liu wrote: On Thu, Apr 10, 2025 at 10:07:15PM +0800, Ewan Hai wrote: Date: Thu, 10 Apr 2025 22:07:15 +0800 From: Ewan Hai Subject: Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template On 4/10/25 8:22 PM, Paolo Bonzini wrote: On 4/7/25 04:07, Ewan Hai wrote: The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. This patch corrects the model field. Hi, please follow commit e0013791b9326945ccd09b5b602437beb322cab8 to define a new version of the CPU. I’ve noticed that in the QEMU repository at commit e0013791b9326945ccd09b5b602437beb322cab8 (as HEAD), the following patches I previously submitted (which the Zhaoxin YongFeng vCPU model depends on) are not included: :-) e0013791b9326945ccd09b5b602437beb322cab8 is an example case to show how to fix model id. - 5d20aa540b6991c0dbeef933d2055e5372f52e0e: "target/i386: Add support for Zhaoxin CPU vendor identification" - c0799e8b003713e07b546faba600363eccd179ee: "target/i386: Add CPUID leaf 0xC000_0001 EDX definitions" - ff04bc1ac478656e5d6a255bf4069edb3f55bc58: "target/i386: Introduce Zhaoxin Yongfeng CPU model" (this is the main patch that needs to be fixed) - a4e749780bd20593c0c386612a51bf4d64a80132: "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ECX for Zhaoxin CPUs" Should I resend the entire patchset, or would it be sufficient to just send a revised version of the “target/i386: Introduce Zhaoxin Yongfeng CPU model” patch? IIUC, because this fix is planning to land in v10.1 (next release cycle), current CPU model (will be released in v10.0) can't be modified directly. It is only possible to directly modify an unreleased CPU model during the same release cycle. Thus it's enough to just introduce a v2 and correct your model id like this: diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..1ca1c3a729e8 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5621,6 +5621,17 @@ static const X86CPUDefinition builtin_x86_defs[] = { .features[FEAT_VMX_VMFUNC] = MSR_VMX_VMFUNC_EPT_SWITCHING, .xlevel = 0x8008, .model_id = "Zhaoxin YongFeng Processor", +.versions = (X86CPUVersionDefinition[]) { +{ .version = 1 }, +{ +.version = 2, +.props = (PropValue[]) { +{ "model", "0x5b" }, +{ /* end of list */ } +} +}, +{ /* end of list */ } +} }, }; Thanks again for your patience and explanation. I'm not entirely sure if this is the best approach. I have one thought, and I'd like your help to confirm whether I'm on the right track or not. From what I can tell, most other vCPU definitions that use the .versions mechanism do so incrementally: for instance, they add new features in v2, v3, etc., but each of those versions (v1, v2, v3) remains valid for practical use. However, in our specific case, the v1 version of the Zhaoxin vCPU definition has an incorrect .model value, which breaks the Linux guest's vPMU functionality. That makes me uncertain whether using new version definitions to fix this issue is really the best solution. After all, v1 itself would remain problematic. Do you have any thoughts on whether it might be better to correct the existing definition, or do you think the versioned approach is still the recommended path? I appreciate any input or guidance you can provide.
Re: [PATCH v3] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
On 4/14/25 11:05 PM, Zhao Liu wrote: On Mon, Apr 14, 2025 at 03:53:42AM -0400, Ewan Hai wrote: Date: Mon, 14 Apr 2025 03:53:42 -0400 From: Ewan Hai Subject: [PATCH v3] target/i386: Fix model number of Zhaoxin YongFeng vCPU template X-Mailer: git-send-email 2.34.1 The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. Using the wrong model number can affect guest behavior. One known issue is that vPMU (which relies on the model number) may fail to operate correctly. This patch corrects the model field by introducing a new vCPU version. Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model") Signed-off-by: Ewan Hai --- target/i386/cpu.c | 12 1 file changed, 12 insertions(+) Reviewed-by: Zhao Liu BTW, if you want to add more notes or explaination to strongly ask users to use v2, you can add a section "Preferred CPU models for Zhaoxin x86 hosts" in docs/system/cpu-models-x86.rst.inc. Thanks for the reminder and the review, Zhao. I'll send a v4 patch that includes a new section titled “Preferred CPU models for Zhaoxin x86 hosts” in docs/system/cpu-models-x86.rst.inc to strongly recommend using v2.
Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
On 4/10/25 8:22 PM, Paolo Bonzini wrote: On 4/7/25 04:07, Ewan Hai wrote: The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. This patch corrects the model field. Hi, please follow commit e0013791b9326945ccd09b5b602437beb322cab8 to define a new version of the CPU. I’ve noticed that in the QEMU repository at commit e0013791b9326945ccd09b5b602437beb322cab8 (as HEAD), the following patches I previously submitted (which the Zhaoxin YongFeng vCPU model depends on) are not included: - 5d20aa540b6991c0dbeef933d2055e5372f52e0e: "target/i386: Add support for Zhaoxin CPU vendor identification" - c0799e8b003713e07b546faba600363eccd179ee: "target/i386: Add CPUID leaf 0xC000_0001 EDX definitions" - ff04bc1ac478656e5d6a255bf4069edb3f55bc58: "target/i386: Introduce Zhaoxin Yongfeng CPU model" (this is the main patch that needs to be fixed) - a4e749780bd20593c0c386612a51bf4d64a80132: "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ECX for Zhaoxin CPUs" Should I resend the entire patchset, or would it be sufficient to just send a revised version of the “target/i386: Introduce Zhaoxin Yongfeng CPU model” patch? Thank you for your guidance.
Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
On 4/14/25 2:44 PM, Xiaoyao Li wrote: On 4/11/2025 3:42 PM, Ewan Hai wrote: On 4/11/25 11:22 AM, Zhao Liu wrote: On Thu, Apr 10, 2025 at 10:07:15PM +0800, Ewan Hai wrote: Date: Thu, 10 Apr 2025 22:07:15 +0800 From: Ewan Hai Subject: Re: [PATCH v2] target/i386: Fix model number of Zhaoxin YongFeng vCPU template On 4/10/25 8:22 PM, Paolo Bonzini wrote: On 4/7/25 04:07, Ewan Hai wrote: The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. This patch corrects the model field. Hi, please follow commit e0013791b9326945ccd09b5b602437beb322cab8 to define a new version of the CPU. I’ve noticed that in the QEMU repository at commit e0013791b9326945ccd09b5b602437beb322cab8 (as HEAD), the following patches I previously submitted (which the Zhaoxin YongFeng vCPU model depends on) are not included: :-) e0013791b9326945ccd09b5b602437beb322cab8 is an example case to show how to fix model id. - 5d20aa540b6991c0dbeef933d2055e5372f52e0e: "target/i386: Add support for Zhaoxin CPU vendor identification" - c0799e8b003713e07b546faba600363eccd179ee: "target/i386: Add CPUID leaf 0xC000_0001 EDX definitions" - ff04bc1ac478656e5d6a255bf4069edb3f55bc58: "target/i386: Introduce Zhaoxin Yongfeng CPU model" (this is the main patch that needs to be fixed) - a4e749780bd20593c0c386612a51bf4d64a80132: "target/i386: Mask CMPLegacy bit in CPUID[0x8001].ECX for Zhaoxin CPUs" Should I resend the entire patchset, or would it be sufficient to just send a revised version of the “target/i386: Introduce Zhaoxin Yongfeng CPU model” patch? IIUC, because this fix is planning to land in v10.1 (next release cycle), current CPU model (will be released in v10.0) can't be modified directly. It is only possible to directly modify an unreleased CPU model during the same release cycle. Thus it's enough to just introduce a v2 and correct your model id like this: diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..1ca1c3a729e8 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5621,6 +5621,17 @@ static const X86CPUDefinition builtin_x86_defs[] = { .features[FEAT_VMX_VMFUNC] = MSR_VMX_VMFUNC_EPT_SWITCHING, .xlevel = 0x8008, .model_id = "Zhaoxin YongFeng Processor", + .versions = (X86CPUVersionDefinition[]) { + { .version = 1 }, + { + .version = 2, + .props = (PropValue[]) { + { "model", "0x5b" }, + { /* end of list */ } + } + }, + { /* end of list */ } + } }, }; Thanks again for your patience and explanation. I'm not entirely sure if this is the best approach. I have one thought, and I'd like your help to confirm whether I'm on the right track or not. From what I can tell, most other vCPU definitions that use the .versions mechanism do so incrementally: for instance, they add new features in v2, v3, etc., but each of those versions (v1, v2, v3) remains valid for practical use. However, in our specific case, the v1 version of the Zhaoxin vCPU definition has an incorrect .model value, which breaks the Linux guest's vPMU functionality. That makes me uncertain whether using new version definitions to fix this issue is really the best solution. After all, v1 itself would remain problematic. Do you have any thoughts on whether it might be better to correct the existing definition, or do you think the versioned approach is still the recommended path? I appreciate any input or guidance you can provide. If changing the @model value directly, it will introduce user visible change. E.g., live migrating from old QEMU to new QEMU, the guest will find the model number changes. That's why versioned CPU model was introduced. It becomes the fact already that YongFeng-v1 has model id 11 and broken vpmu. If user create vcpu with YongFeng-v1, user has to accept it. If user wants a working vpmu, go and use YongFeng-v2 Got it. Thanks for the explanation. I'll send a new fix patch as soon as possible.
[PATCH v3] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. Using the wrong model number can affect guest behavior. One known issue is that vPMU (which relies on the model number) may fail to operate correctly. This patch corrects the model field by introducing a new vCPU version. Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model") Signed-off-by: Ewan Hai --- target/i386/cpu.c | 12 1 file changed, 12 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba..3fb1ec62da 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5621,6 +5621,18 @@ static const X86CPUDefinition builtin_x86_defs[] = { .features[FEAT_VMX_VMFUNC] = MSR_VMX_VMFUNC_EPT_SWITCHING, .xlevel = 0x8008, .model_id = "Zhaoxin YongFeng Processor", +.versions = (X86CPUVersionDefinition[]) { +{ .version = 1 }, +{ +.version = 2, +.note = "with the correct model number", +.props = (PropValue[]) { +{ "model", "0x5b" }, +{ /* end of list */ } +} +}, +{ /* end of list */ } +} }, }; -- 2.34.1
Re: [PATCH v4] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
On 4/24/25 3:25 PM, Michael Tokarev wrote: On 15.04.2025 05:45, Ewan Hai wrote: The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. Using the wrong model number can affect guest behavior. One known issue is that vPMU (which relies on the model number) may fail to operate correctly. This patch corrects the model field by introducing a new vCPU version. Additionally, it adds a "Preferred CPU models for Zhaoxin x86 hosts" section in docs/system/cpu-models-x86.rst.inc to recommend the appropriate Zhaoxin CPU model(s). Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model") Signed-off-by: Ewan Hai Reviewed-by: Zhao Liu Is it a qemu-stable material (for 10.0.x)? I'm picking this one up, please let me know if I should not. I'm not sure if this should go into qemu-stable. Maybe Paolo Bonzini or Zhao can answer?
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 4/23/25 7:46 PM, Zhao Liu wrote: Per SDM, 0x8005 leaf is reserved for Intel CPU, and its current "assert" check blocks adding new cache model for non-AMD CPUs. Therefore, check the vendor and encode this leaf as all-0 for Intel CPU. And since Zhaoxin mostly follows Intel behavior, apply the vendor check for Zhaoxin as well. Thanks for taking Zhaoxin CPUs into account. Zhaoxin follows AMD's definition for CPUID leaf 0x8005, so this leaf is valid on our CPUs rather than reserved. We do, however, follow Intel's definition for leaf 0x8006. Note, for !vendor_cpuid_only case, non-AMD CPU would get the wrong information, i.e., get AMD's cache model for Intel or Zhaoxin CPUs. For this case, there is no need to tweak for non-AMD CPUs, because vendor_cpuid_only has been turned on by default since PC machine v6.1. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..8fdafa8aedaf 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -7248,11 +7248,23 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *edx = env->cpuid_model[(index - 0x8002) * 4 + 3]; break; case 0x8005: -/* cache info (L1 cache) */ -if (cpu->cache_info_passthrough) { +/* + * cache info (L1 cache) + * + * For !vendor_cpuid_only case, non-AMD CPU would get the wrong + * information, i.e., get AMD's cache model. It doesn't matter, + * vendor_cpuid_only has been turned on by default since + * PC machine v6.1. + */ +if (cpu->vendor_cpuid_only && +(IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) { Given that, there is no need to add IS_ZHAOXIN_CPU(env) to the 0x8005 path. Note that the L1 TLB constants for the YongFeng core differ from the current values in target/i386/cpu.c(YongFeng defaults shown in brackets): #define L1_DTLB_2M_ASSOC 1 (4) #define L1_DTLB_2M_ENTRIES 255 (32) #define L1_DTLB_4K_ASSOC 1 (6) #define L1_DTLB_4K_ENTRIES 255 (96) #define L1_ITLB_2M_ASSOC 1 (4) #define L1_ITLB_2M_ENTRIES 255 (32) #define L1_ITLB_4K_ASSOC 1 (6) #define L1_ITLB_4K_ENTRIES 255 (96) I am still reviewing how these constants flow through cpu_x86_cpuid() for leaf 0x8005, so I'm not yet certain whether they are overridden. For now, the patchset can ignore Zhaoxin in leaf 0x8005. Once I have traced the code path, I will send an update if needed. Please include Zhaoxin in the handling for leaf 0x8006. I should have sent this after completing my review, but I did not want to delay your work. Sorry for the noise. Thanks again for your work.
[PATCH v4] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. Using the wrong model number can affect guest behavior. One known issue is that vPMU (which relies on the model number) may fail to operate correctly. This patch corrects the model field by introducing a new vCPU version. Additionally, it adds a "Preferred CPU models for Zhaoxin x86 hosts" section in docs/system/cpu-models-x86.rst.inc to recommend the appropriate Zhaoxin CPU model(s). Fixes: ff04bc1ac4 ("target/i386: Introduce Zhaoxin Yongfeng CPU model") Signed-off-by: Ewan Hai Reviewed-by: Zhao Liu --- docs/system/cpu-models-x86.rst.inc | 17 + target/i386/cpu.c | 12 2 files changed, 29 insertions(+) diff --git a/docs/system/cpu-models-x86.rst.inc b/docs/system/cpu-models-x86.rst.inc index 6a770ca835..ba001422e2 100644 --- a/docs/system/cpu-models-x86.rst.inc +++ b/docs/system/cpu-models-x86.rst.inc @@ -369,6 +369,23 @@ features are included if using "Host passthrough" or "Host model". Note that not all CPU hardware will support this feature. +Preferred CPU models for Zhaoxin x86 hosts +^^ +The following CPU models are preferred for use on Zhaoxin hosts. +Administrators / applications are recommended to use the CPU model that +matches the generation of the host CPUs in use. In a deployment with a +mixture of host CPU models between machines, if live migration +compatibility is required, use the newest CPU model that is compatible +across all desired hosts. + +Currently, Zhaoxin provides a single CPU model (with potential for more in +the near future), which has two versions. Among them, version 2 is recommended +as it resolves several guest runtime issues related to the model field (FMS). + +``YongFeng-v2`` +Zhaoxin KH-4 Processor (2022) + + Default x86 CPU models ^^ diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba..3fb1ec62da 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5621,6 +5621,18 @@ static const X86CPUDefinition builtin_x86_defs[] = { .features[FEAT_VMX_VMFUNC] = MSR_VMX_VMFUNC_EPT_SWITCHING, .xlevel = 0x8008, .model_id = "Zhaoxin YongFeng Processor", +.versions = (X86CPUVersionDefinition[]) { +{ .version = 1 }, +{ +.version = 2, +.note = "with the correct model number", +.props = (PropValue[]) { +{ "model", "0x5b" }, +{ /* end of list */ } +} +}, +{ /* end of list */ } +} }, }; -- 2.34.1
[Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT / CMP_LEG bits
Hi Community, This email contains 3 bugs appear to share the same root cause. [1] We ran into the following warnings when running QEMU v10.0.0 in TCG mode: qemu-system-x86_64 \ -machine q35 \ -m 4G -smp 4 \ -kernel ./arch/x86/boot/bzImage \ -bios /usr/share/ovmf/OVMF.fd \ -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ -nographic \ -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:EDX.ht [bit 28] qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.8001H:ECX.cmp-legacy [bit 1] (repeats 4 times, once per vCPU) Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is what introduced the warnings. Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT support, these bits trigger the warnings above. [2] Also, Zhao pointed me to a similar report on GitLab: https://gitlab.com/qemu-project/qemu/-/issues/2894 The symptoms there look identical to what we're seeing. By convention we file one issue per email, but these two appear to share the same root cause, so I'm describing them together here. [3] My colleague Alan noticed what appears to be a related problem: if we launch a guest with '-cpu ,-ht --enable-kvm', which means explicitly removing the ht flag, but the guest still reports HT(cat /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht bit seems to be forced on even when the user tries to disable it. Best regards, Ewan
Re: [Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT / CMP_LEG bits
On 4/29/25 11:02 AM, Ewan Hai wrote: Hi Community, This email contains 3 bugs appear to share the same root cause. [1] We ran into the following warnings when running QEMU v10.0.0 in TCG mode: qemu-system-x86_64 \ -machine q35 \ -m 4G -smp 4 \ -kernel ./arch/x86/boot/bzImage \ -bios /usr/share/ovmf/OVMF.fd \ -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \ -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \ -nographic \ -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr' qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:EDX.ht [bit 28] qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.8001H:ECX.cmp-legacy [bit 1] (repeats 4 times, once per vCPU) Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is what introduced the warnings. Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28]) and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT support, these bits trigger the warnings above. [2] Also, Zhao pointed me to a similar report on GitLab: https://gitlab.com/qemu-project/qemu/-/issues/2894 The symptoms there look identical to what we're seeing. By convention we file one issue per email, but these two appear to share the same root cause, so I'm describing them together here. [3] My colleague Alan noticed what appears to be a related problem: if we launch a guest with '-cpu ,-ht --enable-kvm', which means explicitly removing the ht flag, but the guest still reports HT(cat /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht bit seems to be forced on even when the user tries to disable it. XiaoYao reminded me that issue [3] stems from a different patch. Please ignore it for now—I'll start a separate thread to discuss that one independently. Best regards, Ewan
[Bug] "-ht" flag ignored under KVM - guest still reports HT
Hi Community, We have observed that the 'ht' feature bit cannot be disabled when QEMU runs with KVM acceleration. qemu-system-x86_64 \ --enable-kvm \ -machine q35 \ -cpu host,-ht \ -smp 4 \ -m 4G \ -drive file=rootfs.img,format=raw \ -nographic \ -append 'console=ttyS0 root=/dev/sda rw' Because '-ht' is specified, the guest should expose no HT capability (cpuid.1.edx[28] = 0), and /proc/cpuinfo shouldn't show HT feature, but we still saw ht in linux guest when run 'cat /proc/cpuinfo'. XiaoYao mentioned that: It has been the behavior of QEMU since commit 400281af34e5ee6aa9f5496b53d8f82c6fef9319 Author: Andre Przywara Date: Wed Aug 19 15:42:42 2009 +0200 set CPUID bits to present cores and threads topology that we cannot remove HT CPUID bit from guest via "-cpu xxx,-ht" if the VM has >= 2 vcpus. I'd like to know whether there's a plan to address this issue, or if the current behaviour is considered acceptable. Best regards, Ewan.
Re: [Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT / CMP_LEG bits
On 5/8/25 5:04 PM, Zhao Liu wrote: [3] My colleague Alan noticed what appears to be a related problem: if we launch a guest with '-cpu ,-ht --enable-kvm', which means explicitly removing the ht flag, but the guest still reports HT(cat /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht bit seems to be forced on even when the user tries to disable it. XiaoYao reminded me that issue [3] stems from a different patch. Please ignore it for now—I'll start a separate thread to discuss that one independently. I haven't found any other thread :-). Please refer to https://lore.kernel.org/all/db6ae3bb-f4e5-4719-9beb-623fcff56...@zhaoxin.com/. By the way, just curious, in what cases do you need to disbale the HT flag? "-smp 4" means 4 cores with 1 thread per core, and is it not enough? As for the “-ht” behavior, I'm also unsure whether this should be fixed or not - one possible consideration is whether “-ht” would be useful. I wasn't trying to target any specific use case, using "-ht" was simply a way to check how the ht feature behaves under both KVM and TCG. There's no special workload behind it; I just wanted to confirm that the flag is respected (or not) in each mode.
[PATCH] target/i386: Fix model number of Zhaoxin YongFeng vCPU template
The model number was mistakenly set to 0x0b (11) in commit ff04bc1ac4. The correct value is 0x5b. This mistake occurred because the extended model bits in cpuid[eax=0x1].eax were overlooked, and only the base model was used. This patch corrects the model field. Signed-off-by: Ewan Hai --- target/i386/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba..0dd9788a68 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -5503,7 +5503,7 @@ static const X86CPUDefinition builtin_x86_defs[] = { .level = 0x1F, .vendor = CPUID_VENDOR_ZHAOXIN1, .family = 7, -.model = 11, +.model = 0x5b, .stepping = 3, /* missing: CPUID_HT, CPUID_TM, CPUID_PBE */ .features[FEAT_1_EDX] = -- 2.34.1
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 4/23/25 7:46 PM, Zhao Liu wrote: Per SDM, 0x8005 leaf is reserved for Intel CPU, and its current "assert" check blocks adding new cache model for non-AMD CPUs. Therefore, check the vendor and encode this leaf as all-0 for Intel CPU. And since Zhaoxin mostly follows Intel behavior, apply the vendor check for Zhaoxin as well. Note, for !vendor_cpuid_only case, non-AMD CPU would get the wrong information, i.e., get AMD's cache model for Intel or Zhaoxin CPUs. For this case, there is no need to tweak for non-AMD CPUs, because vendor_cpuid_only has been turned on by default since PC machine v6.1. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..8fdafa8aedaf 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -7248,11 +7248,23 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *edx = env->cpuid_model[(index - 0x8002) * 4 + 3]; break; case 0x8005: -/* cache info (L1 cache) */ -if (cpu->cache_info_passthrough) { +/* + * cache info (L1 cache) + * + * For !vendor_cpuid_only case, non-AMD CPU would get the wrong + * information, i.e., get AMD's cache model. It doesn't matter, + * vendor_cpuid_only has been turned on by default since + * PC machine v6.1. + */ +if (cpu->vendor_cpuid_only && +(IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) { +*eax = *ebx = *ecx = *edx = 0; +break; +} else if (cpu->cache_info_passthrough) { x86_cpu_get_cache_cpuid(index, 0, eax, ebx, ecx, edx); break; } + *eax = (L1_DTLB_2M_ASSOC << 24) | (L1_DTLB_2M_ENTRIES << 16) | (L1_ITLB_2M_ASSOC << 8) | (L1_ITLB_2M_ENTRIES); *ebx = (L1_DTLB_4K_ASSOC << 24) | (L1_DTLB_4K_ENTRIES << 16) | I've reviewed the cache-related CPUID path and noticed an oddity: every AMD vCPU model still reports identical hard-coded values for the L1 ITLB and L1 DTLB fields in leaf 0x8000_0005. Your patch fixes this for Intel(and Zhaoxin), but all AMD models continue to receive the same constants in EAX/EBX. Do you know the reason for this choice? Is the guest expected to ignore those L1 TLB numbers? If so, I'll prepare a patch that adjusts only the Zhaoxin defaults in leaf 0x8000_0005 like below, matching real YongFeng behaviour in ecx and edx, but keep eax and ebx following AMD's behaviour. ## Notes 1. Changes tied to "-machine smp-cache xxx" (mainly x86_cpu_update_smp_cache_topo()) are not included here. 2. Do you think I need Zhaoxin-specific legacy_l1d/l1i/l2/l3_cache helpers? If yes, I'll add them with YongFeng sillicon topology data. --- patch prototype start --- diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 7b223642ba..8a17e5ffe9 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -2726,6 +2726,66 @@ static const CPUCaches xeon_srf_cache_info = { }, }; +static const CPUCaches yongfeng_cache_info = { +.l1d_cache = &(CPUCacheInfo) { +.type = DATA_CACHE, +.level = 1, +.size = 32 * KiB, +.line_size = 64, +.associativity = 8, +.partitions = 1, +.sets = 64, +.lines_per_tag = 1, +.inclusive = false, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l1i_cache = &(CPUCacheInfo) { +.type = INSTRUCTION_CACHE, +.level = 1, +.size = 64 * KiB, +.line_size = 64, +.associativity = 16, +.partitions = 1, +.sets = 64, +.lines_per_tag = 1, +.inclusive = false, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l2_cache = &(CPUCacheInfo) { +.type = UNIFIED_CACHE, +.level = 2, +.size = 256 * KiB, +.line_size = 64, +.associativity = 8, +.partitions = 1, +.sets = 512, +.lines_per_tag = 1, +.inclusive = true, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l3_cache = &(CPUCacheInfo) { +.type = UNIFIED_CACHE, +.level = 3, +.size = 8 * MiB, +.line_size = 64, +.associativity = 16, +.partitions = 1, +.sets = 8192, +.lines_per_tag = 1, +.self_init = true, +.inclusive = true, +.no_invd_sharing = true, +.complex_indexing = false, +.share_level = CPU_TOPOLOGY_LEVEL_DIE, +}, +}; + /* The following VMX features are not supported by KVM and are left out in the * CPU definitions: * @@ -5928,6 +5988,15 @@ static const X86CPUDefinition builtin_x86_defs[] = { { /* end of list
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 5/27/25 5:15 PM, Zhao Liu wrote: On 4/23/25 7:46 PM, Zhao Liu wrote: Per SDM, 0x8005 leaf is reserved for Intel CPU, and its current "assert" check blocks adding new cache model for non-AMD CPUs. Therefore, check the vendor and encode this leaf as all-0 for Intel CPU. And since Zhaoxin mostly follows Intel behavior, apply the vendor check for Zhaoxin as well. Note, for !vendor_cpuid_only case, non-AMD CPU would get the wrong information, i.e., get AMD's cache model for Intel or Zhaoxin CPUs. For this case, there is no need to tweak for non-AMD CPUs, because vendor_cpuid_only has been turned on by default since PC machine v6.1. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..8fdafa8aedaf 100644 [snip]>>> + *eax = (L1_DTLB_2M_ASSOC << 24) | (L1_DTLB_2M_ENTRIES << 16) | (L1_ITLB_2M_ASSOC << 8) | (L1_ITLB_2M_ENTRIES); *ebx = (L1_DTLB_4K_ASSOC << 24) | (L1_DTLB_4K_ENTRIES << 16) | I've reviewed the cache-related CPUID path and noticed an oddity: every AMD vCPU model still reports identical hard-coded values for the L1 ITLB and L1 DTLB fields in leaf 0x8000_0005. Your patch fixes this for Intel(and Zhaoxin), but all AMD models continue to receive the same constants in EAX/EBX. Yes, TLB info is hardcoded here. Previously, Babu and Eduardo cleaned up the cache info but didn't cover TLB [*]. I guess one reason would there are very few use cases related to TLB's info, and people are more concerned about the cache itself. [*]: https://lore.kernel.org/qemu-devel/20180510204148.11687-2-babu.mo...@amd.com/ Understood. Keeping the L1 I/D-TLB fields hard-coded for every vCPU model is acceptable. Do you know the reason for this choice? Is the guest expected to ignore those L1 TLB numbers? If so, I'll prepare a patch that adjusts only the Zhaoxin defaults in leaf 0x8000_0005 like below, matching real YongFeng behaviour in ecx and edx, but keep eax and ebx following AMD's behaviour. This way is fine for me. Thanks for confirming. I'll post the YongFeng cache-info series once your refactor lands. ## Notes 1. Changes tied to "-machine smp-cache xxx" (mainly x86_cpu_update_smp_cache_topo()) are not included here. 2. Do you think I need Zhaoxin-specific legacy_l1d/l1i/l2/l3_cache helpers? If yes, I'll add them with YongFeng sillicon topology data. Legacy cache info stands for information on very old machines. So I think your yongfeng_cache_info is enough for Yongfeng CPU. --- patch prototype start --- Thank you for your patch! You're welcome! diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 7b223642ba..8a17e5ffe9 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -2726,6 +2726,66 @@ static const CPUCaches xeon_srf_cache_info = { }, }; +static const CPUCaches yongfeng_cache_info = { [snip]>> +.l3_cache = &(CPUCacheInfo) { +.type = UNIFIED_CACHE, +.level = 3, +.size = 8 * MiB, +.line_size = 64, +.associativity = 16, +.partitions = 1, +.sets = 8192, +.lines_per_tag = 1, +.self_init = true, +.inclusive = true, +.no_invd_sharing = true, +.complex_indexing = false, +.share_level = CPU_TOPOLOGY_LEVEL_DIE, Just to confirm: Is the topology of cache on your physical machines per-die instead of per-socket? Apologies for the confusion, the code I shared was only a prototype. Before submitting the real patch I'll verify every value in yongfeng_cache_info against silicon. For YongFeng the hierarchy is: L1/L2 shared per core, L3 shared per die (four cores per L3). +}, +}; + /* The following VMX features are not supported by KVM and are left out in the * CPU definitions: * @@ -5928,6 +5988,15 @@ static const X86CPUDefinition builtin_x86_defs[] = { { /* end of list */ } } }, +{ +.version = 3, +.note = "wiith the correct model number and cache info", +.props = (PropValue[]) { +{ "model", "0x5b" }, +{ /* end of list */ } +}, +.cache_info = &yongfeng_cache_info +}, { /* end of list */ } } }, Adding a cache model can be done as a separate patch. :-) Ok, the current blob was for review only; the formal submission will be split into logical patches.> @@ -7565,8 +7634,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, * vendor_cpuid_only has been turned on by default since * PC machine v6.1. */ -if (cpu->vendor_cpuid_only && -(IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) { +if (cpu->vendor_cpuid_only && IS_INTEL_CPU(env)) { *eax = *ebx = *
Re: [RFC 00/10] i386/cpu: Cache CPUID fixup, Intel cache model & topo CPUID enhencement
On 5/27/25 5:19 PM, Zhao Liu wrote: On Mon, May 26, 2025 at 06:52:41PM +0800, Ewan Hai wrote: Date: Mon, 26 May 2025 18:52:41 +0800 From: Ewan Hai Subject: Re: [RFC 00/10] i386/cpu: Cache CPUID fixup, Intel cache model & topo CPUID enhencement On 4/23/25 7:46 PM, Zhao Liu wrote: Hi all, (Since patches 1 and 2 involve changes to x86 vendors other than Intel, I have also cc'd friends from AMD and Zhaoxin.) These are the ones I was going to clean up a long time ago: * Fixup CPUID 0x8005 & 0x8006 for Intel (and Zhaoxin now). * Add cache model for Intel CPUs. * Enable 0x1f CPUID leaf for specific Intel CPUs, which already have this leaf on host by default. If you run into vendor specific branches while refactoring the topology-related code, please feel free to treat Intel and Zhaoxin as one class. For every topology CPUID leaf(0x0B, 0x1F, ...) so far, Zhaoxin has followed the Intel SDM definition exactly. Thank you for your confirmation. I'll post v2 soon (If things go well, it'll be in the next two weeks. :-) ) No rush, everyone is busy, maintainers especially so. Just handle it whenever it best fits your schedule.
Re: [RFC 00/10] i386/cpu: Cache CPUID fixup, Intel cache model & topo CPUID enhencement
On 4/23/25 7:46 PM, Zhao Liu wrote: Hi all, (Since patches 1 and 2 involve changes to x86 vendors other than Intel, I have also cc'd friends from AMD and Zhaoxin.) These are the ones I was going to clean up a long time ago: * Fixup CPUID 0x8005 & 0x8006 for Intel (and Zhaoxin now). * Add cache model for Intel CPUs. * Enable 0x1f CPUID leaf for specific Intel CPUs, which already have this leaf on host by default. If you run into vendor specific branches while refactoring the topology-related code, please feel free to treat Intel and Zhaoxin as one class. For every topology CPUID leaf(0x0B, 0x1F, ...) so far, Zhaoxin has followed the Intel SDM definition exactly.
[PATCH v3 RESEND] target/i386/kvm: Refine VMX controls setting for backward compatibility
From: EwanHai Commit 4a910e1 ("target/i386: do not set unsupported VMX secondary execution controls") implemented a workaround for hosts that have specific CPUID features but do not support the corresponding VMX controls, e.g., hosts support RDSEED but do not support RDSEED-Exiting. In detail, commit 4a910e1 introduced a flag `has_msr_vmx_procbased_clts2`. If KVM has `MSR_IA32_VMX_PROCBASED_CTLS2` in its msr list, QEMU would use KVM's settings, avoiding any modifications to this MSR. However, this commit (4a910e1) didn't account for cases in older Linux kernels(4.17~5.2) where `MSR_IA32_VMX_PROCBASED_CTLS2` is in `kvm_feature_msrs`-obtained by ioctl(KVM_GET_MSR_FEATURE_INDEX_LIST), but not in `kvm_msr_list`-obtained by ioctl(KVM_GET_MSR_INDEX_LIST). As a result,it did not set the `has_msr_vmx_procbased_clts2` flag based on `kvm_msr_list` alone, even though KVM does maintain the value of this MSR. This patch supplements the above logic, ensuring that `has_msr_vmx_procbased_clts2` is correctly set by checking both MSR lists, thus maintaining compatibility with older kernels. Signed-off-by: EwanHai --- Changes in v3: - Use a more precise version range in the comment, specifically "4.17~5.2" instead of "<5.3". Changes in v2: - Adjusted some punctuation in the commit message as per suggestions. - Added comments to the newly added code to indicate that it is a compatibility fix. v1 link: https://lore.kernel.org/all/20230925071453.14908-1-ewanhai...@zhaoxin.com/ v2 link: https://lore.kernel.org/all/20231127034326.257596-1-ewanhai...@zhaoxin.com/ --- target/i386/kvm/kvm.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c index a6bc089d02..4ff9b5995c 100644 --- a/target/i386/kvm/kvm.c +++ b/target/i386/kvm/kvm.c @@ -2466,6 +2466,7 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) static int kvm_get_supported_feature_msrs(KVMState *s) { int ret = 0; +int i; if (kvm_feature_msrs != NULL) { return 0; @@ -2500,6 +2501,20 @@ static int kvm_get_supported_feature_msrs(KVMState *s) return ret; } + /* +* Compatibility fix: +* Older Linux kernels (4.17~5.2) report MSR_IA32_VMX_PROCBASED_CTLS2 +* in KVM_GET_MSR_FEATURE_INDEX_LIST but not in KVM_GET_MSR_INDEX_LIST. +* This leads to an issue in older kernel versions where QEMU, +* through the KVM_GET_MSR_INDEX_LIST check, assumes the kernel +* doesn't maintain MSR_IA32_VMX_PROCBASED_CTLS2, resulting in +* incorrect settings by QEMU for this MSR. +*/ +for (i = 0; i < kvm_feature_msrs->nmsrs; i++) { +if (kvm_feature_msrs->indices[i] == MSR_IA32_VMX_PROCBASED_CTLS2) { +has_msr_vmx_procbased_ctls2 = true; +} +} return 0; } -- 2.34.1
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 6/24/25 3:22 PM, Zhao Liu wrote: On Tue, May 27, 2025 at 05:56:07PM +0800, Ewan Hai wrote: Date: Tue, 27 May 2025 17:56:07 +0800 From: Ewan Hai Subject: Re: [RFC 01/10] i386/cpu: Mark CPUID[0x8005] as reserved for Intel On 5/27/25 5:15 PM, Zhao Liu wrote: On 4/23/25 7:46 PM, Zhao Liu wrote: Per SDM, 0x8005 leaf is reserved for Intel CPU, and its current "assert" check blocks adding new cache model for non-AMD CPUs. Therefore, check the vendor and encode this leaf as all-0 for Intel CPU. And since Zhaoxin mostly follows Intel behavior, apply the vendor check for Zhaoxin as well. Note, for !vendor_cpuid_only case, non-AMD CPU would get the wrong information, i.e., get AMD's cache model for Intel or Zhaoxin CPUs. For this case, there is no need to tweak for non-AMD CPUs, because vendor_cpuid_only has been turned on by default since PC machine v6.1. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..8fdafa8aedaf 100644 [snip]>>> + *eax = (L1_DTLB_2M_ASSOC << 24) | (L1_DTLB_2M_ENTRIES << 16) | (L1_ITLB_2M_ASSOC << 8) | (L1_ITLB_2M_ENTRIES); *ebx = (L1_DTLB_4K_ASSOC << 24) | (L1_DTLB_4K_ENTRIES << 16) | I've reviewed the cache-related CPUID path and noticed an oddity: every AMD vCPU model still reports identical hard-coded values for the L1 ITLB and L1 DTLB fields in leaf 0x8000_0005. Your patch fixes this for Intel(and Zhaoxin), but all AMD models continue to receive the same constants in EAX/EBX. Yes, TLB info is hardcoded here. Previously, Babu and Eduardo cleaned up the cache info but didn't cover TLB [*]. I guess one reason would there are very few use cases related to TLB's info, and people are more concerned about the cache itself. [*]: https://lore.kernel.org/qemu-devel/20180510204148.11687-2-babu.mo...@amd.com/ Understood. Keeping the L1 I/D-TLB fields hard-coded for every vCPU model is acceptable. Do you know the reason for this choice? Is the guest expected to ignore those L1 TLB numbers? If so, I'll prepare a patch that adjusts only the Zhaoxin defaults in leaf 0x8000_0005 like below, matching real YongFeng behaviour in ecx and edx, but keep eax and ebx following AMD's behaviour. This way is fine for me. Thanks for confirming. I'll post the YongFeng cache-info series once your refactor lands. Hi Ewan, By this patch: https://lore.kernel.org/qemu-devel/20250620092734.1576677-14-zhao1@intel.com/ I fixed the 0x8005 leaf for Intel and Zhaoxin based on your feedback by the way. It looks like the unified cache_info would be very compatible with various vendor needs and corner cases. So I'll respin this series based on that cache_info series. Before sending the patch, I was thinking that maybe I could help you rebase and include your Yongfeng cache model patch you added into my v2 series, or you could rebase and send it yourself afterward. Which way do you like? It would be great if you could include the Yongfeng cache-model patch in your v2 series. Let me know if you need any more information about the Yongfeng cache model. After you submit v2, I can review the Zhaoxin parts and make any necessary code changes if needed. And thanks again for taking Zhaoxin into account. And for TLB, we can wait and see what maintainers think. Maybe it is possible, considering that the cache also transitioned from hardcoding to a cache model (since commit 7e3482f82480). Let's wait.> Thanks, Zhao
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 6/25/25 11:03 AM, Zhao Liu wrote: On Tue, Jun 24, 2025 at 07:04:02PM +0800, Ewan Hai wrote: Date: Tue, 24 Jun 2025 19:04:02 +0800 From: Ewan Hai Subject: Re: [RFC 01/10] i386/cpu: Mark CPUID[0x8005] as reserved for Intel On 6/24/25 3:22 PM, Zhao Liu wrote: On Tue, May 27, 2025 at 05:56:07PM +0800, Ewan Hai wrote: Date: Tue, 27 May 2025 17:56:07 +0800 From: Ewan Hai Subject: Re: [RFC 01/10] i386/cpu: Mark CPUID[0x8005] as reserved for Intel On 5/27/25 5:15 PM, Zhao Liu wrote: On 4/23/25 7:46 PM, Zhao Liu wrote: Per SDM, 0x8005 leaf is reserved for Intel CPU, and its current "assert" check blocks adding new cache model for non-AMD CPUs. Therefore, check the vendor and encode this leaf as all-0 for Intel CPU. And since Zhaoxin mostly follows Intel behavior, apply the vendor check for Zhaoxin as well. Note, for !vendor_cpuid_only case, non-AMD CPU would get the wrong information, i.e., get AMD's cache model for Intel or Zhaoxin CPUs. For this case, there is no need to tweak for non-AMD CPUs, because vendor_cpuid_only has been turned on by default since PC machine v6.1. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 1b64ceaaba46..8fdafa8aedaf 100644 [snip]>>> + *eax = (L1_DTLB_2M_ASSOC << 24) | (L1_DTLB_2M_ENTRIES << 16) | (L1_ITLB_2M_ASSOC << 8) | (L1_ITLB_2M_ENTRIES); *ebx = (L1_DTLB_4K_ASSOC << 24) | (L1_DTLB_4K_ENTRIES << 16) | I've reviewed the cache-related CPUID path and noticed an oddity: every AMD vCPU model still reports identical hard-coded values for the L1 ITLB and L1 DTLB fields in leaf 0x8000_0005. Your patch fixes this for Intel(and Zhaoxin), but all AMD models continue to receive the same constants in EAX/EBX. Yes, TLB info is hardcoded here. Previously, Babu and Eduardo cleaned up the cache info but didn't cover TLB [*]. I guess one reason would there are very few use cases related to TLB's info, and people are more concerned about the cache itself. [*]: https://lore.kernel.org/qemu-devel/20180510204148.11687-2-babu.mo...@amd.com/ Understood. Keeping the L1 I/D-TLB fields hard-coded for every vCPU model is acceptable. Do you know the reason for this choice? Is the guest expected to ignore those L1 TLB numbers? If so, I'll prepare a patch that adjusts only the Zhaoxin defaults in leaf 0x8000_0005 like below, matching real YongFeng behaviour in ecx and edx, but keep eax and ebx following AMD's behaviour. This way is fine for me. Thanks for confirming. I'll post the YongFeng cache-info series once your refactor lands. Hi Ewan, By this patch: https://lore.kernel.org/qemu-devel/20250620092734.1576677-14-zhao1@intel.com/ I fixed the 0x8005 leaf for Intel and Zhaoxin based on your feedback by the way. It looks like the unified cache_info would be very compatible with various vendor needs and corner cases. So I'll respin this series based on that cache_info series. Before sending the patch, I was thinking that maybe I could help you rebase and include your Yongfeng cache model patch you added into my v2 series, or you could rebase and send it yourself afterward. Which way do you like? It would be great if you could include the Yongfeng cache-model patch in your v2 series. Let me know if you need any more information about the Yongfeng cache model. After you submit v2, I can review the Zhaoxin parts and make any necessary code changes if needed. And thanks again for taking Zhaoxin into account. Welcome; it's something I can easily help with. If possible, when v2 is out, hope you could help test it on your platform to ensure everything is fine. :-) And I've verified it myself through TCG. There's no problem!> Thanks, Zhao
Re: [RFC 01/10] i386/cpu: Mark CPUID[0x80000005] as reserved for Intel
On 6/25/25 5:19 PM, Zhao Liu wrote: Just want to confirm with the "lines_per_tag" field, which is related about how to handle current "assert(lines_per_tag > 0)": --- patch prototype start --- diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 7b223642ba..8a17e5ffe9 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -2726,6 +2726,66 @@ static const CPUCaches xeon_srf_cache_info = { }, }; +static const CPUCaches yongfeng_cache_info = { +.l1d_cache = &(CPUCacheInfo) { +.type = DATA_CACHE, +.level = 1, +.size = 32 * KiB, +.line_size = 64, +.associativity = 8, +.partitions = 1, +.sets = 64, +.lines_per_tag = 1, This fits AMD APM, and is fine. +.inclusive = false, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l1i_cache = &(CPUCacheInfo) { +.type = INSTRUCTION_CACHE, +.level = 1, +.size = 64 * KiB, +.line_size = 64, +.associativity = 16, +.partitions = 1, +.sets = 64, +.lines_per_tag = 1, Fine, too. +.inclusive = false, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l2_cache = &(CPUCacheInfo) { +.type = UNIFIED_CACHE, +.level = 2, +.size = 256 * KiB, +.line_size = 64, +.associativity = 8, +.partitions = 1, +.sets = 512, +.lines_per_tag = 1, SDM reserves this field: For 0x8006 ECX: Bits 11-08: Reserved. So I think this field should be 0, to align with "Reserved". I agree. For Zhaoxin, the "lines-per-tag" field appears only in CPUID leaf 0x8005. Because Zhaoxin follows AMD behavior on this leaf, and the AMD manual states that it reports L1 cache/TLB information, so any "lines-per-tag" value for levels other than L1 should be omitted or set to zero. In this patch: https://lore.kernel.org/qemu-devel/20250620092734.1576677-9-zhao1@intel.com/ I add an argument (lines_per_tag_supported) in encode_cache_cpuid8006(), and for the case that lines_per_tag_supported=false, I assert "lines_per_tag == 0" to align with "Reserved". +.inclusive = true, +.self_init = true, +.no_invd_sharing = false, +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l3_cache = &(CPUCacheInfo) { +.type = UNIFIED_CACHE, +.level = 3, +.size = 8 * MiB, +.line_size = 64, +.associativity = 16, +.partitions = 1, +.sets = 8192, +.lines_per_tag = 1, The 0x8006 EDX is also reserved in SDM. So I think this field should be 0, too. Do you agree? Ditto.> +.self_init = true, +.inclusive = true, +.no_invd_sharing = true, +.complex_indexing = false, +.share_level = CPU_TOPOLOGY_LEVEL_DIE, +}, +}; +
Re: [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
On 6/26/25 4:31 PM, Zhao Liu wrote: From: Ewan Hai Add the cache model to YongFeng (v3) to better emulate its environment. Note, although YongFeng v2 was added after v10.0, it was also back ported to v10.0.2. Therefore, the new version (v3) is needed to avoid conflict. The cache model is as follows: --- cache 0 --- cache type = data cache (1) cache level= 0x1 (1) self-initializing cache level = true fully associative cache= false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = false inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 32768 (32 KB) --- cache 1 --- cache type = instruction cache (2) cache level= 0x1 (1) self-initializing cache level = true fully associative cache= false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x40 (64) WBINVD/INVD acts on lower caches = false inclusive to lower caches = false complex cache indexing = false number of sets (s) = 64 (size synth) = 65536 (64 KB) --- cache 2 --- cache type = unified cache (3) cache level= 0x2 (2) self-initializing cache level = true fully associative cache= false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x8 (8) number of sets = 0x200 (512) WBINVD/INVD acts on lower caches = false inclusive to lower caches = true complex cache indexing = false number of sets (s) = 512 (size synth) = 262144 (256 KB) --- cache 3 --- cache type = unified cache (3) cache level= 0x3 (3) self-initializing cache level = true fully associative cache= false maximum IDs for CPUs sharing cache = 0x0 (0) maximum IDs for cores in pkg = 0x0 (0) system coherency line size = 0x40 (64) physical line partitions = 0x1 (1) ways of associativity = 0x10 (16) number of sets = 0x2000 (8192) WBINVD/INVD acts on lower caches = true inclusive to lower caches = true complex cache indexing = false number of sets (s) = 8192 (size synth) = 8388608 (8 MB) --- cache 4 --- cache type = no more caches (0) Signed-off-by: Ewan Hai Signed-off-by: Zhao Liu --- Changes on the original codes: * Rearrange cache model fields to make them easier to check. * And add explanation of why v3 is needed. * Drop lines_per_tag field for L2 & L3. --- target/i386/cpu.c | 104 ++ 1 file changed, 104 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index a7f2e5dd3fcb..08c84ba90f52 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -3159,6 +3159,105 @@ static const CPUCaches xeon_srf_cache_info = { }, }; +static const CPUCaches yongfeng_cache_info = { +.l1d_cache = &(CPUCacheInfo) { +/* CPUID 0x4.0x0.EAX */ +.type = DATA_CACHE, +.level = 1, +.self_init = true, + +/* CPUID 0x4.0x0.EBX */ +.line_size = 64, +.partitions = 1, +.associativity = 8, + +/* CPUID 0x4.0x0.ECX */ +.sets = 64, + +/* CPUID 0x4.0x0.EDX */ +.no_invd_sharing = false, +.inclusive = false, +.complex_indexing = false, + +/* CPUID 0x8005.ECX */ +.lines_per_tag = 1, +.size = 32 * KiB, + +.share_level = CPU_TOPOLOGY_LEVEL_CORE, +}, +.l1i_cache = &(CPUCacheInfo) { +/* CPUID 0x4.0x1.EAX */ +.type = INSTRUCTION_CACHE, +
Re: [PATCH 4/8] i386/cpu: Introduce cache model for YongFeng
On 7/2/25 2:35 PM, Zhao Liu wrote: +{ +.version = 3, +.note = "with the cache info", I realize that my previous use of "cache info" was not precise; "cache model" is more appropriate. Please help me adjust accordingly, thank you. Nope, will fix. Thank you! +.cache_info = &yongfeng_cache_info +}, { /* end of list */ } } }, -- 2.34.1 Hi Zhao, I tested the patchsets you provided on different hosts, and here are the results: 1. On an Intel host with KVM enabled The CPUID leaves 0x2 and 0x4 reported inside the YongFeng-V3 VM match our expected cache details exactly. However, CPUID leaf 0x8005 returns all zeros. This is because when KVM is in use, QEMU uses the host's vendor for the IS_INTEL_CPU(env), IS_ZHAOXIN_CPU(env), and IS_AMD_CPU(env) checks. This is a bug: https://lore.kernel.org/qemu-devel/d429b6f5-b59c-4884-b18f-8db71cb8d...@oracle.com/ And we expect we can change the vendor with KVM. Oh, I've reviewed your discussion on this bug, and it looks like it will be resolved soon. Given that behavior, a zeroed 0x8005 leaf in the guest is expected and, to me, acceptable. What are your thoughts? Well, (with this bug) since VM is "Intel" vendor, so this is correct. 2. On a YongFeng host (with or without KVM) The CPUID leaves 0x2, 0x4, and 0x8006 inside the VM all return the values we want, and the L1D/L1I cache info in leaf 0x8005 is also correct. Nice! 3. TLB info in leaf 0x8005 On both Intel and YongFeng hosts, the L1 TLB fields in leaf 0x8005 remain constant, as we discussed. As you mentioned before, "we can wait and see what maintainers think" about this. Yes. I suppose Zhaoxin also uses 0x2 to present TLB info like Intel does. Yeah. Same behaviour. To support TLB, I feel like there is still some work to be done, and it depends on if it's worth it... The community will provide the final decision, let's wait and see. In summary, both patchsets look good for Zhaoxin support, I don't see any issues so far. Thanks! Btw, YongFeng host also support 0x1F, does YongFeng need to turn on "x-force-cpuid-0x1f" default ? I think maybe yes. OK, will add it. Thanks a lot! BTW...my colleague reports a bug that Intel/Zhaoxin CPUs with cache model will meet assertion failure on the v10.0 or old machine. So I think it's necessary to drop all the assert() checks on lines_per_tag directly: I'm not sure in which scenarios this assertion failure occurs, so I can't offer any ideas on a solution… diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 18bb0e9cf9f6..f73943a46945 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -491,7 +491,6 @@ static void encode_topo_cpuid1f(CPUX86State *env, uint32_t count, static uint32_t encode_cache_cpuid8005(CPUCacheInfo *cache) { assert(cache->size % 1024 == 0); -assert(cache->lines_per_tag > 0); assert(cache->associativity > 0); assert(cache->line_size > 0); return ((cache->size / 1024) << 24) | (cache->associativity << 16) | @@ -520,13 +519,10 @@ static uint32_t encode_cache_cpuid8005(CPUCacheInfo *cache) */ static void encode_cache_cpuid8006(CPUCacheInfo *l2, CPUCacheInfo *l3, - uint32_t *ecx, uint32_t *edx, - bool lines_per_tag_supported) + uint32_t *ecx, uint32_t *edx) { assert(l2->size % 1024 == 0); assert(l2->associativity > 0); -assert(lines_per_tag_supported ? - l2->lines_per_tag > 0 : l2->lines_per_tag == 0); *ecx = ((l2->size / 1024) << 16) | (X86_ENC_ASSOC(l2->associativity) << 12) | (l2->lines_per_tag << 8) | (l2->line_size); @@ -535,8 +531,6 @@ static void encode_cache_cpuid8006(CPUCacheInfo *l2, if (l3) { assert(l3->size % (512 * 1024) == 0); assert(l3->associativity > 0); -assert(lines_per_tag_supported ? - l3->lines_per_tag > 0 : l3->lines_per_tag == 0); assert(l3->line_size > 0); *edx = ((l3->size / (512 * 1024)) << 18) | (X86_ENC_ASSOC(l3->associativity) << 12) | @@ -8353,7 +8347,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, (IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env))) { *eax = *ebx = 0; encode_cache_cpuid8006(caches->l2_cache, - NULL, ecx, edx, false); + NULL, ecx, edx); break; } @@ -8369,7 +8363,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, encode_cache_cpuid8006(caches->l2_cache, cpu->enable_l3_cache ? caches->l3_cache : NULL, -
Re: [PATCH 2/4] i386/cpu: Mark CPUID 0x80000007[EBX] as reserved for Intel
On 6/27/25 11:51 AM, Zhao Liu wrote: Per SDM, 8007H EAX Reserved = 0. EBX Reserved = 0. ECX Reserved = 0. EDX Bits 07-00: Reserved = 0. Bit 08: Invariant TSC available if 1. Bits 31-09: Reserved = 0. EAX/EBX/ECX in CPUID 0x8007 leaf are reserved for Intel. At present, EAX is reserved for AMD, too. And AMD hasn't used ECX in QEMU. So these 2 registers are both left as 0. Therefore, only fix the EBX and excode it as 0 for Intel. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 867e08236540..6d590a9af389 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -8374,7 +8374,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, } case 0x8007: *eax = 0; -*ebx = env->features[FEAT_8000_0007_EBX]; +if (cpu->vendor_cpuid_only_v2 && IS_INTEL_CPU(env)) { Please add IS_ZHAOXIN_CPU(env), because Zhaoxin follows the SDM definition for CPUID leaf 0x8008. +*ebx = 0; +} else { +*ebx = env->features[FEAT_8000_0007_EBX]; +} *ecx = 0; *edx = env->features[FEAT_8000_0007_EDX]; break; -- 2.34.1
Re: [PATCH 1/4] i386/cpu: Mark EBX/ECX/EDX in CPUID 0x80000000 leaf as reserved for Intel
On 6/27/25 11:51 AM, Zhao Liu wrote: +if (cpu->vendor_cpuid_only_v2 && IS_INTEL_CPU(env)) { Please also add IS_ZHAOXIN_CPU(env), since Zhaoxin follows the SDM definition for CPUID leaf 0x8000. +*ebx = *ecx = *edx = 0; +} else { +*ebx = env->cpuid_vendor1; +*edx = env->cpuid_vendor2; +*ecx = env->cpuid_vendor3; +}
Re: [PATCH 3/4] i386/cpu: Mark ECX/EDX in CPUID 0x80000008 leaf as reserved for Intel
On 6/27/25 11:51 AM, Zhao Liu wrote: Per SDM, 8008H EAX Linear/Physical Address size. Bits 07-00: #Physical Address Bits*. Bits 15-08: #Linear Address Bits. Bits 31-16: Reserved = 0. EBX Bits 08-00: Reserved = 0. Bit 09: WBNOINVD is available if 1. Bits 31-10: Reserved = 0. ECX Reserved = 0. EDX Reserved = 0. ECX/EDX in CPUID 0x8008 leaf are reserved. Encode these 2 registers as 0 for Intel. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 6d590a9af389..5d5a227d4c8a 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -8391,6 +8391,12 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *eax |= (cpu->guest_phys_bits << 16); } *ebx = env->features[FEAT_8000_0008_EBX]; + +if (cpu->vendor_cpuid_only_v2 && IS_INTEL_CPU(env)) { Please add IS_ZHAOXIN_CPU(env), Zhaoxin follow SDM's definition on cpuid leaf 0x8000_0008. +*ecx = *edx = 0; +break; +} + if (threads_per_pkg > 1) { /* * Bits 15:12 is "The number of bits in the initial -- 2.34.1
Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
On 6/27/25 11:05 AM, Zhao Liu wrote: After applying these patches to QEMU mainline at commit 6e1571533fd9: Ah, I forgot I've rebased these patches...Now you can rebase all the patches at the latest master branch. Or, you can try this repo - I just created it to make it easier for you: https://gitlab.com/zhao.liu/qemu/-/tree/cache-model-v2.6-rebase-06-23-2025 I cloned the repo and then ran: $ git am 20250620_zhao1_liu_i386_cpu_unify_the_cache_model_in_x86cpustate.mbx The *.mbx is got from b4 tool. That applied several patches successfully, but on patch #11 I got this error: error: patch failed: target/i386/cpu.c:7482 error: target/i386/cpu.c: patch does not apply Patch failed at 0011 i386/cpu: Select legacy cache model based on vendor in CPUID 0x2 hint: Use 'git am --show-current-patch=diff' to see the failed patch This error also occured on qemu master when I do 'git am 20250620_zhao1_liu_i386_cpu_unify_the_cache_model_in_x86cpustate.mbx'. Have you run into this before, or did I miss any steps?
Re: [PATCH 5/8] i386/cpu: Add a "x-force-cpuid-0x1f" property
On 6/26/25 4:31 PM, Zhao Liu wrote: From: Manish Mishra Add a "x-force-cpuid-0x1f" property so that CPU models can enable it and have 0x1f CPUID leaf natually as the Host CPU. The advantage is that when the CPU model's cache model is already consistent with the Host CPU, for example, SRF defaults to l2 per module & l3 per package, 0x1f can better help users identify the topology in the VM. Adding 0x1f for specific CPU models should not cause any trouble in principle. This property is only enabled for CPU models that already have 0x1f leaf on the Host, so software that originally runs normally on the Host won't encounter issues in the Guest with corresponding CPU model. Conversely, some software that relies on checking 0x1f might have problems in the Guest due to the lack of 0x1f [*]. In summary, adding 0x1f is also intended to further emulate the Host CPU environment. [*]: https://lore.kernel.org/qemu-devel/ph0pr02mb738410511bf51b12db09be6cf6...@ph0pr02mb7384.namprd02.prod.outlook.com/ Signed-off-by: Manish Mishra Co-authored-by: Xiaoyao Li Signed-off-by: Xiaoyao Li [Integrated and rebased 2 previous patches (ordered by post time)] Signed-off-by: Zhao Liu --- Note: This patch integrates the idea from 2 previous posted patches (ordered by post time)[1] [2], following the s-o-b policy of "Re-starting abandoned work" in docs/devel/code-provenance.rst. [1]: From Manish: https://lore.kernel.org/qemu-devel/20240722101859.47408-1-manish.mis...@nutanix.com/ [2]: From Xiaoyao: https://lore.kernel.org/qemu-devel/20240813033145.279307-1-xiaoyao...@intel.com/ --- Changes since RFC: * Rebase and rename the property as "x-force-cpuid-0x1f". (Igor) --- target/i386/cpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 08c84ba90f52..ee36f7ee2ccc 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -9934,6 +9934,7 @@ static const Property x86_cpu_properties[] = { DEFINE_PROP_BOOL("x-intel-pt-auto-level", X86CPU, intel_pt_auto_level, true), DEFINE_PROP_BOOL("x-l1-cache-per-thread", X86CPU, l1_cache_per_core, true), +DEFINE_PROP_BOOL("x-force-cpuid-0x1f", X86CPU, force_cpuid_0x1f, false), }; After applying these patches to QEMU mainline at commit 6e1571533fd9: $ git am patches-from-https://lore.kernel.org/qemu-devel/20250620092734.1576677-1-zhao1@intel.com/ $ git am patches-from-https://lore.kernel.org/all/20250626083105.2581859-6-zhao1@intel.com/ and configure && make qemu with: $ ./configure --target-list=x86_64-softmmu --enable-debug --enable-kvm --enable-sdl --enable-gtk --enable-spice --prefix=/usr --enable-libusb --enable-usb-redir --enable-trace-backends=simple && make -j32 I ran into this build error: target/i386/cpu.c:9942:52: error: 'X86CPU' {aka 'struct ArchCPU'} has no member named 'force_cpuid_0x1f' ; did you mean 'enable_cpuid_0x1f' ? I haven't debug it yet, because it seems like a simple mistake, asking you directly might be quicker. #ifndef CONFIG_USER_ONLY -- 2.34.1
Re: [PATCH 05/16] i386/cpu: Consolidate CPUID 0x4 leaf
On 6/20/25 5:27 PM, Zhao Liu wrote: Modern Intel CPUs use CPUID 0x4 leaf to describe cache information and leave space in 0x2 for prefetch and TLBs (even TLB has its own leaf CPUID 0x18). And 0x2 leaf provides a descriptor 0xFF to instruct software to check cache information in 0x4 leaf instead. Therefore, follow this behavior to encode 0xFF when Intel CPU has 0x4 leaf with "x-consistent-cache=true" for compatibility. In addition, for older CPUs without 0x4 leaf, still enumerate the cache descriptor in 0x2 leaf, except the case that there's no descriptor matching the cache model, then directly encode 0xFF in 0x2 leaf. This makes sense, as in the 0x2 leaf era, all supported caches should have the corresponding descriptor. Signed-off-by: Zhao Liu --- target/i386/cpu.c | 48 --- 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 2f895bf13523..a06aa1d629dc 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -223,7 +223,7 @@ struct CPUID2CacheDescriptorInfo cpuid2_cache_descriptors[] = { * Return a CPUID 2 cache descriptor for a given cache. * If no known descriptor is found, return CACHE_DESCRIPTOR_UNAVAILABLE */ -static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache) +static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache, bool *unmacthed) { int i; @@ -240,9 +240,44 @@ static uint8_t cpuid2_cache_descriptor(CPUCacheInfo *cache) } } +*unmacthed |= true; return CACHE_DESCRIPTOR_UNAVAILABLE; } +/* Encode cache info for CPUID[4] */ Maybe this should be /* Encode cache info for CPUID[2] */ ? I'm not sure. +static void encode_cache_cpuid2(X86CPU *cpu, +uint32_t *eax, uint32_t *ebx, +uint32_t *ecx, uint32_t *edx) +{ +CPUX86State *env = &cpu->env; +CPUCaches *caches = &env->cache_info_cpuid2; +int l1d, l1i, l2, l3; +bool unmatched = false; + +*eax = 1; /* Number of CPUID[EAX=2] calls required */ +*ebx = *ecx = *edx = 0; + +l1d = cpuid2_cache_descriptor(caches->l1d_cache, &unmatched); +l1i = cpuid2_cache_descriptor(caches->l1i_cache, &unmatched); +l2 = cpuid2_cache_descriptor(caches->l2_cache, &unmatched); +l3 = cpuid2_cache_descriptor(caches->l3_cache, &unmatched); + +if (!cpu->consistent_cache || +(env->cpuid_min_level < 0x4 && !unmatched)) { +/* + * Though SDM defines code 0x40 for cases with no L2 or L3. It's + * also valid to just ignore l3's code if there's no l2. + */ +if (cpu->enable_l3_cache) { +*ecx = l3; +} +*edx = (l1d << 16) | (l1i << 8) | l2; +} else { +*ecx = 0; +*edx = CACHE_DESCRIPTOR_UNAVAILABLE; +} +} + /* CPUID Leaf 4 constants: */ /* EAX: */ @@ -7451,16 +7486,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *eax = *ebx = *ecx = *edx = 0; break; } -*eax = 1; /* Number of CPUID[EAX=2] calls required */ -*ebx = 0; -if (!cpu->enable_l3_cache) { -*ecx = 0; -} else { -*ecx = cpuid2_cache_descriptor(env->cache_info_cpuid2.l3_cache); -} -*edx = (cpuid2_cache_descriptor(env->cache_info_cpuid2.l1d_cache) << 16) | - (cpuid2_cache_descriptor(env->cache_info_cpuid2.l1i_cache) << 8) | - (cpuid2_cache_descriptor(env->cache_info_cpuid2.l2_cache)); +encode_cache_cpuid2(cpu, eax, ebx, ecx, edx); break; case 4: /* cache info: needed for Core compatibility */ -- 2.34.1