This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:
apport-collect 2015455
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.
** Changed in: linux (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2015455
Title:
Intel PET not available on recent kernel causing QEMU VM crashes
Status in linux package in Ubuntu:
Incomplete
Bug description:
Hi
Following a recent kernel update on Ubuntu Server 22.04.2 x86_64 to
5.19.0-35 (& ..0-38), QEMU (via LXD) Windows Server 2022 VMs are
crashing every day.
The CPU has Intel PET feature, but I've had to disable tdp_mmu using
modprobe so stabilise the VMs.
The platform:
-------------
Linux 5.19.0-38-generic #39~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 17
21:16:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
The CPU tech specs on Dell R620:
https://www.intel.com/content/www/us/en/products/sku/75277/intel-xeon-processor-e52680-v2-25m-cache-2-80-ghz/specifications.html
The work-around (success with modprobe):
----------------------------------------
https://pve.proxmox.com/wiki/Upgrade_from_6.x_to_7.0#KVM:_entry_failed.2C_hardware_error_0x80000021
LXD Issue:
----------
https://github.com/lxc/lxd/issues/11520
The QEMU log:
-------------
someadmin@us2204-iph-lxd03:/home/someadmin# cat
/var/snap/lxd/common/lxd/logs/mw2022-ivm-test01/qemu.log.old
qemu-system-x86_64: Issue while setting TUNSETSTEERINGEBPF: Invalid argument
with fd: 48, prog_fd: -1
KVM: entry failed, hardware error 0x80000021
If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.
EAX=00000008 EBX=00040ee0 ECX=800003ac EDX=00000000
ESI=32e2f000 EDI=32e26040 EBP=813d2810 ESP=813d2790
EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
ES =0000 00000000 ffffffff 00809300
CS =8000 7ff80000 ffffffff 00809300
SS =0000 00000000 ffffffff 00809300
DS =0000 00000000 ffffffff 00809300
FS =0000 00000000 ffffffff 00809300
GS =0000 00000000 ffffffff 00809300
LDT=0000 00000000 00000000 00000000
TR =0040 ff2a0000 00000067 00008b00
GDT= ff2a1fb0 00000057
IDT= 00000000 00000000
CR0=00050032 CR2=7c3fa0b0 CR3=001ae002 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=qemu-system-x86_64: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs:
Assertion `ret < cpu->num_ases && ret >= 0' failed.
The CPU via lscpu:
------------------
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
CPU family: 6
Model: 62
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
Stepping: 4
CPU max MHz: 3600.0000
CPU min MHz: 1200.0000
BogoMIPS: 5599.96
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs
ibpb stibp tpr_shad
ow vnmi flexpriority ept vpid fsgsbase smep erms
xsaveopt dtherm ida arat pln pts md_clear flush_l1d
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 640 KiB (20 instances)
L1i: 640 KiB (20 instances)
L2: 5 MiB (20 instances)
L3: 50 MiB (2 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s):
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
NUMA node1 CPU(s):
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
Vulnerabilities:
Itlb multihit: KVM: Mitigation: Split huge pages
L1tf: Mitigation; PTE Inversion; VMX conditional cache
flushes, SMT vulnerable
Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Meltdown: Mitigation; PTI
Mmio stale data: Unknown: No mitigations
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via
prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user
pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW,
STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
QEMU version:
I am not able to determine this as yet, but whatever is bundled with the
latest/stable channel's 5.12-c63881f version of the LXD Snap. When I am able to
find out I will update this report.
Thanks
Mark
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2015455/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp