Re: VMM with a Linux guest with the 6.12 kernel branch crashes

Dave Voutila Fri, 15 Aug 2025 11:16:22 -0700

Adrian Ali <adrian...@fortix.com.ar> writes:

> On 8/13/25 12:05 PM, Dave Voutila wrote:
>> Adrian Ali <adrian...@fortix.com.ar> writes:
>>
>>> On 8/13/25 7:33 AM, Dave Voutila wrote:
>>>> Adrian Ali <adrian...@fortix.com.ar> writes:
>>>>
>>>>> Hello, my installation:
>>>>>
>>>>> OpenBSD Release 7.7 amd64
>>>>>
>>>>> server ~ aali $ uname -a
>>>>> OpenBSD server.fortix.com.ar 7.7 GENERIC.MP#2 amd64
>>>>> server ~ aali $
>>>>>
>>>>> server ~ aali $ syspatch -l
>>>>> 001_nfs
>>>>> 002_zic
>>>>> 003_zoneinfo
>>>>> 004_pfsyncook
>>>>> 005_acme
>>>>> 006_xserver
>>>>> 007_xserver
>>>>> 008_pledge
>>>>> server ~ aali $
>>>>>
>>>>> When booting a Linux guest with the 6.12 kernel on VMM:
>>>>>
>>>>> vmctl start -c -n uplink_veb0 -m 512M -i 1 -r
>>>>> install-amd64-minimal-20250810T165238Z.iso -d gentoo.qcow2 gentoo
>>>>>
>>>>> it produces the error:
>>>>>
>>>>> [    2.798040] ------------[ cut here ]------------
>>>>> [    2.799107] WARNING: CPU: 0 PID: 1 at
>>>>> arch/x86/kernel/fpu/xstate.c:1009 get_xsave_addr_user+0x48/0x80
>>>>> [    2.801157] Modules linked in:
>>>>> [    2.801830] CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.12.38 #1
>>>>> [    2.803160] Hardware name: OpenBSD VMM, BIOS 1.16.3p0-OpenBSD-vmm
>>>>> 01/01/2011
>>>>> [    2.804676] RIP: 0010:get_xsave_addr_user+0x48/0x80
>>>>> [    2.805731] Code: 00 00 48 d3 e2 48 23 15 ae 4f e9 01 74 1c 48 63
>>>>> c9 48 83 f9 13 73 20 8b 14 8d 00 19 ef bc 48 83 c4 10 48 01 d0 c3 cc
>>>>> cc cc cc <0f> 0b 31 c0 48 83 c4 10 e9 5b 14 2c 01 48 89 ce 48 c7 c7 e0
>>>>> d6 83
>>>>> [    2.809755] RSP: 0018:ffffd3e5c000bd08 EFLAGS: 00010246
>>>>> [    2.810901] RAX: 00007ffe83371640 RBX: 0000000000000000 RCX:
>>>>> 0000000000000009
>>>>> [    2.812444] RDX: 0000000000000000 RSI: 0000000000000009 RDI:
>>>>> 00007ffe83371640
>>>>> [    2.814014] RBP: ffff8c49812ff9c0 R08: ffffd3e5c000be28 R09:
>>>>> 0000000000000000
>>>>> [    2.815850] R10: 0000000000000000 R11: 0000000000000010 R12:
>>>>> 00007ffe83371640
>>>>> [    2.817366] R13: ffff8c49812ff980 R14: 00007ffe83371640 R15:
>>>>> ffff8c49812fd380
>>>>> [    2.818880] FS:  00007f8f95584d40(0000) GS:ffff8c499f400000(0000)
>>>>> knlGS:0000000000000000
>>>>> [    2.820747] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050013
>>>>> [    2.821987] CR2: 000055ba5c099774 CR3: 0000000001094000 CR4:
>>>>> 0000000000f50eb0
>>>>> [    2.823518] PKRU: 00000000
>>>>> [    2.824119] Call Trace:
>>>>> [    2.824670]  <TASK>
>>>>> [    2.825150]  copy_fpstate_to_sigframe+0x203/0x3a0
>>>>> [    2.826197]  get_sigframe+0xf6/0x280
>>>>> [    2.826993]  x64_setup_rt_frame+0x6c/0x2f0
>>>>> [    2.827887]  arch_do_signal_or_restart+0x1cd/0x260
>>>>> [    2.828929]  syscall_exit_to_user_mode+0x172/0x200
>>>>> [    2.830001]  do_syscall_64+0x8e/0x190
>>>>> [    2.830826]  ? exc_page_fault+0x7e/0x180
>>>>> [    2.831726]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>>>> [    2.832891] RIP: 0033:0x7f8f9565a638
>>>>> [    2.833680] Code: 48 85 f6 74 15 48 b9 00 00 00 80 01 00 00 00 48
>>>>> 8b 06 48 85 c8 75 43 48 89 f0 48 89 c6 41 ba 08 00 00 00 b8 0e 00 00
>>>>> 00 0f 05 <89> c2 f7 da 3d 00 f0 ff ff b8 00 00 00 00 0f 47 c2 48 8b 94
>>>>> 24 88
>>>>> [    2.837671] RSP: 002b:00007ffe83371a30 EFLAGS: 00000246 ORIG_RAX:
>>>>> 000000000000000e
>>>>> [    2.839429] RAX: 0000000000000000 RBX: 000000000000004b RCX:
>>>>> 00007f8f9565a638
>>>>> [    2.840950] RDX: 0000000000000000 RSI: 00007ffe83371b90 RDI:
>>>>> 0000000000000002
>>>>> [    2.842477] RBP: 0000000000000001 R08: 00007f8f957a3ac0 R09:
>>>>> 0000000000000001
>>>>> [    2.844004] R10: 0000000000000008 R11: 0000000000000246 R12:
>>>>> 00007ffe83371b10
>>>>> [    2.845461] R13: 00007ffe83371c10 R14: 000055ba5c066ecc R15:
>>>>> 000055ba79e33580
>>>>> [    2.846976]  </TASK>
>>>>> [    2.847440] ---[ end trace 0000000000000000 ]---
>>>>>


Finally had time to diagnose this.

Linux is seeing the existence of the memory protection key userland
register (PKRU) and trying to use the XSAVE area and instructions to
save/restore the register value. There are multiple issues.

1. vmm masks the cpuid bits that describe the available xsave features
and sizes to what we use on the host and we do not use this method of
saving/restoring pkru state on context switch.

2. the assumption Linux makes is if PKRU exists, that this xsave
support exists, too. Probably a safe assumption for real hardware, but
not in this case. They don't gracefully handle this edge case and
instead spit out a warning and sort of go into a tail spin.

I have a diff I'm cleaning up that I'll share on tech@ when ready. It
will apply at least to -current. Depending on how that goes, I'll
backport to 7.7 for testing as well.


>>>>> It crashes the kernel and boot failed. I tested with a Linux guest
>>>>> "kernel 6.12.31-gentoo" and with the Gentoo minimal installation image
>>>>> "install-amd64-minimal-20250810T165238Z.iso" which comes with a kernel
>>>>> version "Linux version 6.12.38".
>>>>>
>>>>> On the host:
>>>>>
>>>>> tail -f /var/log/daemon
>>>>> Aug 12 23:04:21 server vmd[89690]: started gentoo (vm 1) successfully,
>>>>> tty /dev/ttypu
>>>>>
>>>>> Searching, I found a report that the Linux kernel 6.12 branch also has
>>>>> problems with the VZ hypervisor of macOS. A workaround is to disable
>>>>> the "CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS" option at Linux kernel
>>>>> boot by adding the "nopku" argument to the kernel. The link with the
>>>>> error details and how to solve it in VZ (it works the same in VMM):
>>>>>

For now, the "nopku" kernel arg is the best workaround. It disables
support in Linux and avoids this mess. I've confirmed this myself using
a Debian 13 installer that ships with Linux 6.12.38 with PKU support
compiled in and a userland that's trying to use it.

Note to anyone reading, this only affects you if you have Intel (no AMD)
hardware that supports PKU. Check dmesg output: dmesg | grep PKU.

-dv

Re: VMM with a Linux guest with the 6.12 kernel branch crashes

Reply via email to