Hello, Something similar also appears to also be affecting bhyve, at least on an AMD Opteron 4228 HE. The error produced is different depending on whether bhyve is instructed to ignore accessed to model specific registers that are not implemented in the current CPU. I haven't had to have that flag toggled previously. I've included the dmesg and trace from both setups below.
A snapshot of -current with a build date of 1533181438 - Thu Aug 2 03:43:58 UTC 2018 boots successfully with ignore_bad_msr set to on. I'm not entirely sure if Bryan's patch will have made it into that snapshot or not, but if it has, it appears to also be fixing the issue on bhyve. Thanks! Sincerely, Nulani. ### 6.3 without ignore_bad_msr ### Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2018 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 6.3 (GENERIC) #7: Sun Jul 29 11:30:47 CEST 2018 r...@syspatch-63-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 1056964608 (1008MB) avail mem = 1019158528 (971MB) warning: no entropy supplied by boot loader mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xf101f (9 entries) bios0: vendor BHYVE version "1.00" date 03/14/2014 bios0: bhyve BHYVE acpi0 at bios0: rev 2 acpi0: sleep states S5 acpi0: tables DSDT APIC FACP HPET MCFG acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Opteron(tm) Processor 4228 HE, 2800.42 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,HV,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,XOP,SKINIT,WDT,FMA4,ITSC cpu0: 64KB 64b/line 2-way I-cache, 16KB 64b/line 4-way D-cache, 2MB 64b/line 16-way L2 cache, 8MB 64b/line 64-way L3 cache cpu0: ITLB 48 4KB entries fully associative, 24 4MB entries fully associative cpu0: DTLB 32 4KB entries fully associative, 32 4MB entries fully associative kernel: protection fault trap, code=0 Stopped at 0xffffffff81219c59: wrmsr ddb> trace ffffffff81219c59(ffff800000031700,ffffffff81a7fff0,ffffffff81a7d028,ffffffff81c 06a58,ffff800000031724,0) at 0xffffffff81219c59 ffffffff81008d2e(ffff800000023100,ffffffff81c06a58,ffff800000031700,ffffffff81a 7d000,ffffffff81008d2e,ffffffff81c069b0) at 0xffffffff81008d2e ffffffff813618b8(0,ffff8000000232c4,ffff800000023298,ffff8000000232c4,ffffffff8 1c06a38,ffffffff816c51a0) at 0xffffffff813618b8 ffffffff816c4c36(ffff800000020400,ffffffff81c06b60,ffffffff81ab3f38,ffff8000000 31200,ffff800000031224,0) at 0xffffffff816c4c36 ffffffff813618b8(ffffffff81c06b60,ffff800000020400,ffff800000020470,ffff8000000 20460,ffff800000023280,ffffffff8100d040) at 0xffffffff813618b8 ffffffff8100c571(ffff800000023180,ffffffff81c06c50,ffffffff81a811a8,ffff8000000 20400,ffff800000020424,0) at 0xffffffff8100c571 ffffffff813618b8(ffff800014a67023,ffff800000023180,3c,104,ffff800014a67042,ffff ffff8140b6f0) at 0xffffffff813618b8 ffffffff8140a766(ffff800000023100,ffffffff81c06d88,ffffffff81aa1ea8,ffff8000000 23180,ffff8000000231a4,0) at 0xffffffff8140a766 ffffffff813618b8(ffffffff81c06d88,ffff800000023100,ffffffff81a8ba98,ffff8000000 23100,ffff800000023124,ffffffff811a9830) at 0xffffffff813618b8 ffffffff811a95a1(0,0,0,ffffffff81c06db0,ffffffff81c06e20,3000000010) at 0xfffff fff811a95a1 ffffffff813618b8(0,ffffffff81827131,ffffffff81a9fd8a,ffffffff81c06e78,b28,0) at 0xffffffff813618b8 ffffffff81361a53(0,0,0,0,ffffffff81c00008,0) at 0xffffffff81361a53 ffffffff8101187b(0,0,ffffffff8101187b,ffffffff81c06ef0,0,0) at 0xffffffff810118 7b ffffffff8116b8c3(0,0,0,0,ffffffff8116b8c3,ffffffff81c06f20) at 0xffffffff8116b8 c3 end trace frame: 0x0, count: -14 ddb> ps PID TID PPID UID S FLAGS WAIT COMMAND * 0 0 -1 0 7 0x10200 swapper ###6.3 with ignore_bad_msr### Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2018 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 6.3 (GENERIC) #7: Sun Jul 29 11:30:47 CEST 2018 r...@syspatch-63-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 1056964608 (1008MB) avail mem = 1019158528 (971MB) warning: no entropy supplied by boot loader mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xf101f (9 entries) bios0: vendor BHYVE version "1.00" date 03/14/2014 bios0: bhyve BHYVE acpi0 at bios0: rev 2 acpi0: sleep states S5 acpi0: tables DSDT APIC FACP HPET MCFG acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD Opteron(tm) Processor 4228 HE, 2800.54 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,HV,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,XOP,SKINIT,WDT,FMA4,ITSC cpu0: 64KB 64b/line 2-way I-cache, 16KB 64b/line 4-way D-cache, 2MB 64b/line 16-way L2 cache, 8MB 64b/line 64-way L3 cache cpu0: ITLB 48 4KB entries fully associative, 24 4MB entries fully associative cpu0: DTLB 32 4KB entries fully associative, 32 4MB entries fully associative kernel: protection fault trap, code=0 Stopped at 0xffffffff81219c59: wrmsr ddb> trace ffffffff81219c59(ffff800000031700,ffffffff81a7fff0,ffffffff81a7d028,ffffffff81c 06a58,ffff800000031724,0) at 0xffffffff81219c59 ffffffff81008d2e(ffff800000023100,ffffffff81c06a58,ffff800000031700,ffffffff81a 7d000,ffffffff81008d2e,ffffffff81c069b0) at 0xffffffff81008d2e ffffffff813618b8(0,ffff8000000232c4,ffff800000023298,ffff8000000232c4,ffffffff8 1c06a38,ffffffff816c51a0) at 0xffffffff813618b8 ffffffff816c4c36(ffff800000020400,ffffffff81c06b60,ffffffff81ab3f38,ffff8000000 31200,ffff800000031224,0) at 0xffffffff816c4c36 ffffffff813618b8(ffffffff81c06b60,ffff800000020400,ffff800000020470,ffff8000000 20460,ffff800000023280,ffffffff8100d040) at 0xffffffff813618b8 ffffffff8100c571(ffff800000023180,ffffffff81c06c50,ffffffff81a811a8,ffff8000000 20400,ffff800000020424,0) at 0xffffffff8100c571 ffffffff813618b8(ffff800014a67023,ffff800000023180,3c,104,ffff800014a67042,ffff ffff8140b6f0) at 0xffffffff813618b8 ffffffff8140a766(ffff800000023100,ffffffff81c06d88,ffffffff81aa1ea8,ffff8000000 23180,ffff8000000231a4,0) at 0xffffffff8140a766 ffffffff813618b8(ffffffff81c06d88,ffff800000023100,ffffffff81a8ba98,ffff8000000 23100,ffff800000023124,ffffffff811a9830) at 0xffffffff813618b8 ffffffff811a95a1(0,0,0,ffffffff81c06db0,ffffffff81c06e20,3000000010) at 0xfffff fff811a95a1 ffffffff813618b8(0,ffffffff81827131,ffffffff81a9fd8a,ffffffff81c06e78,b28,0) at 0xffffffff813618b8 ffffffff81361a53(0,0,0,0,ffffffff81c00008,0) at 0xffffffff81361a53 ffffffff8101187b(0,0,ffffffff8101187b,ffffffff81c06ef0,0,0) at 0xffffffff810118 7b ffffffff8116b8c3(0,0,0,0,ffffffff8116b8c3,ffffffff81c06f20) at 0xffffffff8116b8 c3 end trace frame: 0x0, count: -14 ddb> ps PID TID PPID UID S FLAGS WAIT COMMAND * 0 0 -1 0 7 0x10200 swapper On 1 August 2018 at 23:11, Bryan Steele <bry...@gmail.com> wrote: > On Wed, Aug 01, 2018 at 01:07:33PM -0700, Mike Larkin wrote: >> On Wed, Aug 01, 2018 at 12:14:59PM -0400, Bryan Steele wrote: >> > On Wed, Aug 01, 2018 at 11:27:26AM -0400, Bryan Steele wrote: >> > > On Wed, Aug 01, 2018 at 03:46:25PM +0200, Elmer Skjødt Henriksen wrote: >> > > > After installing the 014_amdlfence patch released yesterday for 6.3, my >> > > > OpenBSD VM crashes on boot. It's running under KVM on a Linux box >> > > > (Ubuntu >> > > > 18.04 w/ kernel 4.15) on an AMD Ryzen 7 1700 (microcode 0x8001137). >> > > > I suppose this would also happen on vmm(4) and bhyve, however I don't >> > > > have >> > > > any such AMD hosts available for testing. >> > > >> > > Hi Elmer, >> > > >> > > This was tested in vmm(4), which does work, unfortunately there was not >> > > extensive testing by in other virtualization software. The MSR that is >> > > being set here is only mentioned in AMDs whitepaper and I had no reason >> > > to believe any special consideration was needed for guest VMs on AMD >> > > processors. >> > > >> > > > It occurs both using libvirt's "EPYC" CPU model and using >> > > > "host-passthrough" >> > > > (i.e. no virtual CPU model), but the "core2duo" CPU model works fine. >> > > > >> > > > I guess not many people are running OpenBSD as a VM, and even less on >> > > > AMD >> > > > hardware. But still, a syspatch leaving the system unable to boot is >> > > > probably not a good thing. :) >> > > > >> > > >> > > Even so, I would like to apologize. This situation is unfortunate, and >> > > I'll try to work with other developers to find the best way forward. >> > > But, I regret I am only but an amateur magician. >> > > >> > > -Bryan. >> > >> > Actually, it looks like this is at least partially a KVM/QEMU bug. In >> > the meantime I guess the solution would be to do as you suggested and >> > set a different CPU model for now until Linux distros include a fix for >> > this. >> > >> > https://lkml.org/lkml/2018/2/21/1202 >> > >> > Afterwards, on the OpenBSD side, it looks like one small change may be >> > required in addition.. >> > >> > -Bryan. >> > >> > Index: sys/arch/amd64/amd64/identcpu.c >> > =================================================================== >> > RCS file: /cvs/src/sys/arch/amd64/amd64/identcpu.c,v >> > retrieving revision 1.95.2.2 >> > diff -u -p -u -r1.95.2.2 identcpu.c >> > --- sys/arch/amd64/amd64/identcpu.c 30 Jul 2018 14:45:05 -0000 >> > 1.95.2.2 >> > +++ sys/arch/amd64/amd64/identcpu.c 1 Aug 2018 16:09:50 -0000 >> > @@ -650,8 +650,10 @@ identifycpu(struct cpu_info *ci) >> > >> > msr = rdmsr(MSR_DE_CFG); >> > #define DE_CFG_SERIALIZE_LFENCE (1 << 1) >> > - msr |= DE_CFG_SERIALIZE_LFENCE; >> > - wrmsr(MSR_DE_CFG, msr); >> > + if ((msr & DE_CFG_SERIALIZE_LFENCE) == 0) { >> > + msr |= DE_CFG_SERIALIZE_LFENCE; >> > + wrmsr(MSR_DE_CFG, msr); >> > + } >> > } >> > } >> > >> > >> >> As expected, -current works properly on real AMD hardware. So my assumption >> about KVM doing something odd seems to be correct. >> >> The issue should be reported upstream to the KVM folks. But if the diff above >> also fixes the issue (I didn't test because I cannot reproduce it), ok >> mlarkin. >> >> -ml > > I committed a fix for the potential MSR write #GP bug to -current: > > https://marc.info/?l=openbsd-cvs&m=153315564121057&w=2 > > Unfortunately, for the MSR read issue on older KVMs, it would require > adding additional code to determine if we're running under KVM, there's > really not much at all we can do here.. > > I agree these seem like KVM bugs, as this does not happen on real > hardware, and at least also not in OpenBSD vmm(4). > > -Bryan. >