Still investigating this. AMD 1700, FreeBSD 13.1 stable@3dd6497894. VM is Windows 11 22H2.
It happens on the setup disk -- at the TianoCore logo, before the "ring" has finished its first rotation -- so very early in the boot process. It's eventually happened for every Win 11 install I have made. Removing the passthrough devices and installing Windows, then re-adding the devices, a fresh install will boot with the passthrough devices a few times, but then shows the same hang behaviour forever after. Windows Boot Repair also hangs. On the host, bhyvectl --destroy hangs. gdb cannot stop bhyve and just hangs as well. None of these hangs show any CPU use. kldunload vmm crashes the host with a page fault. Only a reboot of the host will kill the guest. Setting the guest cpu count to 1, or removing all the passthrough devices allows Windows 11 to boot. The same behaviour happens for two different USB controllers I have and two different GPUs. The same bhyve configurations reliably boot Windows Server 2022 and Windows 10 with passthrough working. Debugging in userspace, I can see that Windows 11 does PCI enumeration in parallel across multiple cores, and sometimes during boot one vCPU writes a PCI config register at approximately the same time as another vCPU reads that exact register. The hang seems to be aligned with this synchronized write/read. Also, I can sometimes boot successfully under gdb when single stepping PCI cfg register writes, but it's difficult to be sure because my debugging is probably disturbing the timing. I looked at the bhyve code and I don't see what here could be racing in user space. In any event, it's a kernel-side bug. Spinning up the kernel debugger, what I always see is: 1. 1 bhyve thread in vioapic_mmio_write() -> ... -> vm_handle_rendezvous() -> _sleep() 2. 1 bhyve thread in vcpu_lock_one() -> ... -> vcpu_set_state_locked() -> msleep_spin_sbt() 3. All remaining bhyve threads, if any, in vm_run() -> vm_handle_rendezvous() -> _sleep(). Example backtrace attached. So it looks like we have some kind of a deadlock between vcpu_lock_one() and vioapci_mmio_write()? Anyone seen anything like it? — RHC.
(kgdb) thr 808 [Switching to thread 808 (Thread 101514)] #0 sched_switch (td=td@entry=0xfffffe0174dc5020, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 2287 cpuid = td->td_oncpu = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=td@entry=0xfffffe0174dc5020, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 #1 0xffffffff80c106a2 in mi_switch (flags=flags@entry=260) at /usr/src/sys/kern/kern_synch.c:544 #2 0xffffffff80c607cc in sleepq_switch (wchan=wchan@entry=0xfffffe0142deb110, pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:609 #3 0xffffffff80c60e0f in sleepq_timedwait (wchan=wchan@entry=0xfffffe0142deb110, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:691 #4 0xffffffff80c0fdce in _sleep (ident=0xfffffe0142deb110, lock=0xfffffe0142deb118, priority=0, wmesg=0xffffffff8216690e "vmrndv", sbt=4294967000, pr=<optimized out>, flags=256) at /usr/src/sys/kern/kern_synch.c:220 #5 0xffffffff82144b0e in vm_handle_rendezvous (vm=0xfffffe0142deb000, vm@entry=0xfffff800afe241b0, vcpuid=vcpuid@entry=0) at /usr/src/sys/amd64/vmm/vmm.c:1332 #6 0xffffffff82145d9c in vm_smp_rendezvous (vm=<unavailable>, vcpuid=<unavailable>, vcpuid@entry=0, dest=..., func=0x0, arg=arg@entry=0x0) at /usr/src/sys/amd64/vmm/vmm.c:2623 #7 0xffffffff82152364 in vioapic_write (vioapic=0xfffff800afe24000, vcpuid=0, data=<optimized out>, addr=<optimized out>) at /usr/src/sys/amd64/vmm/io/vioapic.c:359 #8 vioapic_mmio_rw (vioapic=0xfffff800afe24000, vcpuid=vcpuid@entry=0, gpa=gpa@entry=4273995792, data=data@entry=0xfffffe0175140870, size=size@entry=4, doread=<optimized out>) at /usr/src/sys/amd64/vmm/io/vioapic.c:409 #9 0xffffffff821524f2 in vioapic_mmio_write (vm=vm@entry=0xfffffe0142deb000, vcpuid=vcpuid@entry=0, gpa=4273995792, wval=50331648, size=4, arg=arg@entry=0xfffffe0175140a77) at /usr/src/sys/amd64/vmm/io/vioapic.c:438 #10 0xffffffff821493dc in emulate_mov (vm=0xfffffe0142deb000, vcpuid=0, gpa=4273995792, vie=<optimized out>, memread=<optimized out>, memwrite=0xffffffff821524c0 <vioapic_mmio_write>, arg=0xfffffe0175140a77) at /usr/src/sys/amd64/vmm/vmm_instruction_emul.c:296 #11 vmm_emulate_instruction (vm=vm@entry=0xfffffe0142deb000, vcpuid=0, gpa=4273995792, vie=vie@entry=0xfffffe0142deb488, paging=paging@entry=0xfffffe0142deb470, memread=<optimized out>, memwrite=0xffffffff821524c0 <vioapic_mmio_write>, memarg=0xfffffe0175140a77) at /usr/src/sys/amd64/vmm/vmm_instruction_emul.c:1789 #12 0xffffffff82144206 in vm_handle_inst_emul (vm=0xfffffe0142deb000, retu=0xfffffe0175140a77, vcpuid=<optimized out>) at /usr/src/sys/amd64/vmm/vmm.c:1563 #13 vm_run (vm=0xfffffe0142deb000, vmrun=vmrun@entry=0xfffff8000c4b5800) at /usr/src/sys/amd64/vmm/vmm.c:1807 #14 0xffffffff821473e4 in vmmdev_ioctl (cdev=<unavailable>, cdev@entry=<error reading variable: value is not available>, cmd=<unavailable>, cmd@entry=<error reading variable: value is not available>, data=0xfffff8000c4b5800 "", data@entry=<error reading variable: value is not available>, fflag=<unavailable>, fflag@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/amd64/vmm/vmm_dev.c:504 #15 0xffffffff80a9ac36 in devfs_ioctl (ap=0xfffffe0175140ba8, ap@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:944 #16 0xffffffff80cf51f4 in vn_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=0xfffff8000c4b5800, active_cred=0xfffff8013083dd00, td=<unavailable>) at /usr/src/sys/kern/vfs_vnops.c:1696 #17 0xffffffff80a9b2ee in devfs_ioctl_f (fp=<unavailable>, fp@entry=<error reading variable: value is not available>, com=<unavailable>, com@entry=<error reading variable: value is not available>, data=<unavailable>, data@entry=<error reading variable: value is not available>, cred=<unavailable>, cred@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:875 #18 0xffffffff80c7164d in fo_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=<unavailable>, active_cred=<unavailable>, td=0xfffffe0174dc5020) at /usr/src/sys/sys/file.h:361 #19 kern_ioctl (td=<unavailable>, td@entry=0xfffffe0174dc5020, fd=<optimized out>, com=<unavailable>, com@entry=3230692865, data=<unavailable>, data@entry=0xfffff8000c4b5800 "") at /usr/src/sys/kern/sys_generic.c:803 #20 0xffffffff80c71330 in sys_ioctl (td=0xfffffe0174dc5020, uap=0xfffffe0174dc5408) at /usr/src/sys/kern/sys_generic.c:711 #21 0xffffffff810a967c in syscallenter (td=0xfffffe0174dc5020) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #22 amd64_syscall (td=0xfffffe0174dc5020, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1183 #23 <signal handler called> #24 0x000027d652e40b7a in ?? () Backtrace stopped: Cannot access memory at address 0x27dbd5e37e58 (kgdb) thr 809 [Switching to thread 809 (Thread 101515)] #0 sched_switch (td=td@entry=0xfffffe01459bde40, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 2287 cpuid = td->td_oncpu = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=td@entry=0xfffffe01459bde40, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 #1 0xffffffff80c106a2 in mi_switch (flags=flags@entry=260) at /usr/src/sys/kern/kern_synch.c:544 #2 0xffffffff80c607cc in sleepq_switch (wchan=wchan@entry=0xfffffe0142deb3e8, pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:609 #3 0xffffffff80c60e0f in sleepq_timedwait (wchan=wchan@entry=0xfffffe0142deb3e8, pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:691 #4 0xffffffff80c10006 in msleep_spin_sbt (ident=ident@entry=0xfffffe0142deb3e8, mtx=mtx@entry=0xfffffe0142deb3c8, wmesg=0xffffffff82165fc7 "vmstat", sbt=4294967000, pr=pr@entry=0, flags=flags@entry=256) at /usr/src/sys/kern/kern_synch.c:293 #5 0xffffffff82145832 in vcpu_set_state_locked (vm=vm@entry=0xfffffe0142deb000, vcpuid=vcpuid@entry=0, newstate=newstate@entry=VCPU_FROZEN, from_idle=<optimized out>) at /usr/src/sys/amd64/vmm/vmm.c:1222 #6 0xffffffff8214570f in vcpu_set_state (vm=0xfffffe0142deb000, vcpuid=vcpuid@entry=0, newstate=newstate@entry=VCPU_FROZEN, from_idle=true) at /usr/src/sys/amd64/vmm/vmm.c:2367 #7 0xffffffff821466e3 in vcpu_lock_one (sc=0xfffff80189a2b3c0, vcpu=0) at /usr/src/sys/amd64/vmm/vmm_dev.c:134 #8 vcpu_lock_all (sc=sc@entry=0xfffff80189a2b3c0) at /usr/src/sys/amd64/vmm/vmm_dev.c:160 #9 0xffffffff82146eff in vmmdev_ioctl (cdev=<unavailable>, cdev@entry=<error reading variable: value is not available>, cmd=2150135342, cmd@entry=<error reading variable: value is not available>, data=0xfffffe017514ad50 "\t", data@entry=<error reading variable: value is not available>, fflag=<unavailable>, fflag@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/amd64/vmm/vmm_dev.c:475 #10 0xffffffff80a9ac36 in devfs_ioctl (ap=0xfffffe017514aba8, ap@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:944 #11 0xffffffff80cf51f4 in vn_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=0xfffffe017514ad50, active_cred=0xfffff8013083dd00, td=<unavailable>) at /usr/src/sys/kern/vfs_vnops.c:1696 #12 0xffffffff80a9b2ee in devfs_ioctl_f (fp=<unavailable>, fp@entry=<error reading variable: value is not available>, com=<unavailable>, com@entry=<error reading variable: value is not available>, data=<unavailable>, data@entry=<error reading variable: value is not available>, cred=<unavailable>, cred@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:875 #13 0xffffffff80c7164d in fo_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=<unavailable>, active_cred=<unavailable>, td=0xfffffe01459bde40) at /usr/src/sys/sys/file.h:361 #14 kern_ioctl (td=<unavailable>, td@entry=0xfffffe01459bde40, fd=<optimized out>, com=<unavailable>, com@entry=2150135342, data=<unavailable>, data@entry=0xfffffe017514ad50 "\t") at /usr/src/sys/kern/sys_generic.c:803 #15 0xffffffff80c71330 in sys_ioctl (td=0xfffffe01459bde40, uap=0xfffffe01459be228) at /usr/src/sys/kern/sys_generic.c:711 #16 0xffffffff810a967c in syscallenter (td=0xfffffe01459bde40) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #17 amd64_syscall (td=0xfffffe01459bde40, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1183 #18 <signal handler called> #19 0x000027d652e40b7a in ?? () Backtrace stopped: Cannot access memory at address 0x27dbd7074af8 (kgdb) thr 810 [Switching to thread 810 (Thread 101516)] #0 sched_switch (td=td@entry=0xfffffe01459bd720, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 2287 cpuid = td->td_oncpu = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=td@entry=0xfffffe01459bd720, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 #1 0xffffffff80c106a2 in mi_switch (flags=flags@entry=260) at /usr/src/sys/kern/kern_synch.c:544 #2 0xffffffff80c607cc in sleepq_switch (wchan=wchan@entry=0xfffffe0142deb110, pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:609 #3 0xffffffff80c60e0f in sleepq_timedwait (wchan=wchan@entry=0xfffffe0142deb110, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:691 #4 0xffffffff80c0fdce in _sleep (ident=0xfffffe0142deb110, lock=0xfffffe0142deb118, priority=0, wmesg=0xffffffff8216690e "vmrndv", sbt=4294967000, pr=<optimized out>, flags=256) at /usr/src/sys/kern/kern_synch.c:220 #5 0xffffffff82144b0e in vm_handle_rendezvous (vm=vm@entry=0xfffffe0142deb000, vcpuid=vcpuid@entry=2) at /usr/src/sys/amd64/vmm/vmm.c:1332 #6 0xffffffff82144031 in vm_run (vm=0xfffffe0142deb000, vmrun=vmrun@entry=0xfffff8014f411600) at /usr/src/sys/amd64/vmm/vmm.c:1797 #7 0xffffffff821473e4 in vmmdev_ioctl (cdev=<unavailable>, cdev@entry=<error reading variable: value is not available>, cmd=<unavailable>, cmd@entry=<error reading variable: value is not available>, data=0xfffff8014f411600 "\002", data@entry=<error reading variable: value is not available>, fflag=<unavailable>, fflag@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/amd64/vmm/vmm_dev.c:504 #8 0xffffffff80a9ac36 in devfs_ioctl (ap=0xfffffe0175145ba8, ap@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:944 #9 0xffffffff80cf51f4 in vn_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=0xfffff8014f411600, active_cred=0xfffff8013083dd00, td=<unavailable>) at /usr/src/sys/kern/vfs_vnops.c:1696 #10 0xffffffff80a9b2ee in devfs_ioctl_f (fp=<unavailable>, fp@entry=<error reading variable: value is not available>, com=<unavailable>, com@entry=<error reading variable: value is not available>, data=<unavailable>, data@entry=<error reading variable: value is not available>, cred=<unavailable>, cred@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:875 #11 0xffffffff80c7164d in fo_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=<unavailable>, active_cred=<unavailable>, td=0xfffffe01459bd720) at /usr/src/sys/sys/file.h:361 #12 kern_ioctl (td=<unavailable>, td@entry=0xfffffe01459bd720, fd=<optimized out>, com=<unavailable>, com@entry=3230692865, data=<unavailable>, data@entry=0xfffff8014f411600 "\002") at /usr/src/sys/kern/sys_generic.c:803 #13 0xffffffff80c71330 in sys_ioctl (td=0xfffffe01459bd720, uap=0xfffffe01459bdb08) at /usr/src/sys/kern/sys_generic.c:711 #14 0xffffffff810a967c in syscallenter (td=0xfffffe01459bd720) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #15 amd64_syscall (td=0xfffffe01459bd720, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1183 #16 <signal handler called> #17 0x000027d652e40b7a in ?? () Backtrace stopped: Cannot access memory at address 0x27dbd955ee58 (kgdb) thr 811 [Switching to thread 811 (Thread 101517)] #0 sched_switch (td=td@entry=0xfffffe01459bd000, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 2287 cpuid = td->td_oncpu = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=td@entry=0xfffffe01459bd000, flags=flags@entry=260) at /usr/src/sys/kern/sched_ule.c:2287 #1 0xffffffff80c106a2 in mi_switch (flags=flags@entry=260) at /usr/src/sys/kern/kern_synch.c:544 #2 0xffffffff80c607cc in sleepq_switch (wchan=wchan@entry=0xfffffe0142deb110, pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:609 #3 0xffffffff80c60e0f in sleepq_timedwait (wchan=wchan@entry=0xfffffe0142deb110, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:691 #4 0xffffffff80c0fdce in _sleep (ident=0xfffffe0142deb110, lock=0xfffffe0142deb118, priority=0, wmesg=0xffffffff8216690e "vmrndv", sbt=4294967000, pr=<optimized out>, flags=256) at /usr/src/sys/kern/kern_synch.c:220 #5 0xffffffff82144b0e in vm_handle_rendezvous (vm=vm@entry=0xfffffe0142deb000, vcpuid=vcpuid@entry=3) at /usr/src/sys/amd64/vmm/vmm.c:1332 #6 0xffffffff82144031 in vm_run (vm=0xfffffe0142deb000, vmrun=vmrun@entry=0xfffff8013e457600) at /usr/src/sys/amd64/vmm/vmm.c:1797 #7 0xffffffff821473e4 in vmmdev_ioctl (cdev=<unavailable>, cdev@entry=<error reading variable: value is not available>, cmd=<unavailable>, cmd@entry=<error reading variable: value is not available>, data=0xfffff8013e457600 "\003", data@entry=<error reading variable: value is not available>, fflag=<unavailable>, fflag@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/amd64/vmm/vmm_dev.c:504 #8 0xffffffff80a9ac36 in devfs_ioctl (ap=0xfffffe0175154ba8, ap@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:944 #9 0xffffffff80cf51f4 in vn_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=0xfffff8013e457600, active_cred=0xfffff8013083dd00, td=<unavailable>) at /usr/src/sys/kern/vfs_vnops.c:1696 #10 0xffffffff80a9b2ee in devfs_ioctl_f (fp=<unavailable>, fp@entry=<error reading variable: value is not available>, com=<unavailable>, com@entry=<error reading variable: value is not available>, data=<unavailable>, data@entry=<error reading variable: value is not available>, cred=<unavailable>, cred@entry=<error reading variable: value is not available>, td=<unavailable>, td@entry=<error reading variable: value is not available>) at /usr/src/sys/fs/devfs/devfs_vnops.c:875 #11 0xffffffff80c7164d in fo_ioctl (fp=0xfffff8001e0fe640, com=<unavailable>, data=<unavailable>, active_cred=<unavailable>, td=0xfffffe01459bd000) at /usr/src/sys/sys/file.h:361 #12 kern_ioctl (td=<unavailable>, td@entry=0xfffffe01459bd000, fd=<optimized out>, com=<unavailable>, com@entry=3230692865, data=<unavailable>, data@entry=0xfffff8013e457600 "\003") at /usr/src/sys/kern/sys_generic.c:803 #13 0xffffffff80c71330 in sys_ioctl (td=0xfffffe01459bd000, uap=0xfffffe01459bd3e8) at /usr/src/sys/kern/sys_generic.c:711 #14 0xffffffff810a967c in syscallenter (td=0xfffffe01459bd000) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #15 amd64_syscall (td=0xfffffe01459bd000, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1183 #16 <signal handler called> #17 0x000027d652e40b7a in ?? () Backtrace stopped: Cannot access memory at address 0x27dbdb4bde58 (kgdb)