Alyssa Ross <h...@alyssa.is> writes: > Stefan Hajnoczi <stefa...@redhat.com> writes: > >> On Tue, Jul 21, 2020 at 07:14:38AM +0000, Alyssa Ross wrote: >>> Hi -- I hope it's okay me reaching out like this. >>> >>> I've been trying to test out the virtio-vhost-user implementation that's >>> been posted to this list a couple of times, but have been unable to get >>> it to boot a kernel following the steps listed either on >>> <https://wiki.qemu.org/Features/VirtioVhostUser> or >>> <https://ndragazis.github.io/dpdk-vhost-vvu-demo.html>. >>> >>> Specifically, the kernel appears to be unable to write to the >>> virtio-vhost-user device's PCI registers. I've included the full panic >>> output from the kernel at the end of this message. The panic is >>> reproducible with two different kernels I tried (with different configs >>> and versions). I tried both versions of the virtio-vhost-user I was >>> able to find[1][2], and both exhibited the same behaviour. >>> >>> Is this a known issue? Am I doing something wrong? >> >> Hi, >> Unfortunately I'm not sure what the issue is. This is an early >> virtio-pci register access before a driver for any specific device type >> (net, blk, vhost-user, etc) comes into play. > > Small update here: I tried on another computer, and it worked. Made > sure that it was exactly the same QEMU binary, command line, and VM > disk/initrd/kernel, so I think I can fairly confidently say the panic > depends on what hardware QEMU is running on. I set -cpu value to the > same on both as well (SandyBridge). > > I also discovered that it works on my primary computer (the one it > panicked on before) with KVM disabled. > > Note that I've only got so far as finding that it boots on the other > machine -- I haven't verified yet that it actually works. > > Bad host CPU: Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz > Good host CPU: AMD EPYC 7401P 24-Core Processor > > May I ask what host CPUs other people have tested this on? Having more > data would probably be useful. Could it be an AMD vs. Intel thing?
I think I've figured it out! Sandy Bridge and Ivy Bridge hosts encounter this panic because the "additional resources" bar size is too big, at 1 << 36. If I change this to 1 << 35, no more kernel panic. Skylake and later are fine with 1 << 36. In between Ivy Bridge and Skylake were Haswell and Broadwell, but I couldn't find anybody who was able to help me test on either of those, so I don't know what they do. Perhaps related, the hosts that produce panics all seem to have a physical address size of 36 bits, while the hosts that work have larger physical address sizes, as reported by lscpu.