On Tue, Jul 27, 2021 at 09:04:20AM +0200, Ard Biesheuvel wrote: > On Tue, 27 Jul 2021 at 07:12, Guenter Roeck <li...@roeck-us.net> wrote: > > > > On 7/26/21 9:45 PM, Michael S. Tsirkin wrote: > > > On Mon, Jul 26, 2021 at 06:00:57PM +0200, Ard Biesheuvel wrote: > > >> (cc Bjorn) > > >> > > >> On Mon, 26 Jul 2021 at 11:08, Philippe Mathieu-Daudé <phi...@redhat.com> > > >> wrote: > > >>> > > >>> On 7/26/21 12:56 AM, Guenter Roeck wrote: > > >>>> On 7/25/21 3:14 PM, Michael S. Tsirkin wrote: > > >>>>> On Sat, Jul 24, 2021 at 11:52:34AM -0700, Guenter Roeck wrote: > > >>>>>> Hi all, > > >>>>>> > > >>>>>> starting with qemu v6.0, some of my aarch64 efi boot tests no longer > > >>>>>> work. Analysis shows that PCI devices with IO ports do not > > >>>>>> instantiate > > >>>>>> in qemu v6.0 (or v6.1-rc0) when booting through efi. The problem > > >>>>>> affects > > >>>>>> (at least) ne2k_pci, tulip, dc390, and am53c974. The problem only > > >>>>>> affects > > >>>>>> aarch64, not x86/x86_64. > > >>>>>> > > >>>>>> I bisected the problem to commit 0cf8882fd0 ("acpi/gpex: Inform os to > > >>>>>> keep firmware resource map"). Since this commit, PCI device BAR > > >>>>>> allocation has changed. Taking tulip as example, the kernel reports > > >>>>>> the following PCI bar assignments when running qemu v5.2. > > >>>>>> > > >>>>>> [ 3.921801] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000 > > >>>>>> [ 3.922207] pci 0000:00:01.0: reg 0x10: [io 0x0000-0x007f] > > >>>>>> [ 3.922505] pci 0000:00:01.0: reg 0x14: [mem > > >>>>>> 0x10000000-0x1000007f] > > >> > > >> IIUC, these lines are read back from the BARs > > >> > > >>>>>> [ 3.927111] pci 0000:00:01.0: BAR 0: assigned [io 0x1000-0x107f] > > >>>>>> [ 3.927455] pci 0000:00:01.0: BAR 1: assigned [mem > > >>>>>> 0x10000000-0x1000007f] > > >>>>>> > > >> > > >> ... and this is the assignment created by the kernel. > > >> > > >>>>>> With qemu v6.0, the assignment is reported as follows. > > >>>>>> > > >>>>>> [ 3.922887] pci 0000:00:01.0: [1011:0019] type 00 class 0x020000 > > >>>>>> [ 3.923278] pci 0000:00:01.0: reg 0x10: [io 0x0000-0x007f] > > >>>>>> [ 3.923451] pci 0000:00:01.0: reg 0x14: [mem > > >>>>>> 0x10000000-0x1000007f] > > >>>>>> > > >> > > >> The problem here is that Linux, for legacy reasons, does not support > > >> I/O ports <= 0x1000 on PCI, so the I/O assignment created by EFI is > > >> rejected. > > >> > > >> This might make sense on x86, where legacy I/O ports may exist, but on > > >> other architectures, this makes no sense. > > > > > > > > > Fixing Linux makes sense but OTOH EFI probably shouldn't create mappings > > > that trip up existing guests, right? > > > > > > > I think it is difficult to draw a line. Sure, maybe EFI should not create > > such mappings, but then maybe qemu should not suddenly start to enforce > > those mappings for existing guests either. > > > > EFI creates the mappings primarily for itself, and up until DSM #5 > started to be enforced, all PCI resource allocations that existed at > boot were ignored by Linux and recreated from scratch. > > Also, the commit in question looks dubious to me. I don't think it is > likely that Linux would fail to create a resource tree. What does > happen is that BARs get moved around, which may cause trouble in some > cases: for instance, we had to add special code to the EFI framebuffer > driver to copy with framebuffer BARs being relocated. > > > For my own testing, I simply reverted commit 0cf8882fd0 in my copy of > > qemu. That solves my immediate problem, giving us time to find a solution > > that is acceptable for everyone. After all, it doesn't look like anyone > > else has noticed the problem, so there is no real urgency. > > > > I would argue that it is better to revert that commit. DSM #5 has a > long history of debate and misinterpretation, and while I think we > ended up with something sane, I don't think we should be using it in > this particular case.
Re-reading it I have to agree. I think I misunderstood the spec and guest behaviour when I applied it. -- MST