* Gerd Hoffmann (kra...@redhat.com) wrote: > Hi, > > > Something somewhere in qemu/ kernel/ firmware is already reading the number > > of physical bits to determine PCI mapping; if I do: > > > > ./x86_64-softmmu/qemu-system-x86_64 -m 4096,slots=16,maxmem=128T > > No, it's not the physbits. You add some memory hotplug slots here. > Qemu will ask seabios to reserve address space for those, which seabios > promptly does and maps 64bit pci bars above the reserved address space.
Right, that's what I was trying to do - I wanted to see if I could get something to use the non-existing address space. > > -vga none -device > > qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 > > /home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon > > chardev=mon,mode=readline -cpu host,phys-bits=48 > > > > it will happily map the qxl VRAM right up high, but if I lower > > the phys-bits down to 46 it won't. > > I suspect the linux kernel remaps the bar because the seabios mapping is > unreachable. Check dmesg. Right, and that is dependent on physbits; if I run with: ./x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm,usb=off -m 4096,slots=16,maxmem=128T -vga none -device qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 /home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon chardev=mon,mode=readline -cpu host,phys-bits=48 (on a 46 bit xeon) it happily maps that 64-bit bar into somewhere that shouldn't be accessible: [ 0.266183] pci_bus 0000:00: root bus resource [mem 0x800480000000-0x8004ffffffff] [ 0.321611] pci 0000:00:02.0: reg 0x20: [mem 0x800480000000-0x8004ffffffff 64bit pref] [ 0.423257] pci_bus 0000:00: resource 8 [mem 0x800480000000-0x8004ffffffff] lspci -v: 00:02.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card (rev 04) (prog-if 00 [VGA controller]) Subsystem: Red Hat, Inc QEMU Virtual Machine Flags: fast devsel, IRQ 22 Memory at c0000000 (32-bit, non-prefetchable) [size=512M] Memory at e0000000 (32-bit, non-prefetchable) [size=64M] Memory at e4070000 (32-bit, non-prefetchable) [size=8K] I/O ports at c080 [size=32] Memory at 800480000000 (64-bit, prefetchable) [size=2G] Expansion ROM at e4060000 [disabled] [size=64K] Kernel driver in use: qxl So that's mapped at an address beyond host phys-bits. And it hasn't failed/crashed etc - but I guess maybe nothing is using that 2G space? If I change the phys-bits=48 to 46 the kernel avoids it: [ 0.414867] acpi PNP0A08:00: host bridge window [0x800480000000-0x8004ffffffff] (ignored, not CPU addressable) [ 0.683134] pci 0000:00:02.0: can't claim BAR 4 [mem 0x800480000000-0x8004ffffffff 64bit pref]: no compatible bridge window [ 0.703948] pci 0000:00:02.0: BAR 4: [mem size 0x80000000 64bit pref] conflicts with PCI mem [mem 0x00000000-0x3fffffffffff] [ 0.703951] pci 0000:00:02.0: BAR 4: failed to assign [mem size 0x80000000 64bit pref] lspci shows: Memory at <ignored> (64-bit, prefetchable) (Although interesting qemu's info pci still shows it). The 'ignored, not CPU addressable' comes from the kernel's drivers/acpi/pci_root.c acpi_pci_root_validate_resources that uses a value set in arch/x86/kernel/setup.c: iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; So at least the Linux kernel does sanity check using the phys_bits value. Obviously 128T is a bit silly for maxmem at the moment, however I was worrying what happens with 36/39/40bit hosts, and it's not unusual to pick a maxmem that's a few TB even if the VMs you're initially creating are only a handful of GB. (oVirt/RHEV seems to use a 4TB default for maxmem). Still, this only hits as a problem if you hit the combination of: a) You use large PCI bars b) On a 36/39/40bit host c) With a large maxmem that forces those PCI bars up to something silly. Dave > > cheers, > Gerd > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK