Hi, Following this guide <https://medium.com/@calerogers/gpu-virtualization-with-kvm-qemu-63ca98a6a172> before, I had already tried the vbios-romdump route, but now with a lot of other stuff checked of, I gave it another try:
1. booted a live linux from a usb-stick on the host with cms enabled, making sure not to use uefi while booting 2. dumped the vga-bios 3. ran the dump through rom-parser 4. created rom-dump At this point I got a bit confused: as i understand correctly, rom-parser is only for viewing, rom-fixer for editing. As I noticed that the device id didn't correspond with the one reported by lspci -nnk, I made a few version of the dump with changed device id's, and having rom-fixer correct the checksum. If it doesn't help, it can't hurt either. this is what lspci -nnk reported: VGA compatible controller [0300]: Intel Corporation Device *[8086:5a85]* (rev 0b) Subsystem: ASRock Incorporation Device *[1849:5a85]* So I made and tested following dumps in the vm, following the vendor and device id's from the lspci -nnk: a) original dump (not changed): vendor id *8086* (Intel), device id *0406* (unknown) b) dump with changed device id: vendor id *8086* (Intel) and device id *5a85* (Intel HD500) c) dump with changed vendor id and device id: vendor id *1849* (Asrock), device id *5a85* (Intel HD500) This proved to be useless, as there was no difference in output when either one of the versions was used. 5. added the path to the vm.xml <rom bar='on' file='/var/lib/libvirt/vbios_dump/vbios_intel_HD500.rom'/> The rom bar='on' was added by virt-manager (as also documented here <https://doc.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/Virtualization_Deployment_and_Administration_Guide/sub-sub-section-libvirt-dom-xml-devices-interface-ROM-BIOS-configuration.html>), and provided some interesting results: With rom bar='off', the results were identical to the situation before (where rom bar was on, but no vga-bios-rom-file specified): black screen (that powers off), 1 of the 4 assigned virtual cpu's maxing out, as well as the virtual memory. Also the messages in dmesg were identical to before. With rom bar='on', this time the vm refused to start, and I got below error messages in the virsh console: error: Failed to start domain ubuntu16.04_desktop error: internal error: process exited while connecting to monitor: warning: host doesn't support requested feature: CPUID.01H:EDX.ds [bit 21] warning: host doesn't support requested feature: CPUID.01H:EDX.acpi [bit 22] warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit 28] warning: host doesn't support requested feature: CPUID.01H:EDX.tm [bit 29] warning: host doesn't support requested feature: CPUID.01H:EDX.pbe [bit 31] warning: host doesn't support requested feature: CPUID.01H:ECX.dtes64 [bit 2] warning: host doesn't support requested feature: CPUID.01H:ECX.monitor [bit 3] warning: host doesn't support requested feature: CPUID.01H:ECX.ds-cpl [bit 4] warning: host doesn't support requested feature: CPUID.01H:ECX.est [bit 7] warning: host doesn't support requested feature: CPUID.01H:ECX.tm2 [bit 8] warning: host doesn't support requested feature: CPUID.01H:ECX.xtpr [bit 14] warning: host doesn't support requested feature: CPUID.01H:ECX.pdcm [bit 15] warning: host doesn't support requested feature: CPUID.01H:EC Notice the final line not being complete. Often, this last line would go on to: warning: host doesn't support requested feature: CPUID.01H:ECX.osxs Even stranger is the fact that even if I enter a path to a non-existent file, the result and error message will be the same. Doublechecked rights and path. Running the .xml with only <rom bar='on'> (and no rom-file specified) is what I was running before with the usual result (black screen, no errors). Same when I remove all pci-passthrough-, video- and graphics devices. If there is any other info I can provide or something I can try, I would gladly do so. Thanks for any suggestions. Kind regards, Geert On 27 July 2017 at 17:41, Alex Williamson <alex.william...@redhat.com> wrote: > On Thu, 27 Jul 2017 11:32:24 +0200 > Geert Coulommier <g.coulomm...@gmail.com> wrote: > > > Hi, > > > > so I've tried the 2 options you suggested: > > > > 1) "look in /proc/iomem and identify the driver that's still claiming > > portions of IGD and disable it" > > > > from /proc/iomem: > > > > ... > > 80000000-cfffffff : PCI Bus 0000:00 > > 80000000-8fffffff : 0000:00:02.0 > > 80000000-808cffff : efifb > > ... > > > > which is strange as to prevent this, the part "video=efifb:off" was added > > to grub: > > > > GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt > > rd.driver.pre=vfio-pci video=vesafb:off,efifb:off" > > > > Because I'm running the host on uefi, and to keeps things clean, I > removed > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This is probably the next piece of the puzzle... > > > the "vesafb:off"-part: > > GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt > > rd.driver.pre=vfio-pci video=efifb:off" > > > > Unexpectedly, this seemed have an effect. Now from /proc/iomem (when not > > running the VM,full printout of /proc/iomem below in [1]): > > > > ... > > 80000000-cfffffff : PCI Bus 0000:00 > > 80000000-8fffffff : 0000:00:02.0 > > 90000000-90ffffff : 0000:00:02.0 > > 91000000-910fffff : 0000:00:0e.0 > > 91100000-911fffff : PCI Bus 0000:03 > > 91100000-911001ff : 0000:03:00.0 > > 91100000-911001ff : ahci > > ... > > > > when running the VM, it goes to: > > > > ... > > 80000000-cfffffff : PCI Bus 0000:00 > > 80000000-8fffffff : 0000:00:02.0 > > 80000000-8fffffff : vfio-pci > > 90000000-90ffffff : 0000:00:02.0 > > 90000000-90ffffff : vfio-pci > > 91000000-910fffff : 0000:00:0e.0 > > 91000000-910fffff : vfio-pci > > 91100000-911fffff : PCI Bus 0000:03 > > 91100000-911001ff : 0000:03:00.0 > > 91100000-911001ff : ahci > > 91200000-912fffff : PCI Bus 0000:01 > > 91200000-91203fff : 0000:01:00.0 > > 91200000-91203fff : r8169 > > 91204000-91204fff : 0000:01:00.0 > > 91204000-91204fff : r8169 > > 91300000-9130ffff : 0000:00:15.0 > > 91300000-9130ffff : xhci-hcd > > 91310000-91313fff : 0000:00:0e.0 > > 91310000-91313fff : vfio-pci > > 91314000-91315fff : 0000:00:12.0 > > 91314000-91315fff : ahci > > 91316000-913160ff : 0000:00:1f.1 > > 91317000-913177ff : 0000:00:12.0 > > 91317000-913177ff : ahci > > 91318000-913180ff : 0000:00:12.0 > > 91318000-913180ff : ahci > > 9131b000-9131bfff : 0000:00:0f.0 > > 9131b000-9131bfff : mei_me > > ... > > > > > > and the dmesg log: > > > > dmesg | grep -aiE '((DMAR)|(kvm)|(drm)|(Command line)|(iommu)|(vfio))' > > [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.12. > 3-041203-generic > > root=/dev/mapper/granada--vg-root ro quiet splash intel_iommu=on > iommu=pt > > rd.driver.pre=vfio-pci video=efifb:off vt.handoff=7 > > [ 0.000000] ACPI: DMAR 0x000000006D9D0470 0000A8 (v01 INTEL EDK2 > > 00000003 BRXT 0100000D) > > [ 0.000000] Kernel command line: > > BOOT_IMAGE=/boot/vmlinuz-4.12.3-041203-generic > > root=/dev/mapper/granada--vg-root ro quiet splash intel_iommu=on > iommu=pt > > rd.driver.pre=vfio-pci video=efifb:off vt.handoff=7 > > [ 0.000000] DMAR: IOMMU enabled > > [ 0.044107] DMAR: Host address width 39 > > [ 0.044109] DMAR: DRHD base: 0x000000fed64000 flags: 0x0 > > [ 0.044126] DMAR: dmar0: reg_base_addr fed64000 ver 1:0 cap > > 1c0000c40660462 ecap 7e3ff0505e > > [ 0.044128] DMAR: DRHD base: 0x000000fed65000 flags: 0x1 > > [ 0.044139] DMAR: dmar1: reg_base_addr fed65000 ver 1:0 cap > > d2008c40660462 ecap f050da > > [ 0.044142] DMAR: RMRR base: 0x0000006d5af000 end: 0x0000006d5cefff > > [ 0.044145] DMAR: RMRR base: 0x0000006f800000 end: 0x0000007fffffff > > [ 0.044148] DMAR-IR: IOAPIC id 1 under DRHD base 0xfed65000 IOMMU 1 > > [ 0.044150] DMAR-IR: HPET id 0 under DRHD base 0xfed65000 > > [ 0.044152] DMAR-IR: Queued invalidation will be enabled to support > > x2apic and Intr-remapping. > > [ 0.046253] DMAR-IR: Enabled IRQ remapping in x2apic mode > > [ 1.794596] DMAR: No ATSR found > > [ 1.795685] DMAR: dmar0: Using Queued invalidation > > [ 1.795694] DMAR: dmar1: Using Queued invalidation > > [ 1.795872] DMAR: Hardware identity mapping for device 0000:00:00.0 > > [ 1.795882] DMAR: Hardware identity mapping for device 0000:00:02.0 > > [ 1.795886] DMAR: Hardware identity mapping for device 0000:00:0e.0 > > [ 1.795888] DMAR: Hardware identity mapping for device 0000:00:0f.0 > > [ 1.795890] DMAR: Hardware identity mapping for device 0000:00:12.0 > > [ 1.795892] DMAR: Hardware identity mapping for device 0000:00:13.0 > > [ 1.795895] DMAR: Hardware identity mapping for device 0000:00:13.1 > > [ 1.795897] DMAR: Hardware identity mapping for device 0000:00:13.2 > > [ 1.795899] DMAR: Hardware identity mapping for device 0000:00:13.3 > > [ 1.795902] DMAR: Hardware identity mapping for device 0000:00:15.0 > > [ 1.795904] DMAR: Hardware identity mapping for device 0000:00:1f.0 > > [ 1.795906] DMAR: Hardware identity mapping for device 0000:00:1f.1 > > [ 1.795911] DMAR: Hardware identity mapping for device 0000:01:00.0 > > [ 1.795916] DMAR: Hardware identity mapping for device 0000:03:00.0 > > [ 1.795917] DMAR: Setting RMRR: > > [ 1.795920] DMAR: Ignoring identity map for HW passthrough device > > 0000:00:02.0 [0x6f800000 - 0x7fffffff] > > [ 1.795922] DMAR: Ignoring identity map for HW passthrough device > > 0000:00:15.0 [0x6d5af000 - 0x6d5cefff] > > [ 1.795924] DMAR: Prepare 0-16MiB unity mapping for LPC > > [ 1.795926] DMAR: Ignoring identity map for HW passthrough device > > 0000:00:1f.0 [0x0 - 0xffffff] > > [ 1.795954] DMAR: Intel(R) Virtualization Technology for Directed I/O > > [ 1.796125] iommu: Adding device 0000:00:00.0 to group 0 > > [ 1.796140] iommu: Adding device 0000:00:02.0 to group 1 > > [ 1.796157] iommu: Adding device 0000:00:0e.0 to group 2 > > [ 1.796174] iommu: Adding device 0000:00:0f.0 to group 3 > > [ 1.796187] iommu: Adding device 0000:00:12.0 to group 4 > > [ 1.796229] iommu: Adding device 0000:00:13.0 to group 5 > > [ 1.796254] iommu: Adding device 0000:00:13.1 to group 5 > > [ 1.796271] iommu: Adding device 0000:00:13.2 to group 5 > > [ 1.796288] iommu: Adding device 0000:00:13.3 to group 5 > > [ 1.796317] iommu: Adding device 0000:00:15.0 to group 6 > > [ 1.796338] iommu: Adding device 0000:00:1f.0 to group 7 > > [ 1.796350] iommu: Adding device 0000:00:1f.1 to group 7 > > [ 1.796361] iommu: Adding device 0000:01:00.0 to group 5 > > [ 1.796371] iommu: Adding device 0000:03:00.0 to group 5 > > [ 2.512432] ata1.00: supports DRM functions and may not be fully > > accessible > > [ 2.514160] ata1.00: supports DRM functions and may not be fully > > accessible > > [ 3.124755] VFIO - User Level meta-driver version: 0.3 > > [ 3.137417] vfio-pci 0000:00:02.0: vgaarb: changed VGA decodes: > > olddecodes=io+mem,decodes=io+mem:owns=io+mem > > [ 3.156196] vfio_pci: add [8086:5a85[ffff:ffff]] class > 0x000000/00000000 > > [ 3.176202] vfio_pci: add [8086:5a98[ffff:ffff]] class > 0x000000/00000000 > > > > with these entries added when running the VM: > > > > [ 49.439866] vfio_cap_init: 0000:00:0e.0 pci config conflict @0x80, was > > cap 0x9 now cap 0x10 > > [ 49.439869] vfio_cap_init: 0000:00:0e.0 pci config conflict @0x81, was > > cap 0x9 now cap 0x10 > > [ 49.439871] vfio_cap_init: 0000:00:0e.0 pci config conflict @0x82, was > > cap 0x9 now cap 0x10 > > [ 49.439873] vfio_cap_init: 0000:00:0e.0 pci config conflict @0x83, was > > cap 0x9 now cap 0x10 > > [ 49.442695] DMAR: DRHD: handling fault status reg 3 > > [ 49.442710] DMAR: [DMA Write] Request device [00:02.0] fault addr 0 > > [fault reason 02] Present bit in context entry is clear > > [ 49.567831] vfio_ecap_init: 0000:00:02.0 hiding ecap 0x1b@0x100 > > > > Passthrough still doesn't work though, and the last two lines in the > kernel > > log seem to hint at that. So from option one to option 2: > > Nope, those are normal. > > > 2) "don't blacklist i915, let the kernel boot with it, then do a 'virsh > > nodedev-detach pci_0000_00_02_0' at boot before starting the VM so that > > you're not binding it back to i915 after every instance of running the > > VM." > > > > So I unblacklisted i915 an executed 'virsh nodedev-dettach > > pci_0000_00_02_0': > > > > virsh nodedev-dettach pci_0000_00_02_0 > > Device pci_0000_00_02_0 detached > > > > Then ran the VM. Unfortunately, results are the same, as are the log > > entries in the kernel log (see above). > > > > When running the same virsh 'nodedev-dettach pci_0000_00_02_0' command > when > > running the VM, I get: > > > > virsh nodedev-dettach pci_0000_00_02_0 > > error: Failed to detach device pci_0000_00_02_0 > > error: Requested operation is not valid: PCI device 0000:00:02.0 is in > use > > by driver QEMU, domain ubuntu16.04_desktop > > > > So it does seem to be attached to the VM correctly. > > > > Maybe interesting observation: when the host boots, the screen shows grub > > and then goes black but stays powered on. When I launch the VM, the > screen > > stays black but powers off. > > I think either mechanism above is equally effective and the remaining > pieces is likely that there's no VGA BIOS to initialize the graphics > because the host is running in UEFI mode. To solve this, burn some > sort of Linux live CD/DVD image and temporarily set the host BIOS to > boot into CSM/legacy mode to that live image. Sometimes you'll be able > to select between a UEFI or legacy mode booting the CD or you might be > able to prioritize legacy over UEFI, it depends on the system. Once > you've booted the image, dump the ROM for the IGD to a file and copy it > somewhere that you'll be able to get to it later. Undo any settings > for booting the image and then look at my rom-fixer utility: > > https://github.com/awilliam/rom-parser > > Run that on the ROM and then add a <rom file='/path/to/vga.rom'/> to > the IGD hostdev entry in the xml. > > > Finally, until now I had ignored the errors in the kernel log on the > audio > > device (0000:00:0e.0) as I was focusing on the gpu. As requested, in [2] > > the output of 'sudo lspci -xxxxs 0000:00:0e.0'. > > > Thanks, I'll take a look at this. > > Alex >
_______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users