Hi all,

I've recently installed a kernel upgrade package to 4.1.13 which seems to have 
broken GOP output switching (and thus Xorg) when I sequester my PCI-E GPU 
during initrd. This happens using pci-stub or vfio-pci and seems to have 
carried on all the way through to at least kernel 4.4-rc7 (haven't tried later 
versions). I've managed to narrow down the kernel patch which caused the 
problem, but I'm wondering if anyone else has this issue before reporting 
upstream as the failmode likely only impacts vfio users.

This is the patch in question

https://github.com/torvalds/linux/commit/bd69119

My normal boot sequence with kernel 4.1.12 or patch-less 4.1.13. Separate 
monitors are connected to integrated GPU and PCI-E GPU pre-boot.
* POST appears on both monitors, PCI-E GPU first then iGPU (despite UEFI 
setting...)
* GRUB appears on PCI-E GPU, linux kernel selected and loads initrd (iGPU blank 
at this point)
* GRUB prompt freezes on PCI-E GPU, boot splash screen appears on iGPU output
* KDE login appears on iGPU output, system acts as normal from this point
* xorg log shows successful binding to i915, no mention of radeon or attempts 
to bind to radeon
* dmesg indicates fbcon inteldrmfb is set as the primary device

Broken boot sequence, as above but with 4.1.13+ as-is
* POST appears on both monitors, PCI-E GPU first then iGPU (despite UEFI 
setting...)
* GRUB appears on PCI-E GPU, loads initrd
* PCI-E GPU goes blank, can see iGPU monitor light up but no output
* After waiting, must SSH into box to do anything
* xorg logs indicate trying to bind to radeon (sequestered), which is why it 
died
* dmesg shows no indication of switching fbcon, fb0 bound to an arbitrary EFI 
VGA frame buffer device which can be anything?

There are three work-arounds I've identified which restore functionality
1) Disconnect the monitor connected to the PCI-E GPU during boot, plug it in 
after the host has booted
2) Re-compile the kernel without the above patch (works on at least 4.1.13, 
haven't tried on others but there have been no commits to eboot.c since)
3) Custom xorg.conf to pin the primary monitor to the integrated GPU

My platform:

* Core i5 2500 w/ Intel HD2000
* Gigabyte Z77X-UD5H w/ firmware F14, iGPU set as primary adapter, VT-d enabled 
etc.
* Radeon 7750 used for PCI-E pass-through using vfio-pci
* OpenSUSE 42.1 booting from UEFI
* OpenSUSE KVM pattern (via YaST):
-libvirt 1.2.18.1
-virt-manager 1.2.1
-QEMU 2.3.1
* Kernel command line: "resume=/dev/system/swap splash=silent quiet showopts 
intel_iommu=on rd.driver.pre=vfio-pci"
* pci-ids for HD7750 are static entries in conf file within modprobe.d

There's also the possibility that my hardware is suspect, my motherboard hasn't 
had the best track record with regards to firmware quality and there are 
existing ACPI issues (auto reboot on shutdown for example), my GPU also uses a 
patched firmware in order to enable GOP. But with that said everything works as 
it should when the kernel patch is removed.

Regards,
Joseph

_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users

Reply via email to