Re: [Qemu-devel] vfio-pci: Report on a hack to successfully pass through a boot GPU

Robert Ou Tue, 12 Jul 2016 12:34:29 -0700

On Jul 12, 2016 08:12, "Alex Williamson" <alex.william...@redhat.com> wrote:
>
> On Tue, 12 Jul 2016 02:30:44 -0700
> Robert Ou <r...@robertou.com> wrote:
>
> > I would like to report on a hack that I created to successfully use
> > vfio-pci to pass through a boot GPU. The short TL;DR summary is that
> > the BOOTFB framebuffer memory region seems to cause a "BAR <n>: can't
> > reserve [mem <...>]" error, and this can be hackily worked around by
> > calling __release_region on the BOOTFB framebuffer. I was told by
> > someone on IRC to send this hack to this list.
> >
> > My system setup is as follows: I have a Xeon E5-2630 v4 on an Asrock
> > X99 Extreme6 motherboard. The GPU I am attempting to pass through is
> > an NVIDIA GTX 1080 plugged into the slot closest to the CPU. There is
> > a second GPU, an AMD R5 240 OEM (Oland) being used as the "initial"
> > GPU for Linux ("Initial" in this case means that the text consoles and
> > the X graphical login appear on the monitor connected to this GPU.
> > After logging in, additional commands are run to either run a VM or
> > run a new X server using the NVIDIA GPU.). Each GPU has separate
> > monitor cables connected to them - there is no attempt to somehow
> > forward the output from one GPU to another. Linux is booted using
> > UEFI, not BIOS boot. The CSM is disabled. The UEFI splash and the GRUB
> > bootloader display using the NVIDIA GPU. There does not appear to be
> > an option to change the boot GPU. However, Linux is configured to
> > display its output on the AMD GPU by a) only describing the AMD GPU in
> > xorg.conf and b) passing "video=simplefb:off" on the command line as
> > well as putting radeon in the initrd so that it can load before the
> > nvidia driver does. I am running Debian sid with kernel 4.6.
> >
> > I activate the vfio-pci drivers manually by writing to
> > /sys/bus/pci/drivers/vfio-pci/new_id and then unbinding the existing
> > driver and binding vfio-pci. This actually works most of the time
> > (more on this later). When I initially (without my hack) try to launch
> > a qemu-kvm guest (using virt-manager; guest OS is Windows 10; guest is
> > booting via OVMF; guest is using i440fx), the host kernel log gets
> > flooded with an error "vfio-pci 0000:04:00.0: BAR 1: can't reserve
> > [mem 0xc0000000-0xcfffffff 64bit pref]". Examining /proc/iomem shows
> > the memory region vfio-pci is trying to claim overlaps with a memory
> > region named BOOTFB which is apparently the UEFI framebuffer (despite
> > the fact that simplefb is disabled, apparently this memory region is
> > still created). As a really terrible hack, I wrote a kernel module
> > that calls "__release_region(&iomem_resource, <start of bootfb>, <size
> > of bootfb>)". This fixed the issue for me, and I was successfully able
> > to pass through the boot GPU to the guest.
> >
> > The source code of this hacky kernel module is below. It is used by
> > running "insmod forcefully-remove-bootfb.ko bootfb_start=<addr>
> > bootfb_end=<addr>" using addresses found from /proc/iomem. The module
> > is then immediately unloaded with rmmod. (The kernel module can't find
> > BOOTFB by itself because I couldn't and didn't bother to figure out
> > how to actually traverse iomem_resource from a kernel module. The
> > resource_lock lock doesn't seem to be accessible from modules.)
>
> Can't you simply boot with video=efifb:off (or video=vesafb:off if you
> were running BIOS rather than UEFI).  This is what I do for IGD
> assignment.  I'm sure nvidia.ko causes more problems than i915 though,
> maybe that's where simplefb comes into play.


What I'm saying is that this doesn't work. The BOOTFB region is always
created, even when video=simplefb:off (seems to have replaced efifb) is
passed. I got this idea for forcefully deleting BOOTFB from this random
nouveau-related email thread here:
https://lists.freedesktop.org/archives/nouveau/2013-October/014667.html.

> > Regarding activating the vfio-pci drivers, I actually do not have the
> > nvidia/snd_hda_intel drivers blacklisted. I allow them to load
> > normally on boot and unbind them when I run a VM. I also attempt to
> > rebind the normal drivers after shutting down the VM. The idea is that
> > I can either run a Windows VM using the NVIDIA GPU, or I can start a
> > second X server using the NVIDIA GPU and a separate xorg.nv.conf, and
> > I can switch between these two modes without rebooting the host
> > (restarting (the second) X is still required). Most of the time, this
> > actually works correctly. Occasionally however, the kernel will
> > encounter a general protection fault, but this is an unrelated issue
> > to this hack I am describing.
>
> It would be a new development if nvidia.ko were to properly release
> device resources on unload, I filed a bug with them about that awhile
> ago that got closed WONTFIX.  They simply don't support dynamically
> unbinding devices from their driver, but maybe they've fixed
> something.  Even i915 isn't great at this, we can unbind devices from
> it, but occasionally on re-bind the kernel freaks out.  Someone needs to
> spend some time debugging each driver for the unbind/re-bind use case,
> but unfortunately that's impossible to do on nvidia.ko.  Thanks,

Interestingly, I don't recall having any issues that seem to be obviously
related to nvidia.ko. I have had more issues unbinding and rebinding the
snd_hda_intel driver. I am running NVIDIA driver version 367.27.

> Alex

Re: [Qemu-devel] vfio-pci: Report on a hack to successfully pass through a boot GPU

Reply via email to